diff options
author | Hugh Dickins <hugh@veritas.com> | 2005-04-19 16:29:15 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@ppc970.osdl.org.(none)> | 2005-04-19 16:29:15 -0400 |
commit | e0da382c92626ad1d7f4b7527d19b80104d67a83 (patch) | |
tree | b3f455518c286ee14cb2755ced8808487bca7911 /include/linux | |
parent | 9f6c6fc505560465be0964eb4da1b6ca97bd3951 (diff) |
[PATCH] freepgt: free_pgtables use vma list
Recent woes with some arches needing their own pgd_addr_end macro; and 4-level
clear_page_range regression since 2.6.10's clear_page_tables; and its
long-standing well-known inefficiency in searching throughout the higher-level
page tables for those few entries to clear and free: all can be blamed on
ignoring the list of vmas when we free page tables.
Replace exit_mmap's clear_page_range of the total user address space by
free_pgtables operating on the mm's vma list; unmap_region use it in the same
way, giving floor and ceiling beyond which it may not free tables. This
brings lmbench fork/exec/sh numbers back to 2.6.10 (unless preempt is enabled,
in which case latency fixes spoil unmap_vmas throughput).
Beware: the do_mmap_pgoff driver failure case must now use unmap_region
instead of zap_page_range, since a page table might have been allocated, and
can only be freed while it is touched by some vma.
Move free_pgtables from mmap.c to memory.c, where its lower levels are adapted
from the clear_page_range levels. (Most of free_pgtables' old code was
actually for a non-existent case, prev not properly set up, dating from before
hch gave us split_vma.) Pass mmu_gather** in the public interfaces, since we
might want to add latency lockdrops later; but no attempt to do so yet, going
by vma should itself reduce latency.
But what if is_hugepage_only_range? Those ia64 and ppc64 cases need careful
examination: put that off until a later patch of the series.
What of x86_64's 32bit vdso page __map_syscall32 maps outside any vma?
And the range to sparc64's flush_tlb_pgtables? It's less clear to me now that
we need to do more than is done here - every PMD_SIZE ever occupied will be
flushed, do we really have to flush every PGDIR_SIZE ever partially occupied?
A shame to complicate it unnecessarily.
Special thanks to David Miller for time spent repairing my ceilings.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Diffstat (limited to 'include/linux')
-rw-r--r-- | include/linux/mm.h | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/include/linux/mm.h b/include/linux/mm.h index 85f7d1bea937..c3f6c39d41d0 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h | |||
@@ -592,7 +592,8 @@ int unmap_vmas(struct mmu_gather **tlbp, struct mm_struct *mm, | |||
592 | struct vm_area_struct *start_vma, unsigned long start_addr, | 592 | struct vm_area_struct *start_vma, unsigned long start_addr, |
593 | unsigned long end_addr, unsigned long *nr_accounted, | 593 | unsigned long end_addr, unsigned long *nr_accounted, |
594 | struct zap_details *); | 594 | struct zap_details *); |
595 | void clear_page_range(struct mmu_gather *tlb, unsigned long addr, unsigned long end); | 595 | void free_pgtables(struct mmu_gather **tlb, struct vm_area_struct *vma, |
596 | unsigned long floor, unsigned long ceiling); | ||
596 | int copy_page_range(struct mm_struct *dst, struct mm_struct *src, | 597 | int copy_page_range(struct mm_struct *dst, struct mm_struct *src, |
597 | struct vm_area_struct *vma); | 598 | struct vm_area_struct *vma); |
598 | int zeromap_page_range(struct vm_area_struct *vma, unsigned long from, | 599 | int zeromap_page_range(struct vm_area_struct *vma, unsigned long from, |