diff options
author | Hugh Dickins <hugh@veritas.com> | 2005-10-29 21:16:18 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@g5.osdl.org> | 2005-10-30 00:40:39 -0400 |
commit | 365e9c87a982c03d0af3886e29d877f581b59611 (patch) | |
tree | d06c1918ca9fe6677d7e4e869555e095004274f7 /mm/mremap.c | |
parent | 861f2fb8e796022b4928cab9c74fca6681a1c557 (diff) |
[PATCH] mm: update_hiwaters just in time
update_mem_hiwater has attracted various criticisms, in particular from those
concerned with mm scalability. Originally it was called whenever rss or
total_vm got raised. Then many of those callsites were replaced by a timer
tick call from account_system_time. Now Frank van Maarseveen reports that to
be found inadequate. How about this? Works for Frank.
Replace update_mem_hiwater, a poor combination of two unrelated ops, by macros
update_hiwater_rss and update_hiwater_vm. Don't attempt to keep
mm->hiwater_rss up to date at timer tick, nor every time we raise rss (usually
by 1): those are hot paths. Do the opposite, update only when about to lower
rss (usually by many), or just before final accounting in do_exit. Handle
mm->hiwater_vm in the same way, though it's much less of an issue. Demand
that whoever collects these hiwater statistics do the work of taking the
maximum with rss or total_vm.
And there has been no collector of these hiwater statistics in the tree. The
new convention needs an example, so match Frank's usage by adding a VmPeak
line above VmSize to /proc/<pid>/status, and also a VmHWM line above VmRSS
(High-Water-Mark or High-Water-Memory).
There was a particular anomaly during mremap move, that hiwater_vm might be
captured too high. A fleeting such anomaly remains, but it's quickly
corrected now, whereas before it would stick.
What locking? None: if the app is racy then these statistics will be racy,
it's not worth any overhead to make them exact. But whenever it suits,
hiwater_vm is updated under exclusive mmap_sem, and hiwater_rss under
page_table_lock (for now) or with preemption disabled (later on): without
going to any trouble, minimize the time between reading current values and
updating, to minimize those occasions when a racing thread bumps a count up
and back down in between.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Diffstat (limited to 'mm/mremap.c')
-rw-r--r-- | mm/mremap.c | 12 |
1 files changed, 10 insertions, 2 deletions
diff --git a/mm/mremap.c b/mm/mremap.c index 318eea5467a0..ccf456477020 100644 --- a/mm/mremap.c +++ b/mm/mremap.c | |||
@@ -167,6 +167,7 @@ static unsigned long move_vma(struct vm_area_struct *vma, | |||
167 | unsigned long new_pgoff; | 167 | unsigned long new_pgoff; |
168 | unsigned long moved_len; | 168 | unsigned long moved_len; |
169 | unsigned long excess = 0; | 169 | unsigned long excess = 0; |
170 | unsigned long hiwater_vm; | ||
170 | int split = 0; | 171 | int split = 0; |
171 | 172 | ||
172 | /* | 173 | /* |
@@ -205,9 +206,15 @@ static unsigned long move_vma(struct vm_area_struct *vma, | |||
205 | } | 206 | } |
206 | 207 | ||
207 | /* | 208 | /* |
208 | * if we failed to move page tables we still do total_vm increment | 209 | * If we failed to move page tables we still do total_vm increment |
209 | * since do_munmap() will decrement it by old_len == new_len | 210 | * since do_munmap() will decrement it by old_len == new_len. |
211 | * | ||
212 | * Since total_vm is about to be raised artificially high for a | ||
213 | * moment, we need to restore high watermark afterwards: if stats | ||
214 | * are taken meanwhile, total_vm and hiwater_vm appear too high. | ||
215 | * If this were a serious issue, we'd add a flag to do_munmap(). | ||
210 | */ | 216 | */ |
217 | hiwater_vm = mm->hiwater_vm; | ||
211 | mm->total_vm += new_len >> PAGE_SHIFT; | 218 | mm->total_vm += new_len >> PAGE_SHIFT; |
212 | vm_stat_account(mm, vma->vm_flags, vma->vm_file, new_len>>PAGE_SHIFT); | 219 | vm_stat_account(mm, vma->vm_flags, vma->vm_file, new_len>>PAGE_SHIFT); |
213 | 220 | ||
@@ -216,6 +223,7 @@ static unsigned long move_vma(struct vm_area_struct *vma, | |||
216 | vm_unacct_memory(excess >> PAGE_SHIFT); | 223 | vm_unacct_memory(excess >> PAGE_SHIFT); |
217 | excess = 0; | 224 | excess = 0; |
218 | } | 225 | } |
226 | mm->hiwater_vm = hiwater_vm; | ||
219 | 227 | ||
220 | /* Restore VM_ACCOUNT if one or two pieces of vma left */ | 228 | /* Restore VM_ACCOUNT if one or two pieces of vma left */ |
221 | if (excess) { | 229 | if (excess) { |