diff options
| author | Wu Fengguang <fengguang.wu@intel.com> | 2009-09-21 20:03:11 -0400 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2009-09-22 10:17:39 -0400 |
| commit | f86296317434b21585e229f6c49a33cb9ebab4d3 (patch) | |
| tree | d4fb05d4aee1a8e373ec053e7316dc9847b2c417 /include/linux | |
| parent | 1a8670a29b5277cbe601f74ab63d2c5211fb3005 (diff) | |
mm: do batched scans for mem_cgroup
For mem_cgroup, shrink_zone() may call shrink_list() with nr_to_scan=1, in
which case shrink_list() _still_ calls isolate_pages() with the much
larger SWAP_CLUSTER_MAX. It effectively scales up the inactive list scan
rate by up to 32 times.
For example, with 16k inactive pages and DEF_PRIORITY=12, (16k >> 12)=4.
So when shrink_zone() expects to scan 4 pages in the active/inactive list,
the active list will be scanned 4 pages, while the inactive list will be
(over) scanned SWAP_CLUSTER_MAX=32 pages in effect. And that could break
the balance between the two lists.
It can further impact the scan of anon active list, due to the anon
active/inactive ratio rebalance logic in balance_pgdat()/shrink_zone():
inactive anon list over scanned => inactive_anon_is_low() == TRUE
=> shrink_active_list()
=> active anon list over scanned
So the end result may be
- anon inactive => over scanned
- anon active => over scanned (maybe not as much)
- file inactive => over scanned
- file active => under scanned (relatively)
The accesses to nr_saved_scan are not lock protected and so not 100%
accurate, however we can tolerate small errors and the resulted small
imbalanced scan rates between zones.
Cc: Rik van Riel <riel@redhat.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'include/linux')
| -rw-r--r-- | include/linux/mmzone.h | 6 |
1 files changed, 5 insertions, 1 deletions
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 9c50309b30a1..c188ea624c74 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h | |||
| @@ -273,6 +273,11 @@ struct zone_reclaim_stat { | |||
| 273 | */ | 273 | */ |
| 274 | unsigned long recent_rotated[2]; | 274 | unsigned long recent_rotated[2]; |
| 275 | unsigned long recent_scanned[2]; | 275 | unsigned long recent_scanned[2]; |
| 276 | |||
| 277 | /* | ||
| 278 | * accumulated for batching | ||
| 279 | */ | ||
| 280 | unsigned long nr_saved_scan[NR_LRU_LISTS]; | ||
| 276 | }; | 281 | }; |
| 277 | 282 | ||
| 278 | struct zone { | 283 | struct zone { |
| @@ -327,7 +332,6 @@ struct zone { | |||
| 327 | spinlock_t lru_lock; | 332 | spinlock_t lru_lock; |
| 328 | struct zone_lru { | 333 | struct zone_lru { |
| 329 | struct list_head list; | 334 | struct list_head list; |
| 330 | unsigned long nr_saved_scan; /* accumulated for batching */ | ||
| 331 | } lru[NR_LRU_LISTS]; | 335 | } lru[NR_LRU_LISTS]; |
| 332 | 336 | ||
| 333 | struct zone_reclaim_stat reclaim_stat; | 337 | struct zone_reclaim_stat reclaim_stat; |
