diff options
author | KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> | 2012-01-12 20:17:44 -0500 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2012-01-12 23:13:04 -0500 |
commit | ab936cbcd02072a34b60d268f94440fd5cf1970b (patch) | |
tree | d37e3e3c54cc4cc691a428b6ceb71b4b40e4f42b /include/linux/memcontrol.h | |
parent | 28d82dc1c4edbc352129f97f4ca22624d1fe61de (diff) |
memcg: add mem_cgroup_replace_page_cache() to fix LRU issue
Commit ef6a3c6311 ("mm: add replace_page_cache_page() function") added a
function replace_page_cache_page(). This function replaces a page in the
radix-tree with a new page. WHen doing this, memory cgroup needs to fix
up the accounting information. memcg need to check PCG_USED bit etc.
In some(many?) cases, 'newpage' is on LRU before calling
replace_page_cache(). So, memcg's LRU accounting information should be
fixed, too.
This patch adds mem_cgroup_replace_page_cache() and removes the old hooks.
In that function, old pages will be unaccounted without touching
res_counter and new page will be accounted to the memcg (of old page).
WHen overwriting pc->mem_cgroup of newpage, take zone->lru_lock and avoid
races with LRU handling.
Background:
replace_page_cache_page() is called by FUSE code in its splice() handling.
Here, 'newpage' is replacing oldpage but this newpage is not a newly allocated
page and may be on LRU. LRU mis-accounting will be critical for memory cgroup
because rmdir() checks the whole LRU is empty and there is no account leak.
If a page is on the other LRU than it should be, rmdir() will fail.
This bug was added in March 2011, but no bug report yet. I guess there
are not many people who use memcg and FUSE at the same time with upstream
kernels.
The result of this bug is that admin cannot destroy a memcg because of
account leak. So, no panic, no deadlock. And, even if an active cgroup
exist, umount can succseed. So no problem at shutdown.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Miklos Szeredi <mszeredi@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'include/linux/memcontrol.h')
-rw-r--r-- | include/linux/memcontrol.h | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index f944591765eb..3558a5e268cf 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h | |||
@@ -122,6 +122,8 @@ struct zone_reclaim_stat* | |||
122 | mem_cgroup_get_reclaim_stat_from_page(struct page *page); | 122 | mem_cgroup_get_reclaim_stat_from_page(struct page *page); |
123 | extern void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, | 123 | extern void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, |
124 | struct task_struct *p); | 124 | struct task_struct *p); |
125 | extern void mem_cgroup_replace_page_cache(struct page *oldpage, | ||
126 | struct page *newpage); | ||
125 | 127 | ||
126 | #ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP | 128 | #ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP |
127 | extern int do_swap_account; | 129 | extern int do_swap_account; |
@@ -369,6 +371,10 @@ static inline | |||
369 | void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx) | 371 | void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx) |
370 | { | 372 | { |
371 | } | 373 | } |
374 | static inline void mem_cgroup_replace_page_cache(struct page *oldpage, | ||
375 | struct page *newpage) | ||
376 | { | ||
377 | } | ||
372 | #endif /* CONFIG_CGROUP_MEM_CONT */ | 378 | #endif /* CONFIG_CGROUP_MEM_CONT */ |
373 | 379 | ||
374 | #if !defined(CONFIG_CGROUP_MEM_RES_CTLR) || !defined(CONFIG_DEBUG_VM) | 380 | #if !defined(CONFIG_CGROUP_MEM_RES_CTLR) || !defined(CONFIG_DEBUG_VM) |