aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/cgroups
diff options
context:
space:
mode:
authorKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>2012-05-29 18:06:51 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2012-05-29 19:22:24 -0400
commit4b91355e9dc9ac1eb3d69e56de093899ff2677ef (patch)
treeb6d18618013eadd3b3e3144c90b16d0d5f07b3af /Documentation/cgroups
parent181eb39425f2b9275afcb015eaa547d11f71a02f (diff)
memcg: fix/change behavior of shared anon at moving task
This patch changes memcg's behavior at task_move(). At task_move(), the kernel scans a task's page table and move the changes for mapped pages from source cgroup to target cgroup. There has been a bug at handling shared anonymous pages for a long time. Before patch: - The spec says 'shared anonymous pages are not moved.' - The implementation was 'shared anonymoys pages may be moved'. If page_mapcount <=2, shared anonymous pages's charge were moved. After patch: - The spec says 'all anonymous pages are moved'. - The implementation is 'all anonymous pages are moved'. Considering usage of memcg, this will not affect user's experience. 'shared anonymous' pages only exists between a tree of processes which don't do exec(). Moving one of process without exec() seems not sane. For example, libcgroup will not be affected by this change. (Anyway, no one noticed the implementation for a long time...) Below is a discussion log: - current spec/implementation are complex - Now, shared file caches are moved - It adds unclear check as page_mapcount(). To do correct check, we should check swap users, etc. - No one notice this implementation behavior. So, no one get benefit from the design. - In general, once task is moved to a cgroup for running, it will not be moved.... - Finally, we have control knob as memory.move_charge_at_immigrate. Here is a patch to allow moving shared pages, completely. This makes memcg simpler and fix current broken code. Suggested-by: Hugh Dickins <hughd@google.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Glauber Costa <glommer@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'Documentation/cgroups')
-rw-r--r--Documentation/cgroups/memory.txt9
1 files changed, 4 insertions, 5 deletions
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
index e479007f1a75..6a066a270fc5 100644
--- a/Documentation/cgroups/memory.txt
+++ b/Documentation/cgroups/memory.txt
@@ -184,12 +184,14 @@ behind this approach is that a cgroup that aggressively uses a shared
184page will eventually get charged for it (once it is uncharged from 184page will eventually get charged for it (once it is uncharged from
185the cgroup that brought it in -- this will happen on memory pressure). 185the cgroup that brought it in -- this will happen on memory pressure).
186 186
187But see section 8.2: when moving a task to another cgroup, its pages may
188be recharged to the new cgroup, if move_charge_at_immigrate has been chosen.
189
187Exception: If CONFIG_CGROUP_CGROUP_MEM_RES_CTLR_SWAP is not used. 190Exception: If CONFIG_CGROUP_CGROUP_MEM_RES_CTLR_SWAP is not used.
188When you do swapoff and make swapped-out pages of shmem(tmpfs) to 191When you do swapoff and make swapped-out pages of shmem(tmpfs) to
189be backed into memory in force, charges for pages are accounted against the 192be backed into memory in force, charges for pages are accounted against the
190caller of swapoff rather than the users of shmem. 193caller of swapoff rather than the users of shmem.
191 194
192
1932.4 Swap Extension (CONFIG_CGROUP_MEM_RES_CTLR_SWAP) 1952.4 Swap Extension (CONFIG_CGROUP_MEM_RES_CTLR_SWAP)
194 196
195Swap Extension allows you to record charge for swap. A swapped-in page is 197Swap Extension allows you to record charge for swap. A swapped-in page is
@@ -615,8 +617,7 @@ memory cgroup.
615 bit | what type of charges would be moved ? 617 bit | what type of charges would be moved ?
616 -----+------------------------------------------------------------------------ 618 -----+------------------------------------------------------------------------
617 0 | A charge of an anonymous page(or swap of it) used by the target task. 619 0 | A charge of an anonymous page(or swap of it) used by the target task.
618 | Those pages and swaps must be used only by the target task. You must 620 | You must enable Swap Extension(see 2.4) to enable move of swap charges.
619 | enable Swap Extension(see 2.4) to enable move of swap charges.
620 -----+------------------------------------------------------------------------ 621 -----+------------------------------------------------------------------------
621 1 | A charge of file pages(normal file, tmpfs file(e.g. ipc shared memory) 622 1 | A charge of file pages(normal file, tmpfs file(e.g. ipc shared memory)
622 | and swaps of tmpfs file) mmapped by the target task. Unlike the case of 623 | and swaps of tmpfs file) mmapped by the target task. Unlike the case of
@@ -629,8 +630,6 @@ memory cgroup.
629 630
6308.3 TODO 6318.3 TODO
631 632
632- Implement madvise(2) to let users decide the vma to be moved or not to be
633 moved.
634- All of moving charge operations are done under cgroup_mutex. It's not good 633- All of moving charge operations are done under cgroup_mutex. It's not good
635 behavior to hold the mutex too long, so we may need some trick. 634 behavior to hold the mutex too long, so we may need some trick.
636 635