aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/cgroups/memory.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/cgroups/memory.txt')
-rw-r--r--Documentation/cgroups/memory.txt78
1 files changed, 55 insertions, 23 deletions
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
index 7781857dc940..06eb6d957c83 100644
--- a/Documentation/cgroups/memory.txt
+++ b/Documentation/cgroups/memory.txt
@@ -1,8 +1,8 @@
1Memory Resource Controller 1Memory Resource Controller
2 2
3NOTE: The Memory Resource Controller has been generically been referred 3NOTE: The Memory Resource Controller has generically been referred to as the
4 to as the memory controller in this document. Do not confuse memory 4 memory controller in this document. Do not confuse memory controller
5 controller used here with the memory controller that is used in hardware. 5 used here with the memory controller that is used in hardware.
6 6
7(For editors) 7(For editors)
8In this document: 8In this document:
@@ -52,8 +52,10 @@ Brief summary of control files.
52 tasks # attach a task(thread) and show list of threads 52 tasks # attach a task(thread) and show list of threads
53 cgroup.procs # show list of processes 53 cgroup.procs # show list of processes
54 cgroup.event_control # an interface for event_fd() 54 cgroup.event_control # an interface for event_fd()
55 memory.usage_in_bytes # show current memory(RSS+Cache) usage. 55 memory.usage_in_bytes # show current res_counter usage for memory
56 memory.memsw.usage_in_bytes # show current memory+Swap usage 56 (See 5.5 for details)
57 memory.memsw.usage_in_bytes # show current res_counter usage for memory+Swap
58 (See 5.5 for details)
57 memory.limit_in_bytes # set/show limit of memory usage 59 memory.limit_in_bytes # set/show limit of memory usage
58 memory.memsw.limit_in_bytes # set/show limit of memory+Swap usage 60 memory.memsw.limit_in_bytes # set/show limit of memory+Swap usage
59 memory.failcnt # show the number of memory usage hits limits 61 memory.failcnt # show the number of memory usage hits limits
@@ -68,6 +70,7 @@ Brief summary of control files.
68 (See sysctl's vm.swappiness) 70 (See sysctl's vm.swappiness)
69 memory.move_charge_at_immigrate # set/show controls of moving charges 71 memory.move_charge_at_immigrate # set/show controls of moving charges
70 memory.oom_control # set/show oom controls. 72 memory.oom_control # set/show oom controls.
73 memory.numa_stat # show the number of memory usage per numa node
71 74
721. History 751. History
73 76
@@ -179,7 +182,7 @@ behind this approach is that a cgroup that aggressively uses a shared
179page will eventually get charged for it (once it is uncharged from 182page will eventually get charged for it (once it is uncharged from
180the cgroup that brought it in -- this will happen on memory pressure). 183the cgroup that brought it in -- this will happen on memory pressure).
181 184
182Exception: If CONFIG_CGROUP_CGROUP_MEM_RES_CTLR_SWAP is not used.. 185Exception: If CONFIG_CGROUP_CGROUP_MEM_RES_CTLR_SWAP is not used.
183When you do swapoff and make swapped-out pages of shmem(tmpfs) to 186When you do swapoff and make swapped-out pages of shmem(tmpfs) to
184be backed into memory in force, charges for pages are accounted against the 187be backed into memory in force, charges for pages are accounted against the
185caller of swapoff rather than the users of shmem. 188caller of swapoff rather than the users of shmem.
@@ -211,7 +214,7 @@ affecting global LRU, memory+swap limit is better than just limiting swap from
211OS point of view. 214OS point of view.
212 215
213* What happens when a cgroup hits memory.memsw.limit_in_bytes 216* What happens when a cgroup hits memory.memsw.limit_in_bytes
214When a cgroup his memory.memsw.limit_in_bytes, it's useless to do swap-out 217When a cgroup hits memory.memsw.limit_in_bytes, it's useless to do swap-out
215in this cgroup. Then, swap-out will not be done by cgroup routine and file 218in this cgroup. Then, swap-out will not be done by cgroup routine and file
216caches are dropped. But as mentioned above, global LRU can do swapout memory 219caches are dropped. But as mentioned above, global LRU can do swapout memory
217from it for sanity of the system's memory management state. You can't forbid 220from it for sanity of the system's memory management state. You can't forbid
@@ -261,16 +264,17 @@ b. Enable CONFIG_RESOURCE_COUNTERS
261c. Enable CONFIG_CGROUP_MEM_RES_CTLR 264c. Enable CONFIG_CGROUP_MEM_RES_CTLR
262d. Enable CONFIG_CGROUP_MEM_RES_CTLR_SWAP (to use swap extension) 265d. Enable CONFIG_CGROUP_MEM_RES_CTLR_SWAP (to use swap extension)
263 266
2641. Prepare the cgroups 2671. Prepare the cgroups (see cgroups.txt, Why are cgroups needed?)
265# mkdir -p /cgroups 268# mount -t tmpfs none /sys/fs/cgroup
266# mount -t cgroup none /cgroups -o memory 269# mkdir /sys/fs/cgroup/memory
270# mount -t cgroup none /sys/fs/cgroup/memory -o memory
267 271
2682. Make the new group and move bash into it 2722. Make the new group and move bash into it
269# mkdir /cgroups/0 273# mkdir /sys/fs/cgroup/memory/0
270# echo $$ > /cgroups/0/tasks 274# echo $$ > /sys/fs/cgroup/memory/0/tasks
271 275
272Since now we're in the 0 cgroup, we can alter the memory limit: 276Since now we're in the 0 cgroup, we can alter the memory limit:
273# echo 4M > /cgroups/0/memory.limit_in_bytes 277# echo 4M > /sys/fs/cgroup/memory/0/memory.limit_in_bytes
274 278
275NOTE: We can use a suffix (k, K, m, M, g or G) to indicate values in kilo, 279NOTE: We can use a suffix (k, K, m, M, g or G) to indicate values in kilo,
276mega or gigabytes. (Here, Kilo, Mega, Giga are Kibibytes, Mebibytes, Gibibytes.) 280mega or gigabytes. (Here, Kilo, Mega, Giga are Kibibytes, Mebibytes, Gibibytes.)
@@ -278,11 +282,11 @@ mega or gigabytes. (Here, Kilo, Mega, Giga are Kibibytes, Mebibytes, Gibibytes.)
278NOTE: We can write "-1" to reset the *.limit_in_bytes(unlimited). 282NOTE: We can write "-1" to reset the *.limit_in_bytes(unlimited).
279NOTE: We cannot set limits on the root cgroup any more. 283NOTE: We cannot set limits on the root cgroup any more.
280 284
281# cat /cgroups/0/memory.limit_in_bytes 285# cat /sys/fs/cgroup/memory/0/memory.limit_in_bytes
2824194304 2864194304
283 287
284We can check the usage: 288We can check the usage:
285# cat /cgroups/0/memory.usage_in_bytes 289# cat /sys/fs/cgroup/memory/0/memory.usage_in_bytes
2861216512 2901216512
287 291
288A successful write to this file does not guarantee a successful set of 292A successful write to this file does not guarantee a successful set of
@@ -453,6 +457,33 @@ memory under it will be reclaimed.
453You can reset failcnt by writing 0 to failcnt file. 457You can reset failcnt by writing 0 to failcnt file.
454# echo 0 > .../memory.failcnt 458# echo 0 > .../memory.failcnt
455 459
4605.5 usage_in_bytes
461
462For efficiency, as other kernel components, memory cgroup uses some optimization
463to avoid unnecessary cacheline false sharing. usage_in_bytes is affected by the
464method and doesn't show 'exact' value of memory(and swap) usage, it's an fuzz
465value for efficient access. (Of course, when necessary, it's synchronized.)
466If you want to know more exact memory usage, you should use RSS+CACHE(+SWAP)
467value in memory.stat(see 5.2).
468
4695.6 numa_stat
470
471This is similar to numa_maps but operates on a per-memcg basis. This is
472useful for providing visibility into the numa locality information within
473an memcg since the pages are allowed to be allocated from any physical
474node. One of the usecases is evaluating application performance by
475combining this information with the application's cpu allocation.
476
477We export "total", "file", "anon" and "unevictable" pages per-node for
478each memcg. The ouput format of memory.numa_stat is:
479
480total=<total pages> N0=<node 0 pages> N1=<node 1 pages> ...
481file=<total file pages> N0=<node 0 pages> N1=<node 1 pages> ...
482anon=<total anon pages> N0=<node 0 pages> N1=<node 1 pages> ...
483unevictable=<total anon pages> N0=<node 0 pages> N1=<node 1 pages> ...
484
485And we have total = file + anon + unevictable.
486
4566. Hierarchy support 4876. Hierarchy support
457 488
458The memory controller supports a deep hierarchy and hierarchical accounting. 489The memory controller supports a deep hierarchy and hierarchical accounting.
@@ -460,13 +491,13 @@ The hierarchy is created by creating the appropriate cgroups in the
460cgroup filesystem. Consider for example, the following cgroup filesystem 491cgroup filesystem. Consider for example, the following cgroup filesystem
461hierarchy 492hierarchy
462 493
463 root 494 root
464 / | \ 495 / | \
465 / | \ 496 / | \
466 a b c 497 a b c
467 | \ 498 | \
468 | \ 499 | \
469 d e 500 d e
470 501
471In the diagram above, with hierarchical accounting enabled, all memory 502In the diagram above, with hierarchical accounting enabled, all memory
472usage of e, is accounted to its ancestors up until the root (i.e, c and root), 503usage of e, is accounted to its ancestors up until the root (i.e, c and root),
@@ -485,8 +516,9 @@ The feature can be disabled by
485 516
486# echo 0 > memory.use_hierarchy 517# echo 0 > memory.use_hierarchy
487 518
488NOTE1: Enabling/disabling will fail if the cgroup already has other 519NOTE1: Enabling/disabling will fail if either the cgroup already has other
489 cgroups created below it. 520 cgroups created below it, or if the parent cgroup has use_hierarchy
521 enabled.
490 522
491NOTE2: When panic_on_oom is set to "2", the whole system will panic in 523NOTE2: When panic_on_oom is set to "2", the whole system will panic in
492 case of an OOM event in any cgroup. 524 case of an OOM event in any cgroup.