aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/cgroups/memory.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/cgroups/memory.txt')
-rw-r--r--Documentation/cgroups/memory.txt58
1 files changed, 39 insertions, 19 deletions
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
index 7c163477fcd8..06eb6d957c83 100644
--- a/Documentation/cgroups/memory.txt
+++ b/Documentation/cgroups/memory.txt
@@ -1,8 +1,8 @@
1Memory Resource Controller 1Memory Resource Controller
2 2
3NOTE: The Memory Resource Controller has been generically been referred 3NOTE: The Memory Resource Controller has generically been referred to as the
4 to as the memory controller in this document. Do not confuse memory 4 memory controller in this document. Do not confuse memory controller
5 controller used here with the memory controller that is used in hardware. 5 used here with the memory controller that is used in hardware.
6 6
7(For editors) 7(For editors)
8In this document: 8In this document:
@@ -70,6 +70,7 @@ Brief summary of control files.
70 (See sysctl's vm.swappiness) 70 (See sysctl's vm.swappiness)
71 memory.move_charge_at_immigrate # set/show controls of moving charges 71 memory.move_charge_at_immigrate # set/show controls of moving charges
72 memory.oom_control # set/show oom controls. 72 memory.oom_control # set/show oom controls.
73 memory.numa_stat # show the number of memory usage per numa node
73 74
741. History 751. History
75 76
@@ -181,7 +182,7 @@ behind this approach is that a cgroup that aggressively uses a shared
181page will eventually get charged for it (once it is uncharged from 182page will eventually get charged for it (once it is uncharged from
182the cgroup that brought it in -- this will happen on memory pressure). 183the cgroup that brought it in -- this will happen on memory pressure).
183 184
184Exception: If CONFIG_CGROUP_CGROUP_MEM_RES_CTLR_SWAP is not used.. 185Exception: If CONFIG_CGROUP_CGROUP_MEM_RES_CTLR_SWAP is not used.
185When you do swapoff and make swapped-out pages of shmem(tmpfs) to 186When you do swapoff and make swapped-out pages of shmem(tmpfs) to
186be backed into memory in force, charges for pages are accounted against the 187be backed into memory in force, charges for pages are accounted against the
187caller of swapoff rather than the users of shmem. 188caller of swapoff rather than the users of shmem.
@@ -213,7 +214,7 @@ affecting global LRU, memory+swap limit is better than just limiting swap from
213OS point of view. 214OS point of view.
214 215
215* What happens when a cgroup hits memory.memsw.limit_in_bytes 216* What happens when a cgroup hits memory.memsw.limit_in_bytes
216When a cgroup his memory.memsw.limit_in_bytes, it's useless to do swap-out 217When a cgroup hits memory.memsw.limit_in_bytes, it's useless to do swap-out
217in this cgroup. Then, swap-out will not be done by cgroup routine and file 218in this cgroup. Then, swap-out will not be done by cgroup routine and file
218caches are dropped. But as mentioned above, global LRU can do swapout memory 219caches are dropped. But as mentioned above, global LRU can do swapout memory
219from it for sanity of the system's memory management state. You can't forbid 220from it for sanity of the system's memory management state. You can't forbid
@@ -263,16 +264,17 @@ b. Enable CONFIG_RESOURCE_COUNTERS
263c. Enable CONFIG_CGROUP_MEM_RES_CTLR 264c. Enable CONFIG_CGROUP_MEM_RES_CTLR
264d. Enable CONFIG_CGROUP_MEM_RES_CTLR_SWAP (to use swap extension) 265d. Enable CONFIG_CGROUP_MEM_RES_CTLR_SWAP (to use swap extension)
265 266
2661. Prepare the cgroups 2671. Prepare the cgroups (see cgroups.txt, Why are cgroups needed?)
267# mkdir -p /cgroups 268# mount -t tmpfs none /sys/fs/cgroup
268# mount -t cgroup none /cgroups -o memory 269# mkdir /sys/fs/cgroup/memory
270# mount -t cgroup none /sys/fs/cgroup/memory -o memory
269 271
2702. Make the new group and move bash into it 2722. Make the new group and move bash into it
271# mkdir /cgroups/0 273# mkdir /sys/fs/cgroup/memory/0
272# echo $$ > /cgroups/0/tasks 274# echo $$ > /sys/fs/cgroup/memory/0/tasks
273 275
274Since now we're in the 0 cgroup, we can alter the memory limit: 276Since now we're in the 0 cgroup, we can alter the memory limit:
275# echo 4M > /cgroups/0/memory.limit_in_bytes 277# echo 4M > /sys/fs/cgroup/memory/0/memory.limit_in_bytes
276 278
277NOTE: We can use a suffix (k, K, m, M, g or G) to indicate values in kilo, 279NOTE: We can use a suffix (k, K, m, M, g or G) to indicate values in kilo,
278mega or gigabytes. (Here, Kilo, Mega, Giga are Kibibytes, Mebibytes, Gibibytes.) 280mega or gigabytes. (Here, Kilo, Mega, Giga are Kibibytes, Mebibytes, Gibibytes.)
@@ -280,11 +282,11 @@ mega or gigabytes. (Here, Kilo, Mega, Giga are Kibibytes, Mebibytes, Gibibytes.)
280NOTE: We can write "-1" to reset the *.limit_in_bytes(unlimited). 282NOTE: We can write "-1" to reset the *.limit_in_bytes(unlimited).
281NOTE: We cannot set limits on the root cgroup any more. 283NOTE: We cannot set limits on the root cgroup any more.
282 284
283# cat /cgroups/0/memory.limit_in_bytes 285# cat /sys/fs/cgroup/memory/0/memory.limit_in_bytes
2844194304 2864194304
285 287
286We can check the usage: 288We can check the usage:
287# cat /cgroups/0/memory.usage_in_bytes 289# cat /sys/fs/cgroup/memory/0/memory.usage_in_bytes
2881216512 2901216512
289 291
290A successful write to this file does not guarantee a successful set of 292A successful write to this file does not guarantee a successful set of
@@ -464,6 +466,24 @@ value for efficient access. (Of course, when necessary, it's synchronized.)
464If you want to know more exact memory usage, you should use RSS+CACHE(+SWAP) 466If you want to know more exact memory usage, you should use RSS+CACHE(+SWAP)
465value in memory.stat(see 5.2). 467value in memory.stat(see 5.2).
466 468
4695.6 numa_stat
470
471This is similar to numa_maps but operates on a per-memcg basis. This is
472useful for providing visibility into the numa locality information within
473an memcg since the pages are allowed to be allocated from any physical
474node. One of the usecases is evaluating application performance by
475combining this information with the application's cpu allocation.
476
477We export "total", "file", "anon" and "unevictable" pages per-node for
478each memcg. The ouput format of memory.numa_stat is:
479
480total=<total pages> N0=<node 0 pages> N1=<node 1 pages> ...
481file=<total file pages> N0=<node 0 pages> N1=<node 1 pages> ...
482anon=<total anon pages> N0=<node 0 pages> N1=<node 1 pages> ...
483unevictable=<total anon pages> N0=<node 0 pages> N1=<node 1 pages> ...
484
485And we have total = file + anon + unevictable.
486
4676. Hierarchy support 4876. Hierarchy support
468 488
469The memory controller supports a deep hierarchy and hierarchical accounting. 489The memory controller supports a deep hierarchy and hierarchical accounting.
@@ -471,13 +491,13 @@ The hierarchy is created by creating the appropriate cgroups in the
471cgroup filesystem. Consider for example, the following cgroup filesystem 491cgroup filesystem. Consider for example, the following cgroup filesystem
472hierarchy 492hierarchy
473 493
474 root 494 root
475 / | \ 495 / | \
476 / | \ 496 / | \
477 a b c 497 a b c
478 | \ 498 | \
479 | \ 499 | \
480 d e 500 d e
481 501
482In the diagram above, with hierarchical accounting enabled, all memory 502In the diagram above, with hierarchical accounting enabled, all memory
483usage of e, is accounted to its ancestors up until the root (i.e, c and root), 503usage of e, is accounted to its ancestors up until the root (i.e, c and root),