aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorGreg Thelen <gthelen@google.com>2011-01-13 18:47:36 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2011-01-13 20:32:50 -0500
commitece72400c2a27a3d726cb0854449f991d9fcd2da (patch)
tree8815e1a75188d0e7df8e0c633778dbb166c3b75c /Documentation
parentdb16d5ec1f87f17511599bc77857dd1662b5a22f (diff)
memcg: document cgroup dirty memory interfaces
Document cgroup dirty memory interfaces and statistics. [akpm@linux-foundation.org: fix use_hierarchy description] Signed-off-by: Andrea Righi <arighi@develer.com> Signed-off-by: Greg Thelen <gthelen@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/cgroups/memory.txt74
1 files changed, 74 insertions, 0 deletions
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
index 7781857dc94..bac328c232f 100644
--- a/Documentation/cgroups/memory.txt
+++ b/Documentation/cgroups/memory.txt
@@ -385,6 +385,10 @@ mapped_file - # of bytes of mapped file (includes tmpfs/shmem)
385pgpgin - # of pages paged in (equivalent to # of charging events). 385pgpgin - # of pages paged in (equivalent to # of charging events).
386pgpgout - # of pages paged out (equivalent to # of uncharging events). 386pgpgout - # of pages paged out (equivalent to # of uncharging events).
387swap - # of bytes of swap usage 387swap - # of bytes of swap usage
388dirty - # of bytes that are waiting to get written back to the disk.
389writeback - # of bytes that are actively being written back to the disk.
390nfs_unstable - # of bytes sent to the NFS server, but not yet committed to
391 the actual storage.
388inactive_anon - # of bytes of anonymous memory and swap cache memory on 392inactive_anon - # of bytes of anonymous memory and swap cache memory on
389 LRU list. 393 LRU list.
390active_anon - # of bytes of anonymous and swap cache memory on active 394active_anon - # of bytes of anonymous and swap cache memory on active
@@ -406,6 +410,9 @@ total_mapped_file - sum of all children's "cache"
406total_pgpgin - sum of all children's "pgpgin" 410total_pgpgin - sum of all children's "pgpgin"
407total_pgpgout - sum of all children's "pgpgout" 411total_pgpgout - sum of all children's "pgpgout"
408total_swap - sum of all children's "swap" 412total_swap - sum of all children's "swap"
413total_dirty - sum of all children's "dirty"
414total_writeback - sum of all children's "writeback"
415total_nfs_unstable - sum of all children's "nfs_unstable"
409total_inactive_anon - sum of all children's "inactive_anon" 416total_inactive_anon - sum of all children's "inactive_anon"
410total_active_anon - sum of all children's "active_anon" 417total_active_anon - sum of all children's "active_anon"
411total_inactive_file - sum of all children's "inactive_file" 418total_inactive_file - sum of all children's "inactive_file"
@@ -453,6 +460,73 @@ memory under it will be reclaimed.
453You can reset failcnt by writing 0 to failcnt file. 460You can reset failcnt by writing 0 to failcnt file.
454# echo 0 > .../memory.failcnt 461# echo 0 > .../memory.failcnt
455 462
4635.5 dirty memory
464
465Control the maximum amount of dirty pages a cgroup can have at any given time.
466
467Limiting dirty memory is like fixing the max amount of dirty (hard to reclaim)
468page cache used by a cgroup. So, in case of multiple cgroup writers, they will
469not be able to consume more than their designated share of dirty pages and will
470be forced to perform write-out if they cross that limit.
471
472The interface is equivalent to the procfs interface: /proc/sys/vm/dirty_*. It
473is possible to configure a limit to trigger both a direct writeback or a
474background writeback performed by per-bdi flusher threads. The root cgroup
475memory.dirty_* control files are read-only and match the contents of
476the /proc/sys/vm/dirty_* files.
477
478Per-cgroup dirty limits can be set using the following files in the cgroupfs:
479
480- memory.dirty_ratio: the amount of dirty memory (expressed as a percentage of
481 cgroup memory) at which a process generating dirty pages will itself start
482 writing out dirty data.
483
484- memory.dirty_limit_in_bytes: the amount of dirty memory (expressed in bytes)
485 in the cgroup at which a process generating dirty pages will start itself
486 writing out dirty data. Suffix (k, K, m, M, g, or G) can be used to indicate
487 that value is kilo, mega or gigabytes.
488
489 Note: memory.dirty_limit_in_bytes is the counterpart of memory.dirty_ratio.
490 Only one of them may be specified at a time. When one is written it is
491 immediately taken into account to evaluate the dirty memory limits and the
492 other appears as 0 when read.
493
494- memory.dirty_background_ratio: the amount of dirty memory of the cgroup
495 (expressed as a percentage of cgroup memory) at which background writeback
496 kernel threads will start writing out dirty data.
497
498- memory.dirty_background_limit_in_bytes: the amount of dirty memory (expressed
499 in bytes) in the cgroup at which background writeback kernel threads will
500 start writing out dirty data. Suffix (k, K, m, M, g, or G) can be used to
501 indicate that value is kilo, mega or gigabytes.
502
503 Note: memory.dirty_background_limit_in_bytes is the counterpart of
504 memory.dirty_background_ratio. Only one of them may be specified at a time.
505 When one is written it is immediately taken into account to evaluate the dirty
506 memory limits and the other appears as 0 when read.
507
508A cgroup may contain more dirty memory than its dirty limit. This is possible
509because of the principle that the first cgroup to touch a page is charged for
510it. Subsequent page counting events (dirty, writeback, nfs_unstable) are also
511counted to the originally charged cgroup.
512
513Example: If page is allocated by a cgroup A task, then the page is charged to
514cgroup A. If the page is later dirtied by a task in cgroup B, then the cgroup A
515dirty count will be incremented. If cgroup A is over its dirty limit but cgroup
516B is not, then dirtying a cgroup A page from a cgroup B task may push cgroup A
517over its dirty limit without throttling the dirtying cgroup B task.
518
519When use_hierarchy=0, each cgroup has dirty memory usage and limits.
520System-wide dirty limits are also consulted. Dirty memory consumption is
521checked against both system-wide and per-cgroup dirty limits.
522
523The current implementation does not enforce per-cgroup dirty limits when
524use_hierarchy=1. System-wide dirty limits are used for processes in such
525cgroups. Attempts to read memory.dirty_* files return the system-wide
526values. Writes to the memory.dirty_* files return error. An enhanced
527implementation is needed to check the chain of parents to ensure that no
528dirty limit is exceeded.
529
4566. Hierarchy support 5306. Hierarchy support
457 531
458The memory controller supports a deep hierarchy and hierarchical accounting. 532The memory controller supports a deep hierarchy and hierarchical accounting.