diff options
author | Greg Thelen <gthelen@google.com> | 2011-01-13 18:47:36 -0500 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2011-01-13 20:32:50 -0500 |
commit | ece72400c2a27a3d726cb0854449f991d9fcd2da (patch) | |
tree | 8815e1a75188d0e7df8e0c633778dbb166c3b75c /Documentation | |
parent | db16d5ec1f87f17511599bc77857dd1662b5a22f (diff) |
memcg: document cgroup dirty memory interfaces
Document cgroup dirty memory interfaces and statistics.
[akpm@linux-foundation.org: fix use_hierarchy description]
Signed-off-by: Andrea Righi <arighi@develer.com>
Signed-off-by: Greg Thelen <gthelen@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/cgroups/memory.txt | 74 |
1 files changed, 74 insertions, 0 deletions
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index 7781857dc940..bac328c232f5 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt | |||
@@ -385,6 +385,10 @@ mapped_file - # of bytes of mapped file (includes tmpfs/shmem) | |||
385 | pgpgin - # of pages paged in (equivalent to # of charging events). | 385 | pgpgin - # of pages paged in (equivalent to # of charging events). |
386 | pgpgout - # of pages paged out (equivalent to # of uncharging events). | 386 | pgpgout - # of pages paged out (equivalent to # of uncharging events). |
387 | swap - # of bytes of swap usage | 387 | swap - # of bytes of swap usage |
388 | dirty - # of bytes that are waiting to get written back to the disk. | ||
389 | writeback - # of bytes that are actively being written back to the disk. | ||
390 | nfs_unstable - # of bytes sent to the NFS server, but not yet committed to | ||
391 | the actual storage. | ||
388 | inactive_anon - # of bytes of anonymous memory and swap cache memory on | 392 | inactive_anon - # of bytes of anonymous memory and swap cache memory on |
389 | LRU list. | 393 | LRU list. |
390 | active_anon - # of bytes of anonymous and swap cache memory on active | 394 | active_anon - # of bytes of anonymous and swap cache memory on active |
@@ -406,6 +410,9 @@ total_mapped_file - sum of all children's "cache" | |||
406 | total_pgpgin - sum of all children's "pgpgin" | 410 | total_pgpgin - sum of all children's "pgpgin" |
407 | total_pgpgout - sum of all children's "pgpgout" | 411 | total_pgpgout - sum of all children's "pgpgout" |
408 | total_swap - sum of all children's "swap" | 412 | total_swap - sum of all children's "swap" |
413 | total_dirty - sum of all children's "dirty" | ||
414 | total_writeback - sum of all children's "writeback" | ||
415 | total_nfs_unstable - sum of all children's "nfs_unstable" | ||
409 | total_inactive_anon - sum of all children's "inactive_anon" | 416 | total_inactive_anon - sum of all children's "inactive_anon" |
410 | total_active_anon - sum of all children's "active_anon" | 417 | total_active_anon - sum of all children's "active_anon" |
411 | total_inactive_file - sum of all children's "inactive_file" | 418 | total_inactive_file - sum of all children's "inactive_file" |
@@ -453,6 +460,73 @@ memory under it will be reclaimed. | |||
453 | You can reset failcnt by writing 0 to failcnt file. | 460 | You can reset failcnt by writing 0 to failcnt file. |
454 | # echo 0 > .../memory.failcnt | 461 | # echo 0 > .../memory.failcnt |
455 | 462 | ||
463 | 5.5 dirty memory | ||
464 | |||
465 | Control the maximum amount of dirty pages a cgroup can have at any given time. | ||
466 | |||
467 | Limiting dirty memory is like fixing the max amount of dirty (hard to reclaim) | ||
468 | page cache used by a cgroup. So, in case of multiple cgroup writers, they will | ||
469 | not be able to consume more than their designated share of dirty pages and will | ||
470 | be forced to perform write-out if they cross that limit. | ||
471 | |||
472 | The interface is equivalent to the procfs interface: /proc/sys/vm/dirty_*. It | ||
473 | is possible to configure a limit to trigger both a direct writeback or a | ||
474 | background writeback performed by per-bdi flusher threads. The root cgroup | ||
475 | memory.dirty_* control files are read-only and match the contents of | ||
476 | the /proc/sys/vm/dirty_* files. | ||
477 | |||
478 | Per-cgroup dirty limits can be set using the following files in the cgroupfs: | ||
479 | |||
480 | - memory.dirty_ratio: the amount of dirty memory (expressed as a percentage of | ||
481 | cgroup memory) at which a process generating dirty pages will itself start | ||
482 | writing out dirty data. | ||
483 | |||
484 | - memory.dirty_limit_in_bytes: the amount of dirty memory (expressed in bytes) | ||
485 | in the cgroup at which a process generating dirty pages will start itself | ||
486 | writing out dirty data. Suffix (k, K, m, M, g, or G) can be used to indicate | ||
487 | that value is kilo, mega or gigabytes. | ||
488 | |||
489 | Note: memory.dirty_limit_in_bytes is the counterpart of memory.dirty_ratio. | ||
490 | Only one of them may be specified at a time. When one is written it is | ||
491 | immediately taken into account to evaluate the dirty memory limits and the | ||
492 | other appears as 0 when read. | ||
493 | |||
494 | - memory.dirty_background_ratio: the amount of dirty memory of the cgroup | ||
495 | (expressed as a percentage of cgroup memory) at which background writeback | ||
496 | kernel threads will start writing out dirty data. | ||
497 | |||
498 | - memory.dirty_background_limit_in_bytes: the amount of dirty memory (expressed | ||
499 | in bytes) in the cgroup at which background writeback kernel threads will | ||
500 | start writing out dirty data. Suffix (k, K, m, M, g, or G) can be used to | ||
501 | indicate that value is kilo, mega or gigabytes. | ||
502 | |||
503 | Note: memory.dirty_background_limit_in_bytes is the counterpart of | ||
504 | memory.dirty_background_ratio. Only one of them may be specified at a time. | ||
505 | When one is written it is immediately taken into account to evaluate the dirty | ||
506 | memory limits and the other appears as 0 when read. | ||
507 | |||
508 | A cgroup may contain more dirty memory than its dirty limit. This is possible | ||
509 | because of the principle that the first cgroup to touch a page is charged for | ||
510 | it. Subsequent page counting events (dirty, writeback, nfs_unstable) are also | ||
511 | counted to the originally charged cgroup. | ||
512 | |||
513 | Example: If page is allocated by a cgroup A task, then the page is charged to | ||
514 | cgroup A. If the page is later dirtied by a task in cgroup B, then the cgroup A | ||
515 | dirty count will be incremented. If cgroup A is over its dirty limit but cgroup | ||
516 | B is not, then dirtying a cgroup A page from a cgroup B task may push cgroup A | ||
517 | over its dirty limit without throttling the dirtying cgroup B task. | ||
518 | |||
519 | When use_hierarchy=0, each cgroup has dirty memory usage and limits. | ||
520 | System-wide dirty limits are also consulted. Dirty memory consumption is | ||
521 | checked against both system-wide and per-cgroup dirty limits. | ||
522 | |||
523 | The current implementation does not enforce per-cgroup dirty limits when | ||
524 | use_hierarchy=1. System-wide dirty limits are used for processes in such | ||
525 | cgroups. Attempts to read memory.dirty_* files return the system-wide | ||
526 | values. Writes to the memory.dirty_* files return error. An enhanced | ||
527 | implementation is needed to check the chain of parents to ensure that no | ||
528 | dirty limit is exceeded. | ||
529 | |||
456 | 6. Hierarchy support | 530 | 6. Hierarchy support |
457 | 531 | ||
458 | The memory controller supports a deep hierarchy and hierarchical accounting. | 532 | The memory controller supports a deep hierarchy and hierarchical accounting. |