diff options
author | James Morris <jmorris@namei.org> | 2010-03-30 17:39:27 -0400 |
---|---|---|
committer | James Morris <jmorris@namei.org> | 2010-03-30 17:39:27 -0400 |
commit | d25d6fa1a95f465ff1ec4458ca15e30b2c8dffec (patch) | |
tree | 7362b182dedd825fc762ef7706830837e42943af /Documentation/cgroups/memory.txt | |
parent | 225a9be24d799aa16d543c31fb09f0c9ed1d9caa (diff) | |
parent | 2eaa9cfdf33b8d7fb7aff27792192e0019ae8fc6 (diff) |
Merge branch 'master' into next
Diffstat (limited to 'Documentation/cgroups/memory.txt')
-rw-r--r-- | Documentation/cgroups/memory.txt | 82 |
1 files changed, 78 insertions, 4 deletions
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index b871f2552b45..3a6aecd078ba 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt | |||
@@ -182,6 +182,8 @@ list. | |||
182 | NOTE: Reclaim does not work for the root cgroup, since we cannot set any | 182 | NOTE: Reclaim does not work for the root cgroup, since we cannot set any |
183 | limits on the root cgroup. | 183 | limits on the root cgroup. |
184 | 184 | ||
185 | Note2: When panic_on_oom is set to "2", the whole system will panic. | ||
186 | |||
185 | 2. Locking | 187 | 2. Locking |
186 | 188 | ||
187 | The memory controller uses the following hierarchy | 189 | The memory controller uses the following hierarchy |
@@ -262,10 +264,12 @@ some of the pages cached in the cgroup (page cache pages). | |||
262 | 4.2 Task migration | 264 | 4.2 Task migration |
263 | 265 | ||
264 | When a task migrates from one cgroup to another, it's charge is not | 266 | When a task migrates from one cgroup to another, it's charge is not |
265 | carried forward. The pages allocated from the original cgroup still | 267 | carried forward by default. The pages allocated from the original cgroup still |
266 | remain charged to it, the charge is dropped when the page is freed or | 268 | remain charged to it, the charge is dropped when the page is freed or |
267 | reclaimed. | 269 | reclaimed. |
268 | 270 | ||
271 | Note: You can move charges of a task along with task migration. See 8. | ||
272 | |||
269 | 4.3 Removing a cgroup | 273 | 4.3 Removing a cgroup |
270 | 274 | ||
271 | A cgroup can be removed by rmdir, but as discussed in sections 4.1 and 4.2, a | 275 | A cgroup can be removed by rmdir, but as discussed in sections 4.1 and 4.2, a |
@@ -336,7 +340,7 @@ Note: | |||
336 | 5.3 swappiness | 340 | 5.3 swappiness |
337 | Similar to /proc/sys/vm/swappiness, but affecting a hierarchy of groups only. | 341 | Similar to /proc/sys/vm/swappiness, but affecting a hierarchy of groups only. |
338 | 342 | ||
339 | Following cgroups' swapiness can't be changed. | 343 | Following cgroups' swappiness can't be changed. |
340 | - root cgroup (uses /proc/sys/vm/swappiness). | 344 | - root cgroup (uses /proc/sys/vm/swappiness). |
341 | - a cgroup which uses hierarchy and it has child cgroup. | 345 | - a cgroup which uses hierarchy and it has child cgroup. |
342 | - a cgroup which uses hierarchy and not the root of hierarchy. | 346 | - a cgroup which uses hierarchy and not the root of hierarchy. |
@@ -377,7 +381,8 @@ The feature can be disabled by | |||
377 | NOTE1: Enabling/disabling will fail if the cgroup already has other | 381 | NOTE1: Enabling/disabling will fail if the cgroup already has other |
378 | cgroups created below it. | 382 | cgroups created below it. |
379 | 383 | ||
380 | NOTE2: This feature can be enabled/disabled per subtree. | 384 | NOTE2: When panic_on_oom is set to "2", the whole system will panic in |
385 | case of an oom event in any cgroup. | ||
381 | 386 | ||
382 | 7. Soft limits | 387 | 7. Soft limits |
383 | 388 | ||
@@ -414,7 +419,76 @@ NOTE1: Soft limits take effect over a long period of time, since they involve | |||
414 | NOTE2: It is recommended to set the soft limit always below the hard limit, | 419 | NOTE2: It is recommended to set the soft limit always below the hard limit, |
415 | otherwise the hard limit will take precedence. | 420 | otherwise the hard limit will take precedence. |
416 | 421 | ||
417 | 8. TODO | 422 | 8. Move charges at task migration |
423 | |||
424 | Users can move charges associated with a task along with task migration, that | ||
425 | is, uncharge task's pages from the old cgroup and charge them to the new cgroup. | ||
426 | This feature is not supported in !CONFIG_MMU environments because of lack of | ||
427 | page tables. | ||
428 | |||
429 | 8.1 Interface | ||
430 | |||
431 | This feature is disabled by default. It can be enabled(and disabled again) by | ||
432 | writing to memory.move_charge_at_immigrate of the destination cgroup. | ||
433 | |||
434 | If you want to enable it: | ||
435 | |||
436 | # echo (some positive value) > memory.move_charge_at_immigrate | ||
437 | |||
438 | Note: Each bits of move_charge_at_immigrate has its own meaning about what type | ||
439 | of charges should be moved. See 8.2 for details. | ||
440 | Note: Charges are moved only when you move mm->owner, IOW, a leader of a thread | ||
441 | group. | ||
442 | Note: If we cannot find enough space for the task in the destination cgroup, we | ||
443 | try to make space by reclaiming memory. Task migration may fail if we | ||
444 | cannot make enough space. | ||
445 | Note: It can take several seconds if you move charges in giga bytes order. | ||
446 | |||
447 | And if you want disable it again: | ||
448 | |||
449 | # echo 0 > memory.move_charge_at_immigrate | ||
450 | |||
451 | 8.2 Type of charges which can be move | ||
452 | |||
453 | Each bits of move_charge_at_immigrate has its own meaning about what type of | ||
454 | charges should be moved. | ||
455 | |||
456 | bit | what type of charges would be moved ? | ||
457 | -----+------------------------------------------------------------------------ | ||
458 | 0 | A charge of an anonymous page(or swap of it) used by the target task. | ||
459 | | Those pages and swaps must be used only by the target task. You must | ||
460 | | enable Swap Extension(see 2.4) to enable move of swap charges. | ||
461 | |||
462 | Note: Those pages and swaps must be charged to the old cgroup. | ||
463 | Note: More type of pages(e.g. file cache, shmem,) will be supported by other | ||
464 | bits in future. | ||
465 | |||
466 | 8.3 TODO | ||
467 | |||
468 | - Add support for other types of pages(e.g. file cache, shmem, etc.). | ||
469 | - Implement madvise(2) to let users decide the vma to be moved or not to be | ||
470 | moved. | ||
471 | - All of moving charge operations are done under cgroup_mutex. It's not good | ||
472 | behavior to hold the mutex too long, so we may need some trick. | ||
473 | |||
474 | 9. Memory thresholds | ||
475 | |||
476 | Memory controler implements memory thresholds using cgroups notification | ||
477 | API (see cgroups.txt). It allows to register multiple memory and memsw | ||
478 | thresholds and gets notifications when it crosses. | ||
479 | |||
480 | To register a threshold application need: | ||
481 | - create an eventfd using eventfd(2); | ||
482 | - open memory.usage_in_bytes or memory.memsw.usage_in_bytes; | ||
483 | - write string like "<event_fd> <memory.usage_in_bytes> <threshold>" to | ||
484 | cgroup.event_control. | ||
485 | |||
486 | Application will be notified through eventfd when memory usage crosses | ||
487 | threshold in any direction. | ||
488 | |||
489 | It's applicable for root and non-root cgroup. | ||
490 | |||
491 | 10. TODO | ||
418 | 492 | ||
419 | 1. Add support for accounting huge pages (as a separate controller) | 493 | 1. Add support for accounting huge pages (as a separate controller) |
420 | 2. Make per-cgroup scanner reclaim not-shared pages first | 494 | 2. Make per-cgroup scanner reclaim not-shared pages first |