diff options
author | Jens Axboe <jens.axboe@oracle.com> | 2010-05-21 15:27:26 -0400 |
---|---|---|
committer | Jens Axboe <jens.axboe@oracle.com> | 2010-05-21 15:27:26 -0400 |
commit | ee9a3607fb03e804ddf624544105f4e34260c380 (patch) | |
tree | ce41b6e0fa10982a306f6c142a92dbf3c9961284 /Documentation/cgroups | |
parent | b492e95be0ae672922f4734acf3f5d35c30be948 (diff) | |
parent | d515e86e639890b33a09390d062b0831664f04a2 (diff) |
Merge branch 'master' into for-2.6.35
Conflicts:
fs/ext3/fsync.c
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Diffstat (limited to 'Documentation/cgroups')
-rw-r--r-- | Documentation/cgroups/cgroups.txt | 2 | ||||
-rw-r--r-- | Documentation/cgroups/cpusets.txt | 38 | ||||
-rw-r--r-- | Documentation/cgroups/memcg_test.txt | 2 | ||||
-rw-r--r-- | Documentation/cgroups/memory.txt | 2 |
4 files changed, 22 insertions, 22 deletions
diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt index a1ca5924faff..57444c2609fc 100644 --- a/Documentation/cgroups/cgroups.txt +++ b/Documentation/cgroups/cgroups.txt | |||
@@ -572,7 +572,7 @@ void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp, | |||
572 | 572 | ||
573 | Called when a task attach operation has failed after can_attach() has succeeded. | 573 | Called when a task attach operation has failed after can_attach() has succeeded. |
574 | A subsystem whose can_attach() has some side-effects should provide this | 574 | A subsystem whose can_attach() has some side-effects should provide this |
575 | function, so that the subsytem can implement a rollback. If not, not necessary. | 575 | function, so that the subsystem can implement a rollback. If not, not necessary. |
576 | This will be called only about subsystems whose can_attach() operation have | 576 | This will be called only about subsystems whose can_attach() operation have |
577 | succeeded. | 577 | succeeded. |
578 | 578 | ||
diff --git a/Documentation/cgroups/cpusets.txt b/Documentation/cgroups/cpusets.txt index 4160df82b3f5..51682ab2dd1a 100644 --- a/Documentation/cgroups/cpusets.txt +++ b/Documentation/cgroups/cpusets.txt | |||
@@ -42,7 +42,7 @@ Nodes to a set of tasks. In this document "Memory Node" refers to | |||
42 | an on-line node that contains memory. | 42 | an on-line node that contains memory. |
43 | 43 | ||
44 | Cpusets constrain the CPU and Memory placement of tasks to only | 44 | Cpusets constrain the CPU and Memory placement of tasks to only |
45 | the resources within a tasks current cpuset. They form a nested | 45 | the resources within a task's current cpuset. They form a nested |
46 | hierarchy visible in a virtual file system. These are the essential | 46 | hierarchy visible in a virtual file system. These are the essential |
47 | hooks, beyond what is already present, required to manage dynamic | 47 | hooks, beyond what is already present, required to manage dynamic |
48 | job placement on large systems. | 48 | job placement on large systems. |
@@ -53,11 +53,11 @@ Documentation/cgroups/cgroups.txt. | |||
53 | Requests by a task, using the sched_setaffinity(2) system call to | 53 | Requests by a task, using the sched_setaffinity(2) system call to |
54 | include CPUs in its CPU affinity mask, and using the mbind(2) and | 54 | include CPUs in its CPU affinity mask, and using the mbind(2) and |
55 | set_mempolicy(2) system calls to include Memory Nodes in its memory | 55 | set_mempolicy(2) system calls to include Memory Nodes in its memory |
56 | policy, are both filtered through that tasks cpuset, filtering out any | 56 | policy, are both filtered through that task's cpuset, filtering out any |
57 | CPUs or Memory Nodes not in that cpuset. The scheduler will not | 57 | CPUs or Memory Nodes not in that cpuset. The scheduler will not |
58 | schedule a task on a CPU that is not allowed in its cpus_allowed | 58 | schedule a task on a CPU that is not allowed in its cpus_allowed |
59 | vector, and the kernel page allocator will not allocate a page on a | 59 | vector, and the kernel page allocator will not allocate a page on a |
60 | node that is not allowed in the requesting tasks mems_allowed vector. | 60 | node that is not allowed in the requesting task's mems_allowed vector. |
61 | 61 | ||
62 | User level code may create and destroy cpusets by name in the cgroup | 62 | User level code may create and destroy cpusets by name in the cgroup |
63 | virtual file system, manage the attributes and permissions of these | 63 | virtual file system, manage the attributes and permissions of these |
@@ -121,9 +121,9 @@ Cpusets extends these two mechanisms as follows: | |||
121 | - Each task in the system is attached to a cpuset, via a pointer | 121 | - Each task in the system is attached to a cpuset, via a pointer |
122 | in the task structure to a reference counted cgroup structure. | 122 | in the task structure to a reference counted cgroup structure. |
123 | - Calls to sched_setaffinity are filtered to just those CPUs | 123 | - Calls to sched_setaffinity are filtered to just those CPUs |
124 | allowed in that tasks cpuset. | 124 | allowed in that task's cpuset. |
125 | - Calls to mbind and set_mempolicy are filtered to just | 125 | - Calls to mbind and set_mempolicy are filtered to just |
126 | those Memory Nodes allowed in that tasks cpuset. | 126 | those Memory Nodes allowed in that task's cpuset. |
127 | - The root cpuset contains all the systems CPUs and Memory | 127 | - The root cpuset contains all the systems CPUs and Memory |
128 | Nodes. | 128 | Nodes. |
129 | - For any cpuset, one can define child cpusets containing a subset | 129 | - For any cpuset, one can define child cpusets containing a subset |
@@ -141,11 +141,11 @@ into the rest of the kernel, none in performance critical paths: | |||
141 | - in init/main.c, to initialize the root cpuset at system boot. | 141 | - in init/main.c, to initialize the root cpuset at system boot. |
142 | - in fork and exit, to attach and detach a task from its cpuset. | 142 | - in fork and exit, to attach and detach a task from its cpuset. |
143 | - in sched_setaffinity, to mask the requested CPUs by what's | 143 | - in sched_setaffinity, to mask the requested CPUs by what's |
144 | allowed in that tasks cpuset. | 144 | allowed in that task's cpuset. |
145 | - in sched.c migrate_live_tasks(), to keep migrating tasks within | 145 | - in sched.c migrate_live_tasks(), to keep migrating tasks within |
146 | the CPUs allowed by their cpuset, if possible. | 146 | the CPUs allowed by their cpuset, if possible. |
147 | - in the mbind and set_mempolicy system calls, to mask the requested | 147 | - in the mbind and set_mempolicy system calls, to mask the requested |
148 | Memory Nodes by what's allowed in that tasks cpuset. | 148 | Memory Nodes by what's allowed in that task's cpuset. |
149 | - in page_alloc.c, to restrict memory to allowed nodes. | 149 | - in page_alloc.c, to restrict memory to allowed nodes. |
150 | - in vmscan.c, to restrict page recovery to the current cpuset. | 150 | - in vmscan.c, to restrict page recovery to the current cpuset. |
151 | 151 | ||
@@ -155,7 +155,7 @@ new system calls are added for cpusets - all support for querying and | |||
155 | modifying cpusets is via this cpuset file system. | 155 | modifying cpusets is via this cpuset file system. |
156 | 156 | ||
157 | The /proc/<pid>/status file for each task has four added lines, | 157 | The /proc/<pid>/status file for each task has four added lines, |
158 | displaying the tasks cpus_allowed (on which CPUs it may be scheduled) | 158 | displaying the task's cpus_allowed (on which CPUs it may be scheduled) |
159 | and mems_allowed (on which Memory Nodes it may obtain memory), | 159 | and mems_allowed (on which Memory Nodes it may obtain memory), |
160 | in the two formats seen in the following example: | 160 | in the two formats seen in the following example: |
161 | 161 | ||
@@ -323,17 +323,17 @@ stack segment pages of a task. | |||
323 | 323 | ||
324 | By default, both kinds of memory spreading are off, and memory | 324 | By default, both kinds of memory spreading are off, and memory |
325 | pages are allocated on the node local to where the task is running, | 325 | pages are allocated on the node local to where the task is running, |
326 | except perhaps as modified by the tasks NUMA mempolicy or cpuset | 326 | except perhaps as modified by the task's NUMA mempolicy or cpuset |
327 | configuration, so long as sufficient free memory pages are available. | 327 | configuration, so long as sufficient free memory pages are available. |
328 | 328 | ||
329 | When new cpusets are created, they inherit the memory spread settings | 329 | When new cpusets are created, they inherit the memory spread settings |
330 | of their parent. | 330 | of their parent. |
331 | 331 | ||
332 | Setting memory spreading causes allocations for the affected page | 332 | Setting memory spreading causes allocations for the affected page |
333 | or slab caches to ignore the tasks NUMA mempolicy and be spread | 333 | or slab caches to ignore the task's NUMA mempolicy and be spread |
334 | instead. Tasks using mbind() or set_mempolicy() calls to set NUMA | 334 | instead. Tasks using mbind() or set_mempolicy() calls to set NUMA |
335 | mempolicies will not notice any change in these calls as a result of | 335 | mempolicies will not notice any change in these calls as a result of |
336 | their containing tasks memory spread settings. If memory spreading | 336 | their containing task's memory spread settings. If memory spreading |
337 | is turned off, then the currently specified NUMA mempolicy once again | 337 | is turned off, then the currently specified NUMA mempolicy once again |
338 | applies to memory page allocations. | 338 | applies to memory page allocations. |
339 | 339 | ||
@@ -357,7 +357,7 @@ pages from the node returned by cpuset_mem_spread_node(). | |||
357 | 357 | ||
358 | The cpuset_mem_spread_node() routine is also simple. It uses the | 358 | The cpuset_mem_spread_node() routine is also simple. It uses the |
359 | value of a per-task rotor cpuset_mem_spread_rotor to select the next | 359 | value of a per-task rotor cpuset_mem_spread_rotor to select the next |
360 | node in the current tasks mems_allowed to prefer for the allocation. | 360 | node in the current task's mems_allowed to prefer for the allocation. |
361 | 361 | ||
362 | This memory placement policy is also known (in other contexts) as | 362 | This memory placement policy is also known (in other contexts) as |
363 | round-robin or interleave. | 363 | round-robin or interleave. |
@@ -594,7 +594,7 @@ is attached, is subtle. | |||
594 | If a cpuset has its Memory Nodes modified, then for each task attached | 594 | If a cpuset has its Memory Nodes modified, then for each task attached |
595 | to that cpuset, the next time that the kernel attempts to allocate | 595 | to that cpuset, the next time that the kernel attempts to allocate |
596 | a page of memory for that task, the kernel will notice the change | 596 | a page of memory for that task, the kernel will notice the change |
597 | in the tasks cpuset, and update its per-task memory placement to | 597 | in the task's cpuset, and update its per-task memory placement to |
598 | remain within the new cpusets memory placement. If the task was using | 598 | remain within the new cpusets memory placement. If the task was using |
599 | mempolicy MPOL_BIND, and the nodes to which it was bound overlap with | 599 | mempolicy MPOL_BIND, and the nodes to which it was bound overlap with |
600 | its new cpuset, then the task will continue to use whatever subset | 600 | its new cpuset, then the task will continue to use whatever subset |
@@ -603,13 +603,13 @@ was using MPOL_BIND and now none of its MPOL_BIND nodes are allowed | |||
603 | in the new cpuset, then the task will be essentially treated as if it | 603 | in the new cpuset, then the task will be essentially treated as if it |
604 | was MPOL_BIND bound to the new cpuset (even though its NUMA placement, | 604 | was MPOL_BIND bound to the new cpuset (even though its NUMA placement, |
605 | as queried by get_mempolicy(), doesn't change). If a task is moved | 605 | as queried by get_mempolicy(), doesn't change). If a task is moved |
606 | from one cpuset to another, then the kernel will adjust the tasks | 606 | from one cpuset to another, then the kernel will adjust the task's |
607 | memory placement, as above, the next time that the kernel attempts | 607 | memory placement, as above, the next time that the kernel attempts |
608 | to allocate a page of memory for that task. | 608 | to allocate a page of memory for that task. |
609 | 609 | ||
610 | If a cpuset has its 'cpuset.cpus' modified, then each task in that cpuset | 610 | If a cpuset has its 'cpuset.cpus' modified, then each task in that cpuset |
611 | will have its allowed CPU placement changed immediately. Similarly, | 611 | will have its allowed CPU placement changed immediately. Similarly, |
612 | if a tasks pid is written to another cpusets 'cpuset.tasks' file, then its | 612 | if a task's pid is written to another cpusets 'cpuset.tasks' file, then its |
613 | allowed CPU placement is changed immediately. If such a task had been | 613 | allowed CPU placement is changed immediately. If such a task had been |
614 | bound to some subset of its cpuset using the sched_setaffinity() call, | 614 | bound to some subset of its cpuset using the sched_setaffinity() call, |
615 | the task will be allowed to run on any CPU allowed in its new cpuset, | 615 | the task will be allowed to run on any CPU allowed in its new cpuset, |
@@ -626,16 +626,16 @@ cpusets memory placement policy 'cpuset.mems' subsequently changes. | |||
626 | If the cpuset flag file 'cpuset.memory_migrate' is set true, then when | 626 | If the cpuset flag file 'cpuset.memory_migrate' is set true, then when |
627 | tasks are attached to that cpuset, any pages that task had | 627 | tasks are attached to that cpuset, any pages that task had |
628 | allocated to it on nodes in its previous cpuset are migrated | 628 | allocated to it on nodes in its previous cpuset are migrated |
629 | to the tasks new cpuset. The relative placement of the page within | 629 | to the task's new cpuset. The relative placement of the page within |
630 | the cpuset is preserved during these migration operations if possible. | 630 | the cpuset is preserved during these migration operations if possible. |
631 | For example if the page was on the second valid node of the prior cpuset | 631 | For example if the page was on the second valid node of the prior cpuset |
632 | then the page will be placed on the second valid node of the new cpuset. | 632 | then the page will be placed on the second valid node of the new cpuset. |
633 | 633 | ||
634 | Also if 'cpuset.memory_migrate' is set true, then if that cpusets | 634 | Also if 'cpuset.memory_migrate' is set true, then if that cpuset's |
635 | 'cpuset.mems' file is modified, pages allocated to tasks in that | 635 | 'cpuset.mems' file is modified, pages allocated to tasks in that |
636 | cpuset, that were on nodes in the previous setting of 'cpuset.mems', | 636 | cpuset, that were on nodes in the previous setting of 'cpuset.mems', |
637 | will be moved to nodes in the new setting of 'mems.' | 637 | will be moved to nodes in the new setting of 'mems.' |
638 | Pages that were not in the tasks prior cpuset, or in the cpusets | 638 | Pages that were not in the task's prior cpuset, or in the cpuset's |
639 | prior 'cpuset.mems' setting, will not be moved. | 639 | prior 'cpuset.mems' setting, will not be moved. |
640 | 640 | ||
641 | There is an exception to the above. If hotplug functionality is used | 641 | There is an exception to the above. If hotplug functionality is used |
@@ -655,7 +655,7 @@ There is a second exception to the above. GFP_ATOMIC requests are | |||
655 | kernel internal allocations that must be satisfied, immediately. | 655 | kernel internal allocations that must be satisfied, immediately. |
656 | The kernel may drop some request, in rare cases even panic, if a | 656 | The kernel may drop some request, in rare cases even panic, if a |
657 | GFP_ATOMIC alloc fails. If the request cannot be satisfied within | 657 | GFP_ATOMIC alloc fails. If the request cannot be satisfied within |
658 | the current tasks cpuset, then we relax the cpuset, and look for | 658 | the current task's cpuset, then we relax the cpuset, and look for |
659 | memory anywhere we can find it. It's better to violate the cpuset | 659 | memory anywhere we can find it. It's better to violate the cpuset |
660 | than stress the kernel. | 660 | than stress the kernel. |
661 | 661 | ||
diff --git a/Documentation/cgroups/memcg_test.txt b/Documentation/cgroups/memcg_test.txt index f7f68b2ac199..b7eececfb195 100644 --- a/Documentation/cgroups/memcg_test.txt +++ b/Documentation/cgroups/memcg_test.txt | |||
@@ -244,7 +244,7 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y. | |||
244 | we have to check if OLDPAGE/NEWPAGE is a valid page after commit(). | 244 | we have to check if OLDPAGE/NEWPAGE is a valid page after commit(). |
245 | 245 | ||
246 | 8. LRU | 246 | 8. LRU |
247 | Each memcg has its own private LRU. Now, it's handling is under global | 247 | Each memcg has its own private LRU. Now, its handling is under global |
248 | VM's control (means that it's handled under global zone->lru_lock). | 248 | VM's control (means that it's handled under global zone->lru_lock). |
249 | Almost all routines around memcg's LRU is called by global LRU's | 249 | Almost all routines around memcg's LRU is called by global LRU's |
250 | list management functions under zone->lru_lock(). | 250 | list management functions under zone->lru_lock(). |
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index 3a6aecd078ba..6cab1f29da4c 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt | |||
@@ -263,7 +263,7 @@ some of the pages cached in the cgroup (page cache pages). | |||
263 | 263 | ||
264 | 4.2 Task migration | 264 | 4.2 Task migration |
265 | 265 | ||
266 | When a task migrates from one cgroup to another, it's charge is not | 266 | When a task migrates from one cgroup to another, its charge is not |
267 | carried forward by default. The pages allocated from the original cgroup still | 267 | carried forward by default. The pages allocated from the original cgroup still |
268 | remain charged to it, the charge is dropped when the page is freed or | 268 | remain charged to it, the charge is dropped when the page is freed or |
269 | reclaimed. | 269 | reclaimed. |