diff options
author | Tejun Heo <tj@kernel.org> | 2015-06-18 16:54:28 -0400 |
---|---|---|
committer | Tejun Heo <tj@kernel.org> | 2015-06-18 16:54:28 -0400 |
commit | 8a0792ef8e01f03cb43806c6a87738bde34df713 (patch) | |
tree | 8be8510a9f70cfde7c63b28e162cd967ee5010f8 /Documentation/cgroups | |
parent | 187fe84067bd377047cfcb7f2bbc7c9dc12d290c (diff) |
cgroup: add delegation section to unified hierarchy documentation
v2: Rearranged paragraphs as suggested by Johannes Weiner.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Diffstat (limited to 'Documentation/cgroups')
-rw-r--r-- | Documentation/cgroups/unified-hierarchy.txt | 102 |
1 files changed, 84 insertions, 18 deletions
diff --git a/Documentation/cgroups/unified-hierarchy.txt b/Documentation/cgroups/unified-hierarchy.txt index eb102fb72213..86847a7647ab 100644 --- a/Documentation/cgroups/unified-hierarchy.txt +++ b/Documentation/cgroups/unified-hierarchy.txt | |||
@@ -17,15 +17,18 @@ CONTENTS | |||
17 | 3. Structural Constraints | 17 | 3. Structural Constraints |
18 | 3-1. Top-down | 18 | 3-1. Top-down |
19 | 3-2. No internal tasks | 19 | 3-2. No internal tasks |
20 | 4. Other Changes | 20 | 4. Delegation |
21 | 4-1. [Un]populated Notification | 21 | 4-1. Model of delegation |
22 | 4-2. Other Core Changes | 22 | 4-2. Common ancestor rule |
23 | 4-3. Per-Controller Changes | 23 | 5. Other Changes |
24 | 4-3-1. blkio | 24 | 5-1. [Un]populated Notification |
25 | 4-3-2. cpuset | 25 | 5-2. Other Core Changes |
26 | 4-3-3. memory | 26 | 5-3. Per-Controller Changes |
27 | 5. Planned Changes | 27 | 5-3-1. blkio |
28 | 5-1. CAP for resource control | 28 | 5-3-2. cpuset |
29 | 5-3-3. memory | ||
30 | 6. Planned Changes | ||
31 | 6-1. CAP for resource control | ||
29 | 32 | ||
30 | 33 | ||
31 | 1. Background | 34 | 1. Background |
@@ -245,9 +248,72 @@ cgroup must create children and transfer all its tasks to the children | |||
245 | before enabling controllers in its "cgroup.subtree_control" file. | 248 | before enabling controllers in its "cgroup.subtree_control" file. |
246 | 249 | ||
247 | 250 | ||
248 | 4. Other Changes | 251 | 4. Delegation |
249 | 252 | ||
250 | 4-1. [Un]populated Notification | 253 | 4-1. Model of delegation |
254 | |||
255 | A cgroup can be delegated to a less privileged user by granting write | ||
256 | access of the directory and its "cgroup.procs" file to the user. Note | ||
257 | that the resource control knobs in a given directory concern the | ||
258 | resources of the parent and thus must not be delegated along with the | ||
259 | directory. | ||
260 | |||
261 | Once delegated, the user can build sub-hierarchy under the directory, | ||
262 | organize processes as it sees fit and further distribute the resources | ||
263 | it got from the parent. The limits and other settings of all resource | ||
264 | controllers are hierarchical and regardless of what happens in the | ||
265 | delegated sub-hierarchy, nothing can escape the resource restrictions | ||
266 | imposed by the parent. | ||
267 | |||
268 | Currently, cgroup doesn't impose any restrictions on the number of | ||
269 | cgroups in or nesting depth of a delegated sub-hierarchy; however, | ||
270 | this may in the future be limited explicitly. | ||
271 | |||
272 | |||
273 | 4-2. Common ancestor rule | ||
274 | |||
275 | On the unified hierarchy, to write to a "cgroup.procs" file, in | ||
276 | addition to the usual write permission to the file and uid match, the | ||
277 | writer must also have write access to the "cgroup.procs" file of the | ||
278 | common ancestor of the source and destination cgroups. This prevents | ||
279 | delegatees from smuggling processes across disjoint sub-hierarchies. | ||
280 | |||
281 | Let's say cgroups C0 and C1 have been delegated to user U0 who created | ||
282 | C00, C01 under C0 and C10 under C1 as follows. | ||
283 | |||
284 | ~~~~~~~~~~~~~ - C0 - C00 | ||
285 | ~ cgroup ~ \ C01 | ||
286 | ~ hierarchy ~ | ||
287 | ~~~~~~~~~~~~~ - C1 - C10 | ||
288 | |||
289 | C0 and C1 are separate entities in terms of resource distribution | ||
290 | regardless of their relative positions in the hierarchy. The | ||
291 | resources the processes under C0 are entitled to are controlled by | ||
292 | C0's ancestors and may be completely different from C1. It's clear | ||
293 | that the intention of delegating C0 to U0 is allowing U0 to organize | ||
294 | the processes under C0 and further control the distribution of C0's | ||
295 | resources. | ||
296 | |||
297 | On traditional hierarchies, if a task has write access to "tasks" or | ||
298 | "cgroup.procs" file of a cgroup and its uid agrees with the target, it | ||
299 | can move the target to the cgroup. In the above example, U0 will not | ||
300 | only be able to move processes in each sub-hierarchy but also across | ||
301 | the two sub-hierarchies, effectively allowing it to violate the | ||
302 | organizational and resource restrictions implied by the hierarchical | ||
303 | structure above C0 and C1. | ||
304 | |||
305 | On the unified hierarchy, let's say U0 wants to write the pid of a | ||
306 | process which has a matching uid and is currently in C10 into | ||
307 | "C00/cgroup.procs". U0 obviously has write access to the file and | ||
308 | migration permission on the process; however, the common ancestor of | ||
309 | the source cgroup C10 and the destination cgroup C00 is above the | ||
310 | points of delegation and U0 would not have write access to its | ||
311 | "cgroup.procs" and thus be denied with -EACCES. | ||
312 | |||
313 | |||
314 | 5. Other Changes | ||
315 | |||
316 | 5-1. [Un]populated Notification | ||
251 | 317 | ||
252 | cgroup users often need a way to determine when a cgroup's | 318 | cgroup users often need a way to determine when a cgroup's |
253 | subhierarchy becomes empty so that it can be cleaned up. cgroup | 319 | subhierarchy becomes empty so that it can be cleaned up. cgroup |
@@ -289,7 +355,7 @@ supported and the interface files "release_agent" and | |||
289 | "notify_on_release" do not exist. | 355 | "notify_on_release" do not exist. |
290 | 356 | ||
291 | 357 | ||
292 | 4-2. Other Core Changes | 358 | 5-2. Other Core Changes |
293 | 359 | ||
294 | - None of the mount options is allowed. | 360 | - None of the mount options is allowed. |
295 | 361 | ||
@@ -306,14 +372,14 @@ supported and the interface files "release_agent" and | |||
306 | - The "cgroup.clone_children" file is removed. | 372 | - The "cgroup.clone_children" file is removed. |
307 | 373 | ||
308 | 374 | ||
309 | 4-3. Per-Controller Changes | 375 | 5-3. Per-Controller Changes |
310 | 376 | ||
311 | 4-3-1. blkio | 377 | 5-3-1. blkio |
312 | 378 | ||
313 | - blk-throttle becomes properly hierarchical. | 379 | - blk-throttle becomes properly hierarchical. |
314 | 380 | ||
315 | 381 | ||
316 | 4-3-2. cpuset | 382 | 5-3-2. cpuset |
317 | 383 | ||
318 | - Tasks are kept in empty cpusets after hotplug and take on the masks | 384 | - Tasks are kept in empty cpusets after hotplug and take on the masks |
319 | of the nearest non-empty ancestor, instead of being moved to it. | 385 | of the nearest non-empty ancestor, instead of being moved to it. |
@@ -322,7 +388,7 @@ supported and the interface files "release_agent" and | |||
322 | masks of the nearest non-empty ancestor. | 388 | masks of the nearest non-empty ancestor. |
323 | 389 | ||
324 | 390 | ||
325 | 4-3-3. memory | 391 | 5-3-3. memory |
326 | 392 | ||
327 | - use_hierarchy is on by default and the cgroup file for the flag is | 393 | - use_hierarchy is on by default and the cgroup file for the flag is |
328 | not created. | 394 | not created. |
@@ -407,9 +473,9 @@ supported and the interface files "release_agent" and | |||
407 | memory.low, memory.high, and memory.max will use the string "max" to | 473 | memory.low, memory.high, and memory.max will use the string "max" to |
408 | indicate and set the highest possible value. | 474 | indicate and set the highest possible value. |
409 | 475 | ||
410 | 5. Planned Changes | 476 | 6. Planned Changes |
411 | 477 | ||
412 | 5-1. CAP for resource control | 478 | 6-1. CAP for resource control |
413 | 479 | ||
414 | Unified hierarchy will require one of the capabilities(7), which is | 480 | Unified hierarchy will require one of the capabilities(7), which is |
415 | yet to be decided, for all resource control related knobs. Process | 481 | yet to be decided, for all resource control related knobs. Process |