aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/cgroups/cgroups.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/cgroups/cgroups.txt')
-rw-r--r--Documentation/cgroups/cgroups.txt143
1 files changed, 99 insertions, 44 deletions
diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
index b34823ff1646..cd67e90003c0 100644
--- a/Documentation/cgroups/cgroups.txt
+++ b/Documentation/cgroups/cgroups.txt
@@ -18,7 +18,8 @@ CONTENTS:
18 1.2 Why are cgroups needed ? 18 1.2 Why are cgroups needed ?
19 1.3 How are cgroups implemented ? 19 1.3 How are cgroups implemented ?
20 1.4 What does notify_on_release do ? 20 1.4 What does notify_on_release do ?
21 1.5 How do I use cgroups ? 21 1.5 What does clone_children do ?
22 1.6 How do I use cgroups ?
222. Usage Examples and Syntax 232. Usage Examples and Syntax
23 2.1 Basic Usage 24 2.1 Basic Usage
24 2.2 Attaching processes 25 2.2 Attaching processes
@@ -109,22 +110,22 @@ university server with various users - students, professors, system
109tasks etc. The resource planning for this server could be along the 110tasks etc. The resource planning for this server could be along the
110following lines: 111following lines:
111 112
112 CPU : Top cpuset 113 CPU : "Top cpuset"
113 / \ 114 / \
114 CPUSet1 CPUSet2 115 CPUSet1 CPUSet2
115 | | 116 | |
116 (Profs) (Students) 117 (Professors) (Students)
117 118
118 In addition (system tasks) are attached to topcpuset (so 119 In addition (system tasks) are attached to topcpuset (so
119 that they can run anywhere) with a limit of 20% 120 that they can run anywhere) with a limit of 20%
120 121
121 Memory : Professors (50%), students (30%), system (20%) 122 Memory : Professors (50%), Students (30%), system (20%)
122 123
123 Disk : Prof (50%), students (30%), system (20%) 124 Disk : Professors (50%), Students (30%), system (20%)
124 125
125 Network : WWW browsing (20%), Network File System (60%), others (20%) 126 Network : WWW browsing (20%), Network File System (60%), others (20%)
126 / \ 127 / \
127 Prof (15%) students (5%) 128 Professors (15%) students (5%)
128 129
129Browsers like Firefox/Lynx go into the WWW network class, while (k)nfsd go 130Browsers like Firefox/Lynx go into the WWW network class, while (k)nfsd go
130into NFS network class. 131into NFS network class.
@@ -137,11 +138,11 @@ With the ability to classify tasks differently for different resources
137the admin can easily set up a script which receives exec notifications 138the admin can easily set up a script which receives exec notifications
138and depending on who is launching the browser he can 139and depending on who is launching the browser he can
139 140
140 # echo browser_pid > /mnt/<restype>/<userclass>/tasks 141 # echo browser_pid > /sys/fs/cgroup/<restype>/<userclass>/tasks
141 142
142With only a single hierarchy, he now would potentially have to create 143With only a single hierarchy, he now would potentially have to create
143a separate cgroup for every browser launched and associate it with 144a separate cgroup for every browser launched and associate it with
144approp network and other resource class. This may lead to 145appropriate network and other resource class. This may lead to
145proliferation of such cgroups. 146proliferation of such cgroups.
146 147
147Also lets say that the administrator would like to give enhanced network 148Also lets say that the administrator would like to give enhanced network
@@ -152,9 +153,9 @@ apps enhanced CPU power,
152With ability to write pids directly to resource classes, it's just a 153With ability to write pids directly to resource classes, it's just a
153matter of : 154matter of :
154 155
155 # echo pid > /mnt/network/<new_class>/tasks 156 # echo pid > /sys/fs/cgroup/network/<new_class>/tasks
156 (after some time) 157 (after some time)
157 # echo pid > /mnt/network/<orig_class>/tasks 158 # echo pid > /sys/fs/cgroup/network/<orig_class>/tasks
158 159
159Without this ability, he would have to split the cgroup into 160Without this ability, he would have to split the cgroup into
160multiple separate ones and then associate the new cgroups with the 161multiple separate ones and then associate the new cgroups with the
@@ -235,7 +236,8 @@ containing the following files describing that cgroup:
235 - cgroup.procs: list of tgids in the cgroup. This list is not 236 - cgroup.procs: list of tgids in the cgroup. This list is not
236 guaranteed to be sorted or free of duplicate tgids, and userspace 237 guaranteed to be sorted or free of duplicate tgids, and userspace
237 should sort/uniquify the list if this property is required. 238 should sort/uniquify the list if this property is required.
238 This is a read-only file, for now. 239 Writing a thread group id into this file moves all threads in that
240 group into this cgroup.
239 - notify_on_release flag: run the release agent on exit? 241 - notify_on_release flag: run the release agent on exit?
240 - release_agent: the path to use for release notifications (this file 242 - release_agent: the path to use for release notifications (this file
241 exists in the top cgroup only) 243 exists in the top cgroup only)
@@ -293,27 +295,39 @@ notify_on_release in the root cgroup at system boot is disabled
293value of their parents notify_on_release setting. The default value of 295value of their parents notify_on_release setting. The default value of
294a cgroup hierarchy's release_agent path is empty. 296a cgroup hierarchy's release_agent path is empty.
295 297
2961.5 How do I use cgroups ? 2981.5 What does clone_children do ?
299---------------------------------
300
301If the clone_children flag is enabled (1) in a cgroup, then all
302cgroups created beneath will call the post_clone callbacks for each
303subsystem of the newly created cgroup. Usually when this callback is
304implemented for a subsystem, it copies the values of the parent
305subsystem, this is the case for the cpuset.
306
3071.6 How do I use cgroups ?
297-------------------------- 308--------------------------
298 309
299To start a new job that is to be contained within a cgroup, using 310To start a new job that is to be contained within a cgroup, using
300the "cpuset" cgroup subsystem, the steps are something like: 311the "cpuset" cgroup subsystem, the steps are something like:
301 312
302 1) mkdir /dev/cgroup 313 1) mount -t tmpfs cgroup_root /sys/fs/cgroup
303 2) mount -t cgroup -ocpuset cpuset /dev/cgroup 314 2) mkdir /sys/fs/cgroup/cpuset
304 3) Create the new cgroup by doing mkdir's and write's (or echo's) in 315 3) mount -t cgroup -ocpuset cpuset /sys/fs/cgroup/cpuset
305 the /dev/cgroup virtual file system. 316 4) Create the new cgroup by doing mkdir's and write's (or echo's) in
306 4) Start a task that will be the "founding father" of the new job. 317 the /sys/fs/cgroup virtual file system.
307 5) Attach that task to the new cgroup by writing its pid to the 318 5) Start a task that will be the "founding father" of the new job.
308 /dev/cgroup tasks file for that cgroup. 319 6) Attach that task to the new cgroup by writing its pid to the
309 6) fork, exec or clone the job tasks from this founding father task. 320 /sys/fs/cgroup/cpuset/tasks file for that cgroup.
321 7) fork, exec or clone the job tasks from this founding father task.
310 322
311For example, the following sequence of commands will setup a cgroup 323For example, the following sequence of commands will setup a cgroup
312named "Charlie", containing just CPUs 2 and 3, and Memory Node 1, 324named "Charlie", containing just CPUs 2 and 3, and Memory Node 1,
313and then start a subshell 'sh' in that cgroup: 325and then start a subshell 'sh' in that cgroup:
314 326
315 mount -t cgroup cpuset -ocpuset /dev/cgroup 327 mount -t tmpfs cgroup_root /sys/fs/cgroup
316 cd /dev/cgroup 328 mkdir /sys/fs/cgroup/cpuset
329 mount -t cgroup cpuset -ocpuset /sys/fs/cgroup/cpuset
330 cd /sys/fs/cgroup/cpuset
317 mkdir Charlie 331 mkdir Charlie
318 cd Charlie 332 cd Charlie
319 /bin/echo 2-3 > cpuset.cpus 333 /bin/echo 2-3 > cpuset.cpus
@@ -334,28 +348,41 @@ Creating, modifying, using the cgroups can be done through the cgroup
334virtual filesystem. 348virtual filesystem.
335 349
336To mount a cgroup hierarchy with all available subsystems, type: 350To mount a cgroup hierarchy with all available subsystems, type:
337# mount -t cgroup xxx /dev/cgroup 351# mount -t cgroup xxx /sys/fs/cgroup
338 352
339The "xxx" is not interpreted by the cgroup code, but will appear in 353The "xxx" is not interpreted by the cgroup code, but will appear in
340/proc/mounts so may be any useful identifying string that you like. 354/proc/mounts so may be any useful identifying string that you like.
341 355
356Note: Some subsystems do not work without some user input first. For instance,
357if cpusets are enabled the user will have to populate the cpus and mems files
358for each new cgroup created before that group can be used.
359
360As explained in section `1.2 Why are cgroups needed?' you should create
361different hierarchies of cgroups for each single resource or group of
362resources you want to control. Therefore, you should mount a tmpfs on
363/sys/fs/cgroup and create directories for each cgroup resource or resource
364group.
365
366# mount -t tmpfs cgroup_root /sys/fs/cgroup
367# mkdir /sys/fs/cgroup/rg1
368
342To mount a cgroup hierarchy with just the cpuset and memory 369To mount a cgroup hierarchy with just the cpuset and memory
343subsystems, type: 370subsystems, type:
344# mount -t cgroup -o cpuset,memory hier1 /dev/cgroup 371# mount -t cgroup -o cpuset,memory hier1 /sys/fs/cgroup/rg1
345 372
346To change the set of subsystems bound to a mounted hierarchy, just 373To change the set of subsystems bound to a mounted hierarchy, just
347remount with different options: 374remount with different options:
348# mount -o remount,cpuset,ns hier1 /dev/cgroup 375# mount -o remount,cpuset,blkio hier1 /sys/fs/cgroup/rg1
349 376
350Now memory is removed from the hierarchy and ns is added. 377Now memory is removed from the hierarchy and blkio is added.
351 378
352Note this will add ns to the hierarchy but won't remove memory or 379Note this will add blkio to the hierarchy but won't remove memory or
353cpuset, because the new options are appended to the old ones: 380cpuset, because the new options are appended to the old ones:
354# mount -o remount,ns /dev/cgroup 381# mount -o remount,blkio /sys/fs/cgroup/rg1
355 382
356To Specify a hierarchy's release_agent: 383To Specify a hierarchy's release_agent:
357# mount -t cgroup -o cpuset,release_agent="/sbin/cpuset_release_agent" \ 384# mount -t cgroup -o cpuset,release_agent="/sbin/cpuset_release_agent" \
358 xxx /dev/cgroup 385 xxx /sys/fs/cgroup/rg1
359 386
360Note that specifying 'release_agent' more than once will return failure. 387Note that specifying 'release_agent' more than once will return failure.
361 388
@@ -364,17 +391,17 @@ when the hierarchy consists of a single (root) cgroup. Supporting
364the ability to arbitrarily bind/unbind subsystems from an existing 391the ability to arbitrarily bind/unbind subsystems from an existing
365cgroup hierarchy is intended to be implemented in the future. 392cgroup hierarchy is intended to be implemented in the future.
366 393
367Then under /dev/cgroup you can find a tree that corresponds to the 394Then under /sys/fs/cgroup/rg1 you can find a tree that corresponds to the
368tree of the cgroups in the system. For instance, /dev/cgroup 395tree of the cgroups in the system. For instance, /sys/fs/cgroup/rg1
369is the cgroup that holds the whole system. 396is the cgroup that holds the whole system.
370 397
371If you want to change the value of release_agent: 398If you want to change the value of release_agent:
372# echo "/sbin/new_release_agent" > /dev/cgroup/release_agent 399# echo "/sbin/new_release_agent" > /sys/fs/cgroup/rg1/release_agent
373 400
374It can also be changed via remount. 401It can also be changed via remount.
375 402
376If you want to create a new cgroup under /dev/cgroup: 403If you want to create a new cgroup under /sys/fs/cgroup/rg1:
377# cd /dev/cgroup 404# cd /sys/fs/cgroup/rg1
378# mkdir my_cgroup 405# mkdir my_cgroup
379 406
380Now you want to do something with this cgroup. 407Now you want to do something with this cgroup.
@@ -416,6 +443,20 @@ You can attach the current shell task by echoing 0:
416 443
417# echo 0 > tasks 444# echo 0 > tasks
418 445
446You can use the cgroup.procs file instead of the tasks file to move all
447threads in a threadgroup at once. Echoing the pid of any task in a
448threadgroup to cgroup.procs causes all tasks in that threadgroup to be
449be attached to the cgroup. Writing 0 to cgroup.procs moves all tasks
450in the writing task's threadgroup.
451
452Note: Since every task is always a member of exactly one cgroup in each
453mounted hierarchy, to remove a task from its current cgroup you must
454move it into a new cgroup (possibly the root cgroup) by writing to the
455new cgroup's tasks file.
456
457Note: If the ns cgroup is active, moving a process to another cgroup can
458fail.
459
4192.3 Mounting hierarchies by name 4602.3 Mounting hierarchies by name
420-------------------------------- 461--------------------------------
421 462
@@ -553,7 +594,7 @@ rmdir() will fail with it. From this behavior, pre_destroy() can be
553called multiple times against a cgroup. 594called multiple times against a cgroup.
554 595
555int can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp, 596int can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
556 struct task_struct *task, bool threadgroup) 597 struct task_struct *task)
557(cgroup_mutex held by caller) 598(cgroup_mutex held by caller)
558 599
559Called prior to moving a task into a cgroup; if the subsystem 600Called prior to moving a task into a cgroup; if the subsystem
@@ -562,9 +603,14 @@ task is passed, then a successful result indicates that *any*
562unspecified task can be moved into the cgroup. Note that this isn't 603unspecified task can be moved into the cgroup. Note that this isn't
563called on a fork. If this method returns 0 (success) then this should 604called on a fork. If this method returns 0 (success) then this should
564remain valid while the caller holds cgroup_mutex and it is ensured that either 605remain valid while the caller holds cgroup_mutex and it is ensured that either
565attach() or cancel_attach() will be called in future. If threadgroup is 606attach() or cancel_attach() will be called in future.
566true, then a successful result indicates that all threads in the given 607
567thread's threadgroup can be moved together. 608int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
609(cgroup_mutex held by caller)
610
611As can_attach, but for operations that must be run once per task to be
612attached (possibly many when using cgroup_attach_proc). Called after
613can_attach.
568 614
569void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp, 615void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
570 struct task_struct *task, bool threadgroup) 616 struct task_struct *task, bool threadgroup)
@@ -576,15 +622,24 @@ function, so that the subsystem can implement a rollback. If not, not necessary.
576This will be called only about subsystems whose can_attach() operation have 622This will be called only about subsystems whose can_attach() operation have
577succeeded. 623succeeded.
578 624
625void pre_attach(struct cgroup *cgrp);
626(cgroup_mutex held by caller)
627
628For any non-per-thread attachment work that needs to happen before
629attach_task. Needed by cpuset.
630
579void attach(struct cgroup_subsys *ss, struct cgroup *cgrp, 631void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
580 struct cgroup *old_cgrp, struct task_struct *task, 632 struct cgroup *old_cgrp, struct task_struct *task)
581 bool threadgroup)
582(cgroup_mutex held by caller) 633(cgroup_mutex held by caller)
583 634
584Called after the task has been attached to the cgroup, to allow any 635Called after the task has been attached to the cgroup, to allow any
585post-attachment activity that requires memory allocations or blocking. 636post-attachment activity that requires memory allocations or blocking.
586If threadgroup is true, the subsystem should take care of all threads 637
587in the specified thread's threadgroup. Currently does not support any 638void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
639(cgroup_mutex held by caller)
640
641As attach, but for operations that must be run once per task to be attached,
642like can_attach_task. Called before attach. Currently does not support any
588subsystem that might need the old_cgrp for every thread in the group. 643subsystem that might need the old_cgrp for every thread in the group.
589 644
590void fork(struct cgroup_subsy *ss, struct task_struct *task) 645void fork(struct cgroup_subsy *ss, struct task_struct *task)
@@ -608,7 +663,7 @@ always handled well.
608void post_clone(struct cgroup_subsys *ss, struct cgroup *cgrp) 663void post_clone(struct cgroup_subsys *ss, struct cgroup *cgrp)
609(cgroup_mutex held by caller) 664(cgroup_mutex held by caller)
610 665
611Called at the end of cgroup_clone() to do any parameter 666Called during cgroup_create() to do any parameter
612initialization which might be required before a task could attach. For 667initialization which might be required before a task could attach. For
613example in cpusets, no task may attach before 'cpus' and 'mems' are set 668example in cpusets, no task may attach before 'cpus' and 'mems' are set
614up. 669up.