diff options
author | Vivek Goyal <vgoyal@redhat.com> | 2010-11-15 13:37:36 -0500 |
---|---|---|
committer | Jens Axboe <jaxboe@fusionio.com> | 2010-11-15 13:37:36 -0500 |
commit | bdc85df7a8417b9893443ff5520804699416b6f3 (patch) | |
tree | 7d1158ff87d327934c0c3b0d5ec9b02ed4811de6 | |
parent | 0143832cc96d0bf78486297aad5c8fb2c2ead02a (diff) |
blk-cgroup: Allow creation of hierarchical cgroups
o Allow hierarchical cgroup creation for blkio controller
o Currently we disallow it as both the io controller policies (throttling
as well as proportion bandwidth) do not support hierarhical accounting
and control. But the flip side is that blkio controller can not be used with
libvirt as libvirt creates a cgroup hierarchy deeper than 1 level.
<top-level-cgroup-dir>/<controller>/libvirt/qemu/<virtual-machine-groups>
o So this patch will allow creation of cgroup hierarhcy but at the backend
everything will be treated as flat. So if somebody created a an hierarchy
like as follows.
root
/ \
test1 test2
|
test3
CFQ and throttling will practically treat all groups at same level.
pivot
/ | \ \
root test1 test2 test3
o Once we have actual support for hierarchical accounting and control
then we can introduce another cgroup tunable file "blkio.use_hierarchy"
which will be 0 by default but if user wants to enforce hierarhical
control then it can be set to 1. This way there should not be any
ABI problems down the line.
o The only not so pretty part is introduction of extra file "use_hierarchy"
down the line. Kame-san had mentioned that hierarhical accounting is
expensive in memory controller hence they keep it off by default. I
suspect same will be the case for IO controller also as for each IO
completion we shall have to account IO through hierarchy up to the root.
if yes, then it probably is not a very bad idea to introduce this extra
file so that it will be used only when somebody needs it and some people
might enable hierarchy only in part of the hierarchy.
o This is how basically memory controller also uses "use_hierarhcy" and
they also allowed creation of hierarchies when actual backend support
was not available.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Reviewed-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Reviewed-by: Ciju Rajan K <ciju@linux.vnet.ibm.com>
Tested-by: Ciju Rajan K <ciju@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
-rw-r--r-- | Documentation/cgroups/blkio-controller.txt | 27 | ||||
-rw-r--r-- | block/blk-cgroup.c | 4 |
2 files changed, 27 insertions, 4 deletions
diff --git a/Documentation/cgroups/blkio-controller.txt b/Documentation/cgroups/blkio-controller.txt index d6da611f8f63..4ed7b5ceeed2 100644 --- a/Documentation/cgroups/blkio-controller.txt +++ b/Documentation/cgroups/blkio-controller.txt | |||
@@ -89,6 +89,33 @@ Throttling/Upper Limit policy | |||
89 | 89 | ||
90 | Limits for writes can be put using blkio.write_bps_device file. | 90 | Limits for writes can be put using blkio.write_bps_device file. |
91 | 91 | ||
92 | Hierarchical Cgroups | ||
93 | ==================== | ||
94 | - Currently none of the IO control policy supports hierarhical groups. But | ||
95 | cgroup interface does allow creation of hierarhical cgroups and internally | ||
96 | IO policies treat them as flat hierarchy. | ||
97 | |||
98 | So this patch will allow creation of cgroup hierarhcy but at the backend | ||
99 | everything will be treated as flat. So if somebody created a hierarchy like | ||
100 | as follows. | ||
101 | |||
102 | root | ||
103 | / \ | ||
104 | test1 test2 | ||
105 | | | ||
106 | test3 | ||
107 | |||
108 | CFQ and throttling will practically treat all groups at same level. | ||
109 | |||
110 | pivot | ||
111 | / | \ \ | ||
112 | root test1 test2 test3 | ||
113 | |||
114 | Down the line we can implement hierarchical accounting/control support | ||
115 | and also introduce a new cgroup file "use_hierarchy" which will control | ||
116 | whether cgroup hierarchy is viewed as flat or hierarchical by the policy.. | ||
117 | This is how memory controller also has implemented the things. | ||
118 | |||
92 | Various user visible config options | 119 | Various user visible config options |
93 | =================================== | 120 | =================================== |
94 | CONFIG_BLK_CGROUP | 121 | CONFIG_BLK_CGROUP |
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index b1febd0f6d2a..455768a3eb9e 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c | |||
@@ -1452,10 +1452,6 @@ blkiocg_create(struct cgroup_subsys *subsys, struct cgroup *cgroup) | |||
1452 | goto done; | 1452 | goto done; |
1453 | } | 1453 | } |
1454 | 1454 | ||
1455 | /* Currently we do not support hierarchy deeper than two level (0,1) */ | ||
1456 | if (parent != cgroup->top_cgroup) | ||
1457 | return ERR_PTR(-EPERM); | ||
1458 | |||
1459 | blkcg = kzalloc(sizeof(*blkcg), GFP_KERNEL); | 1455 | blkcg = kzalloc(sizeof(*blkcg), GFP_KERNEL); |
1460 | if (!blkcg) | 1456 | if (!blkcg) |
1461 | return ERR_PTR(-ENOMEM); | 1457 | return ERR_PTR(-ENOMEM); |