aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/block
diff options
context:
space:
mode:
authorTejun Heo <tj@kernel.org>2013-01-09 11:05:11 -0500
committerTejun Heo <tj@kernel.org>2013-01-09 11:05:11 -0500
commitd02f7aa8dce8166dbbc515ce393912aa45e6b8a6 (patch)
tree8a3e8e54bed797bb084a83008ca47065a261a9d6 /Documentation/block
parent41cad6ab2cb9ccb3b11546ad56b8b285e47c6279 (diff)
cfq-iosched: enable full blkcg hierarchy support
With the previous two patches, all cfqg scheduling decisions are based on vfraction and ready for hierarchy support. The only thing which keeps the behavior flat is cfqg_flat_parent() which makes vfraction calculation consider all non-root cfqgs children of the root cfqg. Replace it with cfqg_parent() which returns the real parent. This enables full blkcg hierarchy support for cfq-iosched. For example, consider the following hierarchy. root / \ A:500 B:250 / \ AA:500 AB:1000 For simplicity, let's say all the leaf nodes have active tasks and are on service tree. For each leaf node, vfraction would be AA: (500 / 1500) * (500 / 750) =~ 0.2222 AB: (1000 / 1500) * (500 / 750) =~ 0.4444 B: (250 / 750) =~ 0.3333 and vdisktime will be distributed accordingly. For more detail, please refer to Documentation/block/cfq-iosched.txt. v2: cfq-iosched.txt updated to describe group scheduling as suggested by Vivek. v3: blkio-controller.txt updated. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Vivek Goyal <vgoyal@redhat.com>
Diffstat (limited to 'Documentation/block')
-rw-r--r--Documentation/block/cfq-iosched.txt58
1 files changed, 58 insertions, 0 deletions
diff --git a/Documentation/block/cfq-iosched.txt b/Documentation/block/cfq-iosched.txt
index d89b4fe724d7..a5eb7d19a65d 100644
--- a/Documentation/block/cfq-iosched.txt
+++ b/Documentation/block/cfq-iosched.txt
@@ -102,6 +102,64 @@ processing of request. Therefore, increasing the value can imporve the
102performace although this can cause the latency of some I/O to increase due 102performace although this can cause the latency of some I/O to increase due
103to more number of requests. 103to more number of requests.
104 104
105CFQ Group scheduling
106====================
107
108CFQ supports blkio cgroup and has "blkio." prefixed files in each
109blkio cgroup directory. It is weight-based and there are four knobs
110for configuration - weight[_device] and leaf_weight[_device].
111Internal cgroup nodes (the ones with children) can also have tasks in
112them, so the former two configure how much proportion the cgroup as a
113whole is entitled to at its parent's level while the latter two
114configure how much proportion the tasks in the cgroup have compared to
115its direct children.
116
117Another way to think about it is assuming that each internal node has
118an implicit leaf child node which hosts all the tasks whose weight is
119configured by leaf_weight[_device]. Let's assume a blkio hierarchy
120composed of five cgroups - root, A, B, AA and AB - with the following
121weights where the names represent the hierarchy.
122
123 weight leaf_weight
124 root : 125 125
125 A : 500 750
126 B : 250 500
127 AA : 500 500
128 AB : 1000 500
129
130root never has a parent making its weight is meaningless. For backward
131compatibility, weight is always kept in sync with leaf_weight. B, AA
132and AB have no child and thus its tasks have no children cgroup to
133compete with. They always get 100% of what the cgroup won at the
134parent level. Considering only the weights which matter, the hierarchy
135looks like the following.
136
137 root
138 / | \
139 A B leaf
140 500 250 125
141 / | \
142 AA AB leaf
143 500 1000 750
144
145If all cgroups have active IOs and competing with each other, disk
146time will be distributed like the following.
147
148Distribution below root. The total active weight at this level is
149A:500 + B:250 + C:125 = 875.
150
151 root-leaf : 125 / 875 =~ 14%
152 A : 500 / 875 =~ 57%
153 B(-leaf) : 250 / 875 =~ 28%
154
155A has children and further distributes its 57% among the children and
156the implicit leaf node. The total active weight at this level is
157AA:500 + AB:1000 + A-leaf:750 = 2250.
158
159 A-leaf : ( 750 / 2250) * A =~ 19%
160 AA(-leaf) : ( 500 / 2250) * A =~ 12%
161 AB(-leaf) : (1000 / 2250) * A =~ 25%
162
105CFQ IOPS Mode for group scheduling 163CFQ IOPS Mode for group scheduling
106=================================== 164===================================
107Basic CFQ design is to provide priority based time slices. Higher priority 165Basic CFQ design is to provide priority based time slices. Higher priority