aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorVivek Goyal <vgoyal@redhat.com>2010-08-23 06:25:29 -0400
committerJens Axboe <jaxboe@fusionio.com>2010-08-23 06:25:29 -0400
commit6d6ac1c1a3d4f95953aa3b085e8f16692d3a7179 (patch)
tree2524afc21ddf04bff04e23f47d18973262c0eb26 /Documentation
parentc4e7893ebc3a5c507b53f59b9de448db20849944 (diff)
cfq-iosched: Documentation help for new tunables
Some documentation to provide help with tunables. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/block/cfq-iosched.txt45
-rw-r--r--Documentation/cgroups/blkio-controller.txt28
2 files changed, 73 insertions, 0 deletions
diff --git a/Documentation/block/cfq-iosched.txt b/Documentation/block/cfq-iosched.txt
new file mode 100644
index 000000000000..e578feed6d81
--- /dev/null
+++ b/Documentation/block/cfq-iosched.txt
@@ -0,0 +1,45 @@
1CFQ ioscheduler tunables
2========================
3
4slice_idle
5----------
6This specifies how long CFQ should idle for next request on certain cfq queues
7(for sequential workloads) and service trees (for random workloads) before
8queue is expired and CFQ selects next queue to dispatch from.
9
10By default slice_idle is a non-zero value. That means by default we idle on
11queues/service trees. This can be very helpful on highly seeky media like
12single spindle SATA/SAS disks where we can cut down on overall number of
13seeks and see improved throughput.
14
15Setting slice_idle to 0 will remove all the idling on queues/service tree
16level and one should see an overall improved throughput on faster storage
17devices like multiple SATA/SAS disks in hardware RAID configuration. The down
18side is that isolation provided from WRITES also goes down and notion of
19IO priority becomes weaker.
20
21So depending on storage and workload, it might be useful to set slice_idle=0.
22In general I think for SATA/SAS disks and software RAID of SATA/SAS disks
23keeping slice_idle enabled should be useful. For any configurations where
24there are multiple spindles behind single LUN (Host based hardware RAID
25controller or for storage arrays), setting slice_idle=0 might end up in better
26throughput and acceptable latencies.
27
28CFQ IOPS Mode for group scheduling
29===================================
30Basic CFQ design is to provide priority based time slices. Higher priority
31process gets bigger time slice and lower priority process gets smaller time
32slice. Measuring time becomes harder if storage is fast and supports NCQ and
33it would be better to dispatch multiple requests from multiple cfq queues in
34request queue at a time. In such scenario, it is not possible to measure time
35consumed by single queue accurately.
36
37What is possible though is to measure number of requests dispatched from a
38single queue and also allow dispatch from multiple cfq queue at the same time.
39This effectively becomes the fairness in terms of IOPS (IO operations per
40second).
41
42If one sets slice_idle=0 and if storage supports NCQ, CFQ internally switches
43to IOPS mode and starts providing fairness in terms of number of requests
44dispatched. Note that this mode switching takes effect only for group
45scheduling. For non-cgroup users nothing should change.
diff --git a/Documentation/cgroups/blkio-controller.txt b/Documentation/cgroups/blkio-controller.txt
index 48e0b21b0059..6919d62591d9 100644
--- a/Documentation/cgroups/blkio-controller.txt
+++ b/Documentation/cgroups/blkio-controller.txt
@@ -217,6 +217,7 @@ Details of cgroup files
217CFQ sysfs tunable 217CFQ sysfs tunable
218================= 218=================
219/sys/block/<disk>/queue/iosched/group_isolation 219/sys/block/<disk>/queue/iosched/group_isolation
220-----------------------------------------------
220 221
221If group_isolation=1, it provides stronger isolation between groups at the 222If group_isolation=1, it provides stronger isolation between groups at the
222expense of throughput. By default group_isolation is 0. In general that 223expense of throughput. By default group_isolation is 0. In general that
@@ -243,6 +244,33 @@ By default one should run with group_isolation=0. If that is not sufficient
243and one wants stronger isolation between groups, then set group_isolation=1 244and one wants stronger isolation between groups, then set group_isolation=1
244but this will come at cost of reduced throughput. 245but this will come at cost of reduced throughput.
245 246
247/sys/block/<disk>/queue/iosched/slice_idle
248------------------------------------------
249On a faster hardware CFQ can be slow, especially with sequential workload.
250This happens because CFQ idles on a single queue and single queue might not
251drive deeper request queue depths to keep the storage busy. In such scenarios
252one can try setting slice_idle=0 and that would switch CFQ to IOPS
253(IO operations per second) mode on NCQ supporting hardware.
254
255That means CFQ will not idle between cfq queues of a cfq group and hence be
256able to driver higher queue depth and achieve better throughput. That also
257means that cfq provides fairness among groups in terms of IOPS and not in
258terms of disk time.
259
260/sys/block/<disk>/queue/iosched/group_idle
261------------------------------------------
262If one disables idling on individual cfq queues and cfq service trees by
263setting slice_idle=0, group_idle kicks in. That means CFQ will still idle
264on the group in an attempt to provide fairness among groups.
265
266By default group_idle is same as slice_idle and does not do anything if
267slice_idle is enabled.
268
269One can experience an overall throughput drop if you have created multiple
270groups and put applications in that group which are not driving enough
271IO to keep disk busy. In that case set group_idle=0, and CFQ will not idle
272on individual groups and throughput should improve.
273
246What works 274What works
247========== 275==========
248- Currently only sync IO queues are support. All the buffered writes are 276- Currently only sync IO queues are support. All the buffered writes are