aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2010-09-10 10:26:27 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2010-09-10 10:26:27 -0400
commitff3cb3fec3c5bbb5110e652bbdd410bc99a47e9f (patch)
tree5b6834a3a4ecd479d544f8cc8cd10811c1ae13e1 /Documentation
parent6ccaa3172941c0a97c7f1c5155b1d32ecd27ec2f (diff)
parentbe14eb619108fa8b7120eb2c42d66d5f623ae10e (diff)
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: Range check cpu in blk_cpu_to_group scatterlist: prevent invalid free when alloc fails writeback: Fix lost wake-up shutting down writeback thread writeback: do not lose wakeup events when forking bdi threads cciss: fix reporting of max queue depth since init block: switch s390 tape_block and mg_disk to elevator_change() block: add function call to switch the IO scheduler from a driver fs/bio-integrity.c: return -ENOMEM on kmalloc failure bio-integrity.c: remove dependency on __GFP_NOFAIL BLOCK: fix bio.bi_rw handling block: put dev->kobj in blk_register_queue fail path cciss: handle allocation failure cfq-iosched: Documentation help for new tunables cfq-iosched: blktrace print per slice sector stats cfq-iosched: Implement tunable group_idle cfq-iosched: Do group share accounting in IOPS when slice_idle=0 cfq-iosched: Do not idle if slice_idle=0 cciss: disable doorbell reset on reset_devices blkio: Fix return code for mkdir calls
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/block/cfq-iosched.txt45
-rw-r--r--Documentation/cgroups/blkio-controller.txt28
2 files changed, 73 insertions, 0 deletions
diff --git a/Documentation/block/cfq-iosched.txt b/Documentation/block/cfq-iosched.txt
new file mode 100644
index 000000000000..e578feed6d81
--- /dev/null
+++ b/Documentation/block/cfq-iosched.txt
@@ -0,0 +1,45 @@
1CFQ ioscheduler tunables
2========================
3
4slice_idle
5----------
6This specifies how long CFQ should idle for next request on certain cfq queues
7(for sequential workloads) and service trees (for random workloads) before
8queue is expired and CFQ selects next queue to dispatch from.
9
10By default slice_idle is a non-zero value. That means by default we idle on
11queues/service trees. This can be very helpful on highly seeky media like
12single spindle SATA/SAS disks where we can cut down on overall number of
13seeks and see improved throughput.
14
15Setting slice_idle to 0 will remove all the idling on queues/service tree
16level and one should see an overall improved throughput on faster storage
17devices like multiple SATA/SAS disks in hardware RAID configuration. The down
18side is that isolation provided from WRITES also goes down and notion of
19IO priority becomes weaker.
20
21So depending on storage and workload, it might be useful to set slice_idle=0.
22In general I think for SATA/SAS disks and software RAID of SATA/SAS disks
23keeping slice_idle enabled should be useful. For any configurations where
24there are multiple spindles behind single LUN (Host based hardware RAID
25controller or for storage arrays), setting slice_idle=0 might end up in better
26throughput and acceptable latencies.
27
28CFQ IOPS Mode for group scheduling
29===================================
30Basic CFQ design is to provide priority based time slices. Higher priority
31process gets bigger time slice and lower priority process gets smaller time
32slice. Measuring time becomes harder if storage is fast and supports NCQ and
33it would be better to dispatch multiple requests from multiple cfq queues in
34request queue at a time. In such scenario, it is not possible to measure time
35consumed by single queue accurately.
36
37What is possible though is to measure number of requests dispatched from a
38single queue and also allow dispatch from multiple cfq queue at the same time.
39This effectively becomes the fairness in terms of IOPS (IO operations per
40second).
41
42If one sets slice_idle=0 and if storage supports NCQ, CFQ internally switches
43to IOPS mode and starts providing fairness in terms of number of requests
44dispatched. Note that this mode switching takes effect only for group
45scheduling. For non-cgroup users nothing should change.
diff --git a/Documentation/cgroups/blkio-controller.txt b/Documentation/cgroups/blkio-controller.txt
index 48e0b21b0059..6919d62591d9 100644
--- a/Documentation/cgroups/blkio-controller.txt
+++ b/Documentation/cgroups/blkio-controller.txt
@@ -217,6 +217,7 @@ Details of cgroup files
217CFQ sysfs tunable 217CFQ sysfs tunable
218================= 218=================
219/sys/block/<disk>/queue/iosched/group_isolation 219/sys/block/<disk>/queue/iosched/group_isolation
220-----------------------------------------------
220 221
221If group_isolation=1, it provides stronger isolation between groups at the 222If group_isolation=1, it provides stronger isolation between groups at the
222expense of throughput. By default group_isolation is 0. In general that 223expense of throughput. By default group_isolation is 0. In general that
@@ -243,6 +244,33 @@ By default one should run with group_isolation=0. If that is not sufficient
243and one wants stronger isolation between groups, then set group_isolation=1 244and one wants stronger isolation between groups, then set group_isolation=1
244but this will come at cost of reduced throughput. 245but this will come at cost of reduced throughput.
245 246
247/sys/block/<disk>/queue/iosched/slice_idle
248------------------------------------------
249On a faster hardware CFQ can be slow, especially with sequential workload.
250This happens because CFQ idles on a single queue and single queue might not
251drive deeper request queue depths to keep the storage busy. In such scenarios
252one can try setting slice_idle=0 and that would switch CFQ to IOPS
253(IO operations per second) mode on NCQ supporting hardware.
254
255That means CFQ will not idle between cfq queues of a cfq group and hence be
256able to driver higher queue depth and achieve better throughput. That also
257means that cfq provides fairness among groups in terms of IOPS and not in
258terms of disk time.
259
260/sys/block/<disk>/queue/iosched/group_idle
261------------------------------------------
262If one disables idling on individual cfq queues and cfq service trees by
263setting slice_idle=0, group_idle kicks in. That means CFQ will still idle
264on the group in an attempt to provide fairness among groups.
265
266By default group_idle is same as slice_idle and does not do anything if
267slice_idle is enabled.
268
269One can experience an overall throughput drop if you have created multiple
270groups and put applications in that group which are not driving enough
271IO to keep disk busy. In that case set group_idle=0, and CFQ will not idle
272on individual groups and throughput should improve.
273
246What works 274What works
247========== 275==========
248- Currently only sync IO queues are support. All the buffered writes are 276- Currently only sync IO queues are support. All the buffered writes are