cfq-iosched: Documentation help for new tunables

Some documentation to provide help with tunables. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
author: Vivek Goyal <vgoyal@redhat.com> 2010-08-23 06:25:29 -0400
committer: Jens Axboe <jaxboe@fusionio.com> 2010-08-23 06:25:29 -0400
commit: 6d6ac1c1a3d4f95953aa3b085e8f16692d3a7179 (patch)
tree: 2524afc21ddf04bff04e23f47d18973262c0eb26 /Documentation
parent: c4e7893ebc3a5c507b53f59b9de448db20849944 (diff)
2 files changed, 73 insertions, 0 deletions
diff --git a/Documentation/block/cfq-iosched.txt b/Documentation/block/cfq-iosched.txt
new file mode 100644
index 000000000000..e578feed6d81
--- /dev/null
+++ b/Documentation/block/cfq-iosched.txt
@@ -0,0 +1,45 @@
+CFQ ioscheduler tunables
+========================
+slice_idle
+----------
+This specifies how long CFQ should idle for next request on certain cfq queues
+(for sequential workloads) and service trees (for random workloads) before
+queue is expired and CFQ selects next queue to dispatch from.
+By default slice_idle is a non-zero value. That means by default we idle on
+queues/service trees. This can be very helpful on highly seeky media like
+single spindle SATA/SAS disks where we can cut down on overall number of
+seeks and see improved throughput.
+Setting slice_idle to 0 will remove all the idling on queues/service tree
+level and one should see an overall improved throughput on faster storage
+devices like multiple SATA/SAS disks in hardware RAID configuration. The down
+side is that isolation provided from WRITES also goes down and notion of
+IO priority becomes weaker.
+So depending on storage and workload, it might be useful to set slice_idle=0.
+In general I think for SATA/SAS disks and software RAID of SATA/SAS disks
+keeping slice_idle enabled should be useful. For any configurations where
+there are multiple spindles behind single LUN (Host based hardware RAID
+controller or for storage arrays), setting slice_idle=0 might end up in better
+throughput and acceptable latencies.
+CFQ IOPS Mode for group scheduling
+===================================
+Basic CFQ design is to provide priority based time slices. Higher priority
+process gets bigger time slice and lower priority process gets smaller time
+slice. Measuring time becomes harder if storage is fast and supports NCQ and
+it would be better to dispatch multiple requests from multiple cfq queues in
+request queue at a time. In such scenario, it is not possible to measure time
+consumed by single queue accurately.
+What is possible though is to measure number of requests dispatched from a
+single queue and also allow dispatch from multiple cfq queue at the same time.
+This effectively becomes the fairness in terms of IOPS (IO operations per
+second).
+If one sets slice_idle=0 and if storage supports NCQ, CFQ internally switches
+to IOPS mode and starts providing fairness in terms of number of requests
+dispatched. Note that this mode switching takes effect only for group
+scheduling. For non-cgroup users nothing should change.
diff --git a/Documentation/cgroups/blkio-controller.txt b/Documentation/cgroups/blkio-controller.txt
index 48e0b21b0059..6919d62591d9 100644
--- a/Documentation/cgroups/blkio-controller.txt
+++ b/Documentation/cgroups/blkio-controller.txt
@@ -217,6 +217,7 @@ Details of cgroup files
 CFQ sysfs tunable
 =================
 /sys/block/<disk>/queue/iosched/group_isolation
+-----------------------------------------------
 If group_isolation=1, it provides stronger isolation between groups at the
 expense of throughput. By default group_isolation is 0. In general that
@@ -243,6 +244,33 @@ By default one should run with group_isolation=0. If that is not sufficient
 and one wants stronger isolation between groups, then set group_isolation=1
 but this will come at cost of reduced throughput.
+/sys/block/<disk>/queue/iosched/slice_idle
+------------------------------------------
+On a faster hardware CFQ can be slow, especially with sequential workload.
+This happens because CFQ idles on a single queue and single queue might not
+drive deeper request queue depths to keep the storage busy. In such scenarios
+one can try setting slice_idle=0 and that would switch CFQ to IOPS
+(IO operations per second) mode on NCQ supporting hardware.
+That means CFQ will not idle between cfq queues of a cfq group and hence be
+able to driver higher queue depth and achieve better throughput. That also
+means that cfq provides fairness among groups in terms of IOPS and not in
+terms of disk time.
+/sys/block/<disk>/queue/iosched/group_idle
+------------------------------------------
+If one disables idling on individual cfq queues and cfq service trees by
+setting slice_idle=0, group_idle kicks in. That means CFQ will still idle
+on the group in an attempt to provide fairness among groups.
+By default group_idle is same as slice_idle and does not do anything if
+slice_idle is enabled.
+One can experience an overall throughput drop if you have created multiple
+groups and put applications in that group which are not driving enough
+IO to keep disk busy. In that case set group_idle=0, and CFQ will not idle
+on individual groups and throughput should improve.
 What works
 ==========
 - Currently only sync IO queues are support. All the buffered writes are
author	Vivek Goyal <vgoyal@redhat.com>	2010-08-23 06:25:29 -0400
committer	Jens Axboe <jaxboe@fusionio.com>	2010-08-23 06:25:29 -0400
commit	6d6ac1c1a3d4f95953aa3b085e8f16692d3a7179 (patch)
tree	2524afc21ddf04bff04e23f47d18973262c0eb26 /Documentation
parent	c4e7893ebc3a5c507b53f59b9de448db20849944 (diff)

diff --git a/Documentation/block/cfq-iosched.txt b/Documentation/block/cfq-iosched.txt new file mode 100644 index 000000000000..e578feed6d81 --- /dev/null +++ b/Documentation/block/cfq-iosched.txt
@@ -0,0 +1,45 @@
		1	CFQ ioscheduler tunables
		2	========================
		3
		4	slice_idle
		5	----------
		6	This specifies how long CFQ should idle for next request on certain cfq queues
		7	(for sequential workloads) and service trees (for random workloads) before
		8	queue is expired and CFQ selects next queue to dispatch from.
		9
		10	By default slice_idle is a non-zero value. That means by default we idle on
		11	queues/service trees. This can be very helpful on highly seeky media like
		12	single spindle SATA/SAS disks where we can cut down on overall number of
		13	seeks and see improved throughput.
		14
		15	Setting slice_idle to 0 will remove all the idling on queues/service tree
		16	level and one should see an overall improved throughput on faster storage
		17	devices like multiple SATA/SAS disks in hardware RAID configuration. The down
		18	side is that isolation provided from WRITES also goes down and notion of
		19	IO priority becomes weaker.
		20
		21	So depending on storage and workload, it might be useful to set slice_idle=0.
		22	In general I think for SATA/SAS disks and software RAID of SATA/SAS disks
		23	keeping slice_idle enabled should be useful. For any configurations where
		24	there are multiple spindles behind single LUN (Host based hardware RAID
		25	controller or for storage arrays), setting slice_idle=0 might end up in better
		26	throughput and acceptable latencies.
		27
		28	CFQ IOPS Mode for group scheduling
		29	===================================
		30	Basic CFQ design is to provide priority based time slices. Higher priority
		31	process gets bigger time slice and lower priority process gets smaller time
		32	slice. Measuring time becomes harder if storage is fast and supports NCQ and
		33	it would be better to dispatch multiple requests from multiple cfq queues in
		34	request queue at a time. In such scenario, it is not possible to measure time
		35	consumed by single queue accurately.
		36
		37	What is possible though is to measure number of requests dispatched from a
		38	single queue and also allow dispatch from multiple cfq queue at the same time.
		39	This effectively becomes the fairness in terms of IOPS (IO operations per
		40	second).
		41
		42	If one sets slice_idle=0 and if storage supports NCQ, CFQ internally switches
		43	to IOPS mode and starts providing fairness in terms of number of requests
		44	dispatched. Note that this mode switching takes effect only for group
		45	scheduling. For non-cgroup users nothing should change.


diff --git a/Documentation/cgroups/blkio-controller.txt b/Documentation/cgroups/blkio-controller.txt index 48e0b21b0059..6919d62591d9 100644 --- a/Documentation/cgroups/blkio-controller.txt +++ b/Documentation/cgroups/blkio-controller.txt
@@ -217,6 +217,7 @@ Details of cgroup files
217	CFQ sysfs tunable	217	CFQ sysfs tunable
218	=================	218	=================
219	/sys/block/<disk>/queue/iosched/group_isolation	219	/sys/block/<disk>/queue/iosched/group_isolation
		220	-----------------------------------------------
220		221
221	If group_isolation=1, it provides stronger isolation between groups at the	222	If group_isolation=1, it provides stronger isolation between groups at the
222	expense of throughput. By default group_isolation is 0. In general that	223	expense of throughput. By default group_isolation is 0. In general that
@@ -243,6 +244,33 @@ By default one should run with group_isolation=0. If that is not sufficient
243	and one wants stronger isolation between groups, then set group_isolation=1	244	and one wants stronger isolation between groups, then set group_isolation=1
244	but this will come at cost of reduced throughput.	245	but this will come at cost of reduced throughput.
245		246
		247	/sys/block/<disk>/queue/iosched/slice_idle
		248	------------------------------------------
		249	On a faster hardware CFQ can be slow, especially with sequential workload.
		250	This happens because CFQ idles on a single queue and single queue might not
		251	drive deeper request queue depths to keep the storage busy. In such scenarios
		252	one can try setting slice_idle=0 and that would switch CFQ to IOPS
		253	(IO operations per second) mode on NCQ supporting hardware.
		254
		255	That means CFQ will not idle between cfq queues of a cfq group and hence be
		256	able to driver higher queue depth and achieve better throughput. That also
		257	means that cfq provides fairness among groups in terms of IOPS and not in
		258	terms of disk time.
		259
		260	/sys/block/<disk>/queue/iosched/group_idle
		261	------------------------------------------
		262	If one disables idling on individual cfq queues and cfq service trees by
		263	setting slice_idle=0, group_idle kicks in. That means CFQ will still idle
		264	on the group in an attempt to provide fairness among groups.
		265
		266	By default group_idle is same as slice_idle and does not do anything if
		267	slice_idle is enabled.
		268
		269	One can experience an overall throughput drop if you have created multiple
		270	groups and put applications in that group which are not driving enough
		271	IO to keep disk busy. In that case set group_idle=0, and CFQ will not idle
		272	on individual groups and throughput should improve.
		273
246	What works	274	What works
247	==========	275	==========
248	- Currently only sync IO queues are support. All the buffered writes are	276	- Currently only sync IO queues are support. All the buffered writes are