aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2011-03-24 13:16:26 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2011-03-24 13:16:26 -0400
commit6c5103890057b1bb781b26b7aae38d33e4c517d8 (patch)
treee6e57961dcddcb5841acb34956e70b9dc696a880 /Documentation
parent3dab04e6978e358ad2307bca563fabd6c5d2c58b (diff)
parent9d2e157d970a73b3f270b631828e03eb452d525e (diff)
Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits) Documentation/iostats.txt: bit-size reference etc. cfq-iosched: removing unnecessary think time checking cfq-iosched: Don't clear queue stats when preempt. blk-throttle: Reset group slice when limits are changed blk-cgroup: Only give unaccounted_time under debug cfq-iosched: Don't set active queue in preempt block: fix non-atomic access to genhd inflight structures block: attempt to merge with existing requests on plug flush block: NULL dereference on error path in __blkdev_get() cfq-iosched: Don't update group weights when on service tree fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away block: Require subsystems to explicitly allocate bio_set integrity mempool jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging fs: make fsync_buffers_list() plug mm: make generic_writepages() use plugging blk-cgroup: Add unaccounted time to timeslice_used. block: fixup plugging stubs for !CONFIG_BLOCK block: remove obsolete comments for blkdev_issue_zeroout. blktrace: Use rq->cmd_flags directly in blk_add_trace_rq. ... Fix up conflicts in fs/{aio.c,super.c}
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/block/biodoc.txt5
-rw-r--r--Documentation/cgroups/blkio-controller.txt30
-rw-r--r--Documentation/iostats.txt17
3 files changed, 9 insertions, 43 deletions
diff --git a/Documentation/block/biodoc.txt b/Documentation/block/biodoc.txt
index b9a83dd24732..2a7b38c832c7 100644
--- a/Documentation/block/biodoc.txt
+++ b/Documentation/block/biodoc.txt
@@ -963,11 +963,6 @@ elevator_dispatch_fn* fills the dispatch queue with ready requests.
963 963
964elevator_add_req_fn* called to add a new request into the scheduler 964elevator_add_req_fn* called to add a new request into the scheduler
965 965
966elevator_queue_empty_fn returns true if the merge queue is empty.
967 Drivers shouldn't use this, but rather check
968 if elv_next_request is NULL (without losing the
969 request if one exists!)
970
971elevator_former_req_fn 966elevator_former_req_fn
972elevator_latter_req_fn These return the request before or after the 967elevator_latter_req_fn These return the request before or after the
973 one specified in disk sort order. Used by the 968 one specified in disk sort order. Used by the
diff --git a/Documentation/cgroups/blkio-controller.txt b/Documentation/cgroups/blkio-controller.txt
index 4ed7b5ceeed2..465351d4cf85 100644
--- a/Documentation/cgroups/blkio-controller.txt
+++ b/Documentation/cgroups/blkio-controller.txt
@@ -140,7 +140,7 @@ Proportional weight policy files
140 - Specifies per cgroup weight. This is default weight of the group 140 - Specifies per cgroup weight. This is default weight of the group
141 on all the devices until and unless overridden by per device rule. 141 on all the devices until and unless overridden by per device rule.
142 (See blkio.weight_device). 142 (See blkio.weight_device).
143 Currently allowed range of weights is from 100 to 1000. 143 Currently allowed range of weights is from 10 to 1000.
144 144
145- blkio.weight_device 145- blkio.weight_device
146 - One can specify per cgroup per device rules using this interface. 146 - One can specify per cgroup per device rules using this interface.
@@ -343,34 +343,6 @@ Common files among various policies
343 343
344CFQ sysfs tunable 344CFQ sysfs tunable
345================= 345=================
346/sys/block/<disk>/queue/iosched/group_isolation
347-----------------------------------------------
348
349If group_isolation=1, it provides stronger isolation between groups at the
350expense of throughput. By default group_isolation is 0. In general that
351means that if group_isolation=0, expect fairness for sequential workload
352only. Set group_isolation=1 to see fairness for random IO workload also.
353
354Generally CFQ will put random seeky workload in sync-noidle category. CFQ
355will disable idling on these queues and it does a collective idling on group
356of such queues. Generally these are slow moving queues and if there is a
357sync-noidle service tree in each group, that group gets exclusive access to
358disk for certain period. That means it will bring the throughput down if
359group does not have enough IO to drive deeper queue depths and utilize disk
360capacity to the fullest in the slice allocated to it. But the flip side is
361that even a random reader should get better latencies and overall throughput
362if there are lots of sequential readers/sync-idle workload running in the
363system.
364
365If group_isolation=0, then CFQ automatically moves all the random seeky queues
366in the root group. That means there will be no service differentiation for
367that kind of workload. This leads to better throughput as we do collective
368idling on root sync-noidle tree.
369
370By default one should run with group_isolation=0. If that is not sufficient
371and one wants stronger isolation between groups, then set group_isolation=1
372but this will come at cost of reduced throughput.
373
374/sys/block/<disk>/queue/iosched/slice_idle 346/sys/block/<disk>/queue/iosched/slice_idle
375------------------------------------------ 347------------------------------------------
376On a faster hardware CFQ can be slow, especially with sequential workload. 348On a faster hardware CFQ can be slow, especially with sequential workload.
diff --git a/Documentation/iostats.txt b/Documentation/iostats.txt
index f6dece5b7014..c76c21d87e85 100644
--- a/Documentation/iostats.txt
+++ b/Documentation/iostats.txt
@@ -1,8 +1,6 @@
1I/O statistics fields 1I/O statistics fields
2--------------- 2---------------
3 3
4Last modified Sep 30, 2003
5
6Since 2.4.20 (and some versions before, with patches), and 2.5.45, 4Since 2.4.20 (and some versions before, with patches), and 2.5.45,
7more extensive disk statistics have been introduced to help measure disk 5more extensive disk statistics have been introduced to help measure disk
8activity. Tools such as sar and iostat typically interpret these and do 6activity. Tools such as sar and iostat typically interpret these and do
@@ -46,11 +44,12 @@ the above example, the first field of statistics would be 446216.
46By contrast, in 2.6 if you look at /sys/block/hda/stat, you'll 44By contrast, in 2.6 if you look at /sys/block/hda/stat, you'll
47find just the eleven fields, beginning with 446216. If you look at 45find just the eleven fields, beginning with 446216. If you look at
48/proc/diskstats, the eleven fields will be preceded by the major and 46/proc/diskstats, the eleven fields will be preceded by the major and
49minor device numbers, and device name. Each of these formats provide 47minor device numbers, and device name. Each of these formats provides
50eleven fields of statistics, each meaning exactly the same things. 48eleven fields of statistics, each meaning exactly the same things.
51All fields except field 9 are cumulative since boot. Field 9 should 49All fields except field 9 are cumulative since boot. Field 9 should
52go to zero as I/Os complete; all others only increase. Yes, these are 50go to zero as I/Os complete; all others only increase (unless they
5332 bit unsigned numbers, and on a very busy or long-lived system they 51overflow and wrap). Yes, these are (32-bit or 64-bit) unsigned long
52(native word size) numbers, and on a very busy or long-lived system they
54may wrap. Applications should be prepared to deal with that; unless 53may wrap. Applications should be prepared to deal with that; unless
55your observations are measured in large numbers of minutes or hours, 54your observations are measured in large numbers of minutes or hours,
56they should not wrap twice before you notice them. 55they should not wrap twice before you notice them.
@@ -96,11 +95,11 @@ introduced when changes collide, so (for instance) adding up all the
96read I/Os issued per partition should equal those made to the disks ... 95read I/Os issued per partition should equal those made to the disks ...
97but due to the lack of locking it may only be very close. 96but due to the lack of locking it may only be very close.
98 97
99In 2.6, there are counters for each cpu, which made the lack of locking 98In 2.6, there are counters for each CPU, which make the lack of locking
100almost a non-issue. When the statistics are read, the per-cpu counters 99almost a non-issue. When the statistics are read, the per-CPU counters
101are summed (possibly overflowing the unsigned 32-bit variable they are 100are summed (possibly overflowing the unsigned long variable they are
102summed to) and the result given to the user. There is no convenient 101summed to) and the result given to the user. There is no convenient
103user interface for accessing the per-cpu counters themselves. 102user interface for accessing the per-CPU counters themselves.
104 103
105Disks vs Partitions 104Disks vs Partitions
106------------------- 105-------------------