diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2011-03-24 13:16:26 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2011-03-24 13:16:26 -0400 |
commit | 6c5103890057b1bb781b26b7aae38d33e4c517d8 (patch) | |
tree | e6e57961dcddcb5841acb34956e70b9dc696a880 /Documentation | |
parent | 3dab04e6978e358ad2307bca563fabd6c5d2c58b (diff) | |
parent | 9d2e157d970a73b3f270b631828e03eb452d525e (diff) |
Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
Documentation/iostats.txt: bit-size reference etc.
cfq-iosched: removing unnecessary think time checking
cfq-iosched: Don't clear queue stats when preempt.
blk-throttle: Reset group slice when limits are changed
blk-cgroup: Only give unaccounted_time under debug
cfq-iosched: Don't set active queue in preempt
block: fix non-atomic access to genhd inflight structures
block: attempt to merge with existing requests on plug flush
block: NULL dereference on error path in __blkdev_get()
cfq-iosched: Don't update group weights when on service tree
fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
block: Require subsystems to explicitly allocate bio_set integrity mempool
jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
fs: make fsync_buffers_list() plug
mm: make generic_writepages() use plugging
blk-cgroup: Add unaccounted time to timeslice_used.
block: fixup plugging stubs for !CONFIG_BLOCK
block: remove obsolete comments for blkdev_issue_zeroout.
blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
...
Fix up conflicts in fs/{aio.c,super.c}
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/block/biodoc.txt | 5 | ||||
-rw-r--r-- | Documentation/cgroups/blkio-controller.txt | 30 | ||||
-rw-r--r-- | Documentation/iostats.txt | 17 |
3 files changed, 9 insertions, 43 deletions
diff --git a/Documentation/block/biodoc.txt b/Documentation/block/biodoc.txt index b9a83dd24732..2a7b38c832c7 100644 --- a/Documentation/block/biodoc.txt +++ b/Documentation/block/biodoc.txt | |||
@@ -963,11 +963,6 @@ elevator_dispatch_fn* fills the dispatch queue with ready requests. | |||
963 | 963 | ||
964 | elevator_add_req_fn* called to add a new request into the scheduler | 964 | elevator_add_req_fn* called to add a new request into the scheduler |
965 | 965 | ||
966 | elevator_queue_empty_fn returns true if the merge queue is empty. | ||
967 | Drivers shouldn't use this, but rather check | ||
968 | if elv_next_request is NULL (without losing the | ||
969 | request if one exists!) | ||
970 | |||
971 | elevator_former_req_fn | 966 | elevator_former_req_fn |
972 | elevator_latter_req_fn These return the request before or after the | 967 | elevator_latter_req_fn These return the request before or after the |
973 | one specified in disk sort order. Used by the | 968 | one specified in disk sort order. Used by the |
diff --git a/Documentation/cgroups/blkio-controller.txt b/Documentation/cgroups/blkio-controller.txt index 4ed7b5ceeed2..465351d4cf85 100644 --- a/Documentation/cgroups/blkio-controller.txt +++ b/Documentation/cgroups/blkio-controller.txt | |||
@@ -140,7 +140,7 @@ Proportional weight policy files | |||
140 | - Specifies per cgroup weight. This is default weight of the group | 140 | - Specifies per cgroup weight. This is default weight of the group |
141 | on all the devices until and unless overridden by per device rule. | 141 | on all the devices until and unless overridden by per device rule. |
142 | (See blkio.weight_device). | 142 | (See blkio.weight_device). |
143 | Currently allowed range of weights is from 100 to 1000. | 143 | Currently allowed range of weights is from 10 to 1000. |
144 | 144 | ||
145 | - blkio.weight_device | 145 | - blkio.weight_device |
146 | - One can specify per cgroup per device rules using this interface. | 146 | - One can specify per cgroup per device rules using this interface. |
@@ -343,34 +343,6 @@ Common files among various policies | |||
343 | 343 | ||
344 | CFQ sysfs tunable | 344 | CFQ sysfs tunable |
345 | ================= | 345 | ================= |
346 | /sys/block/<disk>/queue/iosched/group_isolation | ||
347 | ----------------------------------------------- | ||
348 | |||
349 | If group_isolation=1, it provides stronger isolation between groups at the | ||
350 | expense of throughput. By default group_isolation is 0. In general that | ||
351 | means that if group_isolation=0, expect fairness for sequential workload | ||
352 | only. Set group_isolation=1 to see fairness for random IO workload also. | ||
353 | |||
354 | Generally CFQ will put random seeky workload in sync-noidle category. CFQ | ||
355 | will disable idling on these queues and it does a collective idling on group | ||
356 | of such queues. Generally these are slow moving queues and if there is a | ||
357 | sync-noidle service tree in each group, that group gets exclusive access to | ||
358 | disk for certain period. That means it will bring the throughput down if | ||
359 | group does not have enough IO to drive deeper queue depths and utilize disk | ||
360 | capacity to the fullest in the slice allocated to it. But the flip side is | ||
361 | that even a random reader should get better latencies and overall throughput | ||
362 | if there are lots of sequential readers/sync-idle workload running in the | ||
363 | system. | ||
364 | |||
365 | If group_isolation=0, then CFQ automatically moves all the random seeky queues | ||
366 | in the root group. That means there will be no service differentiation for | ||
367 | that kind of workload. This leads to better throughput as we do collective | ||
368 | idling on root sync-noidle tree. | ||
369 | |||
370 | By default one should run with group_isolation=0. If that is not sufficient | ||
371 | and one wants stronger isolation between groups, then set group_isolation=1 | ||
372 | but this will come at cost of reduced throughput. | ||
373 | |||
374 | /sys/block/<disk>/queue/iosched/slice_idle | 346 | /sys/block/<disk>/queue/iosched/slice_idle |
375 | ------------------------------------------ | 347 | ------------------------------------------ |
376 | On a faster hardware CFQ can be slow, especially with sequential workload. | 348 | On a faster hardware CFQ can be slow, especially with sequential workload. |
diff --git a/Documentation/iostats.txt b/Documentation/iostats.txt index f6dece5b7014..c76c21d87e85 100644 --- a/Documentation/iostats.txt +++ b/Documentation/iostats.txt | |||
@@ -1,8 +1,6 @@ | |||
1 | I/O statistics fields | 1 | I/O statistics fields |
2 | --------------- | 2 | --------------- |
3 | 3 | ||
4 | Last modified Sep 30, 2003 | ||
5 | |||
6 | Since 2.4.20 (and some versions before, with patches), and 2.5.45, | 4 | Since 2.4.20 (and some versions before, with patches), and 2.5.45, |
7 | more extensive disk statistics have been introduced to help measure disk | 5 | more extensive disk statistics have been introduced to help measure disk |
8 | activity. Tools such as sar and iostat typically interpret these and do | 6 | activity. Tools such as sar and iostat typically interpret these and do |
@@ -46,11 +44,12 @@ the above example, the first field of statistics would be 446216. | |||
46 | By contrast, in 2.6 if you look at /sys/block/hda/stat, you'll | 44 | By contrast, in 2.6 if you look at /sys/block/hda/stat, you'll |
47 | find just the eleven fields, beginning with 446216. If you look at | 45 | find just the eleven fields, beginning with 446216. If you look at |
48 | /proc/diskstats, the eleven fields will be preceded by the major and | 46 | /proc/diskstats, the eleven fields will be preceded by the major and |
49 | minor device numbers, and device name. Each of these formats provide | 47 | minor device numbers, and device name. Each of these formats provides |
50 | eleven fields of statistics, each meaning exactly the same things. | 48 | eleven fields of statistics, each meaning exactly the same things. |
51 | All fields except field 9 are cumulative since boot. Field 9 should | 49 | All fields except field 9 are cumulative since boot. Field 9 should |
52 | go to zero as I/Os complete; all others only increase. Yes, these are | 50 | go to zero as I/Os complete; all others only increase (unless they |
53 | 32 bit unsigned numbers, and on a very busy or long-lived system they | 51 | overflow and wrap). Yes, these are (32-bit or 64-bit) unsigned long |
52 | (native word size) numbers, and on a very busy or long-lived system they | ||
54 | may wrap. Applications should be prepared to deal with that; unless | 53 | may wrap. Applications should be prepared to deal with that; unless |
55 | your observations are measured in large numbers of minutes or hours, | 54 | your observations are measured in large numbers of minutes or hours, |
56 | they should not wrap twice before you notice them. | 55 | they should not wrap twice before you notice them. |
@@ -96,11 +95,11 @@ introduced when changes collide, so (for instance) adding up all the | |||
96 | read I/Os issued per partition should equal those made to the disks ... | 95 | read I/Os issued per partition should equal those made to the disks ... |
97 | but due to the lack of locking it may only be very close. | 96 | but due to the lack of locking it may only be very close. |
98 | 97 | ||
99 | In 2.6, there are counters for each cpu, which made the lack of locking | 98 | In 2.6, there are counters for each CPU, which make the lack of locking |
100 | almost a non-issue. When the statistics are read, the per-cpu counters | 99 | almost a non-issue. When the statistics are read, the per-CPU counters |
101 | are summed (possibly overflowing the unsigned 32-bit variable they are | 100 | are summed (possibly overflowing the unsigned long variable they are |
102 | summed to) and the result given to the user. There is no convenient | 101 | summed to) and the result given to the user. There is no convenient |
103 | user interface for accessing the per-cpu counters themselves. | 102 | user interface for accessing the per-CPU counters themselves. |
104 | 103 | ||
105 | Disks vs Partitions | 104 | Disks vs Partitions |
106 | ------------------- | 105 | ------------------- |