diff options
Diffstat (limited to 'Documentation/cgroups/blkio-controller.txt')
-rw-r--r-- | Documentation/cgroups/blkio-controller.txt | 151 |
1 files changed, 133 insertions, 18 deletions
diff --git a/Documentation/cgroups/blkio-controller.txt b/Documentation/cgroups/blkio-controller.txt index 630879cd9a42..48e0b21b0059 100644 --- a/Documentation/cgroups/blkio-controller.txt +++ b/Documentation/cgroups/blkio-controller.txt | |||
@@ -17,6 +17,9 @@ HOWTO | |||
17 | You can do a very simple testing of running two dd threads in two different | 17 | You can do a very simple testing of running two dd threads in two different |
18 | cgroups. Here is what you can do. | 18 | cgroups. Here is what you can do. |
19 | 19 | ||
20 | - Enable Block IO controller | ||
21 | CONFIG_BLK_CGROUP=y | ||
22 | |||
20 | - Enable group scheduling in CFQ | 23 | - Enable group scheduling in CFQ |
21 | CONFIG_CFQ_GROUP_IOSCHED=y | 24 | CONFIG_CFQ_GROUP_IOSCHED=y |
22 | 25 | ||
@@ -54,32 +57,52 @@ cgroups. Here is what you can do. | |||
54 | 57 | ||
55 | Various user visible config options | 58 | Various user visible config options |
56 | =================================== | 59 | =================================== |
57 | CONFIG_CFQ_GROUP_IOSCHED | ||
58 | - Enables group scheduling in CFQ. Currently only 1 level of group | ||
59 | creation is allowed. | ||
60 | |||
61 | CONFIG_DEBUG_CFQ_IOSCHED | ||
62 | - Enables some debugging messages in blktrace. Also creates extra | ||
63 | cgroup file blkio.dequeue. | ||
64 | |||
65 | Config options selected automatically | ||
66 | ===================================== | ||
67 | These config options are not user visible and are selected/deselected | ||
68 | automatically based on IO scheduler configuration. | ||
69 | |||
70 | CONFIG_BLK_CGROUP | 60 | CONFIG_BLK_CGROUP |
71 | - Block IO controller. Selected by CONFIG_CFQ_GROUP_IOSCHED. | 61 | - Block IO controller. |
72 | 62 | ||
73 | CONFIG_DEBUG_BLK_CGROUP | 63 | CONFIG_DEBUG_BLK_CGROUP |
74 | - Debug help. Selected by CONFIG_DEBUG_CFQ_IOSCHED. | 64 | - Debug help. Right now some additional stats file show up in cgroup |
65 | if this option is enabled. | ||
66 | |||
67 | CONFIG_CFQ_GROUP_IOSCHED | ||
68 | - Enables group scheduling in CFQ. Currently only 1 level of group | ||
69 | creation is allowed. | ||
75 | 70 | ||
76 | Details of cgroup files | 71 | Details of cgroup files |
77 | ======================= | 72 | ======================= |
78 | - blkio.weight | 73 | - blkio.weight |
79 | - Specifies per cgroup weight. | 74 | - Specifies per cgroup weight. This is default weight of the group |
80 | 75 | on all the devices until and unless overridden by per device rule. | |
76 | (See blkio.weight_device). | ||
81 | Currently allowed range of weights is from 100 to 1000. | 77 | Currently allowed range of weights is from 100 to 1000. |
82 | 78 | ||
79 | - blkio.weight_device | ||
80 | - One can specify per cgroup per device rules using this interface. | ||
81 | These rules override the default value of group weight as specified | ||
82 | by blkio.weight. | ||
83 | |||
84 | Following is the format. | ||
85 | |||
86 | #echo dev_maj:dev_minor weight > /path/to/cgroup/blkio.weight_device | ||
87 | Configure weight=300 on /dev/sdb (8:16) in this cgroup | ||
88 | # echo 8:16 300 > blkio.weight_device | ||
89 | # cat blkio.weight_device | ||
90 | dev weight | ||
91 | 8:16 300 | ||
92 | |||
93 | Configure weight=500 on /dev/sda (8:0) in this cgroup | ||
94 | # echo 8:0 500 > blkio.weight_device | ||
95 | # cat blkio.weight_device | ||
96 | dev weight | ||
97 | 8:0 500 | ||
98 | 8:16 300 | ||
99 | |||
100 | Remove specific weight for /dev/sda in this cgroup | ||
101 | # echo 8:0 0 > blkio.weight_device | ||
102 | # cat blkio.weight_device | ||
103 | dev weight | ||
104 | 8:16 300 | ||
105 | |||
83 | - blkio.time | 106 | - blkio.time |
84 | - disk time allocated to cgroup per device in milliseconds. First | 107 | - disk time allocated to cgroup per device in milliseconds. First |
85 | two fields specify the major and minor number of the device and | 108 | two fields specify the major and minor number of the device and |
@@ -92,13 +115,105 @@ Details of cgroup files | |||
92 | third field specifies the number of sectors transferred by the | 115 | third field specifies the number of sectors transferred by the |
93 | group to/from the device. | 116 | group to/from the device. |
94 | 117 | ||
118 | - blkio.io_service_bytes | ||
119 | - Number of bytes transferred to/from the disk by the group. These | ||
120 | are further divided by the type of operation - read or write, sync | ||
121 | or async. First two fields specify the major and minor number of the | ||
122 | device, third field specifies the operation type and the fourth field | ||
123 | specifies the number of bytes. | ||
124 | |||
125 | - blkio.io_serviced | ||
126 | - Number of IOs completed to/from the disk by the group. These | ||
127 | are further divided by the type of operation - read or write, sync | ||
128 | or async. First two fields specify the major and minor number of the | ||
129 | device, third field specifies the operation type and the fourth field | ||
130 | specifies the number of IOs. | ||
131 | |||
132 | - blkio.io_service_time | ||
133 | - Total amount of time between request dispatch and request completion | ||
134 | for the IOs done by this cgroup. This is in nanoseconds to make it | ||
135 | meaningful for flash devices too. For devices with queue depth of 1, | ||
136 | this time represents the actual service time. When queue_depth > 1, | ||
137 | that is no longer true as requests may be served out of order. This | ||
138 | may cause the service time for a given IO to include the service time | ||
139 | of multiple IOs when served out of order which may result in total | ||
140 | io_service_time > actual time elapsed. This time is further divided by | ||
141 | the type of operation - read or write, sync or async. First two fields | ||
142 | specify the major and minor number of the device, third field | ||
143 | specifies the operation type and the fourth field specifies the | ||
144 | io_service_time in ns. | ||
145 | |||
146 | - blkio.io_wait_time | ||
147 | - Total amount of time the IOs for this cgroup spent waiting in the | ||
148 | scheduler queues for service. This can be greater than the total time | ||
149 | elapsed since it is cumulative io_wait_time for all IOs. It is not a | ||
150 | measure of total time the cgroup spent waiting but rather a measure of | ||
151 | the wait_time for its individual IOs. For devices with queue_depth > 1 | ||
152 | this metric does not include the time spent waiting for service once | ||
153 | the IO is dispatched to the device but till it actually gets serviced | ||
154 | (there might be a time lag here due to re-ordering of requests by the | ||
155 | device). This is in nanoseconds to make it meaningful for flash | ||
156 | devices too. This time is further divided by the type of operation - | ||
157 | read or write, sync or async. First two fields specify the major and | ||
158 | minor number of the device, third field specifies the operation type | ||
159 | and the fourth field specifies the io_wait_time in ns. | ||
160 | |||
161 | - blkio.io_merged | ||
162 | - Total number of bios/requests merged into requests belonging to this | ||
163 | cgroup. This is further divided by the type of operation - read or | ||
164 | write, sync or async. | ||
165 | |||
166 | - blkio.io_queued | ||
167 | - Total number of requests queued up at any given instant for this | ||
168 | cgroup. This is further divided by the type of operation - read or | ||
169 | write, sync or async. | ||
170 | |||
171 | - blkio.avg_queue_size | ||
172 | - Debugging aid only enabled if CONFIG_DEBUG_BLK_CGROUP=y. | ||
173 | The average queue size for this cgroup over the entire time of this | ||
174 | cgroup's existence. Queue size samples are taken each time one of the | ||
175 | queues of this cgroup gets a timeslice. | ||
176 | |||
177 | - blkio.group_wait_time | ||
178 | - Debugging aid only enabled if CONFIG_DEBUG_BLK_CGROUP=y. | ||
179 | This is the amount of time the cgroup had to wait since it became busy | ||
180 | (i.e., went from 0 to 1 request queued) to get a timeslice for one of | ||
181 | its queues. This is different from the io_wait_time which is the | ||
182 | cumulative total of the amount of time spent by each IO in that cgroup | ||
183 | waiting in the scheduler queue. This is in nanoseconds. If this is | ||
184 | read when the cgroup is in a waiting (for timeslice) state, the stat | ||
185 | will only report the group_wait_time accumulated till the last time it | ||
186 | got a timeslice and will not include the current delta. | ||
187 | |||
188 | - blkio.empty_time | ||
189 | - Debugging aid only enabled if CONFIG_DEBUG_BLK_CGROUP=y. | ||
190 | This is the amount of time a cgroup spends without any pending | ||
191 | requests when not being served, i.e., it does not include any time | ||
192 | spent idling for one of the queues of the cgroup. This is in | ||
193 | nanoseconds. If this is read when the cgroup is in an empty state, | ||
194 | the stat will only report the empty_time accumulated till the last | ||
195 | time it had a pending request and will not include the current delta. | ||
196 | |||
197 | - blkio.idle_time | ||
198 | - Debugging aid only enabled if CONFIG_DEBUG_BLK_CGROUP=y. | ||
199 | This is the amount of time spent by the IO scheduler idling for a | ||
200 | given cgroup in anticipation of a better request than the exising ones | ||
201 | from other queues/cgroups. This is in nanoseconds. If this is read | ||
202 | when the cgroup is in an idling state, the stat will only report the | ||
203 | idle_time accumulated till the last idle period and will not include | ||
204 | the current delta. | ||
205 | |||
95 | - blkio.dequeue | 206 | - blkio.dequeue |
96 | - Debugging aid only enabled if CONFIG_DEBUG_CFQ_IOSCHED=y. This | 207 | - Debugging aid only enabled if CONFIG_DEBUG_BLK_CGROUP=y. This |
97 | gives the statistics about how many a times a group was dequeued | 208 | gives the statistics about how many a times a group was dequeued |
98 | from service tree of the device. First two fields specify the major | 209 | from service tree of the device. First two fields specify the major |
99 | and minor number of the device and third field specifies the number | 210 | and minor number of the device and third field specifies the number |
100 | of times a group was dequeued from a particular device. | 211 | of times a group was dequeued from a particular device. |
101 | 212 | ||
213 | - blkio.reset_stats | ||
214 | - Writing an int to this file will result in resetting all the stats | ||
215 | for that cgroup. | ||
216 | |||
102 | CFQ sysfs tunable | 217 | CFQ sysfs tunable |
103 | ================= | 218 | ================= |
104 | /sys/block/<disk>/queue/iosched/group_isolation | 219 | /sys/block/<disk>/queue/iosched/group_isolation |