diff options
Diffstat (limited to 'Documentation/admin-guide/device-mapper/statistics.rst')
-rw-r--r-- | Documentation/admin-guide/device-mapper/statistics.rst | 225 |
1 files changed, 225 insertions, 0 deletions
diff --git a/Documentation/admin-guide/device-mapper/statistics.rst b/Documentation/admin-guide/device-mapper/statistics.rst new file mode 100644 index 000000000000..41ded0bc5933 --- /dev/null +++ b/Documentation/admin-guide/device-mapper/statistics.rst | |||
@@ -0,0 +1,225 @@ | |||
1 | ============= | ||
2 | DM statistics | ||
3 | ============= | ||
4 | |||
5 | Device Mapper supports the collection of I/O statistics on user-defined | ||
6 | regions of a DM device. If no regions are defined no statistics are | ||
7 | collected so there isn't any performance impact. Only bio-based DM | ||
8 | devices are currently supported. | ||
9 | |||
10 | Each user-defined region specifies a starting sector, length and step. | ||
11 | Individual statistics will be collected for each step-sized area within | ||
12 | the range specified. | ||
13 | |||
14 | The I/O statistics counters for each step-sized area of a region are | ||
15 | in the same format as `/sys/block/*/stat` or `/proc/diskstats` (see: | ||
16 | Documentation/admin-guide/iostats.rst). But two extra counters (12 and 13) are | ||
17 | provided: total time spent reading and writing. When the histogram | ||
18 | argument is used, the 14th parameter is reported that represents the | ||
19 | histogram of latencies. All these counters may be accessed by sending | ||
20 | the @stats_print message to the appropriate DM device via dmsetup. | ||
21 | |||
22 | The reported times are in milliseconds and the granularity depends on | ||
23 | the kernel ticks. When the option precise_timestamps is used, the | ||
24 | reported times are in nanoseconds. | ||
25 | |||
26 | Each region has a corresponding unique identifier, which we call a | ||
27 | region_id, that is assigned when the region is created. The region_id | ||
28 | must be supplied when querying statistics about the region, deleting the | ||
29 | region, etc. Unique region_ids enable multiple userspace programs to | ||
30 | request and process statistics for the same DM device without stepping | ||
31 | on each other's data. | ||
32 | |||
33 | The creation of DM statistics will allocate memory via kmalloc or | ||
34 | fallback to using vmalloc space. At most, 1/4 of the overall system | ||
35 | memory may be allocated by DM statistics. The admin can see how much | ||
36 | memory is used by reading: | ||
37 | |||
38 | /sys/module/dm_mod/parameters/stats_current_allocated_bytes | ||
39 | |||
40 | Messages | ||
41 | ======== | ||
42 | |||
43 | @stats_create <range> <step> [<number_of_optional_arguments> <optional_arguments>...] [<program_id> [<aux_data>]] | ||
44 | Create a new region and return the region_id. | ||
45 | |||
46 | <range> | ||
47 | "-" | ||
48 | whole device | ||
49 | "<start_sector>+<length>" | ||
50 | a range of <length> 512-byte sectors | ||
51 | starting with <start_sector>. | ||
52 | |||
53 | <step> | ||
54 | "<area_size>" | ||
55 | the range is subdivided into areas each containing | ||
56 | <area_size> sectors. | ||
57 | "/<number_of_areas>" | ||
58 | the range is subdivided into the specified | ||
59 | number of areas. | ||
60 | |||
61 | <number_of_optional_arguments> | ||
62 | The number of optional arguments | ||
63 | |||
64 | <optional_arguments> | ||
65 | The following optional arguments are supported: | ||
66 | |||
67 | precise_timestamps | ||
68 | use precise timer with nanosecond resolution | ||
69 | instead of the "jiffies" variable. When this argument is | ||
70 | used, the resulting times are in nanoseconds instead of | ||
71 | milliseconds. Precise timestamps are a little bit slower | ||
72 | to obtain than jiffies-based timestamps. | ||
73 | histogram:n1,n2,n3,n4,... | ||
74 | collect histogram of latencies. The | ||
75 | numbers n1, n2, etc are times that represent the boundaries | ||
76 | of the histogram. If precise_timestamps is not used, the | ||
77 | times are in milliseconds, otherwise they are in | ||
78 | nanoseconds. For each range, the kernel will report the | ||
79 | number of requests that completed within this range. For | ||
80 | example, if we use "histogram:10,20,30", the kernel will | ||
81 | report four numbers a:b:c:d. a is the number of requests | ||
82 | that took 0-10 ms to complete, b is the number of requests | ||
83 | that took 10-20 ms to complete, c is the number of requests | ||
84 | that took 20-30 ms to complete and d is the number of | ||
85 | requests that took more than 30 ms to complete. | ||
86 | |||
87 | <program_id> | ||
88 | An optional parameter. A name that uniquely identifies | ||
89 | the userspace owner of the range. This groups ranges together | ||
90 | so that userspace programs can identify the ranges they | ||
91 | created and ignore those created by others. | ||
92 | The kernel returns this string back in the output of | ||
93 | @stats_list message, but it doesn't use it for anything else. | ||
94 | If we omit the number of optional arguments, program id must not | ||
95 | be a number, otherwise it would be interpreted as the number of | ||
96 | optional arguments. | ||
97 | |||
98 | <aux_data> | ||
99 | An optional parameter. A word that provides auxiliary data | ||
100 | that is useful to the client program that created the range. | ||
101 | The kernel returns this string back in the output of | ||
102 | @stats_list message, but it doesn't use this value for anything. | ||
103 | |||
104 | @stats_delete <region_id> | ||
105 | Delete the region with the specified id. | ||
106 | |||
107 | <region_id> | ||
108 | region_id returned from @stats_create | ||
109 | |||
110 | @stats_clear <region_id> | ||
111 | Clear all the counters except the in-flight i/o counters. | ||
112 | |||
113 | <region_id> | ||
114 | region_id returned from @stats_create | ||
115 | |||
116 | @stats_list [<program_id>] | ||
117 | List all regions registered with @stats_create. | ||
118 | |||
119 | <program_id> | ||
120 | An optional parameter. | ||
121 | If this parameter is specified, only matching regions | ||
122 | are returned. | ||
123 | If it is not specified, all regions are returned. | ||
124 | |||
125 | Output format: | ||
126 | <region_id>: <start_sector>+<length> <step> <program_id> <aux_data> | ||
127 | precise_timestamps histogram:n1,n2,n3,... | ||
128 | |||
129 | The strings "precise_timestamps" and "histogram" are printed only | ||
130 | if they were specified when creating the region. | ||
131 | |||
132 | @stats_print <region_id> [<starting_line> <number_of_lines>] | ||
133 | Print counters for each step-sized area of a region. | ||
134 | |||
135 | <region_id> | ||
136 | region_id returned from @stats_create | ||
137 | |||
138 | <starting_line> | ||
139 | The index of the starting line in the output. | ||
140 | If omitted, all lines are returned. | ||
141 | |||
142 | <number_of_lines> | ||
143 | The number of lines to include in the output. | ||
144 | If omitted, all lines are returned. | ||
145 | |||
146 | Output format for each step-sized area of a region: | ||
147 | |||
148 | <start_sector>+<length> | ||
149 | counters | ||
150 | |||
151 | The first 11 counters have the same meaning as | ||
152 | `/sys/block/*/stat or /proc/diskstats`. | ||
153 | |||
154 | Please refer to Documentation/admin-guide/iostats.rst for details. | ||
155 | |||
156 | 1. the number of reads completed | ||
157 | 2. the number of reads merged | ||
158 | 3. the number of sectors read | ||
159 | 4. the number of milliseconds spent reading | ||
160 | 5. the number of writes completed | ||
161 | 6. the number of writes merged | ||
162 | 7. the number of sectors written | ||
163 | 8. the number of milliseconds spent writing | ||
164 | 9. the number of I/Os currently in progress | ||
165 | 10. the number of milliseconds spent doing I/Os | ||
166 | 11. the weighted number of milliseconds spent doing I/Os | ||
167 | |||
168 | Additional counters: | ||
169 | |||
170 | 12. the total time spent reading in milliseconds | ||
171 | 13. the total time spent writing in milliseconds | ||
172 | |||
173 | @stats_print_clear <region_id> [<starting_line> <number_of_lines>] | ||
174 | Atomically print and then clear all the counters except the | ||
175 | in-flight i/o counters. Useful when the client consuming the | ||
176 | statistics does not want to lose any statistics (those updated | ||
177 | between printing and clearing). | ||
178 | |||
179 | <region_id> | ||
180 | region_id returned from @stats_create | ||
181 | |||
182 | <starting_line> | ||
183 | The index of the starting line in the output. | ||
184 | If omitted, all lines are printed and then cleared. | ||
185 | |||
186 | <number_of_lines> | ||
187 | The number of lines to process. | ||
188 | If omitted, all lines are printed and then cleared. | ||
189 | |||
190 | @stats_set_aux <region_id> <aux_data> | ||
191 | Store auxiliary data aux_data for the specified region. | ||
192 | |||
193 | <region_id> | ||
194 | region_id returned from @stats_create | ||
195 | |||
196 | <aux_data> | ||
197 | The string that identifies data which is useful to the client | ||
198 | program that created the range. The kernel returns this | ||
199 | string back in the output of @stats_list message, but it | ||
200 | doesn't use this value for anything. | ||
201 | |||
202 | Examples | ||
203 | ======== | ||
204 | |||
205 | Subdivide the DM device 'vol' into 100 pieces and start collecting | ||
206 | statistics on them:: | ||
207 | |||
208 | dmsetup message vol 0 @stats_create - /100 | ||
209 | |||
210 | Set the auxiliary data string to "foo bar baz" (the escape for each | ||
211 | space must also be escaped, otherwise the shell will consume them):: | ||
212 | |||
213 | dmsetup message vol 0 @stats_set_aux 0 foo\\ bar\\ baz | ||
214 | |||
215 | List the statistics:: | ||
216 | |||
217 | dmsetup message vol 0 @stats_list | ||
218 | |||
219 | Print the statistics:: | ||
220 | |||
221 | dmsetup message vol 0 @stats_print 0 | ||
222 | |||
223 | Delete the statistics:: | ||
224 | |||
225 | dmsetup message vol 0 @stats_delete 0 | ||