diff options
author | Adrian Hunter <adrian.hunter@intel.com> | 2015-07-17 12:34:00 -0400 |
---|---|---|
committer | Arnaldo Carvalho de Melo <acme@redhat.com> | 2015-08-24 16:51:09 -0400 |
commit | 9d1bf02ac3d41367896b38793db6f8f30bb9a295 (patch) | |
tree | 04ef7b21be9064e56541e028649bb875f60214e0 /tools | |
parent | 7eacca3ebb03a4ee7bb41284aafeb19a54242621 (diff) |
perf tools: Update Intel PT documentation
Update Intel PT documentation to describe new features.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-26-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Diffstat (limited to 'tools')
-rw-r--r-- | tools/perf/Documentation/intel-pt.txt | 194 |
1 files changed, 186 insertions, 8 deletions
diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt index 2866b62eb293..4a0501d7a3b4 100644 --- a/tools/perf/Documentation/intel-pt.txt +++ b/tools/perf/Documentation/intel-pt.txt | |||
@@ -142,19 +142,21 @@ which is the same as | |||
142 | 142 | ||
143 | -e intel_pt/tsc=1,noretcomp=0/ | 143 | -e intel_pt/tsc=1,noretcomp=0/ |
144 | 144 | ||
145 | Note there are now new config terms - see section 'config terms' further below. | ||
146 | |||
145 | The config terms are listed in /sys/devices/intel_pt/format. They are bit | 147 | The config terms are listed in /sys/devices/intel_pt/format. They are bit |
146 | fields within the config member of the struct perf_event_attr which is | 148 | fields within the config member of the struct perf_event_attr which is |
147 | passed to the kernel by the perf_event_open system call. They correspond to bit | 149 | passed to the kernel by the perf_event_open system call. They correspond to bit |
148 | fields in the IA32_RTIT_CTL MSR. Here is a list of them and their definitions: | 150 | fields in the IA32_RTIT_CTL MSR. Here is a list of them and their definitions: |
149 | 151 | ||
150 | $ for f in `ls /sys/devices/intel_pt/format`;do | 152 | $ grep -H . /sys/bus/event_source/devices/intel_pt/format/* |
151 | > echo $f | 153 | /sys/bus/event_source/devices/intel_pt/format/cyc:config:1 |
152 | > cat /sys/devices/intel_pt/format/$f | 154 | /sys/bus/event_source/devices/intel_pt/format/cyc_thresh:config:19-22 |
153 | > done | 155 | /sys/bus/event_source/devices/intel_pt/format/mtc:config:9 |
154 | noretcomp | 156 | /sys/bus/event_source/devices/intel_pt/format/mtc_period:config:14-17 |
155 | config:11 | 157 | /sys/bus/event_source/devices/intel_pt/format/noretcomp:config:11 |
156 | tsc | 158 | /sys/bus/event_source/devices/intel_pt/format/psb_period:config:24-27 |
157 | config:10 | 159 | /sys/bus/event_source/devices/intel_pt/format/tsc:config:10 |
158 | 160 | ||
159 | Note that the default config must be overridden for each term i.e. | 161 | Note that the default config must be overridden for each term i.e. |
160 | 162 | ||
@@ -209,9 +211,185 @@ perf_event_attr is displayed if the -vv option is used e.g. | |||
209 | ------------------------------------------------------------ | 211 | ------------------------------------------------------------ |
210 | 212 | ||
211 | 213 | ||
214 | config terms | ||
215 | ------------ | ||
216 | |||
217 | The June 2015 version of Intel 64 and IA-32 Architectures Software Developer | ||
218 | Manuals, Chapter 36 Intel Processor Trace, defined new Intel PT features. | ||
219 | Some of the features are reflect in new config terms. All the config terms are | ||
220 | described below. | ||
221 | |||
222 | tsc Always supported. Produces TSC timestamp packets to provide | ||
223 | timing information. In some cases it is possible to decode | ||
224 | without timing information, for example a per-thread context | ||
225 | that does not overlap executable memory maps. | ||
226 | |||
227 | The default config selects tsc (i.e. tsc=1). | ||
228 | |||
229 | noretcomp Always supported. Disables "return compression" so a TIP packet | ||
230 | is produced when a function returns. Causes more packets to be | ||
231 | produced but might make decoding more reliable. | ||
232 | |||
233 | The default config does not select noretcomp (i.e. noretcomp=0). | ||
234 | |||
235 | psb_period Allows the frequency of PSB packets to be specified. | ||
236 | |||
237 | The PSB packet is a synchronization packet that provides a | ||
238 | starting point for decoding or recovery from errors. | ||
239 | |||
240 | Support for psb_period is indicated by: | ||
241 | |||
242 | /sys/bus/event_source/devices/intel_pt/caps/psb_cyc | ||
243 | |||
244 | which contains "1" if the feature is supported and "0" | ||
245 | otherwise. | ||
246 | |||
247 | Valid values are given by: | ||
248 | |||
249 | /sys/bus/event_source/devices/intel_pt/caps/psb_periods | ||
250 | |||
251 | which contains a hexadecimal value, the bits of which represent | ||
252 | valid values e.g. bit 2 set means value 2 is valid. | ||
253 | |||
254 | The psb_period value is converted to the approximate number of | ||
255 | trace bytes between PSB packets as: | ||
256 | |||
257 | 2 ^ (value + 11) | ||
258 | |||
259 | e.g. value 3 means 16KiB bytes between PSBs | ||
260 | |||
261 | If an invalid value is entered, the error message | ||
262 | will give a list of valid values e.g. | ||
263 | |||
264 | $ perf record -e intel_pt/psb_period=15/u uname | ||
265 | Invalid psb_period for intel_pt. Valid values are: 0-5 | ||
266 | |||
267 | If MTC packets are selected, the default config selects a value | ||
268 | of 3 (i.e. psb_period=3) or the nearest lower value that is | ||
269 | supported (0 is always supported). Otherwise the default is 0. | ||
270 | |||
271 | If decoding is expected to be reliable and the buffer is large | ||
272 | then a large PSB period can be used. | ||
273 | |||
274 | Because a TSC packet is produced with PSB, the PSB period can | ||
275 | also affect the granularity to timing information in the absence | ||
276 | of MTC or CYC. | ||
277 | |||
278 | mtc Produces MTC timing packets. | ||
279 | |||
280 | MTC packets provide finer grain timestamp information than TSC | ||
281 | packets. MTC packets record time using the hardware crystal | ||
282 | clock (CTC) which is related to TSC packets using a TMA packet. | ||
283 | |||
284 | Support for this feature is indicated by: | ||
285 | |||
286 | /sys/bus/event_source/devices/intel_pt/caps/mtc | ||
287 | |||
288 | which contains "1" if the feature is supported and | ||
289 | "0" otherwise. | ||
290 | |||
291 | The frequency of MTC packets can also be specified - see | ||
292 | mtc_period below. | ||
293 | |||
294 | mtc_period Specifies how frequently MTC packets are produced - see mtc | ||
295 | above for how to determine if MTC packets are supported. | ||
296 | |||
297 | Valid values are given by: | ||
298 | |||
299 | /sys/bus/event_source/devices/intel_pt/caps/mtc_periods | ||
300 | |||
301 | which contains a hexadecimal value, the bits of which represent | ||
302 | valid values e.g. bit 2 set means value 2 is valid. | ||
303 | |||
304 | The mtc_period value is converted to the MTC frequency as: | ||
305 | |||
306 | CTC-frequency / (2 ^ value) | ||
307 | |||
308 | e.g. value 3 means one eighth of CTC-frequency | ||
309 | |||
310 | Where CTC is the hardware crystal clock, the frequency of which | ||
311 | can be related to TSC via values provided in cpuid leaf 0x15. | ||
312 | |||
313 | If an invalid value is entered, the error message | ||
314 | will give a list of valid values e.g. | ||
315 | |||
316 | $ perf record -e intel_pt/mtc_period=15/u uname | ||
317 | Invalid mtc_period for intel_pt. Valid values are: 0,3,6,9 | ||
318 | |||
319 | The default value is 3 or the nearest lower value | ||
320 | that is supported (0 is always supported). | ||
321 | |||
322 | cyc Produces CYC timing packets. | ||
323 | |||
324 | CYC packets provide even finer grain timestamp information than | ||
325 | MTC and TSC packets. A CYC packet contains the number of CPU | ||
326 | cycles since the last CYC packet. Unlike MTC and TSC packets, | ||
327 | CYC packets are only sent when another packet is also sent. | ||
328 | |||
329 | Support for this feature is indicated by: | ||
330 | |||
331 | /sys/bus/event_source/devices/intel_pt/caps/psb_cyc | ||
332 | |||
333 | which contains "1" if the feature is supported and | ||
334 | "0" otherwise. | ||
335 | |||
336 | The number of CYC packets produced can be reduced by specifying | ||
337 | a threshold - see cyc_thresh below. | ||
338 | |||
339 | cyc_thresh Specifies how frequently CYC packets are produced - see cyc | ||
340 | above for how to determine if CYC packets are supported. | ||
341 | |||
342 | Valid cyc_thresh values are given by: | ||
343 | |||
344 | /sys/bus/event_source/devices/intel_pt/caps/cycle_thresholds | ||
345 | |||
346 | which contains a hexadecimal value, the bits of which represent | ||
347 | valid values e.g. bit 2 set means value 2 is valid. | ||
348 | |||
349 | The cyc_thresh value represents the minimum number of CPU cycles | ||
350 | that must have passed before a CYC packet can be sent. The | ||
351 | number of CPU cycles is: | ||
352 | |||
353 | 2 ^ (value - 1) | ||
354 | |||
355 | e.g. value 4 means 8 CPU cycles must pass before a CYC packet | ||
356 | can be sent. Note a CYC packet is still only sent when another | ||
357 | packet is sent, not at, e.g. every 8 CPU cycles. | ||
358 | |||
359 | If an invalid value is entered, the error message | ||
360 | will give a list of valid values e.g. | ||
361 | |||
362 | $ perf record -e intel_pt/cyc,cyc_thresh=15/u uname | ||
363 | Invalid cyc_thresh for intel_pt. Valid values are: 0-12 | ||
364 | |||
365 | CYC packets are not requested by default. | ||
366 | |||
367 | no_force_psb This is a driver option and is not in the IA32_RTIT_CTL MSR. | ||
368 | |||
369 | It stops the driver resetting the byte count to zero whenever | ||
370 | enabling the trace (for example on context switches) which in | ||
371 | turn results in no PSB being forced. However some processors | ||
372 | will produce a PSB anyway. | ||
373 | |||
374 | In any case, there is still a PSB when the trace is enabled for | ||
375 | the first time. | ||
376 | |||
377 | no_force_psb can be used to slightly decrease the trace size but | ||
378 | may make it harder for the decoder to recover from errors. | ||
379 | |||
380 | no_force_psb is not selected by default. | ||
381 | |||
382 | |||
212 | new snapshot option | 383 | new snapshot option |
213 | ------------------- | 384 | ------------------- |
214 | 385 | ||
386 | The difference between full trace and snapshot from the kernel's perspective is | ||
387 | that in full trace we don't overwrite trace data that the user hasn't collected | ||
388 | yet (and indicated that by advancing aux_tail), whereas in snapshot mode we let | ||
389 | the trace run and overwrite older data in the buffer so that whenever something | ||
390 | interesting happens, we can stop it and grab a snapshot of what was going on | ||
391 | around that interesting moment. | ||
392 | |||
215 | To select snapshot mode a new option has been added: | 393 | To select snapshot mode a new option has been added: |
216 | 394 | ||
217 | -S | 395 | -S |