diff options
Diffstat (limited to 'tools/perf/Documentation/intel-pt.txt')
| -rw-r--r-- | tools/perf/Documentation/intel-pt.txt | 194 |
1 files changed, 186 insertions, 8 deletions
diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt index 2866b62eb293..4a0501d7a3b4 100644 --- a/tools/perf/Documentation/intel-pt.txt +++ b/tools/perf/Documentation/intel-pt.txt | |||
| @@ -142,19 +142,21 @@ which is the same as | |||
| 142 | 142 | ||
| 143 | -e intel_pt/tsc=1,noretcomp=0/ | 143 | -e intel_pt/tsc=1,noretcomp=0/ |
| 144 | 144 | ||
| 145 | Note there are now new config terms - see section 'config terms' further below. | ||
| 146 | |||
| 145 | The config terms are listed in /sys/devices/intel_pt/format. They are bit | 147 | The config terms are listed in /sys/devices/intel_pt/format. They are bit |
| 146 | fields within the config member of the struct perf_event_attr which is | 148 | fields within the config member of the struct perf_event_attr which is |
| 147 | passed to the kernel by the perf_event_open system call. They correspond to bit | 149 | passed to the kernel by the perf_event_open system call. They correspond to bit |
| 148 | fields in the IA32_RTIT_CTL MSR. Here is a list of them and their definitions: | 150 | fields in the IA32_RTIT_CTL MSR. Here is a list of them and their definitions: |
| 149 | 151 | ||
| 150 | $ for f in `ls /sys/devices/intel_pt/format`;do | 152 | $ grep -H . /sys/bus/event_source/devices/intel_pt/format/* |
| 151 | > echo $f | 153 | /sys/bus/event_source/devices/intel_pt/format/cyc:config:1 |
| 152 | > cat /sys/devices/intel_pt/format/$f | 154 | /sys/bus/event_source/devices/intel_pt/format/cyc_thresh:config:19-22 |
| 153 | > done | 155 | /sys/bus/event_source/devices/intel_pt/format/mtc:config:9 |
| 154 | noretcomp | 156 | /sys/bus/event_source/devices/intel_pt/format/mtc_period:config:14-17 |
| 155 | config:11 | 157 | /sys/bus/event_source/devices/intel_pt/format/noretcomp:config:11 |
| 156 | tsc | 158 | /sys/bus/event_source/devices/intel_pt/format/psb_period:config:24-27 |
| 157 | config:10 | 159 | /sys/bus/event_source/devices/intel_pt/format/tsc:config:10 |
| 158 | 160 | ||
| 159 | Note that the default config must be overridden for each term i.e. | 161 | Note that the default config must be overridden for each term i.e. |
| 160 | 162 | ||
| @@ -209,9 +211,185 @@ perf_event_attr is displayed if the -vv option is used e.g. | |||
| 209 | ------------------------------------------------------------ | 211 | ------------------------------------------------------------ |
| 210 | 212 | ||
| 211 | 213 | ||
| 214 | config terms | ||
| 215 | ------------ | ||
| 216 | |||
| 217 | The June 2015 version of Intel 64 and IA-32 Architectures Software Developer | ||
| 218 | Manuals, Chapter 36 Intel Processor Trace, defined new Intel PT features. | ||
| 219 | Some of the features are reflect in new config terms. All the config terms are | ||
| 220 | described below. | ||
| 221 | |||
| 222 | tsc Always supported. Produces TSC timestamp packets to provide | ||
| 223 | timing information. In some cases it is possible to decode | ||
| 224 | without timing information, for example a per-thread context | ||
| 225 | that does not overlap executable memory maps. | ||
| 226 | |||
| 227 | The default config selects tsc (i.e. tsc=1). | ||
| 228 | |||
| 229 | noretcomp Always supported. Disables "return compression" so a TIP packet | ||
| 230 | is produced when a function returns. Causes more packets to be | ||
| 231 | produced but might make decoding more reliable. | ||
| 232 | |||
| 233 | The default config does not select noretcomp (i.e. noretcomp=0). | ||
| 234 | |||
| 235 | psb_period Allows the frequency of PSB packets to be specified. | ||
| 236 | |||
| 237 | The PSB packet is a synchronization packet that provides a | ||
| 238 | starting point for decoding or recovery from errors. | ||
| 239 | |||
| 240 | Support for psb_period is indicated by: | ||
| 241 | |||
| 242 | /sys/bus/event_source/devices/intel_pt/caps/psb_cyc | ||
| 243 | |||
| 244 | which contains "1" if the feature is supported and "0" | ||
| 245 | otherwise. | ||
| 246 | |||
| 247 | Valid values are given by: | ||
| 248 | |||
| 249 | /sys/bus/event_source/devices/intel_pt/caps/psb_periods | ||
| 250 | |||
| 251 | which contains a hexadecimal value, the bits of which represent | ||
| 252 | valid values e.g. bit 2 set means value 2 is valid. | ||
| 253 | |||
| 254 | The psb_period value is converted to the approximate number of | ||
| 255 | trace bytes between PSB packets as: | ||
| 256 | |||
| 257 | 2 ^ (value + 11) | ||
| 258 | |||
| 259 | e.g. value 3 means 16KiB bytes between PSBs | ||
| 260 | |||
| 261 | If an invalid value is entered, the error message | ||
| 262 | will give a list of valid values e.g. | ||
| 263 | |||
| 264 | $ perf record -e intel_pt/psb_period=15/u uname | ||
| 265 | Invalid psb_period for intel_pt. Valid values are: 0-5 | ||
| 266 | |||
| 267 | If MTC packets are selected, the default config selects a value | ||
| 268 | of 3 (i.e. psb_period=3) or the nearest lower value that is | ||
| 269 | supported (0 is always supported). Otherwise the default is 0. | ||
| 270 | |||
| 271 | If decoding is expected to be reliable and the buffer is large | ||
| 272 | then a large PSB period can be used. | ||
| 273 | |||
| 274 | Because a TSC packet is produced with PSB, the PSB period can | ||
| 275 | also affect the granularity to timing information in the absence | ||
| 276 | of MTC or CYC. | ||
| 277 | |||
| 278 | mtc Produces MTC timing packets. | ||
| 279 | |||
| 280 | MTC packets provide finer grain timestamp information than TSC | ||
| 281 | packets. MTC packets record time using the hardware crystal | ||
| 282 | clock (CTC) which is related to TSC packets using a TMA packet. | ||
| 283 | |||
| 284 | Support for this feature is indicated by: | ||
| 285 | |||
| 286 | /sys/bus/event_source/devices/intel_pt/caps/mtc | ||
| 287 | |||
| 288 | which contains "1" if the feature is supported and | ||
| 289 | "0" otherwise. | ||
| 290 | |||
| 291 | The frequency of MTC packets can also be specified - see | ||
| 292 | mtc_period below. | ||
| 293 | |||
| 294 | mtc_period Specifies how frequently MTC packets are produced - see mtc | ||
| 295 | above for how to determine if MTC packets are supported. | ||
| 296 | |||
| 297 | Valid values are given by: | ||
| 298 | |||
| 299 | /sys/bus/event_source/devices/intel_pt/caps/mtc_periods | ||
| 300 | |||
| 301 | which contains a hexadecimal value, the bits of which represent | ||
| 302 | valid values e.g. bit 2 set means value 2 is valid. | ||
| 303 | |||
| 304 | The mtc_period value is converted to the MTC frequency as: | ||
| 305 | |||
| 306 | CTC-frequency / (2 ^ value) | ||
| 307 | |||
| 308 | e.g. value 3 means one eighth of CTC-frequency | ||
| 309 | |||
| 310 | Where CTC is the hardware crystal clock, the frequency of which | ||
| 311 | can be related to TSC via values provided in cpuid leaf 0x15. | ||
| 312 | |||
| 313 | If an invalid value is entered, the error message | ||
| 314 | will give a list of valid values e.g. | ||
| 315 | |||
| 316 | $ perf record -e intel_pt/mtc_period=15/u uname | ||
| 317 | Invalid mtc_period for intel_pt. Valid values are: 0,3,6,9 | ||
| 318 | |||
| 319 | The default value is 3 or the nearest lower value | ||
| 320 | that is supported (0 is always supported). | ||
| 321 | |||
| 322 | cyc Produces CYC timing packets. | ||
| 323 | |||
| 324 | CYC packets provide even finer grain timestamp information than | ||
| 325 | MTC and TSC packets. A CYC packet contains the number of CPU | ||
| 326 | cycles since the last CYC packet. Unlike MTC and TSC packets, | ||
| 327 | CYC packets are only sent when another packet is also sent. | ||
| 328 | |||
| 329 | Support for this feature is indicated by: | ||
| 330 | |||
| 331 | /sys/bus/event_source/devices/intel_pt/caps/psb_cyc | ||
| 332 | |||
| 333 | which contains "1" if the feature is supported and | ||
| 334 | "0" otherwise. | ||
| 335 | |||
| 336 | The number of CYC packets produced can be reduced by specifying | ||
| 337 | a threshold - see cyc_thresh below. | ||
| 338 | |||
| 339 | cyc_thresh Specifies how frequently CYC packets are produced - see cyc | ||
| 340 | above for how to determine if CYC packets are supported. | ||
| 341 | |||
| 342 | Valid cyc_thresh values are given by: | ||
| 343 | |||
| 344 | /sys/bus/event_source/devices/intel_pt/caps/cycle_thresholds | ||
| 345 | |||
| 346 | which contains a hexadecimal value, the bits of which represent | ||
| 347 | valid values e.g. bit 2 set means value 2 is valid. | ||
| 348 | |||
| 349 | The cyc_thresh value represents the minimum number of CPU cycles | ||
| 350 | that must have passed before a CYC packet can be sent. The | ||
| 351 | number of CPU cycles is: | ||
| 352 | |||
| 353 | 2 ^ (value - 1) | ||
| 354 | |||
| 355 | e.g. value 4 means 8 CPU cycles must pass before a CYC packet | ||
| 356 | can be sent. Note a CYC packet is still only sent when another | ||
| 357 | packet is sent, not at, e.g. every 8 CPU cycles. | ||
| 358 | |||
| 359 | If an invalid value is entered, the error message | ||
| 360 | will give a list of valid values e.g. | ||
| 361 | |||
| 362 | $ perf record -e intel_pt/cyc,cyc_thresh=15/u uname | ||
| 363 | Invalid cyc_thresh for intel_pt. Valid values are: 0-12 | ||
| 364 | |||
| 365 | CYC packets are not requested by default. | ||
| 366 | |||
| 367 | no_force_psb This is a driver option and is not in the IA32_RTIT_CTL MSR. | ||
| 368 | |||
| 369 | It stops the driver resetting the byte count to zero whenever | ||
| 370 | enabling the trace (for example on context switches) which in | ||
| 371 | turn results in no PSB being forced. However some processors | ||
| 372 | will produce a PSB anyway. | ||
| 373 | |||
| 374 | In any case, there is still a PSB when the trace is enabled for | ||
| 375 | the first time. | ||
| 376 | |||
| 377 | no_force_psb can be used to slightly decrease the trace size but | ||
| 378 | may make it harder for the decoder to recover from errors. | ||
| 379 | |||
| 380 | no_force_psb is not selected by default. | ||
| 381 | |||
| 382 | |||
| 212 | new snapshot option | 383 | new snapshot option |
| 213 | ------------------- | 384 | ------------------- |
| 214 | 385 | ||
| 386 | The difference between full trace and snapshot from the kernel's perspective is | ||
| 387 | that in full trace we don't overwrite trace data that the user hasn't collected | ||
| 388 | yet (and indicated that by advancing aux_tail), whereas in snapshot mode we let | ||
| 389 | the trace run and overwrite older data in the buffer so that whenever something | ||
| 390 | interesting happens, we can stop it and grab a snapshot of what was going on | ||
| 391 | around that interesting moment. | ||
| 392 | |||
| 215 | To select snapshot mode a new option has been added: | 393 | To select snapshot mode a new option has been added: |
| 216 | 394 | ||
| 217 | -S | 395 | -S |
