diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2018-01-30 14:15:14 -0500 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2018-01-30 14:15:14 -0500 |
commit | d8b91dde38f4c43bd0bbbf17a90f735b16aaff2c (patch) | |
tree | bd72dabf6e4b23e060fce429c87e60504f69de54 /tools/perf/util/stat.h | |
parent | 5e7481a25e90b661d1dbbba18be3fd3dfe12ec6f (diff) | |
parent | e4c1091cb495d9cbec8956d642644a71a1689958 (diff) |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"Kernel side changes:
- Clean up the x86 instruction decoder (Masami Hiramatsu)
- Add new uprobes optimization for PUSH instructions on x86 (Yonghong
Song)
- Add MSR_IA32_THERM_STATUS to the MSR events (Stephane Eranian)
- Fix misc bugs, update documentation, plus various cleanups (Jiri
Olsa)
There's a large number of tooling side improvements:
- Intel-PT/BTS improvements (Adrian Hunter)
- Numerous 'perf trace' improvements (Arnaldo Carvalho de Melo)
- Introduce an errno code to string facility (Hendrik Brueckner)
- Various build system improvements (Jiri Olsa)
- Add support for CoreSight trace decoding by making the perf tools
use the external openCSD (Mathieu Poirier, Tor Jeremiassen)
- Add ARM Statistical Profiling Extensions (SPE) support (Kim
Phillips)
- libtraceevent updates (Steven Rostedt)
- Intel vendor event JSON updates (Andi Kleen)
- Introduce 'perf report --mmaps' and 'perf report --tasks' to show
info present in 'perf.data' (Jiri Olsa, Arnaldo Carvalho de Melo)
- Add infrastructure to record first and last sample time to the
perf.data file header, so that when processing all samples in a
'perf record' session, such as when doing build-id processing, or
when specifically requesting that that info be recorded, use that
in 'perf report --time', that also got support for percent slices
in addition to absolute ones.
I.e. now it is possible to ask for the samples in the 10%-20% time
slice of a perf.data file (Jin Yao)
- Allow system wide 'perf stat --per-thread', sorting the result (Jin
Yao)
E.g.:
[root@jouet ~]# perf stat --per-thread --metrics IPC
^C
Performance counter stats for 'system wide':
make-22229 23,012,094,032 inst_retired.any # 0.8 IPC
cc1-22419 692,027,497 inst_retired.any # 0.8 IPC
gcc-22418 328,231,855 inst_retired.any # 0.9 IPC
cc1-22509 220,853,647 inst_retired.any # 0.8 IPC
gcc-22486 199,874,810 inst_retired.any # 1.0 IPC
as-22466 177,896,365 inst_retired.any # 0.9 IPC
cc1-22465 150,732,374 inst_retired.any # 0.8 IPC
gcc-22508 112,555,593 inst_retired.any # 0.9 IPC
cc1-22487 108,964,079 inst_retired.any # 0.7 IPC
qemu-system-x86-2697 21,330,550 inst_retired.any # 0.3 IPC
systemd-journal-551 20,642,951 inst_retired.any # 0.4 IPC
docker-containe-17651 9,552,892 inst_retired.any # 0.5 IPC
dockerd-current-9809 7,528,586 inst_retired.any # 0.5 IPC
make-22153 12,504,194,380 inst_retired.any # 0.8 IPC
python2-22429 12,081,290,954 inst_retired.any # 0.8 IPC
<SNIP>
python2-22429 15,026,328,103 cpu_clk_unhalted.thread
cc1-22419 826,660,193 cpu_clk_unhalted.thread
gcc-22418 365,321,295 cpu_clk_unhalted.thread
cc1-22509 279,169,362 cpu_clk_unhalted.thread
gcc-22486 210,156,950 cpu_clk_unhalted.thread
<SNIP>
5.638075538 seconds time elapsed
[root@jouet ~]#
- Improve shell auto-completion of perf events (Jin Yao)
- 'perf probe' improvements (Masami Hiramatsu)
- Improve PMU infrastructure to support amp64's ThunderX2
implementation defined core events (Ganapatrao Kulkarni)
- Various annotation related improvements and fixes (Thomas Richter)
- Clarify usage of 'overwrite' and 'backward' in the evlist/mmap
code, removing the 'overwrite' parameter from several functions as
it was always used it as 'false' (Wang Nan)
- Fix/improve 'perf record' reverse recording support (Wang Nan)
- Improve command line options documentation (Sihyeon Jang)
- Optimize sample parsing for ordering events, where we don't need to
parse all the PERF_SAMPLE_ bits, just the ones leading to the
timestamp needed to reorder events (Jiri Olsa)
- Generalize the annotation code to support other source information
besides objdump/DWARF obtained ones, starting with python scripts,
that will is slated to be merged soon (Jiri Olsa)
- ... and a lot more that I failed to list, see the shortlog and
changelog for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits)
perf trace beauty flock: Move to separate object file
perf evlist: Remove fcntl.h from evlist.h
perf trace beauty futex: Beautify FUTEX_BITSET_MATCH_ANY
perf trace: Do not print from time delta for interrupted syscall lines
perf trace: Add --print-sample
perf bpf: Remove misplaced __maybe_unused attribute
MAINTAINERS: Adding entry for CoreSight trace decoding
perf tools: Add mechanic to synthesise CoreSight trace packets
perf tools: Add full support for CoreSight trace decoding
pert tools: Add queue management functionality
perf tools: Add functionality to communicate with the openCSD decoder
perf tools: Add support for decoding CoreSight trace data
perf tools: Add decoder mechanic to support dumping trace data
perf tools: Add processing of coresight metadata
perf tools: Add initial entry point for decoder CoreSight traces
perf tools: Integrating the CoreSight decoding library
perf vendor events intel: Update IvyTown files to V20
perf vendor events intel: Update IvyBridge files to V20
perf vendor events intel: Update BroadwellDE events to V7
perf vendor events intel: Update SkylakeX events to V1.06
...
Diffstat (limited to 'tools/perf/util/stat.h')
-rw-r--r-- | tools/perf/util/stat.h | 63 |
1 files changed, 60 insertions, 3 deletions
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h index eefca5c981fd..dbc6f7134f61 100644 --- a/tools/perf/util/stat.h +++ b/tools/perf/util/stat.h | |||
@@ -5,6 +5,7 @@ | |||
5 | #include <linux/types.h> | 5 | #include <linux/types.h> |
6 | #include <stdio.h> | 6 | #include <stdio.h> |
7 | #include "xyarray.h" | 7 | #include "xyarray.h" |
8 | #include "rblist.h" | ||
8 | 9 | ||
9 | struct stats | 10 | struct stats |
10 | { | 11 | { |
@@ -43,11 +44,54 @@ enum aggr_mode { | |||
43 | AGGR_UNSET, | 44 | AGGR_UNSET, |
44 | }; | 45 | }; |
45 | 46 | ||
47 | enum { | ||
48 | CTX_BIT_USER = 1 << 0, | ||
49 | CTX_BIT_KERNEL = 1 << 1, | ||
50 | CTX_BIT_HV = 1 << 2, | ||
51 | CTX_BIT_HOST = 1 << 3, | ||
52 | CTX_BIT_IDLE = 1 << 4, | ||
53 | CTX_BIT_MAX = 1 << 5, | ||
54 | }; | ||
55 | |||
56 | #define NUM_CTX CTX_BIT_MAX | ||
57 | |||
58 | enum stat_type { | ||
59 | STAT_NONE = 0, | ||
60 | STAT_NSECS, | ||
61 | STAT_CYCLES, | ||
62 | STAT_STALLED_CYCLES_FRONT, | ||
63 | STAT_STALLED_CYCLES_BACK, | ||
64 | STAT_BRANCHES, | ||
65 | STAT_CACHEREFS, | ||
66 | STAT_L1_DCACHE, | ||
67 | STAT_L1_ICACHE, | ||
68 | STAT_LL_CACHE, | ||
69 | STAT_ITLB_CACHE, | ||
70 | STAT_DTLB_CACHE, | ||
71 | STAT_CYCLES_IN_TX, | ||
72 | STAT_TRANSACTION, | ||
73 | STAT_ELISION, | ||
74 | STAT_TOPDOWN_TOTAL_SLOTS, | ||
75 | STAT_TOPDOWN_SLOTS_ISSUED, | ||
76 | STAT_TOPDOWN_SLOTS_RETIRED, | ||
77 | STAT_TOPDOWN_FETCH_BUBBLES, | ||
78 | STAT_TOPDOWN_RECOVERY_BUBBLES, | ||
79 | STAT_SMI_NUM, | ||
80 | STAT_APERF, | ||
81 | STAT_MAX | ||
82 | }; | ||
83 | |||
84 | struct runtime_stat { | ||
85 | struct rblist value_list; | ||
86 | }; | ||
87 | |||
46 | struct perf_stat_config { | 88 | struct perf_stat_config { |
47 | enum aggr_mode aggr_mode; | 89 | enum aggr_mode aggr_mode; |
48 | bool scale; | 90 | bool scale; |
49 | FILE *output; | 91 | FILE *output; |
50 | unsigned int interval; | 92 | unsigned int interval; |
93 | struct runtime_stat *stats; | ||
94 | int stats_num; | ||
51 | }; | 95 | }; |
52 | 96 | ||
53 | void update_stats(struct stats *stats, u64 val); | 97 | void update_stats(struct stats *stats, u64 val); |
@@ -67,6 +111,15 @@ static inline void init_stats(struct stats *stats) | |||
67 | struct perf_evsel; | 111 | struct perf_evsel; |
68 | struct perf_evlist; | 112 | struct perf_evlist; |
69 | 113 | ||
114 | struct perf_aggr_thread_value { | ||
115 | struct perf_evsel *counter; | ||
116 | int id; | ||
117 | double uval; | ||
118 | u64 val; | ||
119 | u64 run; | ||
120 | u64 ena; | ||
121 | }; | ||
122 | |||
70 | bool __perf_evsel_stat__is(struct perf_evsel *evsel, | 123 | bool __perf_evsel_stat__is(struct perf_evsel *evsel, |
71 | enum perf_stat_evsel_id id); | 124 | enum perf_stat_evsel_id id); |
72 | 125 | ||
@@ -75,16 +128,20 @@ bool __perf_evsel_stat__is(struct perf_evsel *evsel, | |||
75 | 128 | ||
76 | void perf_stat_evsel_id_init(struct perf_evsel *evsel); | 129 | void perf_stat_evsel_id_init(struct perf_evsel *evsel); |
77 | 130 | ||
131 | extern struct runtime_stat rt_stat; | ||
78 | extern struct stats walltime_nsecs_stats; | 132 | extern struct stats walltime_nsecs_stats; |
79 | 133 | ||
80 | typedef void (*print_metric_t)(void *ctx, const char *color, const char *unit, | 134 | typedef void (*print_metric_t)(void *ctx, const char *color, const char *unit, |
81 | const char *fmt, double val); | 135 | const char *fmt, double val); |
82 | typedef void (*new_line_t )(void *ctx); | 136 | typedef void (*new_line_t )(void *ctx); |
83 | 137 | ||
138 | void runtime_stat__init(struct runtime_stat *st); | ||
139 | void runtime_stat__exit(struct runtime_stat *st); | ||
84 | void perf_stat__init_shadow_stats(void); | 140 | void perf_stat__init_shadow_stats(void); |
85 | void perf_stat__reset_shadow_stats(void); | 141 | void perf_stat__reset_shadow_stats(void); |
142 | void perf_stat__reset_shadow_per_stat(struct runtime_stat *st); | ||
86 | void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count, | 143 | void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count, |
87 | int cpu); | 144 | int cpu, struct runtime_stat *st); |
88 | struct perf_stat_output_ctx { | 145 | struct perf_stat_output_ctx { |
89 | void *ctx; | 146 | void *ctx; |
90 | print_metric_t print_metric; | 147 | print_metric_t print_metric; |
@@ -92,11 +149,11 @@ struct perf_stat_output_ctx { | |||
92 | bool force_header; | 149 | bool force_header; |
93 | }; | 150 | }; |
94 | 151 | ||
95 | struct rblist; | ||
96 | void perf_stat__print_shadow_stats(struct perf_evsel *evsel, | 152 | void perf_stat__print_shadow_stats(struct perf_evsel *evsel, |
97 | double avg, int cpu, | 153 | double avg, int cpu, |
98 | struct perf_stat_output_ctx *out, | 154 | struct perf_stat_output_ctx *out, |
99 | struct rblist *metric_events); | 155 | struct rblist *metric_events, |
156 | struct runtime_stat *st); | ||
100 | void perf_stat__collect_metric_expr(struct perf_evlist *); | 157 | void perf_stat__collect_metric_expr(struct perf_evlist *); |
101 | 158 | ||
102 | int perf_evlist__alloc_stats(struct perf_evlist *evlist, bool alloc_raw); | 159 | int perf_evlist__alloc_stats(struct perf_evlist *evlist, bool alloc_raw); |