diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2012-03-20 13:29:15 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2012-03-20 13:29:15 -0400 |
commit | 9c2b957db1772ebf942ae7a9346b14eba6c8ca66 (patch) | |
tree | 0dbb83e57260ea7fc0dc421f214d5f1b26262005 /include/trace | |
parent | 0bbfcaff9b2a69c71a95e6902253487ab30cb498 (diff) | |
parent | bea95c152dee1791dd02cbc708afbb115bb00f9a (diff) |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events changes for v3.4 from Ingo Molnar:
- New "hardware based branch profiling" feature both on the kernel and
the tooling side, on CPUs that support it. (modern x86 Intel CPUs
with the 'LBR' hardware feature currently.)
This new feature is basically a sophisticated 'magnifying glass' for
branch execution - something that is pretty difficult to extract from
regular, function histogram centric profiles.
The simplest mode is activated via 'perf record -b', and the result
looks like this in perf report:
$ perf record -b any_call,u -e cycles:u branchy
$ perf report -b --sort=symbol
52.34% [.] main [.] f1
24.04% [.] f1 [.] f3
23.60% [.] f1 [.] f2
0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow
0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn
0.01% [k] _IO_vfprintf_internal [k] strchrnul
0.01% [k] __printf [k] _IO_vfprintf_internal
0.01% [k] main [k] __printf
This output shows from/to branch columns and shows the highest
percentage (from,to) jump combinations - i.e. the most likely taken
branches in the system. "branches" can also include function calls
and any other synchronous and asynchronous transitions of the
instruction pointer that are not 'next instruction' - such as system
calls, traps, interrupts, etc.
This feature comes with (hopefully intuitive) flat ascii and TUI
support in perf report.
- Various 'perf annotate' visual improvements for us assembly junkies.
It will now recognize function calls in the TUI and by hitting enter
you can follow the call (recursively) and back, amongst other
improvements.
- Multiple threads/processes recording support in perf record, perf
stat, perf top - which is activated via a comma-list of PIDs:
perf top -p 21483,21485
perf stat -p 21483,21485 -ddd
perf record -p 21483,21485
- Support for per UID views, via the --uid paramter to perf top, perf
report, etc. For example 'perf top --uid mingo' will only show the
tasks that I am running, excluding other users, root, etc.
- Jump label restructurings and improvements - this includes the
factoring out of the (hopefully much clearer) include/linux/static_key.h
generic facility:
struct static_key key = STATIC_KEY_INIT_FALSE;
...
if (static_key_false(&key))
do unlikely code
else
do likely code
...
static_key_slow_inc();
...
static_key_slow_inc();
...
The static_key_false() branch will be generated into the code with as
little impact to the likely code path as possible. the
static_key_slow_*() APIs flip the branch via live kernel code patching.
This facility can now be used more widely within the kernel to
micro-optimize hot branches whose likelihood matches the static-key
usage and fast/slow cost patterns.
- SW function tracer improvements: perf support and filtering support.
- Various hardenings of the perf.data ABI, to make older perf.data's
smoother on newer tool versions, to make new features integrate more
smoothly, to support cross-endian recording/analyzing workflows
better, etc.
- Restructuring of the kprobes code, the splitting out of 'optprobes',
and a corner case bugfix.
- Allow the tracing of kernel console output (printk).
- Improvements/fixes to user-space RDPMC support, allowing user-space
self-profiling code to extract PMU counts without performing any
system calls, while playing nice with the kernel side.
- 'perf bench' improvements
- ... and lots of internal restructurings, cleanups and fixes that made
these features possible. And, as usual this list is incomplete as
there were also lots of other improvements
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (120 commits)
perf report: Fix annotate double quit issue in branch view mode
perf report: Remove duplicate annotate choice in branch view mode
perf/x86: Prettify pmu config literals
perf report: Enable TUI in branch view mode
perf report: Auto-detect branch stack sampling mode
perf record: Add HEADER_BRANCH_STACK tag
perf record: Provide default branch stack sampling mode option
perf tools: Make perf able to read files from older ABIs
perf tools: Fix ABI compatibility bug in print_event_desc()
perf tools: Enable reading of perf.data files from different ABI rev
perf: Add ABI reference sizes
perf report: Add support for taken branch sampling
perf record: Add support for sampling taken branch
perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK
x86/kprobes: Split out optprobe related code to kprobes-opt.c
x86/kprobes: Fix a bug which can modify kernel code permanently
x86/kprobes: Fix instruction recovery on optimized path
perf: Add callback to flush branch_stack on context switch
perf: Disable PERF_SAMPLE_BRANCH_* when not supported
perf/x86: Add LBR software filter support for Intel CPUs
...
Diffstat (limited to 'include/trace')
-rw-r--r-- | include/trace/events/power.h | 2 | ||||
-rw-r--r-- | include/trace/events/printk.h | 41 | ||||
-rw-r--r-- | include/trace/events/sched.h | 27 | ||||
-rw-r--r-- | include/trace/events/signal.h | 85 |
4 files changed, 92 insertions, 63 deletions
diff --git a/include/trace/events/power.h b/include/trace/events/power.h index 1bcc2a8c00e2..14b38940062b 100644 --- a/include/trace/events/power.h +++ b/include/trace/events/power.h | |||
@@ -151,6 +151,8 @@ enum { | |||
151 | events get removed */ | 151 | events get removed */ |
152 | static inline void trace_power_start(u64 type, u64 state, u64 cpuid) {}; | 152 | static inline void trace_power_start(u64 type, u64 state, u64 cpuid) {}; |
153 | static inline void trace_power_end(u64 cpuid) {}; | 153 | static inline void trace_power_end(u64 cpuid) {}; |
154 | static inline void trace_power_start_rcuidle(u64 type, u64 state, u64 cpuid) {}; | ||
155 | static inline void trace_power_end_rcuidle(u64 cpuid) {}; | ||
154 | static inline void trace_power_frequency(u64 type, u64 state, u64 cpuid) {}; | 156 | static inline void trace_power_frequency(u64 type, u64 state, u64 cpuid) {}; |
155 | #endif /* _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED */ | 157 | #endif /* _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED */ |
156 | 158 | ||
diff --git a/include/trace/events/printk.h b/include/trace/events/printk.h new file mode 100644 index 000000000000..94ec79cc011a --- /dev/null +++ b/include/trace/events/printk.h | |||
@@ -0,0 +1,41 @@ | |||
1 | #undef TRACE_SYSTEM | ||
2 | #define TRACE_SYSTEM printk | ||
3 | |||
4 | #if !defined(_TRACE_PRINTK_H) || defined(TRACE_HEADER_MULTI_READ) | ||
5 | #define _TRACE_PRINTK_H | ||
6 | |||
7 | #include <linux/tracepoint.h> | ||
8 | |||
9 | TRACE_EVENT_CONDITION(console, | ||
10 | TP_PROTO(const char *log_buf, unsigned start, unsigned end, | ||
11 | unsigned log_buf_len), | ||
12 | |||
13 | TP_ARGS(log_buf, start, end, log_buf_len), | ||
14 | |||
15 | TP_CONDITION(start != end), | ||
16 | |||
17 | TP_STRUCT__entry( | ||
18 | __dynamic_array(char, msg, end - start + 1) | ||
19 | ), | ||
20 | |||
21 | TP_fast_assign( | ||
22 | if ((start & (log_buf_len - 1)) > (end & (log_buf_len - 1))) { | ||
23 | memcpy(__get_dynamic_array(msg), | ||
24 | log_buf + (start & (log_buf_len - 1)), | ||
25 | log_buf_len - (start & (log_buf_len - 1))); | ||
26 | memcpy((char *)__get_dynamic_array(msg) + | ||
27 | log_buf_len - (start & (log_buf_len - 1)), | ||
28 | log_buf, end & (log_buf_len - 1)); | ||
29 | } else | ||
30 | memcpy(__get_dynamic_array(msg), | ||
31 | log_buf + (start & (log_buf_len - 1)), | ||
32 | end - start); | ||
33 | ((char *)__get_dynamic_array(msg))[end - start] = 0; | ||
34 | ), | ||
35 | |||
36 | TP_printk("%s", __get_str(msg)) | ||
37 | ); | ||
38 | #endif /* _TRACE_PRINTK_H */ | ||
39 | |||
40 | /* This part must be outside protection */ | ||
41 | #include <trace/define_trace.h> | ||
diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h index e33ed1bfa113..fbc7b1ad929b 100644 --- a/include/trace/events/sched.h +++ b/include/trace/events/sched.h | |||
@@ -6,6 +6,7 @@ | |||
6 | 6 | ||
7 | #include <linux/sched.h> | 7 | #include <linux/sched.h> |
8 | #include <linux/tracepoint.h> | 8 | #include <linux/tracepoint.h> |
9 | #include <linux/binfmts.h> | ||
9 | 10 | ||
10 | /* | 11 | /* |
11 | * Tracepoint for calling kthread_stop, performed to end a kthread: | 12 | * Tracepoint for calling kthread_stop, performed to end a kthread: |
@@ -276,6 +277,32 @@ TRACE_EVENT(sched_process_fork, | |||
276 | ); | 277 | ); |
277 | 278 | ||
278 | /* | 279 | /* |
280 | * Tracepoint for exec: | ||
281 | */ | ||
282 | TRACE_EVENT(sched_process_exec, | ||
283 | |||
284 | TP_PROTO(struct task_struct *p, pid_t old_pid, | ||
285 | struct linux_binprm *bprm), | ||
286 | |||
287 | TP_ARGS(p, old_pid, bprm), | ||
288 | |||
289 | TP_STRUCT__entry( | ||
290 | __string( filename, bprm->filename ) | ||
291 | __field( pid_t, pid ) | ||
292 | __field( pid_t, old_pid ) | ||
293 | ), | ||
294 | |||
295 | TP_fast_assign( | ||
296 | __assign_str(filename, bprm->filename); | ||
297 | __entry->pid = p->pid; | ||
298 | __entry->old_pid = p->pid; | ||
299 | ), | ||
300 | |||
301 | TP_printk("filename=%s pid=%d old_pid=%d", __get_str(filename), | ||
302 | __entry->pid, __entry->old_pid) | ||
303 | ); | ||
304 | |||
305 | /* | ||
279 | * XXX the below sched_stat tracepoints only apply to SCHED_OTHER/BATCH/IDLE | 306 | * XXX the below sched_stat tracepoints only apply to SCHED_OTHER/BATCH/IDLE |
280 | * adding sched_stat support to SCHED_FIFO/RR would be welcome. | 307 | * adding sched_stat support to SCHED_FIFO/RR would be welcome. |
281 | */ | 308 | */ |
diff --git a/include/trace/events/signal.h b/include/trace/events/signal.h index 17df43464df0..39a8a430d90f 100644 --- a/include/trace/events/signal.h +++ b/include/trace/events/signal.h | |||
@@ -23,11 +23,23 @@ | |||
23 | } \ | 23 | } \ |
24 | } while (0) | 24 | } while (0) |
25 | 25 | ||
26 | #ifndef TRACE_HEADER_MULTI_READ | ||
27 | enum { | ||
28 | TRACE_SIGNAL_DELIVERED, | ||
29 | TRACE_SIGNAL_IGNORED, | ||
30 | TRACE_SIGNAL_ALREADY_PENDING, | ||
31 | TRACE_SIGNAL_OVERFLOW_FAIL, | ||
32 | TRACE_SIGNAL_LOSE_INFO, | ||
33 | }; | ||
34 | #endif | ||
35 | |||
26 | /** | 36 | /** |
27 | * signal_generate - called when a signal is generated | 37 | * signal_generate - called when a signal is generated |
28 | * @sig: signal number | 38 | * @sig: signal number |
29 | * @info: pointer to struct siginfo | 39 | * @info: pointer to struct siginfo |
30 | * @task: pointer to struct task_struct | 40 | * @task: pointer to struct task_struct |
41 | * @group: shared or private | ||
42 | * @result: TRACE_SIGNAL_* | ||
31 | * | 43 | * |
32 | * Current process sends a 'sig' signal to 'task' process with | 44 | * Current process sends a 'sig' signal to 'task' process with |
33 | * 'info' siginfo. If 'info' is SEND_SIG_NOINFO or SEND_SIG_PRIV, | 45 | * 'info' siginfo. If 'info' is SEND_SIG_NOINFO or SEND_SIG_PRIV, |
@@ -37,9 +49,10 @@ | |||
37 | */ | 49 | */ |
38 | TRACE_EVENT(signal_generate, | 50 | TRACE_EVENT(signal_generate, |
39 | 51 | ||
40 | TP_PROTO(int sig, struct siginfo *info, struct task_struct *task), | 52 | TP_PROTO(int sig, struct siginfo *info, struct task_struct *task, |
53 | int group, int result), | ||
41 | 54 | ||
42 | TP_ARGS(sig, info, task), | 55 | TP_ARGS(sig, info, task, group, result), |
43 | 56 | ||
44 | TP_STRUCT__entry( | 57 | TP_STRUCT__entry( |
45 | __field( int, sig ) | 58 | __field( int, sig ) |
@@ -47,6 +60,8 @@ TRACE_EVENT(signal_generate, | |||
47 | __field( int, code ) | 60 | __field( int, code ) |
48 | __array( char, comm, TASK_COMM_LEN ) | 61 | __array( char, comm, TASK_COMM_LEN ) |
49 | __field( pid_t, pid ) | 62 | __field( pid_t, pid ) |
63 | __field( int, group ) | ||
64 | __field( int, result ) | ||
50 | ), | 65 | ), |
51 | 66 | ||
52 | TP_fast_assign( | 67 | TP_fast_assign( |
@@ -54,11 +69,14 @@ TRACE_EVENT(signal_generate, | |||
54 | TP_STORE_SIGINFO(__entry, info); | 69 | TP_STORE_SIGINFO(__entry, info); |
55 | memcpy(__entry->comm, task->comm, TASK_COMM_LEN); | 70 | memcpy(__entry->comm, task->comm, TASK_COMM_LEN); |
56 | __entry->pid = task->pid; | 71 | __entry->pid = task->pid; |
72 | __entry->group = group; | ||
73 | __entry->result = result; | ||
57 | ), | 74 | ), |
58 | 75 | ||
59 | TP_printk("sig=%d errno=%d code=%d comm=%s pid=%d", | 76 | TP_printk("sig=%d errno=%d code=%d comm=%s pid=%d grp=%d res=%d", |
60 | __entry->sig, __entry->errno, __entry->code, | 77 | __entry->sig, __entry->errno, __entry->code, |
61 | __entry->comm, __entry->pid) | 78 | __entry->comm, __entry->pid, __entry->group, |
79 | __entry->result) | ||
62 | ); | 80 | ); |
63 | 81 | ||
64 | /** | 82 | /** |
@@ -101,65 +119,6 @@ TRACE_EVENT(signal_deliver, | |||
101 | __entry->sa_handler, __entry->sa_flags) | 119 | __entry->sa_handler, __entry->sa_flags) |
102 | ); | 120 | ); |
103 | 121 | ||
104 | DECLARE_EVENT_CLASS(signal_queue_overflow, | ||
105 | |||
106 | TP_PROTO(int sig, int group, struct siginfo *info), | ||
107 | |||
108 | TP_ARGS(sig, group, info), | ||
109 | |||
110 | TP_STRUCT__entry( | ||
111 | __field( int, sig ) | ||
112 | __field( int, group ) | ||
113 | __field( int, errno ) | ||
114 | __field( int, code ) | ||
115 | ), | ||
116 | |||
117 | TP_fast_assign( | ||
118 | __entry->sig = sig; | ||
119 | __entry->group = group; | ||
120 | TP_STORE_SIGINFO(__entry, info); | ||
121 | ), | ||
122 | |||
123 | TP_printk("sig=%d group=%d errno=%d code=%d", | ||
124 | __entry->sig, __entry->group, __entry->errno, __entry->code) | ||
125 | ); | ||
126 | |||
127 | /** | ||
128 | * signal_overflow_fail - called when signal queue is overflow | ||
129 | * @sig: signal number | ||
130 | * @group: signal to process group or not (bool) | ||
131 | * @info: pointer to struct siginfo | ||
132 | * | ||
133 | * Kernel fails to generate 'sig' signal with 'info' siginfo, because | ||
134 | * siginfo queue is overflow, and the signal is dropped. | ||
135 | * 'group' is not 0 if the signal will be sent to a process group. | ||
136 | * 'sig' is always one of RT signals. | ||
137 | */ | ||
138 | DEFINE_EVENT(signal_queue_overflow, signal_overflow_fail, | ||
139 | |||
140 | TP_PROTO(int sig, int group, struct siginfo *info), | ||
141 | |||
142 | TP_ARGS(sig, group, info) | ||
143 | ); | ||
144 | |||
145 | /** | ||
146 | * signal_lose_info - called when siginfo is lost | ||
147 | * @sig: signal number | ||
148 | * @group: signal to process group or not (bool) | ||
149 | * @info: pointer to struct siginfo | ||
150 | * | ||
151 | * Kernel generates 'sig' signal but loses 'info' siginfo, because siginfo | ||
152 | * queue is overflow. | ||
153 | * 'group' is not 0 if the signal will be sent to a process group. | ||
154 | * 'sig' is always one of non-RT signals. | ||
155 | */ | ||
156 | DEFINE_EVENT(signal_queue_overflow, signal_lose_info, | ||
157 | |||
158 | TP_PROTO(int sig, int group, struct siginfo *info), | ||
159 | |||
160 | TP_ARGS(sig, group, info) | ||
161 | ); | ||
162 | |||
163 | #endif /* _TRACE_SIGNAL_H */ | 122 | #endif /* _TRACE_SIGNAL_H */ |
164 | 123 | ||
165 | /* This part must be outside protection */ | 124 | /* This part must be outside protection */ |