aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--Documentation/perf-counters.txt104
1 files changed, 104 insertions, 0 deletions
diff --git a/Documentation/perf-counters.txt b/Documentation/perf-counters.txt
new file mode 100644
index 000000000000..19033a0bb526
--- /dev/null
+++ b/Documentation/perf-counters.txt
@@ -0,0 +1,104 @@
1
2Performance Counters for Linux
3------------------------------
4
5Performance counters are special hardware registers available on most modern
6CPUs. These registers count the number of certain types of hw events: such
7as instructions executed, cachemisses suffered, or branches mis-predicted -
8without slowing down the kernel or applications. These registers can also
9trigger interrupts when a threshold number of events have passed - and can
10thus be used to profile the code that runs on that CPU.
11
12The Linux Performance Counter subsystem provides an abstraction of these
13hardware capabilities. It provides per task and per CPU counters, and
14it provides event capabilities on top of those.
15
16Performance counters are accessed via special file descriptors.
17There's one file descriptor per virtual counter used.
18
19The special file descriptor is opened via the perf_counter_open()
20system call:
21
22 int
23 perf_counter_open(u32 hw_event_type,
24 u32 hw_event_period,
25 u32 record_type,
26 pid_t pid,
27 int cpu);
28
29The syscall returns the new fd. The fd can be used via the normal
30VFS system calls: read() can be used to read the counter, fcntl()
31can be used to set the blocking mode, etc.
32
33Multiple counters can be kept open at a time, and the counters
34can be poll()ed.
35
36When creating a new counter fd, 'hw_event_type' is one of:
37
38 enum hw_event_types {
39 PERF_COUNT_CYCLES,
40 PERF_COUNT_INSTRUCTIONS,
41 PERF_COUNT_CACHE_REFERENCES,
42 PERF_COUNT_CACHE_MISSES,
43 PERF_COUNT_BRANCH_INSTRUCTIONS,
44 PERF_COUNT_BRANCH_MISSES,
45 };
46
47These are standardized types of events that work uniformly on all CPUs
48that implements Performance Counters support under Linux. If a CPU is
49not able to count branch-misses, then the system call will return
50-EINVAL.
51
52[ Note: more hw_event_types are supported as well, but they are CPU
53 specific and are enumerated via /sys on a per CPU basis. Raw hw event
54 types can be passed in as negative numbers. For example, to count
55 "External bus cycles while bus lock signal asserted" events on Intel
56 Core CPUs, pass in a -0x4064 event type value. ]
57
58The parameter 'hw_event_period' is the number of events before waking up
59a read() that is blocked on a counter fd. Zero value means a non-blocking
60counter.
61
62'record_type' is the type of data that a read() will provide for the
63counter, and it can be one of:
64
65 enum perf_record_type {
66 PERF_RECORD_SIMPLE,
67 PERF_RECORD_IRQ,
68 };
69
70a "simple" counter is one that counts hardware events and allows
71them to be read out into a u64 count value. (read() returns 8 on
72a successful read of a simple counter.)
73
74An "irq" counter is one that will also provide an IRQ context information:
75the IP of the interrupted context. In this case read() will return
76the 8-byte counter value, plus the Instruction Pointer address of the
77interrupted context.
78
79The 'pid' parameter allows the counter to be specific to a task:
80
81 pid == 0: if the pid parameter is zero, the counter is attached to the
82 current task.
83
84 pid > 0: the counter is attached to a specific task (if the current task
85 has sufficient privilege to do so)
86
87 pid < 0: all tasks are counted (per cpu counters)
88
89The 'cpu' parameter allows a counter to be made specific to a full
90CPU:
91
92 cpu >= 0: the counter is restricted to a specific CPU
93 cpu == -1: the counter counts on all CPUs
94
95Note: the combination of 'pid == -1' and 'cpu == -1' is not valid.
96
97A 'pid > 0' and 'cpu == -1' counter is a per task counter that counts
98events of that task and 'follows' that task to whatever CPU the task
99gets schedule to. Per task counters can be created by any user, for
100their own tasks.
101
102A 'pid == -1' and 'cpu == x' counter is a per CPU counter that counts
103all events on CPU-x. Per CPU counters need CAP_SYS_ADMIN privilege.
104