aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2013-02-19 20:49:41 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2013-02-19 20:49:41 -0500
commit8f55cea410dbc56114bb71a3742032070c8108d0 (patch)
tree59605f0ee961274b22f91add33f5c32459471a83 /arch/powerpc
parentb7133a9a103655cda254987a3c0975fd9d8c443f (diff)
parente259514eef764a5286873618e34c560ecb6cff13 (diff)
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf changes from Ingo Molnar: "There are lots of improvements, the biggest changes are: Main kernel side changes: - Improve uprobes performance by adding 'pre-filtering' support, by Oleg Nesterov. - Make some POWER7 events available in sysfs, equivalent to what was done on x86, from Sukadev Bhattiprolu. - tracing updates by Steve Rostedt - mostly misc fixes and smaller improvements. - Use perf/event tracing to report PCI Express advanced errors, by Tony Luck. - Enable northbridge performance counters on AMD family 15h, by Jacob Shin. - This tracing commit: tracing: Remove the extra 4 bytes of padding in events changes the ABI. All involved parties (PowerTop in particular) seem to agree that it's safe to do now with the introduction of libtraceevent, but the devil is in the details ... Main tooling side changes: - Add 'event group view', from Namyung Kim: To use it, 'perf record' should group events when recording. And then perf report parses the saved group relation from file header and prints them together if --group option is provided. You can use the 'perf evlist' command to see event group information: $ perf record -e '{ref-cycles,cycles}' noploop 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.385 MB perf.data (~16807 samples) ] $ perf evlist --group {ref-cycles,cycles} With this example, default perf report will show you each event separately. You can use --group option to enable event group view: $ perf report --group ... # group: {ref-cycles,cycles} # ======== # Samples: 7K of event 'anon group { ref-cycles, cycles }' # Event count (approx.): 6876107743 # # Overhead Command Shared Object Symbol # ................ ....... ................. .......................... 99.84% 99.76% noploop noploop [.] main 0.07% 0.00% noploop ld-2.15.so [.] strcmp 0.03% 0.00% noploop [kernel.kallsyms] [k] timerqueue_del 0.03% 0.03% noploop [kernel.kallsyms] [k] sched_clock_cpu 0.02% 0.00% noploop [kernel.kallsyms] [k] account_user_time 0.01% 0.00% noploop [kernel.kallsyms] [k] __alloc_pages_nodemask 0.00% 0.00% noploop [kernel.kallsyms] [k] native_write_msr_safe 0.00% 0.11% noploop [kernel.kallsyms] [k] _raw_spin_lock 0.00% 0.06% noploop [kernel.kallsyms] [k] find_get_page 0.00% 0.02% noploop [kernel.kallsyms] [k] rcu_check_callbacks 0.00% 0.02% noploop [kernel.kallsyms] [k] __current_kernel_time As you can see the Overhead column now contains both of ref-cycles and cycles and header line shows group information also - 'anon group { ref-cycles, cycles }'. The output is sorted by period of group leader first. - Initial GTK+ annotate browser, from Namhyung Kim. - Add option for runtime switching perf data file in perf report, just press 's' and a menu with the valid files found in the current directory will be presented, from Feng Tang. - Add support to display whole group data for raw columns, from Jiri Olsa. - Add per processor socket count aggregation in perf stat, from Stephane Eranian. - Add interval printing in 'perf stat', from Stephane Eranian. - 'perf test' improvements - Add support for wildcards in tracepoint system name, from Jiri Olsa. - Add anonymous huge page recognition, from Joshua Zhu. - perf build-id cache now can show DSOs present in a perf.data file that are not in the cache, to integrate with build-id servers being put in place by organizations such as Fedora. - perf top now shares more of the evsel config/creation routines with 'record', paving the way for further integration like 'top' snapshots, etc. - perf top now supports DWARF callchains. - Fix mmap limitations on 32-bit, fix from David Miller. - 'perf bench numa mem' NUMA performance measurement suite - ... and lots of fixes, performance improvements, cleanups and other improvements I failed to list - see the shortlog and git log for details." * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (270 commits) perf/x86/amd: Enable northbridge performance counters on AMD family 15h perf/hwbp: Fix cleanup in case of kzalloc failure perf tools: Fix build with bison 2.3 and older. perf tools: Limit unwind support to x86 archs perf annotate: Make it to be able to skip unannotatable symbols perf gtk/annotate: Fail early if it can't annotate perf gtk/annotate: Show source lines with gray color perf gtk/annotate: Support multiple event annotation perf ui/gtk: Implement basic GTK2 annotation browser perf annotate: Fix warning message on a missing vmlinux perf buildid-cache: Add --update option uprobes/perf: Avoid uprobe_apply() whenever possible uprobes/perf: Teach trace_uprobe/perf code to use UPROBE_HANDLER_REMOVE uprobes/perf: Teach trace_uprobe/perf code to pre-filter uprobes/perf: Teach trace_uprobe/perf code to track the active perf_event's uprobes: Introduce uprobe_apply() perf: Introduce hw_perf_event->tp_target and ->tp_list uprobes/perf: Always increment trace_uprobe->nhit uprobes/tracing: Kill uprobe_trace_consumer, embed uprobe_consumer into trace_uprobe uprobes/tracing: Introduce is_trace_uprobe_enabled() ...
Diffstat (limited to 'arch/powerpc')
-rw-r--r--arch/powerpc/include/asm/perf_event_server.h26
-rw-r--r--arch/powerpc/perf/core-book3s.c12
-rw-r--r--arch/powerpc/perf/power7-pmu.c80
3 files changed, 110 insertions, 8 deletions
diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index 9710be3a2d17..136bba62efa4 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -11,6 +11,7 @@
11 11
12#include <linux/types.h> 12#include <linux/types.h>
13#include <asm/hw_irq.h> 13#include <asm/hw_irq.h>
14#include <linux/device.h>
14 15
15#define MAX_HWEVENTS 8 16#define MAX_HWEVENTS 8
16#define MAX_EVENT_ALTERNATIVES 8 17#define MAX_EVENT_ALTERNATIVES 8
@@ -35,6 +36,7 @@ struct power_pmu {
35 void (*disable_pmc)(unsigned int pmc, unsigned long mmcr[]); 36 void (*disable_pmc)(unsigned int pmc, unsigned long mmcr[]);
36 int (*limited_pmc_event)(u64 event_id); 37 int (*limited_pmc_event)(u64 event_id);
37 u32 flags; 38 u32 flags;
39 const struct attribute_group **attr_groups;
38 int n_generic; 40 int n_generic;
39 int *generic_events; 41 int *generic_events;
40 int (*cache_events)[PERF_COUNT_HW_CACHE_MAX] 42 int (*cache_events)[PERF_COUNT_HW_CACHE_MAX]
@@ -109,3 +111,27 @@ extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
109 * If an event_id is not subject to the constraint expressed by a particular 111 * If an event_id is not subject to the constraint expressed by a particular
110 * field, then it will have 0 in both the mask and value for that field. 112 * field, then it will have 0 in both the mask and value for that field.
111 */ 113 */
114
115extern ssize_t power_events_sysfs_show(struct device *dev,
116 struct device_attribute *attr, char *page);
117
118/*
119 * EVENT_VAR() is same as PMU_EVENT_VAR with a suffix.
120 *
121 * Having a suffix allows us to have aliases in sysfs - eg: the generic
122 * event 'cpu-cycles' can have two entries in sysfs: 'cpu-cycles' and
123 * 'PM_CYC' where the latter is the name by which the event is known in
124 * POWER CPU specification.
125 */
126#define EVENT_VAR(_id, _suffix) event_attr_##_id##_suffix
127#define EVENT_PTR(_id, _suffix) &EVENT_VAR(_id, _suffix).attr.attr
128
129#define EVENT_ATTR(_name, _id, _suffix) \
130 PMU_EVENT_ATTR(_name, EVENT_VAR(_id, _suffix), PME_PM_##_id, \
131 power_events_sysfs_show)
132
133#define GENERIC_EVENT_ATTR(_name, _id) EVENT_ATTR(_name, _id, _g)
134#define GENERIC_EVENT_PTR(_id) EVENT_PTR(_id, _g)
135
136#define POWER_EVENT_ATTR(_name, _id) EVENT_ATTR(PM_##_name, _id, _p)
137#define POWER_EVENT_PTR(_id) EVENT_PTR(_id, _p)
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index aa2465e21f1a..fa476d50791f 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1305,6 +1305,16 @@ static int power_pmu_event_idx(struct perf_event *event)
1305 return event->hw.idx; 1305 return event->hw.idx;
1306} 1306}
1307 1307
1308ssize_t power_events_sysfs_show(struct device *dev,
1309 struct device_attribute *attr, char *page)
1310{
1311 struct perf_pmu_events_attr *pmu_attr;
1312
1313 pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr);
1314
1315 return sprintf(page, "event=0x%02llx\n", pmu_attr->id);
1316}
1317
1308struct pmu power_pmu = { 1318struct pmu power_pmu = {
1309 .pmu_enable = power_pmu_enable, 1319 .pmu_enable = power_pmu_enable,
1310 .pmu_disable = power_pmu_disable, 1320 .pmu_disable = power_pmu_disable,
@@ -1537,6 +1547,8 @@ int __cpuinit register_power_pmu(struct power_pmu *pmu)
1537 pr_info("%s performance monitor hardware support registered\n", 1547 pr_info("%s performance monitor hardware support registered\n",
1538 pmu->name); 1548 pmu->name);
1539 1549
1550 power_pmu.attr_groups = ppmu->attr_groups;
1551
1540#ifdef MSR_HV 1552#ifdef MSR_HV
1541 /* 1553 /*
1542 * Use FCHV to ignore kernel events if MSR.HV is set. 1554 * Use FCHV to ignore kernel events if MSR.HV is set.
diff --git a/arch/powerpc/perf/power7-pmu.c b/arch/powerpc/perf/power7-pmu.c
index 2ee01e38d5e2..b554879bd31e 100644
--- a/arch/powerpc/perf/power7-pmu.c
+++ b/arch/powerpc/perf/power7-pmu.c
@@ -51,6 +51,18 @@
51#define MMCR1_PMCSEL_MSK 0xff 51#define MMCR1_PMCSEL_MSK 0xff
52 52
53/* 53/*
54 * Power7 event codes.
55 */
56#define PME_PM_CYC 0x1e
57#define PME_PM_GCT_NOSLOT_CYC 0x100f8
58#define PME_PM_CMPLU_STALL 0x4000a
59#define PME_PM_INST_CMPL 0x2
60#define PME_PM_LD_REF_L1 0xc880
61#define PME_PM_LD_MISS_L1 0x400f0
62#define PME_PM_BRU_FIN 0x10068
63#define PME_PM_BRU_MPRED 0x400f6
64
65/*
54 * Layout of constraint bits: 66 * Layout of constraint bits:
55 * 6666555555555544444444443333333333222222222211111111110000000000 67 * 6666555555555544444444443333333333222222222211111111110000000000
56 * 3210987654321098765432109876543210987654321098765432109876543210 68 * 3210987654321098765432109876543210987654321098765432109876543210
@@ -307,14 +319,14 @@ static void power7_disable_pmc(unsigned int pmc, unsigned long mmcr[])
307} 319}
308 320
309static int power7_generic_events[] = { 321static int power7_generic_events[] = {
310 [PERF_COUNT_HW_CPU_CYCLES] = 0x1e, 322 [PERF_COUNT_HW_CPU_CYCLES] = PME_PM_CYC,
311 [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x100f8, /* GCT_NOSLOT_CYC */ 323 [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = PME_PM_GCT_NOSLOT_CYC,
312 [PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 0x4000a, /* CMPLU_STALL */ 324 [PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = PME_PM_CMPLU_STALL,
313 [PERF_COUNT_HW_INSTRUCTIONS] = 2, 325 [PERF_COUNT_HW_INSTRUCTIONS] = PME_PM_INST_CMPL,
314 [PERF_COUNT_HW_CACHE_REFERENCES] = 0xc880, /* LD_REF_L1_LSU*/ 326 [PERF_COUNT_HW_CACHE_REFERENCES] = PME_PM_LD_REF_L1,
315 [PERF_COUNT_HW_CACHE_MISSES] = 0x400f0, /* LD_MISS_L1 */ 327 [PERF_COUNT_HW_CACHE_MISSES] = PME_PM_LD_MISS_L1,
316 [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x10068, /* BRU_FIN */ 328 [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = PME_PM_BRU_FIN,
317 [PERF_COUNT_HW_BRANCH_MISSES] = 0x400f6, /* BR_MPRED */ 329 [PERF_COUNT_HW_BRANCH_MISSES] = PME_PM_BRU_MPRED,
318}; 330};
319 331
320#define C(x) PERF_COUNT_HW_CACHE_##x 332#define C(x) PERF_COUNT_HW_CACHE_##x
@@ -362,6 +374,57 @@ static int power7_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
362 }, 374 },
363}; 375};
364 376
377
378GENERIC_EVENT_ATTR(cpu-cycles, CYC);
379GENERIC_EVENT_ATTR(stalled-cycles-frontend, GCT_NOSLOT_CYC);
380GENERIC_EVENT_ATTR(stalled-cycles-backend, CMPLU_STALL);
381GENERIC_EVENT_ATTR(instructions, INST_CMPL);
382GENERIC_EVENT_ATTR(cache-references, LD_REF_L1);
383GENERIC_EVENT_ATTR(cache-misses, LD_MISS_L1);
384GENERIC_EVENT_ATTR(branch-instructions, BRU_FIN);
385GENERIC_EVENT_ATTR(branch-misses, BRU_MPRED);
386
387POWER_EVENT_ATTR(CYC, CYC);
388POWER_EVENT_ATTR(GCT_NOSLOT_CYC, GCT_NOSLOT_CYC);
389POWER_EVENT_ATTR(CMPLU_STALL, CMPLU_STALL);
390POWER_EVENT_ATTR(INST_CMPL, INST_CMPL);
391POWER_EVENT_ATTR(LD_REF_L1, LD_REF_L1);
392POWER_EVENT_ATTR(LD_MISS_L1, LD_MISS_L1);
393POWER_EVENT_ATTR(BRU_FIN, BRU_FIN)
394POWER_EVENT_ATTR(BRU_MPRED, BRU_MPRED);
395
396static struct attribute *power7_events_attr[] = {
397 GENERIC_EVENT_PTR(CYC),
398 GENERIC_EVENT_PTR(GCT_NOSLOT_CYC),
399 GENERIC_EVENT_PTR(CMPLU_STALL),
400 GENERIC_EVENT_PTR(INST_CMPL),
401 GENERIC_EVENT_PTR(LD_REF_L1),
402 GENERIC_EVENT_PTR(LD_MISS_L1),
403 GENERIC_EVENT_PTR(BRU_FIN),
404 GENERIC_EVENT_PTR(BRU_MPRED),
405
406 POWER_EVENT_PTR(CYC),
407 POWER_EVENT_PTR(GCT_NOSLOT_CYC),
408 POWER_EVENT_PTR(CMPLU_STALL),
409 POWER_EVENT_PTR(INST_CMPL),
410 POWER_EVENT_PTR(LD_REF_L1),
411 POWER_EVENT_PTR(LD_MISS_L1),
412 POWER_EVENT_PTR(BRU_FIN),
413 POWER_EVENT_PTR(BRU_MPRED),
414 NULL
415};
416
417
418static struct attribute_group power7_pmu_events_group = {
419 .name = "events",
420 .attrs = power7_events_attr,
421};
422
423static const struct attribute_group *power7_pmu_attr_groups[] = {
424 &power7_pmu_events_group,
425 NULL,
426};
427
365static struct power_pmu power7_pmu = { 428static struct power_pmu power7_pmu = {
366 .name = "POWER7", 429 .name = "POWER7",
367 .n_counter = 6, 430 .n_counter = 6,
@@ -373,6 +436,7 @@ static struct power_pmu power7_pmu = {
373 .get_alternatives = power7_get_alternatives, 436 .get_alternatives = power7_get_alternatives,
374 .disable_pmc = power7_disable_pmc, 437 .disable_pmc = power7_disable_pmc,
375 .flags = PPMU_ALT_SIPR, 438 .flags = PPMU_ALT_SIPR,
439 .attr_groups = power7_pmu_attr_groups,
376 .n_generic = ARRAY_SIZE(power7_generic_events), 440 .n_generic = ARRAY_SIZE(power7_generic_events),
377 .generic_events = power7_generic_events, 441 .generic_events = power7_generic_events,
378 .cache_events = &power7_cache_events, 442 .cache_events = &power7_cache_events,