aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc/perf
Commit message (Collapse)AuthorAge
* Merge branch 'next' of ↵Linus Torvalds2013-02-23
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Benjamin Herrenschmidt: "So from the depth of frozen Minnesota, here's the powerpc pull request for 3.9. It has a few interesting highlights, in addition to the usual bunch of bug fixes, minor updates, embedded device tree updates and new boards: - Hand tuned asm implementation of SHA1 (by Paulus & Michael Ellerman) - Support for Doorbell interrupts on Power8 (kind of fast thread-thread IPIs) by Ian Munsie - Long overdue cleanup of the way we handle relocation of our open firmware trampoline (prom_init.c) on 64-bit by Anton Blanchard - Support for saving/restoring & context switching the PPR (Processor Priority Register) on server processors that support it. This allows the kernel to preserve thread priorities established by userspace. By Haren Myneni. - DAWR (new watchpoint facility) support on Power8 by Michael Neuling - Ability to change the DSCR (Data Stream Control Register) which controls cache prefetching on a running process via ptrace by Alexey Kardashevskiy - Support for context switching the TAR register on Power8 (new branch target register meant to be used by some new specific userspace perf event interrupt facility which is yet to be enabled) by Ian Munsie. - Improve preservation of the CFAR register (which captures the origin of a branch) on various exception conditions by Paulus. - Move the Bestcomm DMA driver from arch powerpc to drivers/dma where it belongs by Philippe De Muyter - Support for Transactional Memory on Power8 by Michael Neuling (based on original work by Matt Evans). For those curious about the feature, the patch contains a pretty good description." (See commit db8ff907027b: "powerpc: Documentation for transactional memory on powerpc" for the mentioned description added to the file Documentation/powerpc/transactional_memory.txt) * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (140 commits) powerpc/kexec: Disable hard IRQ before kexec powerpc/85xx: l2sram - Add compatible string for BSC9131 platform powerpc/85xx: bsc9131 - Correct typo in SDHC device node powerpc/e500/qemu-e500: enable coreint powerpc/mpic: allow coreint to be determined by MPIC version powerpc/fsl_pci: Store the pci ctlr device ptr in the pci ctlr struct powerpc/85xx: Board support for ppa8548 powerpc/fsl: remove extraneous DIU platform functions arch/powerpc/platforms/85xx/p1022_ds.c: adjust duplicate test powerpc: Documentation for transactional memory on powerpc powerpc: Add transactional memory to pseries and ppc64 defconfigs powerpc: Add config option for transactional memory powerpc: Add transactional memory to POWER8 cpu features powerpc: Add new transactional memory state to the signal context powerpc: Hook in new transactional memory code powerpc: Routines for FP/VSX/VMX unavailable during a transaction powerpc: Add transactional memory unavaliable execption handler powerpc: Add reclaim and recheckpoint functions for context switching transactional memory processes powerpc: Add FP/VSX and VMX register load functions for transactional memory powerpc: Add helper functions for transactional memory context switching ...
| * perf/Power: PERF_EVENT_IOC_ENABLE does not reenable eventsukadev@linux.vnet.ibm.com2013-01-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf/Power: PERF_EVENT_IOC_ENABLE does not reenable event If we disable a perf event because we exceeded the specified ->event_limit, power_pmu_stop() sets the PERF_HES_STOPPED flag on the event. If the application then re-enables the event using PERF_EVENT_IOC_ENABLE ioctl, we don't ever clear this STOPPED flag. Consequently, the user space is never notified of the event. Following message has more background and test case. http://lists.eecs.utk.edu/pipermail/ptools-perfapi/2012-October/002528.html Used the following test cases to verify that this patch works on latest PAPI. $ papi.git/src/ctests/nonthread PAPI_TOT_CYC@5000000 $ papi.git/src/ctests/overflow_single_event Changelog[v2]: - [Paul Mackerras] Also clear PERF_HES_UPTODATE flag since we are restarting the event; cleanup comments and patch description. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/perf: Fix for PMCs not making progressMichael Neuling2013-01-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On POWER7 when we have really small counts left before overflow, we can take a PMU IRQ, but the PMC gets wound back to just before the overflow. If the kernel is setting the PMC to a value just before the overflow, we can get interrupted again without the PMC making any progress (ie another buggy overflow). In this case, we can end up making no forward progress, with the PMC interrupt returning us to the same count over and over. The below detects when we are making no forward progress (ie. delta = 0) and then increases the amount left before the overflow. This stops us from locking up. Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> cc: Paul Mackerras <paulus@samba.org> cc: Anton Blanchard <anton@samba.org> cc: Linux PPC dev <linuxppc-dev@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/perf: Fix finding overflowed PMC in interruptMichael Neuling2013-01-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a PMC is about to overflow on a counter that's on an active perf event (ie. less than 256 from the end) and a _different_ PMC overflows just at this time (a PMC that's not on an active perf event), we currently mark the event as found, but in reality it's not as it's likely the other PMC that caused the IRQ. Since we mark it as found the second catch all for overflows doesn't run, and we don't reset the overflowing PMC ever. Hence we keep hitting that same PMC IRQ over and over and don't reset the actual overflowing counter. This is a rewrite of the perf interrupt handler for book3s to get around this. We now check to see if any of the PMCs have actually overflowed (ie >= 0x80000000). If yes, record it for active counters and just reset it for inactive counters. If it's not overflowed, then we check to see if it's one of the buggy power7 counters and if it is, record it and continue. If none of the PMCs match this, then we make note that we couldn't find the PMC that caused the IRQ. Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> cc: Paul Mackerras <paulus@samba.org> cc: Anton Blanchard <anton@samba.org> cc: Linux PPC dev <linuxppc-dev@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/perf: Add stalled-cycles eventsChris Freehill2013-01-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support for stalled-cycles-frontend and stalled-cycles-backend is added for e500-based processors. The following mappings are used: stalled-cycles-frontend or idle-cycles-frontend: Com:18 Cycles decode stalled stalled-cycles-backend or idle-cycles-backend Com:19 cycles issue stalled Signed-off-by: Chris Freehill <chrisf@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | perf/POWER7: Make some POWER7 events available in sysfsSukadev Bhattiprolu2013-01-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make some POWER7-specific perf events available in sysfs. $ /bin/ls -1 /sys/bus/event_source/devices/cpu/events/ branch-instructions branch-misses cache-misses cache-references cpu-cycles instructions PM_BRU_FIN PM_BRU_MPRED PM_CMPLU_STALL PM_CYC PM_GCT_NOSLOT_CYC PM_INST_CMPL PM_LD_MISS_L1 PM_LD_REF_L1 stalled-cycles-backend stalled-cycles-frontend where the 'PM_*' events are POWER specific and the others are the generic events. This will enable users to specify these events with their symbolic names rather than with their raw code. perf stat -e 'cpu/PM_CYC' ... Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anton Blanchard <anton@au1.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20130123062528.GE13720@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf/POWER7: Make generic event translations available in sysfsSukadev Bhattiprolu2013-01-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the generic perf events in POWER7 available via sysfs. $ ls /sys/bus/event_source/devices/cpu/events branch-instructions branch-misses cache-misses cache-references cpu-cycles instructions stalled-cycles-backend stalled-cycles-frontend $ cat /sys/bus/event_source/devices/cpu/events/cache-misses event=0x400f0 This patch is based on commits that implement this functionality on x86. Eg: commit a47473939db20e3961b200eb00acf5fcf084d755 Author: Jiri Olsa <jolsa@redhat.com> Date: Wed Oct 10 14:53:11 2012 +0200 perf/x86: Make hardware event translations available in sysfs Changelog:[v2] [Jiri Osla] Drop EVENT_ID() macro since it is only used once. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anton Blanchard <anton@au1.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20130123062454.GD13720@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf/Power7: Use macros to identify perf eventsSukadev Bhattiprolu2013-01-31
|/ | | | | | | | | | | | | | | | | | | | Define and use macros to identify perf events codes This would make it easier and more readable when these event codes need to be used in more than one place. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anton Blanchard <anton@au1.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20130123062353.GB13720@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* powerpc/perf: Add missing L2 constraint handling in Power7 PMUMichael Ellerman2012-11-14
| | | | | | | | | | If we have two cache events that require different settings of the L2SEL bits in MMCR1 then we can not schedule those events simultaneously. Add logic to the constraint handling to express that. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Revert "powerpc/perf: Use pmc_overflow() to detect rolled back events"Benjamin Herrenschmidt2012-10-17
| | | | | | | This reverts commit 813312110bede27bffd082c25cd31730bd567beb. This revert was requested by the author of the patch as it seems to cause system hangs with some low frequency events
* powerpc/perf: Sample only if SIAR-Valid bit is set in P7+sukadev@linux.vnet.ibm.com2012-09-26
| | | | | | | | | | | | | | | | | | | | powerpc/perf: Sample only if SIAR-Valid bit is set in P7+ On POWER7+ two new bits (mmcra[35] and mmcra[36]) indicate whether the contents of SIAR and SDAR are valid. For marked instructions on P7+, we must save the contents of SIAR and SDAR registers only if these new bits are set. This code/check for the SIAR-Valid bit is specific to P7+, so rather than waste a CPU-feature bit use the PVR flag. Note that Carl Love proposed a similar change for oprofile: https://lkml.org/lkml/2012/6/22/309 Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Merge branch 'merge' into nextBenjamin Herrenschmidt2012-09-06
|\ | | | | | | Brings in various bug fixes from 3.6-rcX
| * powerpc/perf: Use pmc_overflow() to detect rolled back eventsSukadev Bhattiprolu2012-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For certain speculative events on Power7, 'perf stat' reports far higher event count than 'perf record' for the same event. As described in following commit, a performance monitor exception is raised even when the the performance events are rolled back. commit 0837e3242c73566fc1c0196b4ec61779c25ffc93 Author: Anton Blanchard <anton@samba.org> Date: Wed Mar 9 14:38:42 2011 +1100 perf_event_interrupt() records an event only when an overflow occurs. But this check for overflow is a simple 'if (val < 0)'. Because the events are rolled back, this check for overflow fails and the event is not recorded. perf_event_interrupt() later uses pmc_overflow() to detect the overflow and resets the counters and the events are lost completely. To properly detect the overflow of rolled back events, use pmc_overflow() even when recording events. To reproduce: $ cat strcpy.c #include <stdio.h> #include <string.h> main() { char buf[256]; alarm(5); while(1) strcpy(buf, "string1"); } $ perf record -e r20014 ./strcpy $ perf report -n > report.1 $ perf stat -e r20014 > report.2 # Compare report.1 and report.2 Reported-by: Maynard Johnson <mpjohn@us.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Rename 64-bit PVR constants to PVR_fooMichael Ellerman2012-09-05
|/ | | | | | | | | | | | | | | We have an old FIXME in reg.h which points out that we should standardise on PVR_foo for our PVR #defines. Currently we use PVR_ on 32-bit and PV_ on 64-bit. So do that rename and remove the FIXME. Seeing as we're touching all but one usage of __is_processor(), rename it to something less ugly and more indicative of what it does, which is simply to check the PVR version. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/perf: Use perf_instruction_pointer in callchainsAnton Blanchard2012-07-10
| | | | | | | | | | We use SIAR or regs->nip for the instruction pointer depending on the PMU configuration, but we always use regs->nip in the callchain. Use perf_instruction_pointer so the backtrace is consistent. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/perf: Always use pt_regs for userspace samplesAnton Blanchard2012-07-10
| | | | | | | | | | | | | | | | | | | At the moment we always use the SIAR if the PMU supports continuous sampling. Unfortunately the SIAR and the PMU exception are not synchronised for non marked events so we can end up with callchains that dont make sense. The following patch checks the HV and PR bits for samples coming from userspace and always uses pt_regs for them. Userspace will never have interrupts off so there is no real advantage to using the SIAR for non marked events in userspace. I had experimented with a patch that did a similar thing for kernel samples but we lost a significant amount of information. I was unable to profile any of our early exception code for example. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/perf: Move code to select SIAR or pt_regs into perf_read_regsAnton Blanchard2012-07-10
| | | | | | | | | | | | The logic to choose whether to use the SIAR or get the information out of pt_regs is going to get more complicated, so do it once in perf_read_regs. We overload regs->result which is gross but we are already doing it with regs->dsisr. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/perf: Create mmcra_sihv/mmcra_sipv helpersAnton Blanchard2012-07-10
| | | | | | | | | We want to access the MMCRA_SIHV and MMCRA_SIPR bits elsewhere so create mmcra_sihv and mmcra_sipr which hide the differences between the old and new layout of the bits. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* perf: Pass last sampling period to perf_sample_data_init()Robert Richter2012-05-09
| | | | | | | | | | | | | | | | | | | | | We always need to pass the last sample period to perf_sample_data_init(), otherwise the event distribution will be wrong. Thus, modifiyng the function interface with the required period as argument. So basically a pattern like this: perf_sample_data_init(&data, ~0ULL); data.period = event->hw.last_period; will now be like that: perf_sample_data_init(&data, ~0ULL, event->hw.last_period); Avoids unininitialized data.period and simplifies code. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-3-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* powerpc/perf: Fix instruction address sampling on 970 and Power4Benjamin Herrenschmidt2012-03-27
| | | | | | | | | | | | | | 970 and Power4 don't support "continuous sampling" which means that when we aren't in marked instruction sampling mode (marked events), SIAR isn't updated with the last instruction sampled before the perf interrupt. On those processors, we must thus use the exception SRR0 value as the sampled instruction pointer. Those processors also don't support the SIPR and SIHV bits in MMCRA which means we need some kind of heuristic to decide if SIAR values represent kernel or user addresses. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Merge branch 'next' of ↵Linus Torvalds2012-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc merge from Benjamin Herrenschmidt: "Here's the powerpc batch for this merge window. It is going to be a bit more nasty than usual as in touching things outside of arch/powerpc mostly due to the big iSeriesectomy :-) We finally got rid of the bugger (legacy iSeries support) which was a PITA to maintain and that nobody really used anymore. Here are some of the highlights: - Legacy iSeries is gone. Thanks Stephen ! There's still some bits and pieces remaining if you do a grep -ir series arch/powerpc but they are harmless and will be removed in the next few weeks hopefully. - The 'fadump' functionality (Firmware Assisted Dump) replaces the previous (equivalent) "pHyp assisted dump"... it's a rewrite of a mechanism to get the hypervisor to do crash dumps on pSeries, the new implementation hopefully being much more reliable. Thanks Mahesh Salgaonkar. - The "EEH" code (pSeries PCI error handling & recovery) got a big spring cleaning, motivated by the need to be able to implement a new backend for it on top of some new different type of firwmare. The work isn't complete yet, but a good chunk of the cleanups is there. Note that this adds a field to struct device_node which is not very nice and which Grant objects to. I will have a patch soon that moves that to a powerpc private data structure (hopefully before rc1) and we'll improve things further later on (hopefully getting rid of the need for that pointer completely). Thanks Gavin Shan. - I dug into our exception & interrupt handling code to improve the way we do lazy interrupt handling (and make it work properly with "edge" triggered interrupt sources), and while at it found & fixed a wagon of issues in those areas, including adding support for page fault retry & fatal signals on page faults. - Your usual random batch of small fixes & updates, including a bunch of new embedded boards, both Freescale and APM based ones, etc..." I fixed up some conflicts with the generalized irq-domain changes from Grant Likely, hopefully correctly. * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (141 commits) powerpc/ps3: Do not adjust the wrapper load address powerpc: Remove the rest of the legacy iSeries include files powerpc: Remove the remaining CONFIG_PPC_ISERIES pieces init: Remove CONFIG_PPC_ISERIES powerpc: Remove FW_FEATURE ISERIES from arch code tty/hvc_vio: FW_FEATURE_ISERIES is no longer selectable powerpc/spufs: Fix double unlocks powerpc/5200: convert mpc5200 to use of_platform_populate() powerpc/mpc5200: add options to mpc5200_defconfig powerpc/mpc52xx: add a4m072 board support powerpc/mpc5200: update mpc5200_defconfig to fit for charon board Documentation/powerpc/mpc52xx.txt: Checkpatch cleanup powerpc/44x: Add additional device support for APM821xx SoC and Bluestone board powerpc/44x: Add support PCI-E for APM821xx SoC and Bluestone board MAINTAINERS: Update PowerPC 4xx tree powerpc/44x: The bug fixed support for APM821xx SoC and Bluestone board powerpc: document the FSL MPIC message register binding powerpc: add support for MPIC message register API powerpc/fsl: Added aliased MSIIR register address to MSI node in dts powerpc/85xx: mpc8548cds - add 36-bit dts ...
* powerpc/perf: Move perf core & PMU code into a subdirectoryMichael Ellerman2012-02-22
The perf code has grown a lot since it started, and is big enough to warrant its own subdirectory. For reference it's ~60% bigger than the oprofile code. It declutters the kernel directory, makes it simpler to grep for "just perf stuff", and allows us to shorten some filenames. While we're at it, make it more obvious that we have two implementations of the core perf logic. One for (roughly) Book3S CPUs, which was the original implementation, and the other for Freescale embedded CPUs. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>