aboutsummaryrefslogtreecommitdiffstats
path: root/kernel/perf_event.c
Commit message (Collapse)AuthorAge
...
| | * perf: Rework the PMU methodsPeter Zijlstra2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace pmu::{enable,disable,start,stop,unthrottle} with pmu::{add,del,start,stop}, all of which take a flags argument. The new interface extends the capability to stop a counter while keeping it scheduled on the PMU. We replace the throttled state with the generic stopped state. This also allows us to efficiently stop/start counters over certain code paths (like IRQ handlers). It also allows scheduling a counter without it starting, allowing for a generic frozen state (useful for rotating stopped counters). The stopped state is implemented in two different ways, depending on how the architecture implemented the throttled state: 1) We disable the counter: a) the pmu has per-counter enable bits, we flip that b) we program a NOP event, preserving the counter state 2) We store the counter state and ignore all read/overflow events Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * perf: Shrink hw_perf_eventPeter Zijlstra2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use hw_perf_event::period_left instead of hw_perf_event::remaining and win back 8 bytes. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * perf: Default PMU opsPeter Zijlstra2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide default implementations for the pmu txn methods, this allows us to remove some conditional code. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * perf: Per PMU disablePeter Zijlstra2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes perf_disable() into perf_pmu_disable(). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * perf: Reduce perf_disable() usagePeter Zijlstra2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the current perf_disable() usage is only an optimization, remove it for now. This eases the removal of the __weak hw_perf_enable() interface. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * perf: Unindent labelsPeter Zijlstra2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fixup random annoying style bits. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * perf: Register PMU implementationsPeter Zijlstra2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simple registration interface for struct pmu, this provides the infrastructure for removing all the weak functions. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * perf: Deconstify struct pmuPeter Zijlstra2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"` Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * Merge branch 'perf/urgent' into perf/coreIngo Molnar2010-09-09
| | |\ | | | | | | | | | | | | | | | | | | | | Merge reason: Pick up pending fixes before applying dependent new changes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * \ Merge branch 'tip/perf/urgent' of ↵Ingo Molnar2010-08-19
| | |\ \ | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
| | * | | perf: Humanize the number of contextsFrederic Weisbecker2010-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of hardcoding the number of contexts for the recursions barriers, define a cpp constant to make the code more self-explanatory. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com>
| | * | | perf: Fix race in callchainsFrederic Weisbecker2010-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that software events don't have interrupt disabled anymore in the event path, callchains can nest on any context. So seperating nmi and others contexts in two buffers has become racy. Fix this by providing one buffer per nesting level. Given the size of the callchain entries (2040 bytes * 4), we now need to allocate them dynamically. v2: Fixed put_callchain_entry call after recursion. Fix the type of the recursion, it must be an array. v3: Use a manual pr cpu allocation (temporary solution until NMIs can safely access vmalloc'ed memory). Do a better separation between callchain reference tracking and allocation. Make the "put" path lockless for non-release cases. v4: Protect the callchain buffers with rcu. v5: Do the cpu buffers allocations node affine. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David Miller <davem@davemloft.net> Cc: Borislav Petkov <bp@amd64.org>
| | * | | perf: Factorize callchain context handlingFrederic Weisbecker2010-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Store the kernel and user contexts from the generic layer instead of archs, this gathers some repetitive code. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Paul Mackerras <paulus@samba.org> Tested-by: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: David Miller <davem@davemloft.net> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Borislav Petkov <bp@amd64.org>
| | * | | perf: Generalize some arch callchain codeFrederic Weisbecker2010-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Most archs use one callchain buffer per cpu, except x86 that needs to deal with NMIs. Provide a default perf_callchain_buffer() implementation that x86 overrides. - Centralize all the kernel/user regs handling and invoke new arch handlers from there: perf_callchain_user() / perf_callchain_kernel() That avoid all the user_mode(), current->mm checks and so... - Invert some parameters in perf_callchain_*() helpers: entry to the left, regs to the right, following the traditional (dst, src). Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Paul Mackerras <paulus@samba.org> Tested-by: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: David Miller <davem@davemloft.net> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Borislav Petkov <bp@amd64.org>
| * | | | Merge branch 'oprofile/urgent' (early part) into oprofile/perfRobert Richter2010-10-11
| |\ \ \ \ | | | |_|/ | | |/| |
| * | | | perf: New helper function for pmu nameMatt Fleming2010-10-11
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce perf_pmu_name() helper function that returns the name of the pmu. This gives us a generic way to get the name of a pmu regardless of how an architecture identifies it internally. Signed-off-by: Matt Fleming <matt@console-pimps.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Robert Richter <robert.richter@amd.com>
* | | | perf: Fix incorrect copy_from_user() usageJohn Blackwood2010-10-12
| |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf events: repair incorrect use of copy_from_user This makes the perf_event_period() return 0 instead of -EFAULT on success. Signed-off-by: John Blackwood<john.blackwood@ccur.com> Signed-off-by: Joe Korty <joe.korty@ccur.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100928220311.GA18145@tsunami.ccur.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | perf: Fix CPU hotplugPeter Zijlstra2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | Since we have UP_PREPARE, we should also have UP_CANCELED. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | perf_events: Fix time tracking for events with pid != -1 and cpu != -1Stephane Eranian2010-08-30
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Per-thread events with a cpu filter, i.e., cpu != -1, were not reporting correct timings when the thread never ran on the monitored cpu. The time enabled was reported as a negative value. This patch fixes the problem by updating tstamp_stopped, tstamp_running in event_sched_out() for events with filters and which are marked as INACTIVE. The function group_sched_out() is modified to systematically call into event_sched_out() to avoid duplicating the timing adjustment code twice. With the patch, I now get: $ task_cpu -i -e unhalted_core_cycles,unhalted_core_cycles noploop 2 noploop for 2 seconds CPU0 0 unhalted_core_cycles (ena=1,991,136,594, run=0) CPU0 0 unhalted_core_cycles (ena=1,991,136,594, run=0) CPU1 0 unhalted_core_cycles (ena=1,991,136,594, run=0) CPU1 0 unhalted_core_cycles (ena=1,991,136,594, run=0) CPU2 0 unhalted_core_cycles (ena=1,991,136,594, run=0) CPU2 0 unhalted_core_cycles (ena=1,991,136,594, run=0) CPU3 4,747,990,931 unhalted_core_cycles (ena=1,991,136,594, run=1,991,136,594) CPU3 4,747,990,931 unhalted_core_cycles (ena=1,991,136,594, run=1,991,136,594) Signed-off-by: Stephane Eranian <eranian@gmail.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: davem@davemloft.net Cc: fweisbec@gmail.com Cc: perfmon2-devel@lists.sf.net Cc: eranian@google.com LKML-Reference: <4c76802d.aae9d80a.115d.70fe@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2010-08-06
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits) sched: Use correct macro to display sched_child_runs_first in /proc/sched_debug sched: No need for bootmem special cases sched: Revert nohz_ratelimit() for now sched: Reduce update_group_power() calls sched: Update rq->clock for nohz balanced cpus sched: Fix spelling of sibling sched, cpuset: Drop __cpuexit from cpu hotplug callbacks sched: Fix the racy usage of thread_group_cputimer() in fastpath_timer_check() sched: run_posix_cpu_timers: Don't check ->exit_state, use lock_task_sighand() sched: thread_group_cputime: Simplify, document the "alive" check sched: Remove the obsolete exit_state/signal hacks sched: task_tick_rt: Remove the obsolete ->signal != NULL check sched: __sched_setscheduler: Read the RLIMIT_RTPRIO value lockless sched: Fix comments to make them DocBook happy sched: Fix fix_small_capacity powerpc: Exclude arch_sd_sibiling_asym_packing() on UP powerpc: Enable asymmetric SMT scheduling on POWER7 sched: Add asymmetric group packing option for sibling domain sched: Fix capacity calculations for SMT4 sched: Change nohz idle load balancing logic to push model ...
| * Merge commit 'v2.6.35-rc3' into sched/coreIngo Molnar2010-06-18
| |\ | | | | | | | | | Merge reason: Update to the latest -rc.
| * | sched_clock: Add local_clock() API and improve documentationPeter Zijlstra2010-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For people who otherwise get to write: cpu_clock(smp_processor_id()), there is now: local_clock(). Also, as per suggestion from Andrew, provide some documentation on the various clock interfaces, and minimize the unsigned long long vs u64 mess. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jens Axboe <jaxboe@fusionio.com> LKML-Reference: <1275052414.1645.52.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | Merge branch 'perf/core' of ↵Ingo Molnar2010-06-09
|\ \ \ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
| * | | perf: Drop the skip argument from perf_arch_fetch_regs_callerFrederic Weisbecker2010-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop this argument now that we always want to rewind only to the state of the first caller. It means frame pointers are not necessary anymore to reliably get the source of an event. But this also means we need this helper to be a macro now, as an inline function is not an option since we need to know when to provide a default implentation. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: David Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | | perf: Convert perf_event to local_tPeter Zijlstra2010-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since now all modification to event->count (and ->prev_count and ->period_left) are local to a cpu, change then to local64_t so we avoid the LOCK'ed ops. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | perf: Add perf_event::child_countPeter Zijlstra2010-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only child counters adding back their values into the parent counter are responsible for cross-cpu updates to event->count. So if we pull that out into a new child_count variable, we get an event->count that is only modified locally. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | perf: Add perf_event_count()Peter Zijlstra2010-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a helper function for those sites that want to read the event count. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | perf: Simplify the ring-buffer logic: make perf_buffer_alloc() do everything ↵Peter Zijlstra2010-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | needed Currently there are perf_buffer_alloc() + perf_buffer_init() + some separate bits, fold it all into a single perf_buffer_alloc() and only leave the attachment to the event separate. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | perf: Rename perf_mmap_data to perf_bufferPeter Zijlstra2010-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename to clarify code. s/perf_mmap_data/perf_buffer/g and selective s/data/buffer/g Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | perf: Cleanup {start,commit,cancel}_txn detailsPeter Zijlstra2010-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clarify some of the transactional group scheduling API details and change it so that a successfull ->commit_txn also closes the transaction. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1274803086.5882.1752.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | perf: Add non-exec mmap() trackingEric B Munson2010-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the capacility to track data mmap()s. This can be used together with PERF_SAMPLE_ADDR for data profiling. Signed-off-by: Anton Blanchard <anton@samba.org> [Updated code for stable perf ABI] Signed-off-by: Eric B Munson <ebmunson@us.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1274193049-25997-1-git-send-email-ebmunson@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | perf, trace: Remove superfluous rcu_read_lock()Peter Zijlstra2010-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __DO_TRACE() already calls the callbacks under rcu_read_lock_sched(), which is sufficient for our needs, avoid doing it again. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | perf, trace: Inline perf_swevent_put_recursion_context()Peter Zijlstra2010-06-09
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | Inline perf_swevent_put_recursion_context into perf_tp_event(), this shrinks the per trace template code footprint and saves a function call. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | perf: Fix signed comparison in perf_adjust_period()Peter Zijlstra2010-06-08
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Frederic reported that frequency driven swevents didn't work properly and even caused a division-by-zero error. It turns out there are two bugs, the division-by-zero comes from a failure to deal with that in perf_calculate_period(). The other was more interesting and turned out to be a wrong comparison in perf_adjust_period(). The comparison was between an s64 and u64 and got implicitly converted to an unsigned comparison. The problem is that period_left is typically < 0, so it ended up being always true. Cure this by making the local period variables s64. Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf: Fix crash in sweventsPeter Zijlstra2010-06-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Frederic reported that because swevents handling doesn't disable IRQs anymore, we can get a recursion of perf_adjust_period(), once from overflow handling and once from the tick. If both call ->disable, we get a double hlist_del_rcu() and trigger a LIST_POISON2 dereference. Since we don't actually need to stop/start a swevent to re-programm the hardware (lack of hardware to program), simply nop out these callbacks for the swevent pmu. Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1275557609.27810.35218.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf_events: Fix unincremented buffer base on partial copyFrederic Weisbecker2010-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a sample size crosses to the next page boundary, the copy will be made in more than one step. However we forget to advance the source offset for the next copy, leading to unexpected double copies that completely mess up the traces. This fixes various kinds of bad traces that have irrelevant data inside, as an example: geany-4979 [001] 5758.077775: sched_switch: prev_comm=! prev_pid=121 prev_prio=0 prev_state=S|D|Z|X|x ==> next_comm= next_pid=7497072 next_prio=0 Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1274988898-5639-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf_events: Fix event scheduling issues introduced by transactional APIStephane Eranian2010-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The transactional API patch between the generic and model-specific code introduced several important bugs with event scheduling, at least on X86. If you had pinned events, e.g., watchdog, and were over-committing the PMU, you would get bogus counts. The bug was showing up on Intel CPU because events would move around more often that on AMD. But the problem also existed on AMD, though harder to expose. The issues were: - group_sched_in() was missing a cancel_txn() in the error path - cpuc->n_added was not properly maintained, leading to missing actions in hw_perf_enable(), i.e., n_running being 0. You cannot update n_added until you know the transaction has succeeded. In case of failed transaction n_added was not adjusted back. - in case of failed transactions, event_sched_out() was called and eventually invoked x86_disable_event() to touch the HW reg. But with transactions, on X86, event_sched_in() does not touch HW registers, it simply collects events into a list. Thus, you could end up calling x86_disable_event() on a counter which did not correspond to the current event when idx != -1. The patch modifies the generic and X86 code to avoid all those problems. First, we keep track of the number of events added last. In case the transaction fails, we substract them from n_added. This approach is necessary (as opposed to delaying updates to n_added) because not all event updates use the transaction API, e.g., single events. Second, we encapsulate the event_sched_in() and event_sched_out() in group_sched_in() inside the transaction. That makes the operations symmetrical and you can also detect that you are inside a transaction and skip the HW reg access by checking cpuc->group_flag. With this patch, you can now overcommit the PMU even with pinned system-wide events present and still get valid counts. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1274796225.5882.1389.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf_events: Fix races in group compositionPeter Zijlstra2010-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Group siblings don't pin each-other or the parent, so when we destroy events we must make sure to clean up all cross referencing pointers. In particular, for destruction of a group leader we must be able to find all its siblings and remove their reference to it. This means that detaching an event from its context must not detach it from the group, otherwise we can end up failing to clear all pointers. Solve this by clearly separating the attachment to a context and attachment to a group, and keep the group composed until we destroy the events. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf_events: Fix races and clean up perf_event and perf_mmap_data interactionPeter Zijlstra2010-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to move toward separate buffer objects, rework the whole perf_mmap_data construct to be a more self-sufficient entity, one with its own lifetime rules. This greatly sanitizes the whole output redirection code, which was riddled with bugs and races. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Fix racy use of anon_inode_getfd() in perf_event.cAl Viro2010-05-27
|/ | | | | | | | | once anon_inode_getfd() is called, you can't expect *anything* about struct file that descriptor points to - another thread might be doing whatever it likes with descriptor table at that point. Cc: stable <stable@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* perf: Optimize perf_tp_event_match()Peter Zijlstra2010-05-21
| | | | | | | | | | | | | | | Since we know tracepoints come from kernel context, avoid conditionals that try and establish that very fact. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100521090710.904944001@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf: Remove more code from the fastpathPeter Zijlstra2010-05-21
| | | | | | | | | | | | | Sanity checks cost instructions. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100521090710.852926930@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf: Optimize the !vmalloc backed bufferPeter Zijlstra2010-05-21
| | | | | | | | | | | | | | Reduce code and data by using the knowledge that for !PERF_USE_VMALLOC data_order is always 0. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100521090710.795019386@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf: Optimize perf_output_copy()Peter Zijlstra2010-05-21
| | | | | | | | | | | | | | Reduce the clutter in perf_output_copy() by keeping an interator in perf_output_handle. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100521090710.742809176@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf: Fix wakeup storm for RO mmap()sPeter Zijlstra2010-05-21
| | | | | | | | | | | | | | | | | | RO mmap()s don't update the tail pointer, so comparing against it for determining the written data size doesn't really do any good. Keep track of when we last did a wakeup, and compare against that. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100521090710.684479310@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf: Ensure that IOC_OUTPUT isn't used to create multi-writer buffersPeter Zijlstra2010-05-21
| | | | | | | | | | | | | | Since we want to ensure buffers only have a single writer, we must avoid creating one with multiple. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100521090710.528215873@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf, trace: Optimize tracepoints by using per-tracepoint-per-cpu hlist to ↵Peter Zijlstra2010-05-21
| | | | | | | | | | | | | | | | | | | | track events Avoid the swevent hash-table by using per-tracepoint hlists. Also, avoid conditionals on the fast path by ordering with probe unregister so that we should never get on the callback path without the data being there. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100521090710.473188012@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf: Fix forgotten preempt_enable by nested writersFrederic Weisbecker2010-05-20
| | | | | | | | | | | | | | | | | | | | | | | | A writer that gets a reference to the buffer handle disables preemption. When we put that reference, we check if we are the outer most writer and if not, we simply return and defer the head update to the outer most writer. The problem here is that preemption is only reenabled by the outer most, that produces preemption count imbalance for every nested writer that exit. So just don't forget to always re-enable preemption when we put the buffer reference, whoever we are. Fixes lots of sleeping in atomic warnings, visible with lock events recording. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> Cc: Robert Richter <robert.richter@amd.com>
* Merge branch 'perf/urgent' of ↵Ingo Molnar2010-05-20
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
| * perf: Comply with new rcu checks APIFrederic Weisbecker2010-05-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The software events hlist doesn't fully comply with the new rcu checks api. We need to consider three different sides that access the hlist: - the hlist allocation/release side. This side happens when an events is created or released, accesses to the hlist are serialized under the cpuctx mutex. - the events insertion/removal in the hlist. This side is always serialized against the above one. The hlist is always present during such operations. This side happens when a software event is scheduled in/out. The serialization that ensures the software event is really attached to the context is made under the ctx->lock. - events triggering. This is the read side, it can happen concurrently with any update side. This patch deals with them one by one and anticipates with the separate rcu mem space patches in preparation. This patch fixes various annoying rcu warnings. Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org>