aboutsummaryrefslogtreecommitdiffstats
path: root/kernel
Commit message (Collapse)AuthorAge
...
| * | | | | | | | | | | sched: Fix race between ttwu() and task_rq_lock()Peter Zijlstra2010-02-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Thomas found that due to ttwu() changing a task's cpu without holding the rq->lock, task_rq_lock() might end up locking the wrong rq. Avoid this by serializing against TASK_WAKING. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1266241712.15770.420.camel@laptop> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | | | | | sched: Fix SMT scheduler regression in find_busiest_queue()Suresh Siddha2010-02-16
| | |/ / / / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a SMT scheduler performance regression that is leading to a scenario where SMT threads in one core are completely idle while both the SMT threads in another core (on the same socket) are busy. This is caused by this commit (with the problematic code highlighted) commit bdb94aa5dbd8b55e75f5a50b61312fe589e2c2d1 Author: Peter Zijlstra <a.p.zijlstra@chello.nl> Date: Tue Sep 1 10:34:38 2009 +0200 sched: Try to deal with low capacity @@ -4203,15 +4223,18 @@ find_busiest_queue() ... for_each_cpu(i, sched_group_cpus(group)) { + unsigned long power = power_of(i); ... - wl = weighted_cpuload(i); + wl = weighted_cpuload(i) * SCHED_LOAD_SCALE; + wl /= power; - if (rq->nr_running == 1 && wl > imbalance) + if (capacity && rq->nr_running == 1 && wl > imbalance) continue; On a SMT system, power of the HT logical cpu will be 589 and the scheduler load imbalance (for scenarios like the one mentioned above) can be approximately 1024 (SCHED_LOAD_SCALE). The above change of scaling the weighted load with the power will result in "wl > imbalance" and ultimately resulting in find_busiest_queue() return NULL, causing load_balance() to think that the load is well balanced. But infact one of the tasks can be moved to the idle core for optimal performance. We don't need to use the weighted load (wl) scaled by the cpu power to compare with imabalance. In that condition, we already know there is only a single task "rq->nr_running == 1" and the comparison between imbalance, wl is to make sure that we select the correct priority thread which matches imbalance. So we really need to compare the imabalnce with the original weighted load of the cpu and not the scaled load. But in other conditions where we want the most hammered(busiest) cpu, we can use scaled load to ensure that we consider the cpu power in addition to the actual load on that cpu, so that we can move the load away from the guy that is getting most hammered with respect to the actual capacity, as compared with the rest of the cpu's in that busiest group. Fix it. Reported-by: Ma Ling <ling.ma@intel.com> Initial-Analysis-by: Zhang, Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1266023662.2808.118.camel@sbs-t61.sc.intel.com> Cc: stable@kernel.org [2.6.32.x] Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | | | | kernel/sched.c: Suppress unused var warningAndrew Morton2010-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On UP: kernel/sched.c: In function 'wake_up_new_task': kernel/sched.c:2631: warning: unused variable 'cpu' Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | | | | | | Merge branch 'perf-core-for-linus' of ↵Linus Torvalds2010-02-28
|\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (172 commits) perf_event, amd: Fix spinlock initialization perf_event: Fix preempt warning in perf_clock() perf tools: Flush maps on COMM events perf_events, x86: Split PMU definitions into separate files perf annotate: Handle samples not at objdump output addr boundaries perf_events, x86: Remove superflous MSR writes perf_events: Simplify code by removing cpu argument to hw_perf_group_sched_in() perf_events, x86: AMD event scheduling perf_events: Add new start/stop PMU callbacks perf_events: Report the MMAP pgoff value in bytes perf annotate: Defer allocating sym_priv->hist array perf symbols: Improve debugging information about symtab origins perf top: Use a macro instead of a constant variable perf symbols: Check the right return variable perf/scripts: Tag syscall_name helper as not yet available perf/scripts: Add perf-trace-python Documentation perf/scripts: Remove unnecessary PyTuple resizes perf/scripts: Add syscall tracing scripts perf/scripts: Add Python scripting engine perf/scripts: Remove check-perf-trace from listed scripts ... Fix trivial conflict in tools/perf/util/probe-event.c
| * | | | | | | | | | | perf_event: Fix preempt warning in perf_clock()Peter Zijlstra2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A recent commit introduced a preemption warning for perf_clock(), use raw_smp_processor_id() to avoid this, it really doesn't matter which cpu we use here. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1267198583.22519.684.camel@laptop> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | perf_events: Simplify code by removing cpu argument to hw_perf_group_sched_in()Peter Zijlstra2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the cpu argument to hw_perf_group_sched_in() is always smp_processor_id(), simplify the code a little by removing this argument and using the current cpu where needed. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1265890918.5396.3.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | perf_events, x86: AMD event schedulingStephane Eranian2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds correct AMD NorthBridge event scheduling. NB events are events measuring L3 cache, Hypertransport traffic. They are identified by an event code >= 0xe0. They measure events on the Northbride which is shared by all cores on a package. NB events are counted on a shared set of counters. When a NB event is programmed in a counter, the data actually comes from a shared counter. Thus, access to those counters needs to be synchronized. We implement the synchronization such that no two cores can be measuring NB events using the same counters. Thus, we maintain a per-NB allocation table. The available slot is propagated using the event_constraint structure. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4b703957.0702d00a.6bf2.7b7d@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | perf_events: Add new start/stop PMU callbacksStephane Eranian2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In certain situations, the kernel may need to stop and start the same event rapidly. The current PMU callbacks do not distinguish between stop and release (i.e., stop + free the resource). Thus, a counter may be released, then it will be immediately re-acquired. Event scheduling will again take place with no guarantee to assign the same counter. On some processors, this may event yield to failure to assign the event back due to competion between cores. This patch is adding a new pair of callback to stop and restart a counter without actually release the underlying counter resource. On stop, the counter is stopped, its values saved and that's it. On start, the value is reloaded and counter is restarted (on x86, actual restart is delayed until perf_enable()). Signed-off-by: Stephane Eranian <eranian@google.com> [ added fallback to ->enable/->disable for all other PMUs fixed x86_pmu_start() to call x86_pmu.enable() merged __x86_pmu_disable into x86_pmu_stop() ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4b703875.0a04d00a.7896.ffffb824@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | perf_events: Report the MMAP pgoff value in bytesPeter Zijlstra2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DaveM reported that currently perf interprets the pgoff value reported by the MMAP events as a byte range, but the kernel reports it as a page offset. Since its broken (and unusable) anyway, change the kernel behaviour (ABI) to report bytes indeed, avoiding the need for userspace to deal with PAGE_SIZE things. Reported-by: David Miller <davem@davemloft.net> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | kprobes: Add mcount to the kprobes blacklistMasami Hiramatsu2010-02-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since mcount function can be called from everywhere, it should be blacklisted. Moreover, the "mcount" symbol is a special symbol name. So, it is better to put it in the generic blacklist. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100205062433.3745.36726.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | perf_events: Optimize perf_event_task_tick()Peter Zijlstra2010-02-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pretty much all of the calls do perf_disable/perf_enable cycles, pull that out to cut back on hardware programming. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | ftrace: Remove record freezingMasami Hiramatsu2010-02-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove record freezing. Because kprobes never puts probe on ftrace's mcount call anymore, it doesn't need ftrace to check whether kprobes on it. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: przemyslaw@pawelczyk.it Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100202214925.4694.73469.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | kprobes: Check probe address is reservedMasami Hiramatsu2010-02-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check whether the address of new probe is already reserved by ftrace or alternatives (on x86) when registering new probe. If reserved, it returns an error and not register the probe. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: przemyslaw@pawelczyk.it Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Jason Baron <jbaron@redhat.com> LKML-Reference: <20100202214918.4694.94179.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | ftrace/alternatives: Introducing *_text_reserved functionsMasami Hiramatsu2010-02-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introducing *_text_reserved functions for checking the text address range is partially reserved or not. This patch provides checking routines for x86 smp alternatives and dynamic ftrace. Since both functions modify fixed pieces of kernel text, they should reserve and protect those from other dynamic text modifier, like kprobes. This will also be extended when introducing other subsystems which modify fixed pieces of kernel text. Dynamic text modifiers should avoid those. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: przemyslaw@pawelczyk.it Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Jason Baron <jbaron@redhat.com> LKML-Reference: <20100202214911.4694.16587.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | kprobes: Disable booster when CONFIG_PREEMPT=yMasami Hiramatsu2010-02-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Disable kprobe booster when CONFIG_PREEMPT=y at this time, because it can't ensure that all kernel threads preempted on kprobe's boosted slot run out from the slot even using freeze_processes(). The booster on preemptive kernel will be resumed if synchronize_tasks() or something like that is introduced. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100202214904.4694.24330.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | Merge branch 'perf/urgent' into perf/coreIngo Molnar2010-01-29
| |\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: We want to queue up a dependent patch. Also update to later -rc's. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | perf_events: Fix sample_period transfer on inheritPeter Zijlstra2010-01-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One problem with frequency driven counters is that we cannot predict the rate at which they trigger, therefore we have to start them at period=1, this causes a ramp up effect. However, if we fail to propagate the stable state on fork each new child will have to ramp up again. This can lead to significant artifacts in sample data. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: eranian@google.com Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1264752266.4283.2121.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | tracing/kprobe: Cleanup unused return value of tracing functionsXiao Guangrong2010-01-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The return values of the kprobe's tracing functions are meaningless, lets remove these. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4B60E9A3.2040505@cn.fujitsu.com> [fweisbec@gmail: whitespace fixes, drop useless void returns in end of functions] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
| * | | | | | | | | | | | perf: Factorize trace events raw sample buffer operationsXiao Guangrong2010-01-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce ftrace_perf_buf_prepare() and ftrace_perf_buf_submit() to gather the common code that operates on raw events sampling buffer. This cleans up redundant code between regular trace events, syscall events and kprobe events. Changelog v1->v2: - Rename function name as per Masami and Frederic's suggestion - Add __kprobes for ftrace_perf_buf_prepare() and make ftrace_perf_buf_submit() inline as per Masami's suggestion - Export ftrace_perf_buf_prepare since modules will use it Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4B60E92D.9000808@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
| * | | | | | | | | | | | perf: Reimplement frequency driven samplingPeter Zijlstra2010-01-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There was a bug in the old period code that caused intel_pmu_enable_all() or native_write_msr_safe() to show up quite high in the profiles. In staring at that code it made my head hurt, so I rewrote it in a hopefully simpler fashion. Its now fully symetric between tick and overflow driven adjustments and uses less data to boot. The only complication is that it basically wants to do a u128 division. The code approximates that in a rather simple truncate until it fits fashion, taking care to balance the terms while truncating. This version does not generate that sampling artefact. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | Merge branch 'perf/scheduling' of ↵Ingo Molnar2010-01-18
| |\ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
| | * | | | | | | | | | | | perf: Better order flexible and pinned schedulingFrederic Weisbecker2010-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a task gets scheduled in. We don't touch the cpu bound events so the priority order becomes: cpu pinned, cpu flexible, task pinned, task flexible. So schedule out cpu flexibles when a new task context gets in and correctly order the groups to schedule in: task pinned, cpu flexible, task flexible. Cpu pinned groups don't need to be touched at this time. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
| | * | | | | | | | | | | | perf: Don't schedule out/in pinned events on task tickFrederic Weisbecker2010-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't need to schedule in/out pinned events on task tick, now that pinned and flexible groups can be scheduled separately. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
| | * | | | | | | | | | | | perf: Allow pinned and flexible groups to be scheduled separatelyFrederic Weisbecker2010-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tune the scheduling helpers so that we can choose to schedule either pinned and/or flexible groups from a context. And while at it, refactor a bit the naming of these helpers to make these more consistent and flexible. There is no (intended) change in scheduling behaviour in this patch. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
| | * | | | | | | | | | | | perf: Make __perf_event_sched_out staticFrederic Weisbecker2010-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __perf_event_sched_out doesn't need to be globally available, make it static. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
| * | | | | | | | | | | | | tracing/kprobe: Update kprobe tracing self test for new syntaxMasami Hiramatsu2010-01-17
| |/ / / / / / / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update kprobe tracing self test for new syntax (it supports deleting individual probes, and drops $argN support) and behavior change (new probes are disabled in default). This selftest includes the following checks: - Adding function-entry probe and return probe with arguments. - Enabling these probes. - Deleting it individually. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100114051211.7814.29436.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | perf: Export software-only event group characteristic as a flagFrederic Weisbecker2010-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before scheduling an event group, we first check if a group can go on. We first check if the group is made of software only events first, in which case it is enough to know if the group can be scheduled in. For that purpose, we iterate through the whole group, which is wasteful as we could do this check when we add/delete an event to a group. So we create a group_flags field in perf event that can host characteristics from a group of events, starting with a first PERF_GROUP_SOFTWARE flag that reduces the check on the fast path. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
| * | | | | | | | | | | | perf: Round robin flexible groups of events using list_rotate_left()Frederic Weisbecker2010-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is more proper that doing it through a list_for_each_entry() that breaks after the first entry. v2: Don't rotate pinned groups as its not needed to time share them. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
| * | | | | | | | | | | | perf/core: Split context's event group list into pinned and non-pinned listsFrederic Weisbecker2010-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split-up struct perf_event_context::group_list into pinned_groups and flexible_groups (non-pinned). This first appears to be useless as it duplicates various loops around the group list handlings. But it scales better in the fast-path in perf_sched_in(). We don't anymore iterate twice through the entire list to separate pinned and non-pinned scheduling. Instead we interate through two distinct lists. The another desired effect is that it makes easier to define distinct scheduling rules on both. Changes in v2: - Respectively rename pinned_grp_list and volatile_grp_list into pinned_groups and flexible_groups as per Ingo suggestion. - Various cleanups Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
| * | | | | | | | | | | | sched/perf: Make sure irqs are disabled for perf_event_task_sched_in()Jamie Iles2010-01-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf_event_task_sched_in() expects interrupts to be disabled, but on architectures with __ARCH_WANT_INTERRUPTS_ON_CTXSW defined, this isn't true. If this is defined, disable irqs around the call in finish_task_switch(). Signed-off-by: Jamie Iles <jamie.iles@picochip.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Russell King - ARM Linux <linux@arm.linux.org.uk> LKML-Reference: <1262964453-27370-1-git-send-email-jamie.iles@picochip.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | tracing/kprobe: Drop function argument access syntaxMasami Hiramatsu2010-01-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop function argument access syntax, because the function arguments depend on not only architecture but also compile-options and function API. And now, we have perf-probe for finding register/memory assigned to each argument. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Neuling <mikey@neuling.org> Cc: linuxppc-dev@ozlabs.org LKML-Reference: <20100105224648.19431.52309.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | Merge branch 'perf/urgent' into perf/coreIngo Molnar2010-01-13
| |\ \ \ \ \ \ \ \ \ \ \ \ | | | |_|_|_|_|_|_|_|/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: queue up dependent patch, update to -rc4 Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | perf events: Remove CONFIG_EVENT_PROFILELi Zefan2009-12-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Quoted from Ingo: | This reminds me - i think we should eliminate CONFIG_EVENT_PROFILE - | it's an unnecessary Kconfig complication. If both PERF_EVENTS and | EVENT_TRACING is enabled we should expose generic tracepoints. | | Nor is it limited to event 'profiling', so it has become a misnomer as | well. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <4B2F1557.2050705@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | perf events: Remove arg from perf sched hooksPeter Zijlstra2009-12-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we only ever schedule the local cpu, there is no need to pass the cpu number to the perf sched hooks. This micro-optimizes things a bit. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | | | | | | | | Merge branch 'tracing-core-for-linus' of ↵Linus Torvalds2010-02-28
|\ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (28 commits) ftrace: Add function names to dangling } in function graph tracer tracing: Simplify memory recycle of trace_define_field tracing: Remove unnecessary variable in print_graph_return tracing: Fix typo of info text in trace_kprobe.c tracing: Fix typo in prof_sysexit_enable() tracing: Remove CONFIG_TRACE_POWER from kernel config tracing: Fix ftrace_event_call alignment for use with gcc 4.5 ftrace: Remove memory barriers from NMI code when not needed tracing/kprobes: Add short documentation for HAVE_REGS_AND_STACK_ACCESS_API s390: Add pt_regs register and stack access API tracing/kprobes: Make Kconfig dependencies generic tracing: Unify arch_syscall_addr() implementations tracing: Add notrace to TRACE_EVENT implementation functions ftrace: Allow to remove a single function from function graph filter tracing: Add correct/incorrect to sort keys for branch annotation output tracing: Simplify test for function_graph tracing start point tracing: Drop the tr check from the graph tracing path tracing: Add stack dump to trace_printk if stacktrace option is set tracing: Use appropriate perl constructs in recordmcount.pl tracing: optimize recordmcount.pl for offsets-handling ...
| * \ \ \ \ \ \ \ \ \ \ \ \ Merge branch 'tip/tracing/core' of ↵Ingo Molnar2010-02-27
| |\ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/core
| | * | | | | | | | | | | | | ftrace: Add function names to dangling } in function graph tracerSteven Rostedt2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function graph tracer is currently the most invasive tracer in the ftrace family. It can easily overflow the buffer even with 10megs per CPU. This means that events can often be lost. On start up, or after events are lost, if the function return is recorded but the function enter was lost, all we get to see is the exiting '}'. Here is how a typical trace output starts: [tracing] cat trace # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 0) + 91.897 us | } 0) ! 567.961 us | } 0) <========== | 0) ! 579.083 us | _raw_spin_lock_irqsave(); 0) 4.694 us | _raw_spin_unlock_irqrestore(); 0) ! 594.862 us | } 0) ! 603.361 us | } 0) ! 613.574 us | } 0) ! 623.554 us | } 0) 3.653 us | fget_light(); 0) | sock_poll() { There are a series of '}' with no matching "func() {". There's no information to what functions these ending brackets belong to. This patch adds a stack on the per cpu structure used in outputting the function graph tracer to keep track of what function was outputted. Then on a function exit event, it checks the depth to see if the function exit has a matching entry event. If it does, then it only prints the '}', otherwise it adds the function name after the '}'. This allows function exit events to show what function they belong to at trace output startup, when the entry was lost due to ring buffer overflow, or even after a new task is scheduled in. Here is what the above trace will look like after this patch: [tracing] cat trace # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 0) + 91.897 us | } (irq_exit) 0) ! 567.961 us | } (smp_apic_timer_interrupt) 0) <========== | 0) ! 579.083 us | _raw_spin_lock_irqsave(); 0) 4.694 us | _raw_spin_unlock_irqrestore(); 0) ! 594.862 us | } (add_wait_queue) 0) ! 603.361 us | } (__pollwait) 0) ! 613.574 us | } (tcp_poll) 0) ! 623.554 us | } (sock_poll) 0) 3.653 us | fget_light(); 0) | sock_poll() { Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | | | | | | | | | | | Merge branch 'tracing/core' of ↵Ingo Molnar2010-02-27
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ | | |/ / / / / / / / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into tracing/core
| | * | | | | | | | | | | | | tracing/kprobes: Make Kconfig dependencies genericHeiko Carstens2010-02-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KPROBES_EVENT actually depends on the regs and stack access API (b1cf540f) and not on x86. So introduce a new config option which architectures can select if they have the API implemented and switch x86. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> LKML-Reference: <20100210162517.GB6933@osiris.boeblingen.de.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
| | * | | | | | | | | | | | | tracing: Unify arch_syscall_addr() implementationsMike Frysinger2010-02-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most implementations of arch_syscall_addr() are the same, so create a default version in common code and move the one piece that differs (the syscall table) to asm/syscall.h. New arch ports don't have to waste time copying & pasting this simple function. The s390/sparc versions need to be different, so document why. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1264498803-17278-1-git-send-email-vapier@gentoo.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
| * | | | | | | | | | | | | | Merge branch 'tip/tracing/core' of ↵Ingo Molnar2010-02-26
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/core
| | * | | | | | | | | | | | | | tracing: Simplify memory recycle of trace_define_fieldWenji Huang2010-02-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Discard freeing field->type since it is not necessary. Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Wenji Huang <wenji.huang@oracle.com> LKML-Reference: <1266997226-6833-5-git-send-email-wenji.huang@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing: Remove unnecessary variable in print_graph_returnWenji Huang2010-02-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "cpu" variable is declared at the start of the function and also within a branch, with the exact same initialization. Remove the local variable of the same name in the branch. Signed-off-by: Wenji Huang <wenji.huang@oracle.com> LKML-Reference: <1266997226-6833-3-git-send-email-wenji.huang@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing: Fix typo of info text in trace_kprobe.cWenji Huang2010-02-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Wenji Huang <wenji.huang@oracle.com> LKML-Reference: <1266997226-6833-2-git-send-email-wenji.huang@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing: Fix typo in prof_sysexit_enable()Wenji Huang2010-02-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Wenji Huang <wenji.huang@oracle.com> LKML-Reference: <1266997226-6833-1-git-send-email-wenji.huang@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing: Remove CONFIG_TRACE_POWER from kernel configLi Zefan2010-02-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The power tracer has been converted to power trace events. Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4B84D50E.4070806@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing: Fix ftrace_event_call alignment for use with gcc 4.5Jeff Mahoney2010-02-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC 4.5 introduces behavior that forces the alignment of structures to use the largest possible value. The default value is 32 bytes, so if some structures are defined with a 4-byte alignment and others aren't declared with an alignment constraint at all - it will align at 32-bytes. For things like the ftrace events, this results in a non-standard array. When initializing the ftrace subsystem, we traverse the _ftrace_events section and call the initialization callback for each event. When the structures are misaligned, we could be treating another part of the structure (or the zeroed out space between them) as a function pointer. This patch forces the alignment for all the ftrace_event_call structures to 4 bytes. Without this patch, the kernel fails to boot very early when built with gcc 4.5. It's trivial to check the alignment of the members of the array, so it might be worthwhile to add something to the build system to do that automatically. Unfortunately, that only covers this case. I've asked one of the gcc developers about adding a warning when this condition is seen. Cc: stable@kernel.org Signed-off-by: Jeff Mahoney <jeffm@suse.com> LKML-Reference: <4B85770B.6010901@suse.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | | | | | | | | | | | | Merge commit 'v2.6.33' into tracing/coreIngo Molnar2010-02-26
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ | | |/ / / / / / / / / / / / / / | |/| | | | | | | | | | | | | / | | | |_|_|_|_|_|_|_|_|_|_|_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: scripts/recordmcount.pl Merge reason: Merge up to v2.6.33. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | ftrace: Allow to remove a single function from function graph filterLi Zefan2010-02-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I don't see why we can only clear all functions from the filter. After patching: # echo sys_open > set_graph_function # echo sys_close >> set_graph_function # cat set_graph_function sys_open sys_close # echo '!sys_close' >> set_graph_function # cat set_graph_function sys_open Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4B726388.2000408@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | | | | | | | | | | | tracing: Add correct/incorrect to sort keys for branch annotation outputSteven Rostedt2010-02-09
| | |/ / / / / / / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The branch annotation is a bit difficult to see the worst offenders because it only sorts by percentage: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 163 100 qdisc_restart sch_generic.c 179 0 163 100 pfifo_fast_dequeue sch_generic.c 447 0 4 100 pskb_trim_rcsum skbuff.h 1689 0 4 100 llc_rcv llc_input.c 170 0 18 100 psmouse_interrupt psmouse-base.c 304 0 3 100 atkbd_interrupt atkbd.c 389 0 5 100 usb_alloc_dev usb.c 437 0 11 100 vsscanf vsprintf.c 1897 0 2 100 IS_ERR err.h 34 0 23 100 __rmqueue_fallback page_alloc.c 865 0 4 100 probe_wakeup_sched_switch trace_sched_wakeup.c 142 0 3 100 move_masked_irq migration.c 11 Adding the incorrect and correct values as sort keys makes this file a bit more informative: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 366541 100 audit_syscall_entry auditsc.c 1637 0 366538 100 audit_syscall_exit auditsc.c 1685 0 115839 100 sched_info_switch sched_stats.h 269 0 74567 100 sched_info_queued sched_stats.h 222 0 66578 100 sched_info_dequeued sched_stats.h 177 0 15113 100 trace_workqueue_insertion workqueue.h 38 0 15107 100 trace_workqueue_execution workqueue.h 45 0 3622 100 syscall_trace_leave ptrace.c 1772 0 2750 100 sched_move_task sched.c 10100 0 2750 100 sched_move_task sched.c 10110 0 1815 100 pre_schedule_rt sched_rt.c 1462 0 837 100 audit_alloc auditsc.c 879 0 814 100 tcp_mss_split_point tcp_output.c 1302 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>