aboutsummaryrefslogtreecommitdiffstats
path: root/kernel
Commit message (Collapse)AuthorAge
* sched: Optimize cgroup vs wakeup a bitPeter Zijlstra2009-09-16
| | | | | | | | | | | | We don't need to call update_shares() for each domain we iterate, just got the largets one. However, we should call it before wake_affine() as well, so that that can use up-to-date values too. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Implement a gentler fair-sleepers featureIngo Molnar2009-09-16
| | | | | | | | | | | | | | | | | | Add back FAIR_SLEEPERS and GENTLE_FAIR_SLEEPERS. FAIR_SLEEPERS is the old logic: credit sleepers with their sleep time. GENTLE_FAIR_SLEEPERS dampens this a bit: 50% of their sleep time gets credited. The hope here is to still give the benefits of fair-sleepers logic (quick wakeups, etc.) while not allow them to have 100% of their sleep time as if they were running. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Add SD_PREFER_LOCALPeter Zijlstra2009-09-16
| | | | | | | | | | | | And turn it on for NUMA and MC domains. This improves locality in balancing decisions by keeping up to capacity amount of tasks local before looking for idle CPUs. (and twice the capacity if SD_POWERSAVINGS_BALANCE is set.) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Add a few SYNC hint knobs to play withPeter Zijlstra2009-09-15
| | | | | | | | | | | | | | | | Currently we use overlap to weaken the SYNC hint, but allow it to set the hint as well. echo NO_SYNC_WAKEUP > /debug/sched_features echo SYNC_MORE > /debug/sched_features preserves pipe-test behaviour without using the WF_SYNC hint. Worth playing with on more workloads... Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Fix sync wakeups againPeter Zijlstra2009-09-15
| | | | | | | | | | | | | The sync argument rename to introduce WF_* broke stuff by missing a local alias for an argument in __wake_up_common, fix it by using the more descriptive wake_flags name. This restores WF_SYNC propagation, which fixes wake_affine() behaviour, which fixes pipe-test. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Add WF_FORKPeter Zijlstra2009-09-15
| | | | | | | | | | | | Avoid the cache buddies from biasing the time distribution away from fork()ers. Normally the next buddy will be the preferred scheduling target, but this makes fork()s prefer to run the new child, whereas we prefer to run the parent, since that will generate more work. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Rename sync argumentsPeter Zijlstra2009-09-15
| | | | | | | | | | In order to extend the functions to have more than 1 flag (sync), rename the argument to flags, and explicitly define a WF_ space for individual flags. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Rename select_task_rq() argumentPeter Zijlstra2009-09-15
| | | | | | | | | In order to be able to rename the sync argument, we need to rename the current flag argument. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Feature to disable APERF/MPERF cpu_powerPeter Zijlstra2009-09-15
| | | | | | | | | | | | | I suspect a feed-back loop between cpuidle and the aperf/mperf cpu_power bits, where when we have idle C-states lower the ratio, which leads to lower cpu_power and then less load, which generates more idle time, etc.. Put in a knob to disable it. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Provide arch_scale_freq_powerPeter Zijlstra2009-09-15
| | | | | | | | | | Provide an ach specific hook for cpufreq based scaling of cpu_power. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> [ego@in.ibm.com: spotting bugs] LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Improve latencies and throughputMike Galbraith2009-09-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the idle balancer more agressive, to improve a x264 encoding workload provided by Jason Garrett-Glaser: NEXT_BUDDY NO_LB_BIAS encoded 600 frames, 252.82 fps, 22096.60 kb/s encoded 600 frames, 250.69 fps, 22096.60 kb/s encoded 600 frames, 245.76 fps, 22096.60 kb/s NO_NEXT_BUDDY LB_BIAS encoded 600 frames, 344.44 fps, 22096.60 kb/s encoded 600 frames, 346.66 fps, 22096.60 kb/s encoded 600 frames, 352.59 fps, 22096.60 kb/s NO_NEXT_BUDDY NO_LB_BIAS encoded 600 frames, 425.75 fps, 22096.60 kb/s encoded 600 frames, 425.45 fps, 22096.60 kb/s encoded 600 frames, 422.49 fps, 22096.60 kb/s Peter pointed out that this is better done via newidle_idx, not via LB_BIAS, newidle balancing should look for where there is load _now_, not where there was load 2 ticks ago. Worst-case latencies are improved as well as no buddies means less vruntime spread. (as per prior lkml discussions) This change improves kbuild-peak parallelism as well. Reported-by: Jason Garrett-Glaser <darkshikari@gmail.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1253011667.9128.16.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Tweak wake_idxPeter Zijlstra2009-09-15
| | | | | | | | | | When merging select_task_rq_fair() and sched_balance_self() we lost the use of wake_idx, restore that and set them to 0 to make wake balancing more aggressive. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Fix task affinity for select_task_rq_fairPeter Zijlstra2009-09-15
| | | | | | | | | While merging select_task_rq_fair() and sched_balance_self() I made a mistake that leads to testing the wrong task affinty. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: for_each_domain() vs RCUPeter Zijlstra2009-09-15
| | | | | | | | | | | | for_each_domain() uses RCU to serialize the sched_domains, except it doesn't actually use rcu_read_lock() and instead relies on disabling preemption -> FAIL. XXX: audit other sched_domain code. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Weaken SD_POWERSAVINGS_BALANCEPeter Zijlstra2009-09-15
| | | | | | | | | | | | One of the problems of power-saving balancing is that under certain scenarios it is too slow and allows tons of real work to pile up. Avoid this by ignoring the powersave stuff when there's real work to be done. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Merge select_task_rq_fair() and sched_balance_self()Peter Zijlstra2009-09-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem with wake_idle() is that is doesn't respect things like cpu_power, which means it doesn't deal well with SMT nor the recent RT interaction. To cure this, it needs to do what sched_balance_self() does, which leads to the possibility of merging select_task_rq_fair() and sched_balance_self(). Modify sched_balance_self() to: - update_shares() when walking up the domain tree, (it only called it for the top domain, but it should have done this anyway), which allows us to remove this ugly bit from try_to_wake_up(). - do wake_affine() on the smallest domain that contains both this (the waking) and the prev (the wakee) cpu for WAKE invocations. Then use the top-down balance steps it had to replace wake_idle(). This leads to the dissapearance of SD_WAKE_BALANCE and SD_WAKE_IDLE_FAR, with SD_WAKE_IDLE replaced with SD_BALANCE_WAKE. SD_WAKE_AFFINE needs SD_BALANCE_WAKE to be effective. Touch all topology bits to replace the old with new SD flags -- platforms might need re-tuning, enabling SD_BALANCE_WAKE conditionally on a NUMA distance seems like a good additional feature, magny-core and small nehalem systems would want this enabled, systems with slow interconnects would not. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Add TASK_WAKINGPeter Zijlstra2009-09-15
| | | | | | | | | | | | | | | | | We're going to want to drop rq->lock in try_to_wake_up() for a longer period of time, however we also want to deal with concurrent waking of the same task, which is currently handled by holding rq->lock. So introduce a new TASK state, namely TASK_WAKING, which indicates someone is already waking the task (other wakers will fail p->state & state). We also keep preemption disabled over the whole ttwu(). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Hook sched_balance_self() into sched_class::select_task_rq()Peter Zijlstra2009-09-15
| | | | | | | | | Rather ugly patch to fully place the sched_balance_self() code inside the fair class. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Move sched_balance_self() into sched_fair.cPeter Zijlstra2009-09-15
| | | | | | | | | | | Move the sched_balance_self() code into sched_fair.c This facilitates the merger of sched_balance_self() and sched_fair::select_task_rq(). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Move code aroundPeter Zijlstra2009-09-15
| | | | | | | | | In preparation to other code movement, move weighted_cpuload(), source_load() and target_load() before the class includes. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Add come comments to the sched featuresPeter Zijlstra2009-09-15
| | | | | | | | Add text... Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Complete buddy switchesMike Galbraith2009-09-15
| | | | | | | | | Add a NEXT_BUDDY feature flag to aid in debugging. Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Split WAKEUP_OVERLAPPeter Zijlstra2009-09-15
| | | | | | | | | It consists of two conditions, split them out in separate toggles so we can test them independently. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Fix double_rq_lock() compile warningPeter Zijlstra2009-09-15
| | | | | | Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'tracing-core-for-linus' of ↵Linus Torvalds2009-09-11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (105 commits) ring-buffer: only enable ring_buffer_swap_cpu when needed ring-buffer: check for swapped buffers in start of committing tracing: report error in trace if we fail to swap latency buffer tracing: add trace_array_printk for internal tracers to use tracing: pass around ring buffer instead of tracer tracing: make tracing_reset safe for external use tracing: use timestamp to determine start of latency traces tracing: Remove mentioning of legacy latency_trace file from documentation tracing/filters: Defer pred allocation, fix memory leak tracing: remove users of tracing_reset tracing: disable buffers and synchronize_sched before resetting tracing: disable update max tracer while reading trace tracing: print out start and stop in latency traces ring-buffer: disable all cpu buffers when one finds a problem ring-buffer: do not count discarded events ring-buffer: remove ring_buffer_event_discard ring-buffer: fix ring_buffer_read crossing pages ring-buffer: remove unnecessary cpu_relax ring-buffer: do not swap buffers during a commit ring-buffer: do not reset while in a commit ...
| * Merge branch 'tracing/core' of ↵Ingo Molnar2009-09-06
| |\ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into tracing/core
| | * tracing/filters: Defer pred allocation, fix memory leakLi Zefan2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The predicates of an event and their filter structure are allocated when we create an event filter for the first time. These objects must be created once but each time we come with a new filter, we overwrite such pre-existing allocation, if any. Thus, this patch checks if the filter has already been allocated before going ahead. Spotted-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <4A9CB1BA.3060402@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
| * | Merge commit 'v2.6.31-rc9' into tracing/coreIngo Molnar2009-09-06
| |\ \ | | | | | | | | | | | | | | | | | | | | Merge reason: move from -rc5 to -rc9. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | ring-buffer: only enable ring_buffer_swap_cpu when neededSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the ability to swap the cpu buffers adds a small overhead to the recording of a trace, we only want to add it when needed. Only the irqsoff and preemptoff tracers use this feature, and both are not recommended for production kernels. This patch disables its use when neither irqsoff nor preemptoff is configured. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | ring-buffer: check for swapped buffers in start of committingSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because the irqsoff tracer can swap an internal CPU buffer, it is possible that a swap happens between the start of the write and before the committing bit is set (the committing bit will disable swapping). This patch adds a check for this and will fail the write if it detects it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | tracing: report error in trace if we fail to swap latency bufferSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The irqsoff tracer will fail to swap the cpu buffer with the max buffer if it preempts a commit. Instead of ignoring this, this patch makes the tracer report it if the last max latency failed due to preempting a current commit. The output of the latency tracer will look like this: # tracer: irqsoff # # irqsoff latency trace v1.1.5 on 2.6.31-rc5 # -------------------------------------------------------------------- # latency: 112 us, #1/1, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) # ----------------- # | task: -4281 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # => started at: save_args # => ended at: __do_softirq # # # _------=> CPU# # / _-----=> irqs-off # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| / # ||||| delay # cmd pid ||||| time | caller # \ / ||||| \ | / bash-4281 1d.s6 265us : update_max_tr_single: Failed to swap buffers due to commit in progress Note the latency time and the functions that disabled the irqs or preemption will still be listed. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | tracing: add trace_array_printk for internal tracers to useSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a trace_array_printk to allow a tracer to use the trace_printk on its own trace array. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | tracing: pass around ring buffer instead of tracerSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The latency tracers (irqsoff and wakeup) can swap trace buffers on the fly. If an event is happening and has reserved data on one of the buffers, and the latency tracer swaps the global buffer with the max buffer, the result is that the event may commit the data to the wrong buffer. This patch changes the API to the trace recording to be recieve the buffer that was used to reserve a commit. Then this buffer can be passed in to the commit. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | tracing: make tracing_reset safe for external useSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reseting the trace buffer without first disabling the buffer and waiting for any writers to complete, can corrupt the ring buffer. This patch makes the external version of tracing_reset safe from corruption by disabling the ring buffer and calling synchronize_sched. This version can no longer be called from interrupt context. But all those callers have been removed. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | tracing: use timestamp to determine start of latency tracesSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the latency tracers reset the ring buffer. Unfortunately if a commit is in process (due to a trace event), this can corrupt the ring buffer. When this happens, the ring buffer will detect the corruption and then permanently disable the ring buffer. The bug does not crash the system, but it does prevent further tracing after the bug is hit. Instead of reseting the trace buffers, the timestamp of the start of the trace is used instead. The buffers will still contain the previous data, but the output will not count any data that is before the timestamp of the trace. Note, this only affects the static trace output (trace) and not the runtime trace output (trace_pipe). The runtime trace output does not make sense for the latency tracers anyway. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | tracing: remove users of tracing_resetSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function tracing_reset is deprecated for outside use of trace.c. The new function to reset the the buffers is tracing_reset_online_cpus. The reason for this is that resetting the buffers while the event trace points are active can corrupt the buffers, because they may be writing at the time of reset. The tracing_reset_online_cpus disables writes and waits for current writers to finish. This patch replaces all users of tracing_reset except for the latency tracers. Those changes require more work and will be removed in the following patches. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | tracing: disable buffers and synchronize_sched before resettingSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resetting the ring buffers while traces are happening can corrupt the ring buffer and disable it (no kernel crash to worry about). The safest thing to do is disable the ring buffers, call synchronize_sched() to wait for all current writers to finish and then reset the buffer. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | tracing: disable update max tracer while reading traceSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading the tracer from the trace file, updating the max latency may corrupt the output. This patch disables the tracing of the max latency while reading the trace file. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | tracing: print out start and stop in latency tracesSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During development of the tracer, we would copy information from the live tracer to the max tracer with one memcpy. Since then we added a generic ring buffer and we handle the copies differently now. Unfortunately, we never copied the critical section information, and we lost the output: # => started at: kmem_cache_alloc # => ended at: kmem_cache_alloc This patch adds back the critical start and end copying as well as removes the unused "trace_idx" and "overrun" fields of the trace_array_cpu structure. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | ring-buffer: disable all cpu buffers when one finds a problemSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the way RB_WARN_ON works, is to disable either the current CPU buffer or all CPU buffers, depending on whether a ring_buffer or ring_buffer_per_cpu struct was passed into the macro. Most users of the RB_WARN_ON pass in the CPU buffer, so only the one CPU buffer gets disabled but the rest are still active. This may confuse users even though a warning is sent to the console. This patch changes the macro to disable the entire buffer even if the CPU buffer is passed in. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | ring-buffer: do not count discarded eventsSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The latency tracers report the number of items in the trace buffer. This uses the ring buffer data to calculate this. Because discarded events are also counted, the numbers do not match the number of items that are printed. The ring buffer also adds a "padding" item to the end of each buffer page which also gets counted as a discarded item. This patch decrements the counter to the page entries on a discard. This allows us to ignore discarded entries while reading the buffer. Decrementing the counter is still safe since it can only happen while the committing flag is still set. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | ring-buffer: remove ring_buffer_event_discardSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function ring_buffer_event_discard can be used on any item in the ring buffer, even after the item was committed. This function provides no safety nets and is very race prone. An item may be safely removed from the ring buffer before it is committed with the ring_buffer_discard_commit. Since there are currently no users of this function, and because this function is racey and error prone, this patch removes it altogether. Note, removing this function also allows the counters to ignore all discarded events (patches will follow). Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | ring-buffer: fix ring_buffer_read crossing pagesSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the ring buffer uses an iterator (static read mode, not on the fly reading), when it crosses a page boundery, it will skip the first entry on the next page. The reason is that the last entry of a page is usually padding if the page is not full. The padding will not be returned to the user. The problem arises on ring_buffer_read because it also increments the iterator. Because both the read and peek use the same rb_iter_peek, the rb_iter_peak will return the padding but also increment to the next item. This is because the ring_buffer_peek will not incerment it itself. The ring_buffer_read will increment it again and then call rb_iter_peek again to get the next item. But that will be the second item, not the first one on the page. The reason this never showed up before, is because the ftrace utility always calls ring_buffer_peek first and only uses ring_buffer_read to increment to the next item. The ring_buffer_peek will always keep the pointer to a valid item and not padding. This just hid the bug. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | ring-buffer: remove unnecessary cpu_relaxSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The loops in the ring buffer that use cpu_relax are not dependent on other CPUs. They simply came across some padding in the ring buffer and are skipping over them. It is a normal loop and does not require a cpu_relax. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | ring-buffer: do not swap buffers during a commitSteven Rostedt2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If a commit is taking place on a CPU ring buffer, do not allow it to be swapped. Return -EBUSY when this is detected instead. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | ring-buffer: do not reset while in a commitSteven Rostedt2009-09-04
| | |/ | |/| | | | | | | | | | | | | | | | The callers of reset must ensure that no commit can be taking place at the time of the reset. If it does then we may corrupt the ring buffer. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | tracing/filters: Defer pred allocationLi Zefan2009-08-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | init_preds() allocates about 5392 bytes of memory (on x86_32) for a TRACE_EVENT. With my config, at system boot total memory occupied is: 5392 * (642 + 15) == 3459KB 642 == cat available_events | wc -l 15 == number of dirs in events/ftrace That's quite a lot, so we'd better defer memory allocation util it's needed, that's when filter is used. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <4A9B8EA5.6020700@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | Merge branch 'tip/tracing/core' of ↵Ingo Molnar2009-08-29
| |\ \ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/core
| | * | tracing: only show tracing_max_latency when latency tracer configuredSteven Rostedt2009-08-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tracing_max_latency file should only be present when one of the latency tracers ({preempt|irqs}off, wakeup*) are enabled. This patch also removes tracing_thresh when latency tracers are not enabled, as well as compiles out code that is only used for latency tracers. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | tracing: remove legacy select of MARKERS by context switch tracingSteven Rostedt2009-08-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The context switch tracer was made before tracepoints were mature, and the original version used markers. This is no longer true and this patch removes the select. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>