litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	ftrace: Use manual free after synchronize_sched() not call_rcu_sched()	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The entries to the probe hash must be freed after a synchronize_sched() after the entry has been removed from the hash. As the entries are registered with ops that may have their own callbacks, and these callbacks may sleep, we can not use call_rcu_sched() because the rcu callbacks registered with that are called from a softirq context. Instead of using call_rcu_sched(), manually save the entries on a free_list and at the end of the loop that removes the entries, do a synchronize_sched() and then go through the free_list, freeing the entries. Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	ftrace: Clean up function probe methods	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a function probe is created, each function that the probe is attached to, a "callback" method is called. On release of the probe, each function entry calls the "free" method. First, "callback" is a confusing name and does not really match what it does. Callback sounds like it will be called when the probe triggers. But that's not the case. This is really an "init" function, so lets rename it as such. Secondly, both "init" and "free" do not pass enough information back to the handlers. Pass back the ops, ip and data for each time the method is called. We have the information, might as well use it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Fix comments for ftrace_event_file/call flags	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \|	Most of the flags for the struct ftrace_event_file were moved over to the flags of the struct ftrace_event_call, but the comments were never updated. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add snapshot trigger to function probes	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	echo 'schedule:snapshot:1' > /debug/tracing/set_ftrace_filter This will cause the scheduler to trigger a snapshot the next time it's called (you can use any function that's not called by NMI). Even though it triggers only once, you still need to remove it with: echo '!schedule:snapshot:0' > /debug/tracing/set_ftrace_filter The :1 can be left off for the first command: echo 'schedule:snapshot' > /debug/tracing/set_ftrace_filter But this will cause all calls to schedule to trigger a snapshot. This must be removed without the ':0' echo '!schedule:snapshot' > /debug/tracing/set_ftrace_filter As adding a "count" is a different operation (internally). Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add alloc/free_snapshot() to replace duplicate code	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \|	Add alloc_snapshot() and free_snapshot() to allocate and free the snapshot buffer respectively, and use these to remove duplicate code. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	ftrace: Fix function probe to only enable needed functions	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the function probe enables all functions and runs a "hash" against every function call to see if it should call a probe. This is extremely wasteful. Note, a probe is something like: echo schedule:traceoff > /debug/tracing/set_ftrace_filter When schedule is called, the probe will disable tracing. But currently, it has a call back for all functions, and checks to see if the called function is the probe that is needed. The probe function has been created before ftrace was rewritten to allow for more than one "op" to be registered by the function tracer. When probes were created, it couldn't limit the functions without also limiting normal function calls. But now we can, it's about time to update the probe code. Todo, have separate ops for different entries. That is, assign a ftrace_ops per probe, instead of one op for all probes. But as there's not many probes assigned, this may not be that urgent. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	ftrace: Separate unlimited probes from count limited probes	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \|	The function tracing probes that trigger traceon or traceoff can be set to unlimited, or given a count of # of times to execute. By separating these two types of probes, we can then use the dynamic ftrace function filtering directly, and remove the brute force "check if this function called is my probe" routines in ftrace. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Consolidate ftrace_trace_onoff_unreg() into callback	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \|	The only thing ftrace_trace_onoff_unreg() does is to do a strcmp() against the cmd parameter to determine what op to unregister. But this compare is also done after the location that this function is called (and returns). By moving the check for '!' to unregister after the strcmp(), the callback function itself can just do the unregister and we can get rid of the helper function. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Consolidate updating of count for traceon/off	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \|	Remove some duplicate code and replace it with a helper function. This makes the code a it cleaner. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Let tracing_snapshot() be used by modules but not NMI	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \|	Add EXPORT_SYMBOL_GPL() to let the tracing_snapshot() functions be called from modules. Also add a test to see if the snapshot was called from NMI context and just warn in the tracing buffer if so, and return. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add internal ftrace trace_puts() for ftrace to use	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \|	There's a few places that ftrace uses trace_printk() for internal use, but this requires context (normal, softirq, irq, NMI) buffers to keep things lockless. But the trace_puts() does not, as it can write the string directly into the ring buffer. Make a internal helper for trace_puts() and have the internal functions use that. This way the extra context buffers are not used. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Optimize trace_printk() with one arg to use trace_puts()	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Although trace_printk() is extremely fast, especially when it uses trace_bprintk() (writes args straight to buffer instead of inserting into string), it still has the overhead of calling one of the printf sprintf() functions, that need to scan the fmt string to determine what, if any args it has. This is a waste of precious CPU cycles if the printk format has no args but a single constant string. It is better to use trace_puts() which does not have the overhead of the fmt scanning. But wouldn't it be nice if the developer didn't have to think about such things, and the compile would just do it for them? trace_printk("this string has no args\n"); [...] trace_printk("this sting does %p %d\n", foo, bar); As tracing is critical to have the least amount of overhead, especially when dealing with race conditions, and you want to eliminate any "Heisenbugs", you want the trace_printk() to use the fastest possible means of tracing. Currently the macro magic determines if it will use trace_bprintk() or if the fmt is a dynamic string (a variable), it will fall back to the slow trace_printk() method that does a full snprintf() before copying it into the buffer, where as trace_bprintk() only copys the pointer to the fmt and the args into the buffer. Well, now there's a way to spend some more Hogwarts cash and come up with new fancy macro magic. #define trace_printk(fmt, ...) \ do { \ char _______STR[] = __stringify((__VA_ARGS__)); \ if (sizeof(_______STR) > 3) \ do_trace_printk(fmt, ##__VA_ARGS__); \ else \ trace_puts(fmt); \ } while (0) The above needs a bit of explaining (both here and in the comments). By stringifying the __VA_ARGS__, we can, at compile time, determine the number of args that are being passed to trace_printk(). The extra parenthesis are required, otherwise the compiler complains about too many parameters for __stringify if there is more than one arg. When there are no args, the __stringify((__VA_ARGS__)) converts into "()\0", a string of 3 characters. Anything else, will be a string containing more than 3 characters. Now we assign that string to a dynamic char array, and then take the sizeof() of that array. If it is greater than 3 characters, we know trace_printk() has args and we need to do the full "do_trace_printk()" on them, otherwise it was only passed a single arg and we can optimize to use trace_puts(). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven "The King of Nasty Macros!" Rostedt <rostedt@goodmis.org>
*	tracing: Add trace_puts() for even faster trace_printk() tracing	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The trace_printk() is extremely fast and is very handy as it can be used in any context (including NMIs!). But it still requires scanning the fmt string for parsing the args. Even the trace_bprintk() requires a scan to know what args will be saved, although it doesn't copy the format string itself. Several times trace_printk() has no args, and wastes cpu cycles scanning the fmt string. Adding trace_puts() allows the developer to use an even faster tracing method that only saves the pointer to the string in the ring buffer without doing any format parsing at all. This will help remove even more of the "Heisenbug" effect, when debugging. Also fixed up the F_printk()s for the ftrace internal bprint and print events. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Fix the branch tracer that broke with buffer change	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \|	The changce to add the trace_buffer struct to have the trace array have both the main buffer and max buffer broke the branch tracer because the change did not update that code. As the branch tracer adds a significant amount of overhead, and must be selected via a selection (not a allyesconfig) it was missed in testing. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add alloc_snapshot kernel command line parameter	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If debugging the kernel, and the developer wants to use tracing_snapshot() in places where tracing_snapshot_alloc() may be difficult (or more likely, the developer is lazy and doesn't want to bother with tracing_snapshot_alloc() at all), then adding alloc_snapshot to the kernel command line parameter will tell ftrace to allocate the snapshot buffer (if configured) when it allocates the main tracing buffer. I also noticed that ring_buffer_expanded and tracing_selftest_disabled had inconsistent use of boolean "true" and "false" with "0" and "1". I cleaned that up too. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Move the tracing selftest code into its own function	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \|	Move the tracing startup selftest code into its own function and when not enabled, always have that function succeed. This makes the register_tracer() function much more readable. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	ring-buffer: Do not use schedule_work_on() for current CPU	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ring buffer updates when done while the ring buffer is active, needs to be completed on the CPU that is used for the ring buffer per_cpu buffer. To accomplish this, schedule_work_on() is used to schedule work on the given CPU. Now there's no reason to use schedule_work_on() if the process doing the update happens to be on the CPU that it is processing. It has already filled the requirement. Instead, just do the work and continue. This is needed for tracing_snapshot_alloc() where it may be called really early in boot, where the work queues have not been set up yet. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add internal tracing_snapshot() functions	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new snapshot feature is quite handy. It's a way for the user to take advantage of the spare buffer that, until then, only the latency tracers used to "snapshot" the buffer when it hit a max latency. Now users can trigger a "snapshot" manually when some condition is hit in a program. But a snapshot currently can not be triggered by a condition inside the kernel. With the addition of tracing_snapshot() and tracing_snapshot_alloc(), snapshots can now be taking when a condition is hit, and the developer wants to snapshot the case without stopping the trace. Note, any snapshot will overwrite the old one, so take care in how this is done. These new functions are to be used like tracing_on(), tracing_off() and trace_printk() are. That is, they should never be called in the mainline Linux kernel. They are solely for the purpose of debugging. The tracing_snapshot() will not allocate a buffer, but it is safe to be called from any context (except NMIs). But if a snapshot buffer isn't allocated when it is called, it will write to the live buffer, complaining about the lack of a snapshot buffer, and then stop tracing (giving you the "permanent snapshot"). tracing_snapshot_alloc() will allocate the snapshot buffer if it was not already allocated and then take the snapshot. This routine may sleep, and must be called from context that can sleep. The allocation is done with GFP_KERNEL and not atomic. If you need a snapshot in an atomic context, say in early boot, then it is best to call the tracing_snapshot_alloc() before then, where it will allocate the buffer, and then you can use the tracing_snapshot() anywhere you want and still get snapshots. Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Prevent deleting instances when they are being read	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \|	Add a ref count to the trace_array structure and prevent removal of instances that have open descriptors. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add per_cpu directory into tracing instances	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \|	Add the per_cpu directory to the created tracing instances: cd /sys/kernel/debug/tracing/instances mkdir foo ls foo/per_cpu/cpu0 buffer_size_kb snapshot_raw trace trace_pipe_raw snapshot stats trace_pipe Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add snapshot feature to instances	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the "snapshot" file to the the multi-buffer instances. cd /sys/kernel/debug/tracing/instances mkdir foo ls foo buffer_size_kb buffer_total_size_kb events free_buffer set_event snapshot trace trace_clock trace_marker trace_options trace_pipe tracing_on cat foo/snapshot # tracer: nop # # # * Snapshot is freed * # # Snapshot commands: # echo 0 > snapshot : Clears and frees snapshot buffer # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated. # Takes a snapshot of the main buffer. # echo 2 > snapshot : Clears snapshot buffer (but does not allocate) # (Doesn't have to be '2' works with any number that # is not a '0' or '1') Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Consolidate buffer allocation code	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \|	There's a bit of duplicate code in creating the trace buffers for the normal trace buffer and the max trace buffer among the instances and the main global_trace. This code can be consolidated and cleaned up a bit making the code cleaner and more readable as well as less duplication. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Have trace_array keep track if snapshot buffer is allocated	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \|	The snapshot buffer belongs to the trace array not the tracer that is running. The trace array should be the data structure that keeps track of whether or not the snapshot buffer is allocated, not the tracer desciptor. Having the trace array keep track of it makes modifications so much easier. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add snapshot_raw to extract the raw data from snapshot	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \|	Add a 'snapshot_raw' per_cpu file that allows tools to read the raw binary data of the snapshot buffer. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add config option to allow snapshot to swap per cpu	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the preempt or irq latency tracers are enabled, they require the ring buffer to be able to swap the per cpu sub buffers between two main buffers. This adds a slight overhead to tracing as the trace recording needs to perform some checks to synchronize between recording and swaps that might be happening on other CPUs. The config RING_BUFFER_ALLOW_SWAP is set when a user of the ring buffer needs the "swap cpu" feature, otherwise the extra checks are not implemented and removed from the tracing overhead. The snapshot feature will swap per CPU if the RING_BUFFER_ALLOW_SWAP config is set. But that only gets set by things like OPROFILE and the irqs and preempt latency tracers. This config is added to let the user decide to include this feature with the snapshot agnostic from whether or not another user of the ring buffer sets this config. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add snapshot in the per_cpu trace directories	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add the snapshot file into the per_cpu tracing directories to allow them to be read for an individual cpu. This also allows to clear an individual cpu from the snapshot buffer. If the kernel allows it (CONFIG_RING_BUFFER_ALLOW_SWAP is set), then echoing in '1' into one of the per_cpu snapshot files will do an individual cpu buffer swap instead of the entire file. Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Consolidate max_tr into main trace_array structure	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, the way the latency tracers and snapshot feature works is to have a separate trace_array called "max_tr" that holds the snapshot buffer. For latency tracers, this snapshot buffer is used to swap the running buffer with this buffer to save the current max latency. The only items needed for the max_tr is really just a copy of the buffer itself, the per_cpu data pointers, the time_start timestamp that states when the max latency was triggered, and the cpu that the max latency was triggered on. All other fields in trace_array are unused by the max_tr, making the max_tr mostly bloat. This change removes the max_tr completely, and adds a new structure called trace_buffer, that holds the buffer pointer, the per_cpu data pointers, the time_start timestamp, and the cpu where the latency occurred. The trace_array, now has two trace_buffers, one for the normal trace and one for the max trace or snapshot. By doing this, not only do we remove the bloat from the max_trace but the instances of traces can now use their own snapshot feature and not have just the top level global_trace have the snapshot feature and latency tracers for itself. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Enable snapshot when any latency tracer is enabled	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The snapshot utility is extremely useful, and does not add any more overhead in memory when another latency tracer is enabled. They use the snapshot underneath. There's no reason to hide the snapshot file when a latency tracer has been enabled in the kernel. If any of the latency tracers (irq, preempt or wakeup) is enabled then also select the snapshot facility. Note, snapshot can be enabled without the latency tracers enabled. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Clear all trace buffers when unloaded module event was used	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \|	Currently we do not know what buffer a module event was enabled in. On unload, it is safest to clear all buffer instances, not just the top level buffer. Todo: Clear only the buffer that the event was used in. The infrastructure is there to do this, but it makes the code a bit more complex. Lets get the current code vetted before we add that. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Only clear trace buffer on module unload if event was traced	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, when a module with events is unloaded, the trace buffer is cleared. This is just a safety net in case the module might have some strange callback when its event is outputted. But there's no reason to reset the buffer if the module didn't have any of its events traced. Add a flag to the event "call" structure called WAS_ENABLED and gets set when the event is ever enabled, and this flag never gets cleared. When a module gets unloaded, if any of its events have this flag set, then the trace buffer will get cleared. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add comment for trace event flag IGNORE_ENABLE	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	All the trace event flags have comments but the IGNORE_ENABLE flag which is set for ftrace internal events that should not be enabled via the debugfs "enable" file. That is, if the top level enable file is set, it will enable all events. It use to just check the ftrace event call descriptor "reg" field and skip those whithout it, but now some ftrace internal events have a reg field but still need to be skipped. The flag was created to ignore those events. Now document it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	ring-buffer: Init waitqueue for blocked readers	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \|	The move of blocked readers to the ring buffer left out the init of the wait queue that is used. Tests missed this due to running stress tests against the buffers, which didn't allow for any readers to end up waiting. Running a simple read and wait triggered a bug. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Fix some section mismatch warnings	Li Zefan	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \|	As we've added __init annotation to field-defining functions, we should add __refdata annotation to event_call variables, which reference those functions. Link: http://lkml.kernel.org/r/51343C1F.2050502@huawei.com Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Fix trace events build without modules	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new multi-buffers added a descriptor that kept track of module events, and the directories they use, with struct ftace_module_file_ops. This is used to add a ref count to keep modules from unloading while their files are being accessed. As the descriptor is only needed when CONFIG_MODULES is enabled, it is only declared when the config is enabled. But that struct is dereferenced in a few areas outside the #ifdef CONFIG_MODULES. By adding some helper routines and moving code around a little, events can be compiled again without modules. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add __per_cpu annotation to trace array percpu data pointer	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \| \|	With the conversion of the data array to per cpu, sparse now complains about the use of per_cpu_ptr() on the variable. But The variable is allocated with alloc_percpu() and is fine to use. But since the structure that contains the data variable does not annotate it as such, sparse gives out a lot of false warnings. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing/syscalls: Annotate field-defining functions with __init	Li Zefan	2013-03-15
\| \| \| \| \| \| \| \| \|	These two functions are called during kernel boot only. Link: http://lkml.kernel.org/r/51258796.7020704@huawei.com Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Annotate event field-defining functions with __init	Li Zefan	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Those functions are called either during kernel boot or module init. Before: $ dmesg \| grep 'Freeing unused kernel memory' Freeing unused kernel memory: 1208k freed Freeing unused kernel memory: 1360k freed Freeing unused kernel memory: 1960k freed After: $ dmesg \| grep 'Freeing unused kernel memory' Freeing unused kernel memory: 1236k freed Freeing unused kernel memory: 1388k freed Freeing unused kernel memory: 1960k freed Link: http://lkml.kernel.org/r/5125877D.5000201@huawei.com Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add a helper function for event print functions	Li Zefan	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move duplicate code in event print functions to a helper function. This shrinks the size of the kernel by ~13K. text data bss dec hex filename 6596137 1743966 10138672 18478775 119f6b7 vmlinux.o.old 6583002 1743849 10138672 18465523 119c2f3 vmlinux.o.new Link: http://lkml.kernel.org/r/51258746.2060304@huawei.com Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing/ring-buffer: Move poll wake ups into ring buffer code	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \| \| \|	Move the logic to wake up on ring buffer data into the ring buffer code itself. This simplifies the tracing code a lot and also has the added benefit that waiters on one of the instance buffers can be woken only when data is added to that instance instead of data added to any instance. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Fix read blocking on trace_pipe_raw	Steven Rostedt	2013-03-15
\| \| \| \| \| \| \| \| \| \| \|	If the ring buffer is empty, a read to trace_pipe_raw wont block. The tracing code has the infrastructure to wake up waiting readers, but the trace_pipe_raw doesn't take advantage of that. When a read is done to trace_pipe_raw without the O_NONBLOCK flag set, have the read block until there's data in the requested buffer. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Fix polling on trace_pipe_raw	Steven Rostedt	2013-03-15
\| \| \| \| \| \| \| \| \| \| \|	The trace_pipe_raw never implemented polling and this was casing issues for several utilities. This is now implemented. Blocked reads still are on the TODO list. Reported-by: Mauro Carvalho Chehab <mchehab@redhat.com> Tested-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Do not block on splice if either file or splice NONBLOCK flag is set	Steven Rostedt (Red Hat)	2013-03-15
\| \| \| \| \| \| \| \|	Currently only the splice NONBLOCK flag is checked to determine if the splice read should block or not. But the file descriptor NONBLOCK flag also needs to be checked. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Use direct field, type and system names	Steven Rostedt	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \|	The names used to display the field and type in the event format files are copied, as well as the system name that is displayed. All these names are created by constant values passed in. If one of theses values were to be removed by a module, the module would also be required to remove any event it created. By using the strings directly, we can save over 100K of memory. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Use kmem_cache_alloc instead of kmalloc in trace_events.c	Steven Rostedt	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The event structures used by the trace events are mostly persistent, but they are also allocated by kmalloc, which is not the best at allocating space for what is used. By converting these kmallocs into kmem_cache_allocs, we can save over 50K of space that is permanently allocated. After boot we have: slab name active allocated size --------- ------ --------- ---- ftrace_event_file 979 1005 56 67 1 ftrace_event_field 2301 2310 48 77 1 The ftrace_event_file has at boot up 979 active objects out of 1005 allocated in the slabs. Each object is 56 bytes. In a normal kmalloc, that would allocate 64 bytes for each object. 1005 - 979 = 26 objects not used 26 * 56 = 1456 bytes wasted But if we used kmalloc: 64 - 56 = 8 bytes unused per allocation 8 * 979 = 7832 bytes wasted 7832 - 1456 = 6376 bytes in savings Doing the same for ftrace_event_field where there's 2301 objects allocated in a slab that can hold 2310 with 48 bytes each we have: 2310 - 2301 = 9 objects not used 9 * 48 = 432 bytes wasted A kmalloc would also use 64 bytes per object: 64 - 48 = 16 bytes unused per allocation 16 * 2301 = 36816 bytes wasted! 36816 - 432 = 36384 bytes in savings This change gives us a total of 42760 bytes in savings. At least on my machine, but as there's a lot of these persistent objects for all configurations that use trace points, this is a net win. Thanks to Ezequiel Garcia for his trace_analyze presentation which pointed out the wasted space in my code. Cc: Ezequiel Garcia <elezegarcia@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Get trace_events kernel command line working again	Steven Rostedt	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the new descriptors used to allow multiple buffers in the tracing directory added, the kernel command line parameter trace_events=... no longer works. This is because the top level (global) trace array now has a list of descriptors associated with the events and the files in the debugfs directory. But in early bootup, when the command line is processed and the events enabled, the trace array list of events has not been set up yet. Without the list of events in the trace array, the setting of events to record will fail because it would not match any events. The solution is to set up the top level array in two stages. The first is to just add the ftrace file descriptors that just point to the events. This will allow events to be enabled and start tracing. The second stage is called after the filesystem is set up, and this stage will create the debugfs event files and directories associated with the trace array events. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add rmdir to remove multibuffer instances	Steven Rostedt	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a method to the hijacked dentry descriptor of the "instances" directory to allow for rmdir to remove an instance of a multibuffer. Example: cd /debug/tracing/instances mkdir hello ls hello/ rmdir hello ls Like the mkdir method, the i_mutex is dropped for the instances directory. The instances directory is created at boot up and can not be renamed or removed. The trace_types_lock mutex is used to synchronize adding and removing of instances. I've run several stress tests with different threads trying to create and delete directories of the same name, and it has stood up fine. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Add interface to allow multiple trace buffers	Steven Rostedt	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the interface ("instances" directory) to add multiple buffers to ftrace. To create a new instance, simply do a mkdir in the instances directory: This will create a directory with the following: # cd instances # mkdir foo # ls foo buffer_size_kb free_buffer trace_clock trace_pipe buffer_total_size_kb set_event trace_marker tracing_enabled events/ trace trace_options tracing_on Currently only events are able to be set, and there isn't a way to delete a buffer when one is created (yet). Note, the i_mutex lock is dropped from the parent "instances" directory during the mkdir operation. As the "instances" directory can not be renamed or deleted (created on boot), I do not see any harm in dropping the lock. The creation of the sub directories is protected by trace_types_lock mutex, which only lets one instance get into the code path at a time. If two tasks try to create or delete directories of the same name, only one will occur and the other will fail with -EEXIST. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Make syscall events suitable for multiple buffers	Steven Rostedt	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the syscall events record into the global buffer. But if multiple buffers are in place, then we need to have syscall events record in the proper buffers. By adding descriptors to pass to the syscall event functions, the syscall events can now record into the buffers that have been assigned to them (one event may be applied to mulitple buffers). This will allow tracing high volume syscalls along with seldom occurring syscalls without losing the seldom syscall events. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Replace the static global per_cpu arrays with allocated per_cpu	Steven Rostedt	2013-03-15
\| \| \| \| \| \| \| \| \|	The global and max-tr currently use static per_cpu arrays for the CPU data descriptors. But in order to get new allocated trace_arrays, they need to be allocated per_cpu arrays. Instead of using the static arrays, switch the global and max-tr to use allocated data. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
*	tracing: Pass the ftrace_file to the buffer lock reserve code	Steven Rostedt	2013-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pass the struct ftrace_event_file *ftrace_file to the trace_event_buffer_lock_reserve() (new function that replaces the trace_current_buffer_lock_reserver()). The ftrace_file holds a pointer to the trace_array that is in use. In the case of multiple buffers with different trace_arrays, this allows different events to be recorded into different buffers. Also fixed some of the stale comments in include/trace/ftrace.h Signed-off-by: Steven Rostedt <rostedt@goodmis.org>