aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* Merge tag 'trace-3.19' of ↵Linus Torvalds2014-12-10
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "There was a lot of clean ups and minor fixes. One of those clean ups was to the trace_seq code. It also removed the return values to the trace_seq_*() functions and use trace_seq_has_overflowed() to see if the buffer filled up or not. This is similar to work being done to the seq_file code as well in another tree. Some of the other goodies include: - Added some "!" (NOT) logic to the tracing filter. - Fixed the frame pointer logic to the x86_64 mcount trampolines - Added the logic for dynamic trampolines on !CONFIG_PREEMPT systems. That is, the ftrace trampoline can be dynamically allocated and be called directly by functions that only have a single hook to them" * tag 'trace-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (55 commits) tracing: Truncated output is better than nothing tracing: Add additional marks to signal very large time deltas Documentation: describe trace_buf_size parameter more accurately tracing: Allow NOT to filter AND and OR clauses tracing: Add NOT to filtering logic ftrace/fgraph/x86: Have prepare_ftrace_return() take ip as first parameter ftrace/x86: Get rid of ftrace_caller_setup ftrace/x86: Have save_mcount_regs macro also save stack frames if needed ftrace/x86: Add macro MCOUNT_REG_SIZE for amount of stack used to save mcount regs ftrace/x86: Simplify save_mcount_regs on getting RIP ftrace/x86: Have save_mcount_regs store RIP in %rdi for first parameter ftrace/x86: Rename MCOUNT_SAVE_FRAME and add more detailed comments ftrace/x86: Move MCOUNT_SAVE_FRAME out of header file ftrace/x86: Have static tracing also use ftrace_caller_setup ftrace/x86: Have static function tracing always test for function graph kprobes: Add IPMODIFY flag to kprobe_ftrace_ops ftrace, kprobes: Support IPMODIFY flag to find IP modify conflict kprobes/ftrace: Recover original IP if pre_handler doesn't change it tracing/trivial: Fix typos and make an int into a bool tracing: Deletion of an unnecessary check before iput() ...
| * tracing: Truncated output is better than nothingDan Carpenter2014-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The initial reason for this patch is that I noticed that: if (len > TRACE_BUF_SIZE) is off by one. In this code, if len == TRACE_BUF_SIZE, then it means we have truncated the last character off the output string. If we truncate two or more characters then we exit without printing. After some discussion, we decided that printing truncated data is better than not printing at all so we should just use vscnprintf() and remove the test entirely. Also I have updated memcpy() to copy the NUL char instead of setting the NUL in a separate step. Link: http://lkml.kernel.org/r/20141127155752.GA21914@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Add additional marks to signal very large time deltasByungchul Park2014-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, function graph tracer prints "!" or "+" just before function execution time to signal a function overhead, depending on the time. And some tracers tracing latency also print "!" or "+" just after time to signal overhead, depending on the interval between events. Even it is usually enough to do that, we sometimes need to signal for bigger execution time than 100 micro seconds. For example, I used function graph tracer to detect if there is any case that exit_mm() takes too much time. I did following steps in /sys/kernel/debug/tracing. It was easier to detect very large excution time with patched kernel than with original kernel. $ echo exit_mm > set_graph_function $ echo function_graph > current_tracer $ echo > trace $ cat trace_pipe > $LOGFILE ... (do something and terminate logging) $ grep "\\$" $LOGFILE 3) $ 22082032 us | } /* kernel_map_pages */ 3) $ 22082040 us | } /* free_pages_prepare */ 3) $ 22082113 us | } /* free_hot_cold_page */ 3) $ 22083455 us | } /* free_hot_cold_page_list */ 3) $ 22083895 us | } /* release_pages */ 3) $ 22177873 us | } /* free_pages_and_swap_cache */ 3) $ 22178929 us | } /* unmap_single_vma */ 3) $ 22198885 us | } /* unmap_vmas */ 3) $ 22206949 us | } /* exit_mmap */ 3) $ 22207659 us | } /* mmput */ 3) $ 22207793 us | } /* exit_mm */ And then, it was easy to find out that a schedule-out occured by sub_preempt_count() within kernel_map_pages(). To detect very large function exection time caused by either problematic function implementation or scheduling issues, this patch can be useful. Link: http://lkml.kernel.org/r/1416789259-24038-1-git-send-email-byungchul.park@lge.com Signed-off-by: Byungchul Park <byungchul.park@lge.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * Documentation: describe trace_buf_size parameter more accuratelyJoonsoo Kim2014-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | I'm stuck into panic that too litte free memory is left when I boot with trace_buf_size parameter. After digging into the problem, I found that trace_buf_size is the size of trace buffer on each cpu rather than total size of trace buffer. To prevent victim like me, change description of trace_buf_size parameter more accurately. Link: http://lkml.kernel.org/r/1417570760-10620-1-git-send-email-iamjoonsoo.kim@lge.com Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Allow NOT to filter AND and OR clausesSteven Rostedt (Red Hat)2014-12-03
| | | | | | | | | | | | | | | | | | | | Add support to allow not "!" for and (&&) and (||). That is: !(field1 == X && field2 == Y) Where the value of the full clause will be notted. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Add NOT to filtering logicSteven Rostedt (Red Hat)2014-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ted noticed that he could not filter on an event for a bit being cleared. That's because the filtering logic only tests event fields with a limited number of comparisons which, for bit logic, only include "&", which can test if a bit is set, but there's no good way to see if a bit is clear. This adds a way to do: !(field & 2048) Which returns true if the bit is not set, and false otherwise. Note, currently !(field1 == 10 && field2 == 15) is not supported. That is, the 'not' only works for direct comparisons, not for the AND and OR logic. Link: http://lkml.kernel.org/r/20141202021912.GA29096@thunk.org Link: http://lkml.kernel.org/r/20141202120430.71979060@gandalf.local.home Acked-by: Alexei Starovoitov <ast@plumgrid.com> Suggested-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace/fgraph/x86: Have prepare_ftrace_return() take ip as first parameterSteven Rostedt (Red Hat)2014-12-01
| | | | | | | | | | | | | | | | | | | | | | | | | | The function graph helper function prepare_ftrace_return() which does the work to hijack the parent pointer has that parent pointer as its first parameter. Instead, if we make it the second parameter and have ip as the first parameter (self_addr), then it can use the %rdi from save_mcount_regs that loads it already. Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace/x86: Get rid of ftrace_caller_setupSteven Rostedt (Red Hat)2014-12-01
| | | | | | | | | | | | | | | | | | | | | | | | Move all the work from ftrace_caller_setup into save_mcount_regs. This simplifies the code and makes it easier to understand. Link: http://lkml.kernel.org/r/CA+55aFxUTUbdxpjVMW8X9c=o8sui7OB_MYPfcbJuDyfUWtNrNg@mail.gmail.com Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace/x86: Have save_mcount_regs macro also save stack frames if neededSteven Rostedt (Red Hat)2014-12-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The save_mcount_regs macro saves and restores the required mcount regs that need to be saved before calling C code. It is done for all the function hook utilities (static tracing, dynamic tracing, regs, function graph). When frame pointers are enabled, the ftrace trampolines need to set up frames and pointers such that a back trace (dump stack) can continue passed them. Currently, a separate macro is used (create_frame) to do this, but it's only done for the ftrace_caller and ftrace_reg_caller functions. It is not done for the static tracer or function graph tracing. Instead of having a separate macro doing the recording of the frames, have the save_mcount_regs perform this task. This also has all tracers saving the frame pointers when needed. Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace/x86: Add macro MCOUNT_REG_SIZE for amount of stack used to save ↵Steven Rostedt (Red Hat)2014-12-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mcount regs The macro save_mcount_regs saves regs onto the stack. But to uncouple the amount of stack used in that macro from the users of the macro, we need to have a define that tells all the users how much stack is used by that macro. This way we can change the amount of stack the macro uses without breaking its users. Also remove some dead code that was left over from commit fdc841b58cf5 "ftrace: x86: Remove check of obsolete variable function_trace_stop". Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace/x86: Simplify save_mcount_regs on getting RIPSteven Rostedt (Red Hat)2014-12-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently save_mcount_regs is passed a "skip" parameter to know how much stack updated the pt_regs, as it tries to keep the saved pt_regs in the same location for all users. This is rather stupid, especially since the part stored on the pt_regs has nothing to do with what is suppose to be in that location. Instead of doing that, just pass in an "added" parameter that lets that macro know how much stack was added before it was called so that it can get to the RIP. But the difference is that it will now offset the pt_regs by that "added" count. The caller now needs to take care of the offset of the pt_regs. This will make it easier to simplify the code later. Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace/x86: Have save_mcount_regs store RIP in %rdi for first parameterSteven Rostedt (Red Hat)2014-12-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of having save_mcount_regs store the RIP in %rdx as a temp register to place it in the proper location of the pt_regs on the stack. Use the %rdi register as the temp register. This lets us remove the extra store in the ftrace_caller_setup macro. Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace/x86: Rename MCOUNT_SAVE_FRAME and add more detailed commentsSteven Rostedt (Red Hat)2014-12-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The name MCOUNT_SAVE_FRAME is rather confusing as it really isn't a function frame that is saved, but just the required mcount registers that are needed to be saved before C code may be called. The word "frame" confuses it as being a function frame which it is not. Rename MCOUNT_SAVE_FRAME and MCOUNT_RESTORE_FRAME to save_mcount_regs and restore_mcount_regs respectively. Noticed the lower case, which keeps it from screaming at the reviewers. Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace/x86: Move MCOUNT_SAVE_FRAME out of header fileSteven Rostedt (Red Hat)2014-12-01
| | | | | | | | | | | | | | | | | | | | | | | | Linus pointed out that MCOUNT_SAVE_FRAME is used in only a single file and that there's no reason that it should be in a header file. Move the macro to the code that uses it. Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace/x86: Have static tracing also use ftrace_caller_setupSteven Rostedt (Red Hat)2014-12-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linus pointed out that there were locations that did the hard coded update of the parent and rip parameters. One of them was the static tracer which could also use the ftrace_caller_setup to do that work. In fact, because it did not use it, it is prone to bugs, and since the static tracer is hardly ever used (who wants function tracing code always being called?) it doesn't get tested very often. I only run a few "does it still work" tests on it. But I do not run stress tests on that code. Although, since it is never turned off, just having it on should be stressful enough. (especially for the performance folks) There's no reason that the static tracer can't also use ftrace_caller_setup. Have it do so. Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace/x86: Have static function tracing always test for function graphSteven Rostedt (Red Hat)2014-11-24
| | | | | | | | | | | | | | | | | | | | | | | | | | New updates to the ftrace generic code had ftrace_stub not always being called when ftrace is off. This causes the static tracer to always save and restore functions. But it also showed that when function tracing is running, the function graph tracer can not. We should always check to see if function graph tracing is running even if the function tracer is running too. The function tracer code is not the only one that uses the hook to function mcount. Cc: Markos Chandras <Markos.Chandras@imgtec.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * kprobes: Add IPMODIFY flag to kprobe_ftrace_opsMasami Hiramatsu2014-11-21
| | | | | | | | | | | | | | | | | | | | Add FTRACE_OPS_FL_IPMODIFY flag to kprobe_ftrace_ops since kprobes can changes regs->ip. Link: http://lkml.kernel.org/r/20141121102523.11844.21298.stgit@localhost.localdomain Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace, kprobes: Support IPMODIFY flag to find IP modify conflictMasami Hiramatsu2014-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce FTRACE_OPS_FL_IPMODIFY to avoid conflict among ftrace users who may modify regs->ip to change the execution path. If two or more users modify the regs->ip on the same function entry, one of them will be broken. So they must add IPMODIFY flag and make sure that ftrace_set_filter_ip() succeeds. Note that ftrace doesn't allow ftrace_ops which has IPMODIFY flag to have notrace hash, and the ftrace_ops must have a filter hash (so that the ftrace_ops can hook only specific entries), because it strongly depends on the address and must be allowed for only few selected functions. Link: http://lkml.kernel.org/r/20141121102516.11844.27829.stgit@localhost.localdomain Cc: Jiri Kosina <jkosina@suse.cz> Cc: Seth Jennings <sjenning@redhat.com> Cc: Petr Mladek <pmladek@suse.cz> Cc: Vojtech Pavlik <vojtech@suse.cz> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> [ fixed up some of the comments ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * kprobes/ftrace: Recover original IP if pre_handler doesn't change itMasami Hiramatsu2014-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recover original IP register if the pre_handler doesn't change it. Since current kprobes doesn't expect that another ftrace handler may change regs->ip, it sets kprobe.addr + MCOUNT_INSN_SIZE to regs->ip and returns to ftrace. This seems wrong behavior since kprobes can recover regs->ip and safely pass it to another handler. This adds code which recovers original regs->ip passed from ftrace right before returning to ftrace, so that another ftrace user can change regs->ip. Link: http://lkml.kernel.org/r/20141009130106.4698.26362.stgit@kbuild-f20.novalocal Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing/trivial: Fix typos and make an int into a boolSteven Rostedt (Red Hat)2014-11-20
| | | | | | | | | | | | | | | | | | | | Fix up a few typos in comments and convert an int into a bool in update_traceon_count(). Link: http://lkml.kernel.org/r/546DD445.5080108@hitachi.com Suggested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Deletion of an unnecessary check before iput()Markus Elfring2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | The iput() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Link: http://lkml.kernel.org/r/5468F875.7080907@users.sourceforge.net Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Fix return value of ftrace_raw_output_prep()Steven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the trace_seq of ftrace_raw_output_prep() is full this function returns TRACE_TYPE_PARTIAL_LINE, otherwise it returns zero. The problem is that TRACE_TYPE_PARTIAL_LINE happens to be zero! The thing is, the caller of ftrace_raw_output_prep() expects a success to be zero. Change that to expect it to be TRACE_TYPE_HANDLED. Link: http://lkml.kernel.org/r/20141114112522.GA2988@dhcp128.suse.cz Reminded-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Remove return values of most trace_seq_*() functionsSteven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The trace_seq_printf() and friends are used to store strings into a buffer that can be passed around from function to function. If the trace_seq buffer fills up, it will not print any more. The return values were somewhat inconsistant and using trace_seq_has_overflowed() was a better way to know if the write to the trace_seq buffer succeeded or not. Now that all users have removed reading the return value of the printf() type functions, they can safely return void and keep future users of them from reading the inconsistent values as well. Link: http://lkml.kernel.org/r/20141114011411.992510720@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Do not use return values of trace_seq_printf() in syscall tracingSteven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | The functions trace_seq_printf() and friends will not be returning values soon and will be void functions. To know if they succeeded or not, the functions trace_seq_has_overflowed() and trace_handle_return() should be used instead. Reviewed-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing/uprobes: Do not use return values of trace_seq_printf()Steven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The functions trace_seq_printf() and friends will soon no longer have return values. Using trace_seq_has_overflowed() and trace_handle_return() should be used instead. Link: http://lkml.kernel.org/r/20141114011411.693008134@goodmis.org Link: http://lkml.kernel.org/r/20141115050602.333705855@goodmis.org Reviewed-by: Masami Hiramatsu <masami.hiramatu.pt@hitachi.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing/probes: Do not use return value of trace_seq_printf()Steven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | The functions trace_seq_printf() and friends will soon not have a return value and will only be a void function. Use trace_seq_has_overflowed() instead to know if the trace_seq operations succeeded or not. Link: http://lkml.kernel.org/r/20141114011411.530216306@goodmis.org Reviewed-by: Petr Mladek <pmladek@suse.cz> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Do not check return values of trace_seq_p*() for mmio tracerSteven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The return values for trace_seq_printf() and friends are going to be removed and they will become void functions. The mmio tracer checked their return and even did so incorrectly. Some of the funtions which returned the values were never checked themselves. Removing all the checks simplifies the code. Use trace_seq_has_overflowed() and trace_handle_return() where necessary instead. Reviewed-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * kprobes/tracing: Use trace_seq_has_overflowed() for overflow checksSteven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of checking the return value of trace_seq_printf() and friends for overflowing of the buffer, use the trace_seq_has_overflowed() helper function. This cleans up the code quite a bit and also takes us a step closer to changing the return values of trace_seq_printf() and friends to void. Link: http://lkml.kernel.org/r/20141114011411.181812785@goodmis.org Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reviewed-by: Petr Mladek <pmladek@suse.cz> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Have function_graph use trace_seq_has_overflowed()Steven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of doing individual checks all over the place that makes the code very messy. Just check trace_seq_has_overflowed() at the end or in strategic places. This makes the code much cleaner and also helps with getting closer to removing the return values of trace_seq_printf() and friends. Link: http://lkml.kernel.org/r/20141114011410.987913836@goodmis.org Reviewed-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Have branch tracer use trace_handle_return() helper functionSteven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | The branch tracer should not be checking the trace_seq_printf() return value as that will soon be void. There's a new trace_handle_return() helper function that will return TRACE_TYPE_PARTIAL_LINE if the trace_seq overflowed and TRACE_TYPE_HANDLED otherwise. Reviewed-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ring-buffer: Remove check of trace_seq_{puts,printf}() return valuesSteven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | Remove checking the return value of all trace_seq_puts(). It was wrong anyway as only the last return value mattered. But as the trace_seq_puts() is going to be a void function in the future, we should not be checking the return value of it anyway. Just return !trace_seq_has_overflowed() instead. Reviewed-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * blktrace/tracing: Use trace_seq_has_overflowed() helper functionSteven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Checking the return code of every trace_seq_printf() operation and having to return early if it overflowed makes the code messy. Using the new trace_seq_has_overflowed() and trace_handle_return() functions allows us to clean up the code. In the future, trace_seq_printf() and friends will be turning into void functions and not returning a value. The trace_seq_has_overflowed() is to be used instead. This cleanup allows that change to take place. Cc: Jens Axboe <axboe@fb.com> Reviewed-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Add trace_seq_has_overflowed() and trace_handle_return()Steven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding a trace_seq_has_overflowed() which returns true if the trace_seq had too much written into it allows us to simplify the code. Instead of checking the return value of every call to trace_seq_printf() and friends, they can all be called normally, and at the end we can return !trace_seq_has_overflowed() instead. Several functions also return TRACE_TYPE_PARTIAL_LINE when the trace_seq overflowed and TRACE_TYPE_HANDLED otherwise. Another helper function was created called trace_handle_return() which takes a trace_seq and returns these enums. Using this helper function also simplifies the code. This change also makes it possible to remove the return values of trace_seq_printf() and friends. They should instead just be void functions. Link: http://lkml.kernel.org/r/20141114011410.365183157@goodmis.org Reviewed-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Fix trace_seq_bitmask() to start at current positionSteven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In trace_seq_bitmask() it calls bitmap_scnprintf() not from the current position of the trace_seq buffer (s->buffer + s->len), but instead from the beginning of the buffer (s->buffer). Luckily, the only user of this "ipi_raise tracepoint" uses it as the first parameter, and as such, the start of the temp buffer in include/trace/ftrace.h (see __get_bitmask()). Reported-by: Petr Mladek <pmladek@suse.cz> Reviewed-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * RAS/tracing: Use trace_seq_buffer_ptr() helper instead of open codedSteven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the helper function trace_seq_buffer_ptr() to get the current location of the next buffer write of a trace_seq object, instead of open coding it. This facilitates the conversion of trace_seq to use seq_buf. Tested-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Borislav Petkov <bp@suse.de> Reviewed-by: Petr Mladek <pmladek@suse.cz> Cc: Chen Gong <gong.chen@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * x86/kvm/tracing: Use helper function trace_seq_buffer_ptr()Steven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To allow for the restructiong of the trace_seq code, we need users of it to use the helper functions instead of accessing the internals of the trace_seq structure itself. Link: http://lkml.kernel.org/r/20141104160221.585025609@goodmis.org Tested-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Mark Rustad <mark.d.rustad@intel.com> Reviewed-by: Petr Mladek <pmladek@suse.cz> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace/x86/extable: Add is_ftrace_trampoline() functionSteven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stack traces that happen from function tracing check if the address on the stack is a __kernel_text_address(). That is, is the address kernel code. This calls core_kernel_text() which returns true if the address is part of the builtin kernel code. It also calls is_module_text_address() which returns true if the address belongs to module code. But what is missing is ftrace dynamically allocated trampolines. These trampolines are allocated for individual ftrace_ops that call the ftrace_ops callback functions directly. But if they do a stack trace, the code checking the stack wont detect them as they are neither core kernel code nor module address space. Adding another field to ftrace_ops that also stores the size of the trampoline assigned to it we can create a new function called is_ftrace_trampoline() that returns true if the address is a dynamically allocate ftrace trampoline. Note, it ignores trampolines that are not dynamically allocated as they will return true with the core_kernel_text() function. Link: http://lkml.kernel.org/r/20141119034829.497125839@goodmis.org Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace/x86: Add frames pointers to trampoline as necessarySteven Rostedt (Red Hat)2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_FRAME_POINTERS are enabled, it is required that the ftrace_caller and ftrace_regs_caller trampolines set up frame pointers otherwise a stack trace from a function call wont print the functions that called the trampoline. This is due to a check in __save_stack_address(): #ifdef CONFIG_FRAME_POINTER if (!reliable) return; #endif The "reliable" variable is only set if the function address is equal to contents of the address before the address the frame pointer register points to. If the frame pointer is not set up for the ftrace caller then this will fail the reliable test. It will miss the function that called the trampoline. Worse yet, if fentry is used (gcc 4.6 and beyond), it will also miss the parent, as the fentry is called before the stack frame is set up. That means the bp frame pointer points to the stack of just before the parent function was called. Link: http://lkml.kernel.org/r/20141119034829.355440340@goodmis.org Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: stable@vger.kernel.org # 3.7+ Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Fix race of function probes countingSteven Rostedt (Red Hat)2014-11-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function probe counting for traceon and traceoff suffered a race condition where if the probe was executing on two or more CPUs at the same time, it could decrement the counter by more than one when disabling (or enabling) the tracer only once. The way the traceon and traceoff probes are suppose to work is that they disable (or enable) tracing once per count. If a user were to echo 'schedule:traceoff:3' into set_ftrace_filter, then when the schedule function was called, it would disable tracing. But the count should only be decremented once (to 2). Then if the user enabled tracing again (via tracing_on file), the next call to schedule would disable tracing again and the count would be decremented to 1. But if multiple CPUS called schedule at the same time, it is possible that the count would be decremented more than once because of the simple "count--" used. By reading the count into a local variable and using memory barriers we can guarantee that the count would only be decremented once per disable (or enable). The stack trace probe had a similar race, but here the stack trace will decrement for each time it is called. But this had the read-modify- write race, where it could stack trace more than the number of times that was specified. This case we use a cmpxchg to stack trace only the number of times specified. The dump probes can still use the old "update_count()" function as they only run once, and that is controlled by the dump logic itself. Link: http://lkml.kernel.org/r/20141118134643.4b550ee4@gandalf.local.home Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * function_graph: Fix micro seconds notationsByungchul Park2014-11-14
| | | | | | | | | | | | | | | | | | | | | | Usually, "msecs" notation means milli-seconds, and "usecs" notation means micro-seconds. Since the unit used in the code is micro-seconds, the notation should be replaced from msecs to usecs. Link: http://lkml.kernel.org/r/1415171926-9782-2-git-send-email-byungchul.park@lge.com Signed-off-by: Byungchul Park <byungchul.park@lge.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace-graph: show latency-format on print_graph_irq()Daniel Bristot de Oliveira2014-11-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On the function_graph tracer, the print_graph_irq() function prints a trace line with the flag ==========> on an irq handler entry, and the flag <========== on an irq handler return. But when the latency-format is enable, it is not printing the latency-format flags, causing the following error in the trace output: 0) ==========> | 0) d... | smp_apic_timer_interrupt() { This patch fixes this issue by printing the latency-format flags when it is enable. Link: http://lkml.kernel.org/r/7c2e226dac20c940b6242178fab7f0e3c9b5ce58.1415233316.git.bristot@redhat.com Reviewed-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * trace: Replace single-character seq_puts with seq_putcRasmus Villemoes2014-11-14
| | | | | | | | | | | | | | | | | | | | | | Printing a single character to a seqfile might as well be done with seq_putc instead of seq_puts; this avoids a strlen() call and a memory access. It also shaves another few bytes off the generated code. Link: http://lkml.kernel.org/r/1415479332-25944-4-git-send-email-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Merge consecutive seq_puts callsRasmus Villemoes2014-11-13
| | | | | | | | | | | | | | | | | | | | | | | | Consecutive seq_puts calls with literal strings can be merged to a single call. This reduces the size of the generated code, and can also lead to slight .rodata reduction (because of fewer nul and padding bytes). It should also shave a off a few clock cycles. Link: http://lkml.kernel.org/r/1415479332-25944-3-git-send-email-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Replace seq_printf by simpler equivalentsRasmus Villemoes2014-11-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using seq_printf to print a simple string or a single character is a lot more expensive than it needs to be, since seq_puts and seq_putc exist. These patches do seq_printf(m, s) -> seq_puts(m, s) seq_printf(m, "%s", s) -> seq_puts(m, s) seq_printf(m, "%c", c) -> seq_putc(m, c) Subsequent patches will simplify further. Link: http://lkml.kernel.org/r/1415479332-25944-2-git-send-email-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: kdb: Fix kernel livelock with empty buffersDaniel Thompson2014-11-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently kdb's ftdump command will livelock by constantly printk'ing the empty string at KERN_EMERG level if it run when the ftrace system is not in use. This occurs because trace_empty() never returns false when the ring buffers are left at the start of a non-consuming read [launched by ring_buffer_read_start()]. This patch changes the loop exit condition to use the result of trace_find_next_entry_inc(). Effectively this switches the non-consuming kdb dumper to follow the approach of the non-consuming userspace interface [s_next()] rather than the consuming ftrace_dump(). Link: http://lkml.kernel.org/r/1415277716-19419-3-git-send-email-daniel.thompson@linaro.org Cc: Ingo Molnar <mingo@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: kdb: Fix kernel panic during ftdumpDaniel Thompson2014-11-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently kdb's ftdump command unconditionally crashes due to a null pointer de-reference whenever the command is run. This in turn causes the kernel to panic. The abridged stacktrace (gathered with ARCH=arm) is: --- cut here --- [<c09535ac>] (panic) from [<c02132dc>] (die+0x264/0x440) [<c02132dc>] (die) from [<c0952eb8>] (__do_kernel_fault.part.11+0x74/0x84) [<c0952eb8>] (__do_kernel_fault.part.11) from [<c021f954>] (do_page_fault+0x1d0/0x3c4) [<c021f954>] (do_page_fault) from [<c020846c>] (do_DataAbort+0x48/0xac) [<c020846c>] (do_DataAbort) from [<c0213c58>] (__dabt_svc+0x38/0x60) Exception stack(0xc0deba88 to 0xc0debad0) ba80: e8c29180 00000001 e9854304 e9854300 c0f567d8 c0df2580 baa0: 00000000 00000000 00000000 c0f117b8 c0e3a3c0 c0debb0c 00000000 c0debad0 bac0: 0000672e c02f4d60 60000193 ffffffff [<c0213c58>] (__dabt_svc) from [<c02f4d60>] (kdb_ftdump+0x1e4/0x3d8) [<c02f4d60>] (kdb_ftdump) from [<c02ce328>] (kdb_parse+0x2b8/0x698) [<c02ce328>] (kdb_parse) from [<c02ceef0>] (kdb_main_loop+0x52c/0x784) [<c02ceef0>] (kdb_main_loop) from [<c02d1b0c>] (kdb_stub+0x238/0x490) --- cut here --- The NULL deref occurs due to the initialized use of struct trace_iter's buffer_iter member. This is a regression, albeit a fairly elderly one. It was introduced by commit 6d158a813efc ("tracing: Remove NR_CPUS array from trace_iterator"). This patch solves this by providing a collection of ring_buffer_iter(s) and using this to initialize buffer_iter. Note that static allocation is used solely because the trace_iter itself is also static allocated. Static allocation also means that we have to NULL-ify the pointer during cleanup to avoid use-after-free problems. Link: http://lkml.kernel.org/r/1415277716-19419-2-git-send-email-daniel.thompson@linaro.org Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Fix traceoff_on_warning handling on boot command lineLuis Claudio R. Goncalves2014-11-13
| | | | | | | | | | | | | | | | | | | | | | | | According to the documentation, adding "traceoff_on_warning" to the boot command line should be enough to enable the feature. But right now it is necessary to specify "traceoff_on_warning=". Along with fixing that, also verify if the value passed, if any, is either "0" or "off". Link: http://lkml.kernel.org/r/20141112231400.GL12281@uudg.org Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace: Have the control_ops get a trampolineSteven Rostedt (Red Hat)2014-11-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the new logic, if only a single user of ftrace function hooks is used, it will get its own trampoline assigned to it. The problem is that the control_ops is an indirect ops that perf ops uses. What that means is that when perf registers its ops with register_ftrace_function(), it has the CONTROL flag set and gets added to the control list instead of the global ftrace list. The control_ops gets added to that instead and the mcount trampoline calls the control_ops function. The control_ops function will iterate the control list and call the ops functions that are attached to it. But currently the trampoline is added to the perf ops and not the control ops, and when ftrace tries to find a trampoline hook for it, it fails to find one and gives the following splat: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 10133 at kernel/trace/ftrace.c:2033 ftrace_get_addr_new+0x6f/0xc0() Modules linked in: [...] CPU: 0 PID: 10133 Comm: perf Tainted: P 3.18.0-rc1-test+ #388 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012 00000000000007f1 ffff8800c2643bc8 ffffffff814fca6e ffff88011ea0ed01 0000000000000000 ffff8800c2643c08 ffffffff81041ffd 0000000000000000 ffffffff810c388c ffffffff81a5a350 ffff880119b00000 ffffffff810001c8 Call Trace: [<ffffffff814fca6e>] dump_stack+0x46/0x58 [<ffffffff81041ffd>] warn_slowpath_common+0x81/0x9b [<ffffffff810c388c>] ? ftrace_get_addr_new+0x6f/0xc0 [<ffffffff810001c8>] ? 0xffffffff810001c8 [<ffffffff81042031>] warn_slowpath_null+0x1a/0x1c [<ffffffff810c388c>] ftrace_get_addr_new+0x6f/0xc0 [<ffffffff8102e938>] ftrace_replace_code+0xd6/0x334 [<ffffffff810c4116>] ftrace_modify_all_code+0x41/0xc5 [<ffffffff8102eba6>] arch_ftrace_update_code+0x10/0x19 [<ffffffff810c293c>] ftrace_run_update_code+0x21/0x42 [<ffffffff810c298f>] ftrace_startup_enable+0x32/0x34 [<ffffffff810c3049>] ftrace_startup+0x14e/0x15a [<ffffffff810c307c>] register_ftrace_function+0x27/0x40 [<ffffffff810dc118>] perf_ftrace_event_register+0x3e/0xee [<ffffffff810dbfbe>] perf_trace_init+0x29d/0x2a9 [<ffffffff810eb422>] perf_tp_event_init+0x27/0x3a [<ffffffff810f18bc>] perf_init_event+0x9e/0xed [<ffffffff810f1ba4>] perf_event_alloc+0x299/0x330 [<ffffffff810f236b>] SYSC_perf_event_open+0x3ee/0x816 [<ffffffff8115a066>] ? mntput+0x2d/0x2f [<ffffffff81142b00>] ? __fput+0xa7/0x1b2 [<ffffffff81091300>] ? do_gettimeofday+0x22/0x3a [<ffffffff810f279c>] SyS_perf_event_open+0x9/0xb [<ffffffff81502a92>] system_call_fastpath+0x12/0x17 ---[ end trace 81a53565150e4982 ]--- Bad trampoline accounting at: ffffffff810001c8 (run_init_process+0x0/0x2d) (10000001) Update the control_ops trampoline instead of the perf ops one. Reported-by: lkp@01.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Add entry->next_cpu to trace_ctxwake_bin()Jiang Liu2014-11-11
| | | | | | | | | | | | | | | | | | | | Function trace_ctxwake_bin() misses ctx_switch_entry->next_cpu field, so user will get stale value for "next_cpu". Link: http://lkml.kernel.org/p/1377176379-27908-1-git-send-email-liuj97@gmail.com Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Move tracing_sched_{switch,wakeup}() into wakeup tracerSteven Rostedt (Red Hat)2014-11-11
| | | | | | | | | | | | | | | | | | | | | | | | The only code that references tracing_sched_switch_trace() and tracing_sched_wakeup_trace() is the wakeup latency tracer. Those two functions use to belong to the sched_switch tracer which has long been removed. These functions were left behind because the wakeup latency tracer used them. But since the wakeup latency tracer is the only one to use them, they should be static functions inside that code. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>