| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit b6345879ccbd9b92864fbd7eb8ac48acdb4d6b15 upstream.
A bug was found with Li Zefan's ftrace_stress_test that caused applications
to segfault during the test.
Placing a tracing_off() in the segfault code, and examining several
traces, I found that the following was always the case. The lock tracer
was enabled (lockdep being required) and userstack was enabled. Testing
this out, I just enabled the two, but that was not good enough. I needed
to run something else that could trigger it. Running a load like hackbench
did not work, but executing a new program would. The following would
trigger the segfault within seconds:
# echo 1 > /debug/tracing/options/userstacktrace
# echo 1 > /debug/tracing/events/lock/enable
# while :; do ls > /dev/null ; done
Enabling the function graph tracer and looking at what was happening
I finally noticed that all cashes happened just after an NMI.
1) | copy_user_handle_tail() {
1) | bad_area_nosemaphore() {
1) | __bad_area_nosemaphore() {
1) | no_context() {
1) | fixup_exception() {
1) 0.319 us | search_exception_tables();
1) 0.873 us | }
[...]
1) 0.314 us | __rcu_read_unlock();
1) 0.325 us | native_apic_mem_write();
1) 0.943 us | }
1) 0.304 us | rcu_nmi_exit();
[...]
1) 0.479 us | find_vma();
1) | bad_area() {
1) | __bad_area() {
After capturing several traces of failures, all of them happened
after an NMI. Curious about this, I added a trace_printk() to the NMI
handler to read the regs->ip to see where the NMI happened. In which I
found out it was here:
ffffffff8135b660 <page_fault>:
ffffffff8135b660: 48 83 ec 78 sub $0x78,%rsp
ffffffff8135b664: e8 97 01 00 00 callq ffffffff8135b800 <error_entry>
What was happening is that the NMI would happen at the place that a page
fault occurred. It would call rcu_read_lock() which was traced by
the lock events, and the user_stack_trace would run. This would trigger
a page fault inside the NMI. I do not see where the CR2 register is
saved or restored in NMI handling. This means that it would corrupt
the page fault handling that the NMI interrupted.
The reason the while loop of ls helped trigger the bug, was that
each execution of ls would cause lots of pages to be faulted in, and
increase the chances of the race happening.
The simple solution is to not allow user stack traces in NMI context.
After this patch, I ran the above "ls" test for a couple of hours
without any issues. Without this patch, the bug would trigger in less
than a minute.
Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit a2f8071428ed9a0f06865f417c962421c9a6b488 upstream.
When the trace iterator is read, tracing_start() and tracing_stop()
is called to stop tracing while the iterator is processing the trace
output.
These functions disable both the standard buffer and the max latency
buffer. But if the wakeup tracer is running, it can switch these
buffers between the two disables:
buffer = global_trace.buffer;
if (buffer)
ring_buffer_record_disable(buffer);
<<<--------- swap happens here
buffer = max_tr.buffer;
if (buffer)
ring_buffer_record_disable(buffer);
What happens is that we disabled the same buffer twice. On tracing_start()
we can enable the same buffer twice. All ring_buffer_record_disable()
must be matched with a ring_buffer_record_enable() or the buffer
can be disable permanently, or enable prematurely, and cause a bug
where a reset happens while a trace is commiting.
This patch protects these two by taking the ftrace_max_lock to prevent
a switch from occurring.
Found with Li Zefan's ftrace_stress_test.
Reported-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 283740c619d211e34572cc93c8cdba92ccbdb9cc upstream.
In the ftrace code that resets the ring buffer it references the
buffer with a local variable, but then uses the tr->buffer as the
parameter to reset. If the wakeup tracer is running, which can
switch the tr->buffer with the max saved buffer, this can break
the requirement of disabling the buffer before the reset.
buffer = tr->buffer;
ring_buffer_record_disable(buffer);
synchronize_sched();
__tracing_reset(tr->buffer, cpu);
If the tr->buffer is swapped, then the reset is not happening to the
buffer that was disabled. This will cause the ring buffer to fail.
Found with Li Zefan's ftrace_stress_test.
Reported-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit ac91d85456372a90af5b85eb6620fd2efb1e431b upstream.
This warning in s_next() can be triggered by lseek():
[<c018b3f7>] ? s_next+0x77/0x80
[<c013e3c1>] warn_slowpath_common+0x81/0xa0
[<c018b3f7>] ? s_next+0x77/0x80
[<c013e3fa>] warn_slowpath_null+0x1a/0x20
[<c018b3f7>] s_next+0x77/0x80
[<c01efa77>] traverse+0x117/0x200
[<c01eff13>] seq_lseek+0xa3/0x120
[<c01efe70>] ? seq_lseek+0x0/0x120
[<c01d7081>] vfs_llseek+0x41/0x50
[<c01d8116>] sys_llseek+0x66/0xa0
[<c0102bd0>] sysenter_do_call+0x12/0x26
The iterator "leftover" variable is zeroed in the opening of the trace
file. But lseek can call s_start() which will call s_next() without
reseting the "leftover" variable back to zero, which might trigger
the WARN_ON_ONCE(iter->leftover) that is in s_next().
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B8CE06A.9090207@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit ea14eb714041d40fcc5180b5a586034503650149 upstream.
If the graph tracer is active, and a task is forked but the allocating of
the processes graph stack fails, it can cause crash later on.
This is due to the temporary stack being NULL, but the curr_ret_stack
variable is copied from the parent. If it is not -1, then in
ftrace_graph_probe_sched_switch() the following:
for (index = next->curr_ret_stack; index >= 0; index--)
next->ret_stack[index].calltime += timestamp;
Will cause a kernel OOPS.
Found with Li Zefan's ftrace_stress_test.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 52fbe9cde7fdb5c6fac196d7ebd2d92d05ef3cd4 upstream.
The ring buffer resizing and resetting relies on a schedule RCU
action. The buffers are disabled, a synchronize_sched() is called
and then the resize or reset takes place.
But this only works if the disabling of the buffers are within the
preempt disabled section, otherwise a window exists that the buffers
can be written to while a reset or resize takes place.
Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B949E43.2010906@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 86c38a31aa7f2dd6e74a262710bf8ebf7455acc5 upstream.
GCC 4.5 introduces behavior that forces the alignment of structures to
use the largest possible value. The default value is 32 bytes, so if
some structures are defined with a 4-byte alignment and others aren't
declared with an alignment constraint at all - it will align at 32-bytes.
For things like the ftrace events, this results in a non-standard array.
When initializing the ftrace subsystem, we traverse the _ftrace_events
section and call the initialization callback for each event. When the
structures are misaligned, we could be treating another part of the
structure (or the zeroed out space between them) as a function pointer.
This patch forces the alignment for all the ftrace_event_call structures
to 4 bytes.
Without this patch, the kernel fails to boot very early when built with
gcc 4.5.
It's trivial to check the alignment of the members of the array, so it
might be worthwhile to add something to the build system to do that
automatically. Unfortunately, that only covers this case. I've asked one
of the gcc developers about adding a warning when this condition is seen.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
LKML-Reference: <4B85770B.6010901@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Trying to add a probe like:
echo p:myprobe 0x10000 > /sys/kernel/debug/tracing/kprobe_events
will fail since the wrong pointer is passed to strict_strtoul
when trying to convert the address to an unsigned long.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20100210162346.GA6933@osiris.boeblingen.de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we cat <debugfs>/tracing/stack_trace, we may cause circular lock:
sys_read()
t_start()
arch_spin_lock(&max_stack_lock);
t_show()
seq_printf(), vsnprintf() .... /* they are all trace-able,
when they are traced, max_stack_lock may be required again. */
The following script can trigger this circular dead lock very easy:
#!/bin/bash
echo 1 > /proc/sys/kernel/stack_tracer_enabled
mount -t debugfs xxx /mnt > /dev/null 2>&1
(
# make check_stack() zealous to require max_stack_lock
for ((; ;))
{
echo 1 > /mnt/tracing/stack_max_size
}
) &
for ((; ;))
{
cat /mnt/tracing/stack_trace > /dev/null
}
To fix this bug, we increase the percpu trace_active before
require the lock.
Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B67D4F9.9080905@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
|
|
|
|
|
|
|
|
| |
Update the graph tracer examples to cover the new frame pointer semantics
(in terms of passing it along). Move the HAVE_FUNCTION_GRAPH_FP_TEST docs
out of the Kconfig, into the right place, and expand on the details.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
LKML-Reference: <1264165967-18938-1-git-send-email-vapier@gentoo.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the iterator comes to an empty page for some reason, or if
the page is emptied by a consuming read. The iterator code currently
does not check if the iterator is pass the contents, and may
return a false entry.
This patch adds a check to the ring buffer iterator to test if the
current page has been completely read and sets the iterator to the
next page if necessary.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Usually reads of the ring buffer is performed by a single task.
There are two types of reads from the ring buffer.
One is a consuming read which will consume the entry that was read
and the next read will be the entry that follows.
The other is an iterator that will let the user read the contents of
the ring buffer without modifying it. When an iterator is allocated,
writes to the ring buffer are disabled to protect the iterator.
The problem exists when consuming reads happen while an iterator is
allocated. Specifically, the kind of read that swaps out an entire
page (used by splice) and replaces it with a new read. If the iterator
is on the page that is swapped out, then the next read may read
from this swapped out page and return garbage.
This patch adds a check when reading the iterator to make sure that
the iterator contents are still valid. If a consuming read has taken
place, the iterator is reset.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the contents of the ftrace ring buffer gets corrupted and the trace
file is read, it could create a kernel oops (usualy just killing the user
task thread). This is caused by the checking of the pid in the buffer.
If the pid is negative, it still references the cmdline cache array,
which could point to an invalid address.
The simple fix is to test for negative PIDs.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We should be clear on 2 things:
- the length parameter of a match callback includes
tailing '\0'.
- the string to be searched might not be NULL-terminated.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B4E8770.7000608@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MATCH_FULL matching for PTR_STRING is not working correctly:
# echo 'func == vt' > events/bkl/lock_kernel/filter
# echo 1 > events/bkl/lock_kernel/enable
...
# cat trace
Xorg-1484 [000] 1973.392586: lock_kernel: ... func=vt_ioctl()
gpm-1402 [001] 1974.027740: lock_kernel: ... func=vt_ioctl()
We should pass to regex.match(..., len) the length (including '\0')
of the source string instead of the length of the pattern string.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B4E8763.5070707@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
|
|
|
|
|
|
|
|
|
| |
The @str might not be NULL-terminated if it's of type
DYN_STRING or STATIC_STRING, so we should use strnstr()
instead of strstr().
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B4E8753.2000102@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
|
|
|
|
|
|
|
|
|
| |
For '*foo' pattern, we should allow any string ending with
'foo', but event filtering incorrectly disallows strings
like bar_foo_foo:
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B4E8735.6070604@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MATCH_FRONT_ONLY actually is a full matching:
# ./perf record -R -f -a -e lock:lock_acquire \
--filter 'name ~rcu_*' sleep 1
# ./perf trace
(no output)
We should pass the length of the pattern string to strncmp().
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B4E8721.5090301@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For '*foo' pattern, we should allow any string ending with
'foo', but ftrace filter incorrectly disallows strings
like bar_foo_foo:
# echo '*io' > set_ftrace_filter
# cat set_ftrace_filter | grep 'req_bio_endio'
# cat available_filter_functions | grep 'req_bio_endio'
req_bio_endio
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B4E870E.6060607@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the very unlikely case happens where the writer moves the head by one
between where the head page is read and where the new reader page
is assigned _and_ the writer then writes and wraps the entire ring buffer
so that the head page is back to what was originally read as the head page,
the page to be swapped will have a corrupted next pointer.
Simple solution is to wrap the assignment of the next pointer with a
rb_list_head().
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reference at the end of rb_get_reader_page() was causing off-by-one
writes to the prev pointer of the page after the reader page when that
page is the head page, and therefore the reader page has the RB_PAGE_HEAD
flag in its list.next pointer. This eventually results in a GPF in a
subsequent call to rb_set_head_page() (usually from rb_get_reader_page())
when that prev pointer is dereferenced. The dereferenced register would
characteristically have an address that appears shifted left by one byte
(eg, ffxxxxxxxxxxxxyy instead of ffffxxxxxxxxxxxx) due to being written at
an address one byte too high.
Signed-off-by: David Sharp <dhsharp@google.com>
LKML-Reference: <1262826727-9090-1-git-send-email-dhsharp@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: Fix sign fields in ftrace_define_fields_##call()
tracing/syscalls: Fix typo in SYSCALL_DEFINE0
tracing/kprobe: Show sign of fields in trace_kprobe format files
ksym_tracer: Remove trace_stat
ksym_tracer: Fix race when incrementing count
ksym_tracer: Fix to allow writing newline to ksym_trace_filter
ksym_tracer: Fix to make the tracer work
tracing: Kconfig spelling fixes and cleanups
tracing: Fix setting tracer specific options
Documentation: Update ftrace-design.txt
Documentation: Update tracepoint-analysis.txt
Documentation: Update mmiotrace.txt
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add is_signed_type() call to trace_define_field() in ftrace macros.
The code previously just passed in 0 (false), disregarding whether
or not the field was actually a signed type.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B273D3A.6020007@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The format files of trace_kprobe do not show the sign of the fields.
The other format files show the field signed type of the fields and
this patch makes the trace_kprobe formats consistent with the others.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B273D27.5040009@cn.fujitsu.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
trace_stat is problematic. Don't use it, use seqfile instead.
This fixes a race that reading the stat file is not protected by
any lock, which can lead to use after free.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4B3AF203.40200@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We are under rcu read section but not holding the write lock, so
count++ is not atomic. Use atomic64_t instead.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4B3AF1EC.9010608@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It used to work, but now doesn't:
# echo > ksym_filter
bash: echo: write error: Invalid argument
It's caused by d954fbf0ff6b5fdfb32350e85a2f15d3db976506
("tracing: Fix wrong usage of strstrip in trace_ksyms").
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4B3AF1D7.5040400@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
ksym tracer doesn't work:
# echo tasklist_lock:rw- > ksym_trace_filter
-bash: echo: write error: No such device
It's because we pass to perf_event_create_kernel_counter()
a cpu number which is not present.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4B3AF19E.1010201@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fix filename reference (ftrace-implementation.txt ->
ftrace-design.txt).
Fix spelling, punctuation, grammar.
Fix help text indentation and line lengths to reduce need for
horizontal scrolling or larger window sizes.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091221120117.3fb49cdc.randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The function __set_tracer_option() takes as its last parameter a
"neg" value. If set it should negate the value of the option.
The trace_options_write() passed the value written to the file
which is what the new value needs to be set as. But since this
is not the negative, it never sets the value.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf session: Make events_stats u64 to avoid overflow on 32-bit arches
hw-breakpoints: Fix hardware breakpoints -> perf events dependency
perf events: Dont report side-band events on each cpu for per-task-per-cpu events
perf events, x86/stacktrace: Fix performance/softlockup by providing a special frame pointer-only stack walker
perf events, x86/stacktrace: Make stack walking optional
perf events: Remove unused perf_counter.h header file
perf probe: Check new event name
kprobe-tracer: Check new event/group name
perf probe: Check whether debugfs path is correct
perf probe: Fix libdwarf include path for Debian
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The current print_context_stack helper that does the stack
walking job is good for usual stacktraces as it walks through
all the stack and reports even addresses that look unreliable,
which is nice when we don't have frame pointers for example.
But we have users like perf that only require reliable
stacktraces, and those may want a more adapted stack walker, so
lets make this function a callback in stacktrace_ops that users
can tune for their needs.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1261024834-5336-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Check new event/group name is same syntax as a C symbol. In other
words, checking the name is as like as other tracepoint events.
This can prevent user to create an event with useless name (e.g.
foo|bar, foo*bar).
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jason Baron <jbaron@redhat.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
LKML-Reference: <20091216222408.14459.68790.stgit@dhcp-100-2-132.bos.redhat.com>
[ v2: minor cleanups ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|\ \ \
| |/ /
|/| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* 'for-33' of git://repo.or.cz/linux-kbuild: (29 commits)
net: fix for utsrelease.h moving to generated
gen_init_cpio: fixed fwrite warning
kbuild: fix make clean after mismerge
kbuild: generate modules.builtin
genksyms: properly consider EXPORT_UNUSED_SYMBOL{,_GPL}()
score: add asm/asm-offsets.h wrapper
unifdef: update to upstream revision 1.190
kbuild: specify absolute paths for cscope
kbuild: create include/generated in silentoldconfig
scripts/package: deb-pkg: use fakeroot if available
scripts/package: add KBUILD_PKG_ROOTCMD variable
scripts/package: tar-pkg: use tar --owner=root
Kbuild: clean up marker
net: add net_tstamp.h to headers_install
kbuild: move utsrelease.h to include/generated
kbuild: move autoconf.h to include/generated
drop explicit include of autoconf.h
kbuild: move compile.h to include/generated
kbuild: drop include/asm
kbuild: do not check for include/asm-$ARCH
...
Fixed non-conflicting clean merge of modpost.c as per comments from
Stephen Rothwell (modpost.c had grown an include of linux/autoconf.h
that needed to be changed to generated/autoconf.h)
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fix up all users of utsrelease.h
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
|
|\ \ \
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: Fix return of trace_dump_stack()
ksym_tracer: Fix bad cast
tracing/power: Remove two exports
tracing: Change event->profile_count to be int type
tracing: Simplify trace_option_write()
tracing: Remove useless trace option
tracing: Use seq file for trace_clock
tracing: Use seq file for trace_options
function-graph: Allow writing the same val to set_graph_function
ftrace: Call trace_parser_clear() properly
ftrace: Return EINVAL when writing invalid val to set_ftrace_filter
tracing: Move a printk out of ftrace_raw_reg_event_foo()
tracing: Pull up calls to trace_define_common_fields()
tracing: Extract duplicate ftrace_raw_init_event_foo()
ftrace.h: Use common pr_info fmt string
tracing: Add stack trace to irqsoff tracer
tracing: Add trace_dump_stack()
ring-buffer: Move resize integrity check under reader lock
ring-buffer: Use sync sched protection on ring buffer resizing
tracing: Fix wrong usage of strstrip in trace_ksyms
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The trace_dump_stack() returned a value for a void function.
Also, added the missing stub for trace_dump_stack() when tracing is
not configured.
Reported-by: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20091214162713.GA31060@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |\ \
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/urgent
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The irqsoff and friends tracers help in finding causes of latency in the
kernel. The also work with the function tracer to show what was happening
when interrupts or preemption are disabled. But the function tracer has
a bit of an overhead and can cause exagerated readings.
Currently, when tracing with /proc/sys/kernel/ftrace_enabled = 0, where the
function tracer is disabled, the information that is provided can end up
being useless. For example, a 2 and a half millisecond latency only showed:
# tracer: preemptirqsoff
#
# preemptirqsoff latency trace v1.1.5 on 2.6.32
# --------------------------------------------------------------------
# latency: 2463 us, #4/4, CPU#2 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -4242 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
# => started at: _spin_lock_irqsave
# => ended at: remove_wait_queue
#
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /_--=> lock-depth
# |||||/ delay
# cmd pid |||||| time | caller
# \ / |||||| \ | /
hackbenc-4242 2d.... 0us!: trace_hardirqs_off <-_spin_lock_irqsave
hackbenc-4242 2...1. 2463us+: _spin_unlock_irqrestore <-remove_wait_queue
hackbenc-4242 2...1. 2466us : trace_preempt_on <-remove_wait_queue
The above lets us know that hackbench with pid 2463 grabbed a spin lock
somewhere and enabled preemption at remove_wait_queue. This helps a little
but where this actually happened is not informative.
This patch adds the stack dump to the end of the irqsoff tracer. This provides
the following output:
hackbenc-4242 2d.... 0us!: trace_hardirqs_off <-_spin_lock_irqsave
hackbenc-4242 2...1. 2463us+: _spin_unlock_irqrestore <-remove_wait_queue
hackbenc-4242 2...1. 2466us : trace_preempt_on <-remove_wait_queue
hackbenc-4242 2...1. 2467us : <stack trace>
=> sub_preempt_count
=> _spin_unlock_irqrestore
=> remove_wait_queue
=> free_poll_entry
=> poll_freewait
=> do_sys_poll
=> sys_poll
=> system_call_fastpath
Now we see that the culprit of this latency was the free_poll_entry code.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
I've been asked a few times about how to find out what is calling
some location in the kernel. One way is to use dynamic function tracing
and implement the func_stack_trace. But this only finds out who is
calling a particular function. It does not tell you who is calling
that function and entering a specific if conditional.
I have myself implemented a quick version of trace_dump_stack() for
this purpose a few times, and just needed it now. This is when I realized
that this would be a good tool to have in the kernel like trace_printk().
Using trace_dump_stack() is similar to dump_stack() except that it
writes to the trace buffer instead and can be used in critical locations.
For example:
@@ -5485,8 +5485,12 @@ need_resched_nonpreemptible:
if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
if (unlikely(signal_pending_state(prev->state, prev)))
prev->state = TASK_RUNNING;
- else
+ else {
deactivate_task(rq, prev, 1);
+ trace_printk("Deactivating task %s:%d\n",
+ prev->comm, prev->pid);
+ trace_dump_stack();
+ }
switch_count = &prev->nvcsw;
}
Produces:
<...>-3249 [001] 296.105269: schedule: Deactivating task ntpd:3249
<...>-3249 [001] 296.105270: <stack trace>
=> schedule
=> schedule_hrtimeout_range
=> poll_schedule_timeout
=> do_select
=> core_sys_select
=> sys_select
=> system_call_fastpath
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| |\ \ \
| | | |/
| | |/|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Conflicts:
kernel/trace/trace_kprobe.c
Merge reason: resolve the conflict.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Fix this warning:
kernel/trace/trace_ksym.c: In function 'ksym_trace_filter_read':
kernel/trace/trace_ksym.c:239: warning: cast to pointer from integer of different size
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: "K.Prasad" <prasad@linux.vnet.ibm.com>
LKML-Reference: <4B1DC578.9020909@cn.fujitsu.com>
[remove the strstrip fix as tglx already fixed that]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
trace_power_start and trace_power_end are used in
arch/x86/kernel/power.c, and this file can't be compiled
as a module, so these two tracepoints don't need to be
exported.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <4B1DC55F.7060305@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Like total_profile_count, struct ftrace_event_call::profile_count
is protected by event_mutex, so it doesn't need to be atomic_t.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4B1DC549.5010705@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
- remove duplicate code inside trace_options_write()
- extract duplicate code in trace_options_write() and set_tracer_option()
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <4B1DC532.9010802@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Since commit 4d9493c90f8e6e1b164aede3814010a290161abb
("ftrace: remove add-hoc code"), option "sched-tree"
has become useless.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <4B1DC50A.7040402@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The buffer for the output is as small as 64 bytes, so it'll
overflow if we add more clock type. Use seq file instead.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <4B1DC4FB.5030407@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Code simplification for reading trace_options.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-reference: <4B1DC4EF.3090106@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
# echo 'do_open' > set_graph_function
# echo 'do_open' >> set_graph_function
bash: echo: write error: Invalid argument
Make it valid to write the same value to set_graph_function,
which is consistent with set_ftrace_filter interface.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-reference: <4B1DC4E1.1060303@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
I found a weird behavior:
# echo 'fuse:*' > set_ftrace_filter
bash: echo: write error: Invalid argument
# cat set_ftrace_filter
fuse_dev_fasync
fuse_dev_poll
fuse_copy_do
We should call trace_parser_clear() no matter ftrace_process_regex()
returns 0 or -errno, otherwise we will actually take the unaccepted
records from ftrace_regex_release().
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <4B1DC4D2.3000406@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|