aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* perf: Validate cpu early in perf_event_alloc()Oleg Nesterov2011-01-18
| | | | | | | | | | | | | | | | | | | | | | Starting from perf_event_alloc()->perf_init_event(), the kernel assumes that event->cpu is either -1 or the valid CPU number. Change perf_event_alloc() to validate this argument early. This also means we can remove the similar check in find_get_context(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Roland McGrath <roland@redhat.com> Cc: gregkh@suse.de Cc: stable@kernel.org LKML-Reference: <20110118161032.GC693@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf: Find_get_context: fix the per-cpu-counter checkOleg Nesterov2011-01-18
| | | | | | | | | | | | | | | | | | | | | | | If task == NULL, find_get_context() should always check that cpu is correct. Afaics, the bug was introduced by 38a81da2 "perf events: Clean up pid passing", but even before that commit "&& cpu != -1" was not exactly right, -ESRCH from find_task_by_vpid() is not accurate. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Roland McGrath <roland@redhat.com> Cc: gregkh@suse.de Cc: stable@kernel.org LKML-Reference: <20110118161008.GB693@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf: Fix contexted inheritancePeter Zijlstra2011-01-18
| | | | | | | | | | | | | | | | | | | | | | | Linus reported that the RCU lockdep annotation bits triggered for this rcu_dereference() because we're not holding rcu_read_lock(). Going over the code I cannot convince myself its correct: - holding a ref on the parent_ctx, doesn't avoid it being uncloned concurrently (as the comment says), so we can race with a free. - holding parent_ctx->mutex doesn't avoid the above free from taking place either, it would at best avoid parent_ctx from being freed. I.e. the warning is correct. To fix the bug, serialize against the unclone_ctx() call by extending the reach of the parent_ctx->lock. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf tools: Fix tracepoint id to string perf.data header tableArnaldo Carvalho de Melo2011-01-17
| | | | | | | | | | | | | | | | | | | | | | | It was broken by f006d25 that passed just the event name, not the complete sys:event that it expected to open the /sys/.../sys/sys:event/id file to get the id. Fix it by moving it to after parse_events in cmd_record, as at that point we can just traverse the evsel_list and use evsel->attr.config + event_name(evsel) instead of re-opening the /id file. Reported-by: Franck Bui-Huu <vagabon.xyz@gmail.com> Cc: Franck Bui-Huu <vagabon.xyz@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Han Pingtian <phan@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <20110117202801.GG2085@ghostprotocols.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Fix handling of wildcards in tracepoint event selectorsArnaldo Carvalho de Melo2011-01-17
| | | | | | | | | | | | | | | | | | | | | | It wasn't accounting the ':' when consuming bytes in the the event selector string, so parse_events() would fail in this test: if (!(*str == 0 || *str == ',' || isspace(*str))) return -1; as *str would be pointing to '*', the last character in the '-e' arg in: $ perf record -q -a -D -e sched:sched_* | perf script -i - -s perf-script.py Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* powerpc: perf: Fix frequency calculation for overflowing countersAnton Blanchard2011-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | When profiling a benchmark that is almost 100% userspace, I noticed some wildly inaccurate profiles that showed almost all time spent in the kernel. Closer examination shows we were programming a tiny number of cycles into the PMU after each overflow (about ~200 away from the next overflow). This gets us stuck in a loop which we eventually break out of by throttling the PMU (there are regular throttle/unthrottle events in the log). It looks like we aren't setting event->hw.last_period to something same and the frequency to period calculations in perf are going haywire. With the following patch we find the correct period after a few interrupts and stay there. I also see no more throttle events. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> LKML-Reference: <20110117161742.5feb3761@kryten> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'tip/perf/urgent' of ↵Ingo Molnar2011-01-15
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent
| * tracing: Remove syscall_exit_fieldsLai Jiangshan2011-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need for syscall_exit_fields as the syscall exit event class can already host the fields in its structure, like most other trace events do by default. Use that default behavior instead. Following this scheme, we don't need anymore to override the get_fields() callback of the syscall exit event class either. Hence both syscall_exit_fields and syscall_get_exit_fields() can be removed. Also changed some indentation to keep the following under 80 characters: ".fields = LIST_HEAD_INIT(event_class_syscall_exit.fields)," Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> LKML-Reference: <4D301C0E.8090408@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Only process module tracepoints onceSteven Rostedt2011-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit: 9f987b3141f086de27832514aad9f50a53f754 tracing: Include module.h in define_trace.h only solved half the problem. If the trace/events/module.h header is included at the time of define_trace.h (or in ftrace.h within it), the module.h TRACE_SYSTEM will override the current TRACE_SYSTEM macro. Since define_trace.h is included when CREATE_TRACE_POINTS is set, and the first thing it does is to #undef CREATE_TRACE_POINTS, by placing the module.h TRACE_SYSTEM inside a #ifdef CREATE_TRACE_POINTS we can prevent it from overriding the TRACE_SYSTEM that is being processed, and still process the module.h tracepoints when the module code defines CREATE_TRACE_POINTS and includes the trace/events/module.h header. As with commit 9f987b3141, this is only an issue if module.h is not included before the trace/events/<event>.h file is included, which (luckily) has not happened yet. Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | perf record: Add "nodelay" mode, disabled by defaultKirill Smelkov2011-01-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sometimes there is a need to use perf in "live-log" mode. The problem is, for seldom events, actual info output is largely delayed because perf-record reads sample data in whole pages. So for such scenarious, add flag for perf-record to go in "nodelay" mode. To track e.g. what's going on in icmp_rcv while ping is running Use it with something like this: (1) $ perf probe -L icmp_rcv | grep -U8 '^ *43\>' goto error; } 38 if (!pskb_pull(skb, sizeof(*icmph))) goto error; icmph = icmp_hdr(skb); 43 ICMPMSGIN_INC_STATS_BH(net, icmph->type); /* * 18 is the highest 'known' ICMP type. Anything else is a mystery * * RFC 1122: 3.2.2 Unknown ICMP messages types MUST be silently * discarded. */ 50 if (icmph->type > NR_ICMP_TYPES) goto error; $ perf probe icmp_rcv:43 'type=icmph->type' (2) $ cat trace-icmp.py [...] def trace_begin(): print "in trace_begin" def trace_end(): print "in trace_end" def probe__icmp_rcv(event_name, context, common_cpu, common_secs, common_nsecs, common_pid, common_comm, __probe_ip, type): print_header(event_name, common_cpu, common_secs, common_nsecs, common_pid, common_comm) print "__probe_ip=%u, type=%u\n" % \ (__probe_ip, type), [...] (3) $ perf record -a -D -e probe:icmp_rcv -o - | \ perf script -i - -s trace-icmp.py Thanks to Peter Zijlstra for pointing how to do it. Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu>, Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <20110112140613.GA11698@tugrik.mns.mnsspb.ru> Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf sched: Fix list of events, dropping unsupported ':r' modifierStephane Eranian2011-01-13
|/ | | | | | | | | | | | | | Looks to me like the :r modifier is not supported anymore, so remove it from the list of events. Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> LKML-Reference: <AANLkTim=jawJyBj0iFd0r4-LCKzvjFW+NddzJMD5GUB9@mail.gmail.com> Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Revert "perf tools: Emit clearer message for sys_perf_event_open ENOENT return"Arnaldo Carvalho de Melo2011-01-11
| | | | | | | | | | | | | | | | | | | This reverts commit aa7bc7ef73efc46d7c3a0e185eefaf85744aec98. It removed the fallback from hardware profiling to software profiling. .e.g., in a VM with no PMU. Reported-by: David Ahern <daahern@cisco.com> Cc: David Ahern <daahern@cisco.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf top: Fix annotate segvArnaldo Carvalho de Melo2011-01-11
| | | | | | | | | | | | | | | | | | | | | | | Before we had sym_counter, it was initialized to zero and we used that as an index in the global attrs variable, now we have a list of evsel entries, and sym_counter became sym_evsel, that remained initialized to zero (NULL): b00m. Fix it by initializing it to the first entry in the evsel list. Bug-introduced: 69aad6f Reported-by: Kirill Smelkov <kirr@mns.spb.ru> Tested-by: Kirill Smelkov <kirr@mns.spb.ru> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kirill Smelkov <kirr@mns.spb.ru> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf evsel: Fix order of event list deletionArnaldo Carvalho de Melo2011-01-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to defer calling perf_evsel_list__delete() till after atexit registered routines, because we need to traverse the events being recorded at that time at least on 'perf record'. This fixes the problem reported by Thomas Renninger where cmd_record called by cmd_timechart would not write the tracing data to the perf.data file header because the evsel_list at atexit (control+C on 'perf timechart record') time would be empty, being already deleted by run_builtin(), and thus 'perf timechart' when trying to process such perf.data file would die with: "no trace data in the file" Problem introduced in 70d544d. Reported-by: Thomas Renninger <trenn@suse.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Renninger <trenn@suse.de> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf session: Fix infinite loop in __perf_session__process_eventsArnaldo Carvalho de Melo2011-01-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In this if statement: if (head + event->header.size >= mmap_size) { if (mmaps[map_idx]) { munmap(mmaps[map_idx], mmap_size); mmaps[map_idx] = NULL; } page_offset = page_size * (head / page_size); file_offset += page_offset; head -= page_offset; goto remap; } With, for instance, these values: head=2992 event->header.size=48 mmap_size=3040 We end up endlessly looping back to remap. Off by one. Problem introduced in 55b4462. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Ingo Molnar <mingo@elte.hu> Reported-by: David Ahern <daahern@cisco.com> Bisected-by: David Ahern <daahern@cisco.com> Tested-by: David Ahern <daahern@cisco.com> Cc: David Ahern <daahern@cisco.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf evsel: Support perf_evsel__open(cpus > 1 && threads > 1)Arnaldo Carvalho de Melo2011-01-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | And a test for it: [acme@felicio linux]$ perf test 1: vmlinux symtab matches kallsyms: Ok 2: detect open syscall event: Ok 3: detect open syscall event on all cpus: Ok [acme@felicio linux]$ Translating C the test does: 1. generates different number of open syscalls on each CPU by using sched_setaffinity 2. Verifies that the expected number of events is generated on each CPU It works as expected. LKML-Reference: <new-submission> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sched: Use PTHREAD_STACK_MIN to avoid pthread_attr_setstacksize() failJiri Pirko2011-01-10
| | | | | | | | | | | | | | | | | | | on ppc64: /usr/include/bits/local_lim.h:#define PTHREAD_STACK_MIN 131072 therefore following set of commands: gives: perf.2.6.37test: builtin-sched.c:493: create_tasks: Assertion `!(err)' failed. So make sure we do not set stack size lower than PTHREAD_STACK_MIN. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110110160417.GB2685@psychotron.brq.redhat.com> Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Emit clearer message for sys_perf_event_open ENOENT returnArnaldo Carvalho de Melo2011-01-10
| | | | | | | | | | | | | | | | Improve sys_perf_event_open ENOENT return handling in top and record, just like 5a3446b does for stat. Cc: David Ahern <daahern@cisco.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf stat: better error message for unsupported eventsDavid Ahern2011-01-10
| | | | | | | | | | | | | | | | | | | | | | | For unsupported events (e.g., H/W events when running in a VM) perf stat currently fails with the error message: Error: open_counter returned with 2 (No such file or directory). /bin/dmesg may provide additional information. Fatal: Not all events could be opened. dmesg is of no help and it is not clear as to why it fails to open the counter. This patch changes the error message to Error: cache-misses event is not supported. Fatal: Not all events could be opened. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: a.p.zijlstra@chello.nl LPU-Reference: <1294597272-17335-1-git-send-email-daahern@cisco.com> Signed-off-by: David Ahern <daahern@cisco.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf sched: Fix allocation result checkArnaldo Carvalho de Melo2011-01-10
| | | | | | | | | | | | | | | | Bug introduced in ce47dc56. Reported-by: Mike Galbraith <efault@gmx.de> Cc: Chris Samuel <chris@csamuel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge branch 'tip/perf/core' of ↵Ingo Molnar2011-01-09
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent
| * dynamic debug: Fix build issue with older gccJason Baron2011-01-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On older gcc (3.3) dynamic debug fails to compile: include/net/inet_connection_sock.h: In function `inet_csk_reset_xmit_timer': include/net/inet_connection_sock.h:236: error: duplicate label declaration `do_printk' include/net/inet_connection_sock.h:219: error: this is a previous declaration include/net/inet_connection_sock.h:236: error: duplicate label declaration `out' include/net/inet_connection_sock.h:219: error: this is a previous declaration include/net/inet_connection_sock.h:236: error: duplicate label `do_printk' include/net/inet_connection_sock.h:236: error: duplicate label `out' Fix, by reverting the usage of JUMP_LABEL() in dynamic debug for now. Cc: <stable@kernel.org> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Fix TRACE_EVENT power tracepoint creationMathieu Desnoyers2011-01-07
| | | | | | | | | | | | | | | | | | | | DEFINE_TRACE should also exist when CONFIG_EVENT_TRACING=n. Otherwise, setting only TRACEPOINTS=y is broken. Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20101028153117.GA4051@Krystal> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Fix preempt count leakLi Zefan2011-01-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While running my ftrace stress test, this showed up: BUG: sleeping function called from invalid context at mm/mmap.c:233 ... note: cat[3293] exited with preempt_count 1 The bug was introduced by commit 91e86e560d0b3ce4c5fc64fd2bbb99f856a30a4e ("tracing: Fix recursive user stack trace") Cc: <stable@kernel.org> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4D0089AC.1020802@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracepoint: Add __rcu annotationLai Jiangshan2011-01-07
| | | | | | | | | | | | | | | | | | | | Add __rcu annotation to : (struct tracepoint)->funcs Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> LKML-Reference: <4D22D4F1.50505@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: remove duplicate null-pointer check in skb tracepointMathieu Desnoyers2011-01-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The check for NULL skb in the kfree_skb trace event is a duplicate from the check already done in its only caller, kfree_skb(). Remove this duplicate check. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20110106175319.GA30610@Krystal> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: David S. Miller <davem@davemloft.net> CC: Steven Rostedt <rostedt@goodmis.org> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@elte.hu> CC: Thomas Gleixner <tglx@linutronix.de> CC: Zhaolei <zhaolei@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing/trivial: Add missing comma in TRACE_EVENT commentMathieu Desnoyers2011-01-07
| | | | | | | | | | | | | | | | | | | | | | Add missing comma within the TRACE_EVENT() example in tracepoint.h. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20110106184532.GA2526@Krystal> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@elte.hu> CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Include module.h in define_trace.hSteven Rostedt2011-01-07
| | | | | | | | | | | | | | | | | | | | | | | | While doing some developing, Peter Zijlstra and I have found that if a CREATE_TRACE_POINTS include is done before module.h is included, it can break the build. We have been lucky so far that this has not broke the build since module.h is included in almost everything. Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2011-01-06
| |\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mm: Initialize initial_page_table before paravirt jumps
| | * x86, mm: Initialize initial_page_table before paravirt jumpsRusty Russell2011-01-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2.6.36-rc8-54-gb40827f (x86-32, mm: Add an initial page table for core bootstrapping) made x86 boot using initial_page_table and broke lguest. For 2.6.37 we simply cut & paste the initialization code into lguest (da32dac10126 "lguest: populate initial_page_table"), now we fix it properly by doing that initialization before the paravirt jump. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Cc: lguest <lguest@ozlabs.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <201101041720.54535.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | |
| | \
| | \
| | \
| | \
| | \
| | \
| | \
| | \
| | \
| *---------. \ Merge branches 'x86-alternatives-for-linus', 'x86-fpu-for-linus', ↵Linus Torvalds2011-01-06
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'x86-hwmon-for-linus', 'x86-paravirt-for-linus', 'core-locking-for-linus' and 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, suspend: Avoid unnecessary smp alternatives switch during suspend/resume * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-64, asm: Use fxsaveq/fxrestorq in more places * 'x86-hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hwmon: Add core threshold notification to therm_throt.c * 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, paravirt: Use native_halt on a halt, not native_safe_halt * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: locking, lockdep: Convert sprintf_symbol to %pS * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: irq: Better struct irqaction layout
| | | | | | | * | irq: Better struct irqaction layoutEric Dumazet2010-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently use kmalloc-96 slab for struct irqaction allocations on 64bit arches. This is unfortunate because of possible false sharing and two cache lines accesses. Move 'name' and 'dir' fields at the end of the structure, and force a suitable alignement. Hot path fields now use one cache line on x86_64. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Andi Kleen <andi@firstfloor.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1288865628.2659.69.camel@edumazet-laptop> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | | | | | * | | locking, lockdep: Convert sprintf_symbol to %pSJoe Perches2010-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Joe Perches <joe@perches.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Kosina <trivial@kernel.org> LKML-Reference: <1288998760-11775-6-git-send-email-joe@perches.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | | | * | | | x86, paravirt: Use native_halt on a halt, not native_safe_haltCliff Wickman2010-12-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | halt() should use native_halt() safe_halt() uses native_safe_halt() If CONFIG_PARAVIRT=y, halt() is defined in arch/x86/include/asm/paravirt.h as static inline void halt(void) { PVOP_VCALL0(pv_irq_ops.safe_halt); } Otherwise (no CONFIG_PARAVIRT) halt() in arch/x86/include/asm/irqflags.h is static inline void halt(void) { native_halt(); } So it looks to me like the CONFIG_PARAVIRT case of using native_safe_halt() for a halt() is an oversight. Am I missing something? It probably hasn't shown up as a problem because the local apic is disabled on a shutdown or restart. But if we disable interrupts and call halt() we shouldn't expect that the halt() will re-enable interrupts. Signed-off-by: Cliff Wickman <cpw@sgi.com> LKML-Reference: <E1PSBcz-0001g1-FM@eag09.americas.sgi.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| | | | * | | | | x86, hwmon: Add core threshold notification to therm_throt.cR, Durgadoss2011-01-03
| | | | | |_|_|/ | | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds code to therm_throt.c to notify core thermal threshold events. These thresholds are supported by the IA32_THERM_INTERRUPT register. The status/log for the same is monitored using the IA32_THERM_STATUS register. The necessary #defines are in msr-index.h. A call back is added to mce.h, to further notify the thermal stack, about the threshold events. Signed-off-by: Durgadoss R <durgadoss.r@intel.com> LKML-Reference: <D6D887BA8C9DFF48B5233887EF04654105C1251710@bgsmsx502.gar.corp.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| | | * | | | | x86-64, asm: Use fxsaveq/fxrestorq in more placesH. Peter Anvin2010-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Checkin d7acb92fea932ad2e7846480aeacddc2c03c8485 made use of fxsaveq in fpu_fxsave() if the assembler supports it; this adds fxsaveq/fxrstorq to fxrstor_checking() and fxsave_user() as well. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <AANLkTi=RKyHLNTq6iomZOXkc6Zw1j9iAgsq8388XmzwN@mail.gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | x86, suspend: Avoid unnecessary smp alternatives switch during suspend/resumeSuresh Siddha2010-12-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During suspend, we disable all the non boot cpus. And during resume we bring them all back again. So no need to do alternatives_smp_switch() in between. On my core 2 based laptop, this speeds up the suspend path by 15msec and the resume path by 5 msec (suspend/resume speed up differences can be attributed to the different P-states that the cpu is in during suspend/resume). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1290557500.4946.8.camel@sbsiddha-MOBL3.sc.intel.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | Merge branch 'x86-uv-for-linus' of ↵Linus Torvalds2011-01-06
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, UV, BAU: Extend for more than 16 cpus per socket x86, UV: Fix the effect of extra bits in the hub nodeid register x86, UV: Add common uv_early_read_mmr() function for reading MMRs
| | * | | | | | | x86, UV, BAU: Extend for more than 16 cpus per socketCliff Wickman2011-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a hard-coded limit of a maximum of 16 cpu's per socket. The UV Broadcast Assist Unit code initializes by scanning the cpu topology of the system and assigning a master cpu for each socket and UV hub. That scan had an assumption of a limit of 16 cpus per socket. With Westmere we are going over that limit. The UV hub hardware will allow up to 32. If the scan finds the system has gone over that limit it returns an error and we print a warning and fall back to doing TLB shootdowns without the BAU. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: <stable@kernel.org> # .37.x LKML-Reference: <E1PZol7-0000mM-77@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | x86, UV: Fix the effect of extra bits in the hub nodeid registerJack Steiner2010-12-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | UV systems can be partitioned into multiple independent SSIs. Large partitioned systems may have extra bits in the node_id register. These bits are used when the total memory on all SSIs exceeds 16TB. These extra bits need to be ignored when calculating x2apic_extra_bits. Signed-off-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20101130195926.972776133@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | x86, UV: Add common uv_early_read_mmr() function for reading MMRsJack Steiner2010-12-22
| | | |_|_|/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Early in boot, reading MMRs from the UV hub controller require calls to early_ioremap()/early_iounmap(). Rather than duplicating code, add a common function to do the map/read/unmap. Signed-off-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20101130195926.834804371@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | Merge branch 'x86-tsc-for-linus' of ↵Linus Torvalds2011-01-06
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-tsc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Check tsc available/disabled in the delayed init function x86: Improve TSC calibration using a delayed workqueue x86: Make tsc=reliable override boot time stability checks
| | * | | | | | | x86: Check tsc available/disabled in the delayed init functionThomas Gleixner2010-12-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The delayed TSC init function does not check whether the system has no TSC or TSC is disabled at the kernel command line, which results in a crash in the work queue based extended calibration due to division by zero because the basic calibration never happened. Add the missing checks and do not touch TSC when not available or disabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <johnstul@us.ibm.com>
| | * | | | | | | x86: Improve TSC calibration using a delayed workqueueJohn Stultz2010-12-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Boot to boot the TSC calibration may vary by quite a large amount. While normal variance of 50-100ppm can easily be seen, the quick calibration code only requires 500ppm accuracy, which is the limit of what NTP can correct for. This can cause problems for systems being used as NTP servers, as every time they reboot it can take hours for them to calculate the new drift error caused by the calibration. The classic trade-off here is calibration accuracy vs slow boot times, as during the calibration nothing else can run. This patch uses a delayed workqueue to calibrate the TSC over the period of a second. This allows very accurate calibration (in my tests only varying by 1khz or 0.4ppm boot to boot). Additionally this refined calibration step does not block the boot process, and only delays the TSC clocksoure registration by a few seconds in early boot. If the refined calibration strays 1% from the early boot calibration value, the system will fall back to already calculated early boot calibration. Credit to Andi Kleen who suggested using a timer quite awhile back, but I dismissed it thinking the timer calibration would be done after the clocksource was registered (which would break things). Forgive me for my short-sightedness. This patch has worked very well in my testing, but TSC hardware is quite varied so it would probably be good to get some extended testing, possibly pushing inclusion out to 2.6.39. Signed-off-by: John Stultz <johnstul@us.ibm.com> LKML-Reference: <1289003985-29060-1-git-send-email-johnstul@us.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@elte.hu> CC: Martin Schwidefsky <schwidefsky@de.ibm.com> CC: Clark Williams <williams@redhat.com> CC: Andi Kleen <andi@firstfloor.org>
| | * | | | | | | Merge remote branch 'tip/x86/tsc' into fortglx/2.6.38/tip/x86/tscJohn Stultz2010-12-02
| | |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: Documentation/kernel-parameters.txt
| | | * | | | | | | x86: Make tsc=reliable override boot time stability checksjohn stultz2009-08-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes the tsc=reliable option disable the boot time stability checks. Currently the option only disables the runtime watchdog checks. This change allows folks who want to override the boot time TSC stability checks and use the TSC when the system would otherwise disqualify it. There still are some situations that the TSC will be disqualified, such as cpufreq scaling. But these are situations where the box will hang if allowed. Patch also includes a fix for an issue found by Thomas Gleixner, where the TSC disqualification message wouldn't be printed after a call to unsynchronized_tsc(). Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: akataria@vmware.com Cc: Stephen Hemminger <shemminger@vyatta.com> LKML-Reference: <1250552447.7212.92.camel@localhost.localdomain> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | | | Merge branch 'x86-security-for-linus' of ↵Linus Torvalds2011-01-06
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-security-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: module: Move RO/NX module protection to after ftrace module update x86: Resume trampoline must be executable x86: Add RO/NX protection for loadable kernel modules x86: Add NX protection for kernel data x86: Fix improper large page preservation
| | * | | | | | | | | module: Move RO/NX module protection to after ftrace module updateSteven Rostedt2010-12-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit: 84e1c6bb38eb318e456558b610396d9f1afaabf0 x86: Add RO/NX protection for loadable kernel modules Broke the function tracer with this output: ------------[ cut here ]------------ WARNING: at kernel/trace/ftrace.c:1014 ftrace_bug+0x114/0x171() Hardware name: Precision WorkStation 470 Modules linked in: i2c_core(+) Pid: 86, comm: modprobe Not tainted 2.6.37-rc2+ #68 Call Trace: [<ffffffff8104e957>] warn_slowpath_common+0x85/0x9d [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core] [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core] [<ffffffff8104e989>] warn_slowpath_null+0x1a/0x1c [<ffffffff810a9dfe>] ftrace_bug+0x114/0x171 [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core] [<ffffffff810aa0db>] ftrace_process_locs+0x1ae/0x274 [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core] [<ffffffff810aa29e>] ftrace_module_notify+0x39/0x44 [<ffffffff814405cf>] notifier_call_chain+0x37/0x63 [<ffffffff8106e054>] __blocking_notifier_call_chain+0x46/0x5b [<ffffffff8106e07d>] blocking_notifier_call_chain+0x14/0x16 [<ffffffff8107ffde>] sys_init_module+0x73/0x1f3 [<ffffffff8100acf2>] system_call_fastpath+0x16/0x1b ---[ end trace 2aff4f4ca53ec746 ]--- ftrace faulted on writing [<ffffffffa00026db>] __process_new_adapter+0x7/0x34 [i2c_core] The cause was that the module text was set to read only before ftrace could convert the calls to mcount to nops. Thus, the conversions failed due to not being able to write to the text locations. The simple fix is to move setting the module to read only after the module notifiers are called (where ftrace sets the module mcounts to nops). Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | Merge commit 'v2.6.37-rc7' into x86/securityIngo Molnar2010-12-23
| | |\ \ \ \ \ \ \ \ \ | | | | |_|/ / / / / / | | | |/| | | | | | |
| | * | | | | | | | | x86: Resume trampoline must be executableLin Ming2010-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 5bd5a452(x86: Add NX protection for kernel data) marked the trampoline area NX - which unsurprisingly breaks resume and cpu hotplug. Revert the portion of that commit, which touches the trampoline. Originally-from: Lin Ming <ming.m.lin@intel.com> LKML-Reference: <1290410581.2405.24.camel@minggr.sh.intel.com> Cc: Matthieu Castet <castet.matthieu@free.fr> Cc: Siarhei Liakh <sliakh.lkml@gmail.com> Cc: Xuxian Jiang <jiang@cs.ncsu.edu> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Andi Kleen <andi@firstfloor.org> Tested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>