litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	perf/documentation: Add description for conditional branch filter	Anshuman Khandual	2014-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Reviewed-by: Stephane Eranian <eranian@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: mpe@ellerman.id.au Cc: benh@kernel.crashing.org Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1400743210-32289-4-git-send-email-khandual@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
*	perf/x86: Add conditional branch filtering support	Anshuman Khandual	2014-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds conditional branch filtering support, enabling it for PERF_SAMPLE_BRANCH_COND in perf branch stack sampling framework by utilizing an available software filter X86_BR_JCC. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Reviewed-by: Stephane Eranian <eranian@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: mpe@ellerman.id.au Cc: benh@kernel.crashing.org Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1400743210-32289-3-git-send-email-khandual@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
*	perf/tool: Add conditional branch filter 'cond' to perf record	Anshuman Khandual	2014-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adding perf record support for new branch stack filter criteria PERF_SAMPLE_BRANCH_COND. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Reviewed-by: Stephane Eranian <eranian@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1400743210-32289-2-git-send-email-khandual@linux.vnet.ibm.com Cc: mpe@ellerman.id.au Cc: benh@kernel.crashing.org Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
*	perf: Add new conditional branch filter 'PERF_SAMPLE_BRANCH_COND'	Anshuman Khandual	2014-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduces new branch filter PERF_SAMPLE_BRANCH_COND which will extend the existing perf ABI. This will filter branches which are conditional. Various architectures can provide this functionality either with HW filtering support (if present) or with SW filtering of captured branch instructions. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Reviewed-by: Stephane Eranian <eranian@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: mpe@ellerman.id.au Cc: benh@kernel.crashing.org Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1400743210-32289-1-git-send-email-khandual@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
*	uprobes: Teach copy_insn() to support tmpfs	Oleg Nesterov	2014-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tmpfs is widely used but as Denys reports shmem_aops doesn't have ->readpage() and thus you can't probe a binary on this filesystem. As Hugh suggested we can use shmem_read_mapping_page() in this case, just we need to check shmem_mapping() if ->readpage == NULL. Reported-by: Denys Vlasenko <dvlasenk@redhat.com> Suggested-by: Hugh Dickins <hughd@google.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140519184136.GB6750@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
*	uprobes: Shift ->readpage check from __copy_insn() to uprobe_register()	Oleg Nesterov	2014-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	copy_insn() fails with -EIO if ->readpage == NULL, but this error is not propagated unless uprobe_register() path finds ->mm which already mmaps this file. In this case (say) "perf record" does not actually install the probe, but the user can't know about this. Move this check into uprobe_register() so that this problem can be detected earlier and reported to user. Note: this is still not perfect, - copy_insn() and arch_uprobe_analyze_insn() should be called by uprobe_register() but this is not simple, we need vm_file for read_mapping_page() (although perhaps we can pass NULL), and we need ->mm for is_64bit_mm() (although this logic is broken anyway). - uprobe_register() should be called by create_trace_uprobe(), not by probe_event_enable(), so that an error can be detected at "perf probe -x" time. This also needs more changes in the core uprobe code, uprobe register/unregister interface was poorly designed from the very beginning. Reported-by: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Hugh Dickins <hughd@google.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140519184054.GA6750@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
*	perf/x86: Use common PMU interrupt disabled code	Vince Weaver	2014-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make the x86 perf code use the new common PMU interrupt disabled code. Typically most x86 machines have working PMU interrupts, although some older p6-class machines had this problem. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1405161715560.11099@vincent-weaver-1.umelst.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>
*	perf/ARM: Use common PMU interrupt disabled code	Vince Weaver	2014-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make the ARM perf code use the new common PMU interrupt disabled code. This allows perf to work on ARM machines without a working PMU interrupt (for example, raspberry pi). Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> [peterz: applied changes suggested by Will] Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: devicetree@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1405161712190.11099@vincent-weaver-1.umelst.maine.edu [ Small readability tweaks to the code. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
*	perf: Disable sampled events if no PMU interrupt	Vince Weaver	2014-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add common code to generate -ENOTSUPP at event creation time if an architecture attempts to create a sampled event and PERF_PMU_NO_INTERRUPT is set. This adds a new pmu->capabilities flag. Initially we only support PERF_PMU_NO_INTERRUPT (to indicate a PMU has no support for generating hardware interrupts) but there are other capabilities that can be added later. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Acked-by: Will Deacon <will.deacon@arm.com> [peterz: rename to PERF_PMU_CAP_* and moved the pmu::capabilities word into a hole] Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1405161708060.11099@vincent-weaver-1.umelst.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
*	perf: Fix use after free in perf_remove_from_context()	Peter Zijlstra	2014-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While that mutex should guard the elements, it doesn't guard against the use-after-free that's from list_for_each_entry_rcu(). __perf_event_exit_task() can actually free the event. And because list addition/deletion is guarded by both ctx->mutex and ctx->lock, holding ctx->mutex is sufficient for reading the list, so we don't actually need the rcu list iteration. Fixes: 3a497f48637e ("perf: Simplify perf_event_exit_task_context()") Reported-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Dave Jones <davej@redhat.com> Cc: acme@ghostprotocols.net Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140529170024.GA2315@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
*	Merge branch 'perf/kprobes' into perf/core	Ingo Molnar	2014-06-05
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: arch/x86/kernel/traps.c The kprobes enhancements are fully cooked, ship them upstream. Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes: Ensure blacklist data is aligned	Vineet Gupta	2014-05-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ARC Linux (not supporting native unaligned access) was failing to boot because __start_kprobe_blacklist was not aligned. This was because per generated vmlinux.lds it was emitted right next to .rodata with strings etc hence could be randomly unaligned. Fix that by ensuring a word alignment. While 4 would suffice for 32bit arches and problem at hand, it is probably better to put 8. \| Path: (null) CPU: 0 PID: 1 Comm: swapper Not tainted \| 3.15.0-rc3-next-20140430 #2 \| task: 8f044000 ti: 8f01e000 task.ti: 8f01e000 \| \| [ECR ]: 0x00230400 => Misaligned r/w from 0x800fb0d3 \| [EFA ]: 0x800fb0d3 \| [BLINK ]: do_one_initcall+0x86/0x1bc \| [ERET ]: init_kprobes+0x52/0x120 Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: <torvalds@linux-foundation.org> Cc: <rusty@rustcorp.com.au> Cc: <rdunlap@infradead.org> Cc: <jeremy@goop.org> Cc: <arnd@arndb.de> Cc: <dl9pf@gmx.de> Cc: <sparse@chrisli.org> Cc: <anil.s.keshavamurthy@intel.com> Cc: <davem@davemloft.net> Cc: <ananth@in.ibm.com> Cc: <masami.hiramatsu.pt@hitachi.com> Cc: <chrisw@sous-sol.org> Cc: <akataria@vmware.com> Cc: anton Kolesov <Anton.Kolesov@synopsys.com> Link: http://lkml.kernel.org/r/5361DB14.7010406@synopsys.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes: Show blacklist entries via debugfs	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Show blacklist entries (function names with the address range) via /sys/kernel/debug/kprobes/blacklist. Note that at this point the blacklist supports only in vmlinux, not module. So the list is fixed and not updated. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Link: http://lkml.kernel.org/r/20140417081849.26341.11609.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes, sched: Use NOKPROBE_SYMBOL macro in sched	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use NOKPROBE_SYMBOL macro to protect functions from kprobes instead of __kprobes annotation in sched/core.c. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/20140417081842.26341.83959.stgit@ltc230.yrl.intra.hitachi.co.jp
\| *	kprobes, notifier: Use NOKPROBE_SYMBOL macro in notifier	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use NOKPROBE_SYMBOL macro to protect functions from kprobes instead of __kprobes annotation in notifier. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20140417081835.26341.56128.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes, ftrace: Use NOKPROBE_SYMBOL macro in ftrace	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use NOKPROBE_SYMBOL macro to protect functions from kprobes instead of __kprobes annotation in ftrace. This applies nokprobe_inline annotation for some cases, because NOKPROBE_SYMBOL() will inhibit inlining by referring the symbol address. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20140417081828.26341.55152.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes: Use NOKPROBE_SYMBOL macro instead of __kprobes	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use NOKPROBE_SYMBOL macro to protect functions from kprobes instead of __kprobes annotation. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Link: http://lkml.kernel.org/r/20140417081821.26341.40362.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes, x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use NOKPROBE_SYMBOL macro for protecting functions from kprobes instead of __kprobes annotation under arch/x86. This applies nokprobe_inline annotation for some cases, because NOKPROBE_SYMBOL() will inhibit inlining by referring the symbol address. This just folds a bunch of previous NOKPROBE_SYMBOL() cleanup patches for x86 to one patch. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/20140417081814.26341.51656.stgit@ltc230.yrl.intra.hitachi.co.jp Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fernando Luis Vázquez Cao <fernando_b1@lab.ntt.co.jp> Cc: Gleb Natapov <gleb@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Lebon <jlebon@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Michel Lespinasse <walken@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes, x86: Allow kprobes on text_poke/hw_breakpoint	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow kprobes on text_poke/hw_breakpoint because those are not related to the critical int3-debug recursive path of kprobes at this moment. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@suse.de> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Link: http://lkml.kernel.org/r/20140417081807.26341.73219.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes, ftrace: Allow probing on some functions	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no need to prohibit probing on the functions used for preparation and uprobe only fetch functions. Those are safely probed because those are not invoked from kprobe's breakpoint/fault/debug handlers. So there is no chance to cause recursive exceptions. Following functions are now removed from the kprobes blacklist: update_bitfield_fetch_param free_bitfield_fetch_param kprobe_register FETCH_FUNC_NAME(stack, type) in trace_uprobe.c FETCH_FUNC_NAME(memory, type) in trace_uprobe.c FETCH_FUNC_NAME(memory, string) in trace_uprobe.c FETCH_FUNC_NAME(memory, string_size) in trace_uprobe.c FETCH_FUNC_NAME(file_offset, type) in trace_uprobe.c Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20140417081800.26341.56504.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes: Allow probe on some kprobe functions	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no need to prohibit probing on the functions used for preparation, registeration, optimization, controll etc. Those are safely probed because those are not invoked from breakpoint/fault/debug handlers, there is no chance to cause recursive exceptions. Following functions are now removed from the kprobes blacklist: add_new_kprobe aggr_kprobe_disabled alloc_aggr_kprobe alloc_aggr_kprobe arm_all_kprobes __arm_kprobe arm_kprobe arm_kprobe_ftrace check_kprobe_address_safe collect_garbage_slots collect_garbage_slots collect_one_slot debugfs_kprobe_init __disable_kprobe disable_kprobe disarm_all_kprobes __disarm_kprobe disarm_kprobe disarm_kprobe_ftrace do_free_cleaned_kprobes do_optimize_kprobes do_unoptimize_kprobes enable_kprobe force_unoptimize_kprobe free_aggr_kprobe free_aggr_kprobe __free_insn_slot __get_insn_slot get_optimized_kprobe __get_valid_kprobe init_aggr_kprobe init_aggr_kprobe in_nokprobe_functions kick_kprobe_optimizer kill_kprobe kill_optimized_kprobe kprobe_addr kprobe_optimizer kprobe_queued kprobe_seq_next kprobe_seq_start kprobe_seq_stop kprobes_module_callback kprobes_open optimize_all_kprobes optimize_kprobe prepare_kprobe prepare_optimized_kprobe register_aggr_kprobe register_jprobe register_jprobes register_kprobe register_kprobes register_kretprobe register_kretprobe register_kretprobes register_kretprobes report_probe show_kprobe_addr try_to_optimize_kprobe unoptimize_all_kprobes unoptimize_kprobe unregister_jprobe unregister_jprobes unregister_kprobe __unregister_kprobe_bottom unregister_kprobes __unregister_kprobe_top unregister_kretprobe unregister_kretprobe unregister_kretprobes unregister_kretprobes wait_for_kprobe_optimizer I tested those functions by putting kprobes on all instructions in the functions with the bash script I sent to LKML. See: https://lkml.org/lkml/2014/3/27/33 Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/20140417081753.26341.57889.stgit@ltc230.yrl.intra.hitachi.co.jp Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: fche@redhat.com Cc: systemtap@sourceware.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes/x86: Allow probe on some kprobe preparation functions	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no need to prohibit probing on the functions used in preparation phase. Those are safely probed because those are not invoked from breakpoint/fault/debug handlers, there is no chance to cause recursive exceptions. Following functions are now removed from the kprobes blacklist: can_boost can_probe can_optimize is_IF_modifier __copy_instruction copy_optimized_instructions arch_copy_kprobe arch_prepare_kprobe arch_arm_kprobe arch_disarm_kprobe arch_remove_kprobe arch_trampoline_kprobe arch_prepare_kprobe_ftrace arch_prepare_optimized_kprobe arch_check_optimized_kprobe arch_within_optimized_kprobe __arch_remove_optimized_kprobe arch_remove_optimized_kprobe arch_optimize_kprobes arch_unoptimize_kprobe I tested those functions by putting kprobes on all instructions in the functions with the bash script I sent to LKML. See: https://lkml.org/lkml/2014/3/27/33 Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Jonathan Lebon <jlebon@redhat.com> Link: http://lkml.kernel.org/r/20140417081747.26341.36065.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes, x86: Call exception_enter after kprobes handled	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move exception_enter() call after kprobes handler is done. Since the exception_enter() involves many other functions (like printk), it can cause recursive int3/break loop when kprobes probe such functions. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@suse.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Kees Cook <keescook@chromium.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Link: http://lkml.kernel.org/r/20140417081740.26341.10894.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes/x86: Call exception handlers directly from do_int3/do_debug	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To avoid a kernel crash by probing on lockdep code, call kprobe_int3_handler() and kprobe_debug_handler()(which was formerly called post_kprobe_handler()) directly from do_int3 and do_debug. Currently kprobes uses notify_die() to hook the int3/debug exceptoins. Since there is a locking code in notify_die, the lockdep code can be invoked. And because the lockdep involves printk() related things, theoretically, we need to prohibit probing on such code, which means much longer blacklist we'll have. Instead, hooking the int3/debug for kprobes before notify_die() can avoid this problem. Anyway, most of the int3 handlers in the kernel are already called from do_int3 directly, e.g. ftrace_int3_handler, poke_int3_handler, kgdb_ll_trap. Actually only kprobe_exceptions_notify is on the notifier_call_chain. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@suse.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Jonathan Lebon <jlebon@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Link: http://lkml.kernel.org/r/20140417081733.26341.24423.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes, x86: Prohibit probing on thunk functions and restore	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	thunk/restore functions are also used for tracing irqoff etc. and those are involved in kprobe's exception handling. Prohibit probing on them to avoid kernel crash. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20140417081726.26341.3872.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes, x86: Prohibit probing on native_set_debugreg()/load_idt()	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the kprobes uses do_debug for single stepping, functions called from do_debug() before notify_die() must not be probed. And also native_load_idt() is called from paranoid_exit when returning int3, this also must not be probed. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Alok Kataria <akataria@vmware.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: virtualization@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20140417081719.26341.65542.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes, x86: Prohibit probing on debug_stack_*()	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prohibit probing on debug_stack_reset and debug_stack_set_zero. Since the both functions are called from TRACE_IRQS_ON/OFF_DEBUG macros which run in int3 ist entry, probing it may cause a soft lockup. This happens when the kernel built with CONFIG_DYNAMIC_FTRACE=y and CONFIG_TRACE_IRQFLAGS=y. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Borislav Petkov <bp@suse.de> Cc: Jan Beulich <JBeulich@suse.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Link: http://lkml.kernel.org/r/20140417081712.26341.32994.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes: Introduce NOKPROBE_SYMBOL() macro to maintain kprobes blacklist	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce NOKPROBE_SYMBOL() macro which builds a kprobes blacklist at kernel build time. The usage of this macro is similar to EXPORT_SYMBOL(), placed after the function definition: NOKPROBE_SYMBOL(function); Since this macro will inhibit inlining of static/inline functions, this patch also introduces a nokprobe_inline macro for static/inline functions. In this case, we must use NOKPROBE_SYMBOL() for the inline function caller. When CONFIG_KPROBES=y, the macro stores the given function address in the "_kprobe_blacklist" section. Since the data structures are not fully initialized by the macro (because there is no "size" information), those are re-initialized at boot time by using kallsyms. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/20140417081705.26341.96719.stgit@ltc230.yrl.intra.hitachi.co.jp Cc: Alok Kataria <akataria@vmware.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christopher Li <sparse@chrisli.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: David S. Miller <davem@davemloft.net> Cc: Jan-Simon Möller <dl9pf@gmx.de> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: linux-arch@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-sparse@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes: Prohibit probing on .entry.text code	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	.entry.text is a code area which is used for interrupt/syscall entries, which includes many sensitive code. Thus, it is better to prohibit probing on all of such code instead of a part of that. Since some symbols are already registered on kprobe blacklist, this also removes them from the blacklist. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Jonathan Lebon <jlebon@redhat.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Link: http://lkml.kernel.org/r/20140417081658.26341.57354.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| *	kprobes/x86: Allow to handle reentered kprobe on single-stepping	Masami Hiramatsu	2014-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the NMI handlers(e.g. perf) can interrupt in the single stepping (or preparing the single stepping, do_debug etc.), we should consider a kprobe is hit in the NMI handler. Even in that case, the kprobe is allowed to be reentered as same as the kprobes hit in kprobe handlers (KPROBE_HIT_ACTIVE or KPROBE_HIT_SSDONE). The real issue will happen when a kprobe hit while another reentered kprobe is processing (KPROBE_REENTER), because we already consumed a saved-area for the previous kprobe. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Jonathan Lebon <jlebon@redhat.com> Link: http://lkml.kernel.org/r/20140417081651.26341.10593.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Ingo Molnar <mingo@kernel.org>
* \|	Merge branch 'perf/uprobes' into perf/core	Ingo Molnar	2014-06-05
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These bits from Oleg are fully cooked, ship them to Linus. Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| * \	Merge branch 'uprobes/core' of ↵	Ingo Molnar	2014-05-22
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/uprobes Pull uprobes fixes and changes from Oleg Nesterov: " Denys found another nasty old bug in uprobes/x86: div, mul, shifts with count in CL, and cmpxchg are not handled correctly. Plus a couple of other minor fixes. Nobody acked the changes in x86/traps, hopefully they are simple enough, and I believe that they make sense anyway and allow to do more cleanups." Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| \| * \|	uprobes/x86: Fix the wrong ->si_addr when xol triggers a trap	Oleg Nesterov	2014-05-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the probed insn triggers a trap, ->si_addr = regs->ip is technically correct, but this is not what the signal handler wants; we need to pass the address of the probed insn, not the address of xol slot. Add the new arch-agnostic helper, uprobe_get_trap_addr(), and change fill_trap_info() and math_error() to use it. !CONFIG_UPROBES case in uprobes.h uses a macro to avoid include hell and ensure that it can be compiled even if an architecture doesn't define instruction_pointer(). Test-case: #include <signal.h> #include <stdio.h> #include <unistd.h> extern void probe_div(void); void sigh(int sig, siginfo_t info, void c) { int passed = (info->si_addr == probe_div); printf(passed ? "PASS\n" : "FAIL\n"); _exit(!passed); } int main(void) { struct sigaction sa = { .sa_sigaction = sigh, .sa_flags = SA_SIGINFO, }; sigaction(SIGFPE, &sa, NULL); asm ( "xor %ecx,%ecx\n" ".globl probe_div; probe_div:\n" "idiv %ecx\n" ); return 0; } it fails if probe_div() is probed. Note: show_unhandled_signals users should probably use this helper too, but we need to cleanup them first. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
\| \| * \|	x86/traps: Kill DO_ERROR_INFO()	Oleg Nesterov	2014-05-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that DO_ERROR_INFO() doesn't differ from DO_ERROR() we can remove it and use DO_ERROR() instead. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
\| \| * \|	x86/traps: Shift fill_trap_info() from DO_ERROR_INFO() to do_error_trap()	Oleg Nesterov	2014-05-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the callsite of fill_trap_info() into do_error_trap() and remove the "siginfo_t info" argument. This obviously breaks DO_ERROR() which passed info == NULL, we simply change fill_trap_info() to return "siginfo_t " and add the "default" case which returns SEND_SIG_PRIV. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
\| \| * \|	x86/traps: Introduce fill_trap_info(), simplify DO_ERROR_INFO()	Oleg Nesterov	2014-05-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Extract the fill-siginfo code from DO_ERROR_INFO() into the new helper, fill_trap_info(). It can calculate si_code and si_addr looking at trapnr, so we can remove these arguments from DO_ERROR_INFO() and simplify the source code. The generated code is the same, __builtin_constant_p(trapnr) == T. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
\| \| * \|	x86/traps: Introduce do_error_trap()	Oleg Nesterov	2014-05-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the common code from DO_ERROR() and DO_ERROR_INFO() into the new helper, do_error_trap(). This simplifies define's and shaves 527 bytes from traps.o. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
\| \| * \|	x86/traps: Use SEND_SIG_PRIV instead of force_sig()	Oleg Nesterov	2014-05-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	force_sig() is just force_sig_info(SEND_SIG_PRIV). Imho it should die, we have too many ugly "send signal" helpers. And do_trap() looks just ugly because it uses force_sig_info() or force_sig() depending on info != NULL. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
\| \| * \|	x86/traps: Make math_error() static	Oleg Nesterov	2014-05-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Trivial, make math_error() static. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
\| \| * \|	uprobes/x86: Fix scratch register selection for rip-relative fixups	Denys Vlasenko	2014-05-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before this patch, instructions such as div, mul, shifts with count in CL, cmpxchg are mishandled. This patch adds vex prefix handling. In particular, it avoids colliding with register operand encoded in vex.vvvv field. Since we need to avoid two possible register operands, the selection of scratch register needs to be from at least three registers. After looking through a lot of CPU docs, it looks like the safest choice is SI,DI,BX. Selecting BX needs care to not collide with implicit use of BX by cmpxchg8b. Test-case: #include <stdio.h> static const char const pass[] = { "FAIL", "pass" }; long two = 2; void test1(void) { long ax = 0, dx = 0; asm volatile("\n" " xor %%edx,%%edx\n" " lea 2(%%edx),%%eax\n" // We divide 2 by 2. Result (in eax) should be 1: " probe1: .globl probe1\n" " divl two(%%rip)\n" // If we have a bug (eax mangled on entry) the result will be 2, // because eax gets restored by probe machinery. : "=a" (ax), "=d" (dx) /out/ : "0" (ax), "1" (dx) /in/ : "memory" /clobber/ ); dprintf(2, "%s: %s\n", __func__, pass[ax == 1] ); } long val2 = 0; void test2(void) { long old_val = val2; long ax = 0, dx = 0; asm volatile("\n" " mov val2,%%eax\n" // eax := val2 " lea 1(%%eax),%%edx\n" // edx := eax+1 // eax is equal to val2. cmpxchg should store edx to val2: " probe2: .globl probe2\n" " cmpxchg %%edx,val2(%%rip)\n" // If we have a bug (eax mangled on entry), val2 will stay unchanged : "=a" (ax), "=d" (dx) /out/ : "0" (ax), "1" (dx) /in/ : "memory" /clobber/ ); dprintf(2, "%s: %s\n", __func__, pass[val2 == old_val + 1] ); } long val3[2] = {0,0}; void test3(void) { long old_val = val3[0]; long ax = 0, dx = 0; asm volatile("\n" " mov val3,%%eax\n" // edx:eax := val3 " mov val3+4,%%edx\n" " mov %%eax,%%ebx\n" // ecx:ebx := edx:eax + 1 " mov %%edx,%%ecx\n" " add $1,%%ebx\n" " adc $0,%%ecx\n" // edx:eax is equal to val3. cmpxchg8b should store ecx:ebx to val3: " probe3: .globl probe3\n" " cmpxchg8b val3(%%rip)\n" // If we have a bug (edx:eax mangled on entry), val3 will stay unchanged. // If ecx:edx in mangled, val3 will get wrong value. : "=a" (ax), "=d" (dx) /out/ : "0" (ax), "1" (dx) /in/ : "cx", "bx", "memory" /clobber/ ); dprintf(2, "%s: %s\n", __func__, pass[val3[0] == old_val + 1 && val3[1] == 0] ); } int main(int argc, char *argv) { test1(); test2(); test3(); return 0; } Before this change all tests fail if probe{1,2,3} are probed. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Reviewed-by: Jim Keniston <jkenisto@us.ibm.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
\| \| * \|	uprobes/x86: Simplify rip-relative handling	Denys Vlasenko	2014-05-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is possible to replace rip-relative addressing mode with addressing mode of the same length: (reg+disp32). This eliminates the need to fix up immediate and correct for changing instruction length. And we can kill arch_uprobe->def.riprel_target. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Reviewed-by: Jim Keniston <jkenisto@us.ibm.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
\| \| * \|	uprobes: Add mem_cgroup_charge_anon() into uprobe_write_opcode()	Oleg Nesterov	2014-05-14
\| \|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Hugh says: The one I noticed was that it forgets all about memcg (because it was copied from KSM, and there the replacement page has already been charged to a memcg). See how mm/memory.c do_anonymous_page() does a mem_cgroup_charge_anon(). Hopefully not a big problem, uprobes is a system-wide thing and only root can insert the probes. But I agree, should be fixed anyway. Add mem_cgroup_{un,}charge_anon() into uprobe_write_opcode(). To simplify the error handling (and avoid the new "uncharge" label) the patch also moves anon_vma_prepare() up before we alloc/charge the new page. While at it fix the comment about ->mmap_sem, it is held for write. Suggested-by: Hugh Dickins <hughd@google.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
\| * \|	Merge branch 'uprobes/core' of ↵	Ingo Molnar	2014-05-05
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/uprobes Pull uprobes updates from Oleg Nesterov: "This hopefully completes the previous 'fix the handling of relative jmp/call's' series, all changes except the last 3 unrelated fixes try to address TODO's mentioned in the changelogs." Signed-off-by: Ingo Molnar <mingo@kernel.org>
\| \| * \|	uprobes: Refuse to insert a probe into MAP_SHARED vma	Oleg Nesterov	2014-04-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	valid_vma() rejects the VM_SHARED vmas, but this still allows to insert a probe into the MAP_SHARED but not VM_MAYWRITE vma. Currently this is fine, such a mapping doesn't really differ from the private read-only mmap except mprotect(PROT_WRITE) won't work. However, get_user_pages(FOLL_WRITE \| FOLL_FORCE) doesn't allow to COW in this case, and it would be safer to follow the same conventions as mm even if currently this happens to work. After the recent cda540ace6a1 "mm: get_user_pages(write,force) refuse to COW in shared areas" only uprobes can insert an anon page into the shared file-backed area, lets stop this and change valid_vma() to check VM_MAYSHARE instead. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
\| \| * \|	uprobes/tracing: Fix uprobe_perf_open() on uprobe_apply() failure	Oleg Nesterov	2014-04-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	uprobe_perf_open()->uprobe_apply() can fail, but this error is wrongly ignored. Change uprobe_perf_open() to do uprobe_perf_close() and return the error code in this case. Change uprobe_perf_close() to propogate the error from uprobe_apply() as well, although it should not fail. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
\| \| * \|	uprobes/tracing: Make uprobe_perf_close() visible to uprobe_perf_open()	Oleg Nesterov	2014-04-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Preparation. Move uprobe_perf_close() up before uprobe_perf_open() to avoid the forward declaration in the next patch and make it readable. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
\| \| * \|	uprobes/x86: Simplify riprel_{pre,post}_xol() and make them similar	Oleg Nesterov	2014-04-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ignoring the "correction" logic riprel_pre_xol() and riprel_post_xol() are very similar but look quite differently. 1. Add the "UPROBE_FIX_RIP_AX \| UPROBE_FIX_RIP_CX" check at the start of riprel_pre_xol(), like the same check in riprel_post_xol(). 2. Add the trivial scratch_reg() helper which returns the address of scratch register pre_xol/post_xol need to change. 3. Change these functions to use the new helper and avoid copy-and-paste under if/else branches. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
\| \| * \|	uprobes/x86: Kill the "autask" arg of riprel_pre_xol()	Oleg Nesterov	2014-04-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	default_pre_xol_op() passes &current->utask->autask to riprel_pre_xol() and this is just ugly because it still needs to load current->utask to read ->vaddr. Remove this argument, change riprel_pre_xol() to use current->utask. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
\| \| * \|	uprobes/x86: Rename riprel helpers to make the naming consistent	Oleg Nesterov	2014-04-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	handle_riprel_insn(), pre_xol_rip_insn() and handle_riprel_post_xol() look confusing and inconsistent. Rename them into riprel_analyze(), riprel_pre_xol(), and riprel_post_xol() respectively. No changes in compiled code. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
\| \| * \|	uprobes/x86: Cleanup the usage of UPROBE_FIX_IP/UPROBE_FIX_CALL	Oleg Nesterov	2014-04-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that UPROBE_FIX_IP/UPROBE_FIX_CALL are mutually exclusive we can use a single "fix_ip_or_call" enum instead of 2 fix_* booleans. This way the logic looks more understandable and clean to me. While at it, join "case 0xea" with other "ip is correct" ret/lret cases. Also change default_post_xol_op() to use "else if" for the same reason. Signed-off-by: Oleg Nesterov <oleg@redhat.com>