aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc/kernel/misc.S
Commit message (Collapse)AuthorAge
* powerpc: Rename __get_SP() to current_stack_pointer()Anton Blanchard2014-10-14
| | | | | | | | Michael points out that __get_SP() is a pretty horrible function name. Let's give it a better name. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Reimplement __get_SP() as a function not a defineAnton Blanchard2014-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Li Zhong points out an issue with our current __get_SP() implementation. If ftrace function tracing is enabled (ie -pg profiling using _mcount) we spill a stack frame on 64bit all the time. If a function calls __get_SP() and later calls a function that is tail call optimised, we will pop the stack frame and the value returned by __get_SP() is no longer valid. An example from Li can be found in save_stack_trace -> save_context_stack: c0000000000432c0 <.save_stack_trace>: c0000000000432c0: mflr r0 c0000000000432c4: std r0,16(r1) c0000000000432c8: stdu r1,-128(r1) <-- stack frame for _mcount c0000000000432cc: std r3,112(r1) c0000000000432d0: bl <._mcount> c0000000000432d4: nop c0000000000432d8: mr r4,r1 <-- __get_SP() c0000000000432dc: ld r5,632(r13) c0000000000432e0: ld r3,112(r1) c0000000000432e4: li r6,1 c0000000000432e8: addi r1,r1,128 <-- pop stack frame c0000000000432ec: ld r0,16(r1) c0000000000432f0: mtlr r0 c0000000000432f4: b <.save_context_stack> <-- tail call optimized save_context_stack ends up with a stack pointer below the current one, and it is likely to be scribbled over. Fix this by making __get_SP() a function which returns the callers stack frame. Also replace inline assembly which grabs the stack pointer in save_stack_trace and show_stack with __get_SP(). This also fixes an issue with perf_arch_fetch_caller_regs(). It currently unwinds the stack once, which will skip a valid stack frame on a leaf function. With the __get_SP() fixes in this patch, we never need to unwind the stack frame to get to the first interesting frame. We have to export __get_SP() because perf_arch_fetch_caller_regs() (which is used in modules) calls it from a header file. Reported-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: switch to generic sys_execve()/kernel_execve()Al Viro2012-09-30
| | | | | | | | the only non-obvious part is that current_pt_regs() is really needed here - task_pt_regs() is NULL for kernel threads; it's OK for ptrace uses (the thing task_pt_regs() is intended for), but not for us. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* powerpc: Remove legacy iSeries bits from assembly filesBenjamin Herrenschmidt2012-03-08
| | | | | | | | | This removes the various bits of assembly in the kernel entry, exception handling and SLB management code that were specific to running under the legacy iSeries hypervisor which is no longer supported. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Remove unneeded cpu_setup/restore from POWER7 cputable entryMichael Neuling2010-11-28
| | | | | | | These are not needed so just remove them Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* perf: Drop the skip argument from perf_arch_fetch_regs_callerFrederic Weisbecker2010-06-08
| | | | | | | | | | | | | | | | Drop this argument now that we always want to rewind only to the state of the first caller. It means frame pointers are not necessary anymore to reliably get the source of an event. But this also means we need this helper to be a macro now, as an inline function is not an option since we need to know when to provide a default implentation. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: David Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf: Always build the powerpc perf_arch_fetch_caller_regs versionFrederic Weisbecker2010-04-03
| | | | | | | | | | | | | | | | | | | | | | | Now that software events use perf_arch_fetch_caller_regs() too, we need the powerpc version to be always built. Fixes the following build error: (.text+0x3210): undefined reference to `perf_arch_fetch_caller_regs' (.text+0x3324): undefined reference to `perf_arch_fetch_caller_regs' (.text+0x33bc): undefined reference to `perf_arch_fetch_caller_regs' (.text+0x33ec): undefined reference to `perf_arch_fetch_caller_regs' (.text+0xd4a0): undefined reference to `perf_arch_fetch_caller_regs' arch/powerpc/kernel/built-in.o:(.text+0xd528): more undefined references to `perf_arch_fetch_caller_regs' follow make[1]: *** [.tmp_vmlinux1] Error 1 make: *** [sub-make] Error 2 Reported-by: Michael Ellerman <michael@ellerman.id.au> Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org>
* powerpc/perf_events: Fix call-graph recording, add perf_arch_fetch_caller_regsPaul Mackerras2010-03-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements a powerpc version of perf_arch_fetch_caller_regs to get correct call-graphs. It's implemented in assembly because that way we can be sure there isn't a stack frame for perf_arch_fetch_caller_regs. If it was in C, gcc might or might not create a stack frame for it, which would affect the number of levels we have to skip. With this, we see results from perf record -e lock:lock_acquire like this: # Samples: 24878 # # Overhead Command Shared Object Symbol # ........ .............. ................. ...... # 14.99% perf [kernel.kallsyms] [k] ._raw_spin_lock | --- ._raw_spin_lock | |--25.00%-- .alloc_fd | (nil) | | | |--50.00%-- .anon_inode_getfd | | .sys_perf_event_open | | syscall_exit | | syscall | | create_counter | | __cmd_record | | run_builtin | | main | | 0xfd2e704 | | 0xfd2e8c0 | | (nil) ... etc. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: anton@samba.org Cc: linuxppc-dev@ozlabs.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100318050513.GA6575@drongo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* powerpc: Use LOAD_REG_IMMEDIATE only for constants on 64-bitPaul Mackerras2008-09-15
| | | | | | | | | | | | | | | | | | | | | Using LOAD_REG_IMMEDIATE to get the address of kernel symbols generates 5 instructions where LOAD_REG_ADDR can do it in one, and will generate R_PPC64_ADDR16_* relocations in the output when we get to making the kernel as a position-independent executable, which we'd rather not have to handle. This changes various bits of assembly code to use LOAD_REG_ADDR when we need to get the address of a symbol, or to use suitable position-independent code for cases where we can't access the TOC for various reasons, or if we're not running at the address we were linked at. It also cleans up a few minor things; there's no reason to save and restore SRR0/1 around RTAS calls, __mmu_off can get the return address from LR more conveniently than the caller can supply it in R4 (and we already assume elsewhere that EA == RA if the MMU is on in early boot), and enable_64b_mode was using 5 instructions where 2 would do. Signed-off-by: Paul Mackerras <paulus@samba.org>
* powerpc: Add cputable entry for POWER7Michael Neuling2008-06-30
| | | | | | | | | | Add a cputable entry for the POWER7 processor. Also tell firmware that we know about POWER7. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Joel Schopp <jschopp@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
* [POWERPC] Make setjmp/longjmp code usable outside of xmonMichael Neuling2008-01-25
| | | | | | | | | This makes the setjmp/longjmp code used by xmon, generically available to other code. It also removes the requirement for debugger hooks to be only called on 0x300 (data storage) exception. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* [POWERPC] kernel_execve is identical in 32 and 64 bitStephen Rothwell2007-12-10
| | | | | | | so consolidate it into misc.S. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
* [POWERPC] convert string i/o operations to CStephen Rothwell2006-09-20
| | | | | | | | | | This produces essentially the same code and will make the iSeries i/o consolidation easier. The count parameter is changed to long since that will produce the same (better) code on 32 and 64 bit builds. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
* [POWERPC] clean up ide io accessorsStephen Rothwell2006-09-20
| | | | Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
* [POWERPC] remove unused asm routinesStephen Rothwell2006-09-20
| | | | | | _insw, _outsw, _insl amd _outsl are all unused, so remove them. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
* [POWERPC] Fix MMIO ops to provide expected barrier behaviourPaul Mackerras2006-09-13
| | | | | | | | | | | | | | | | | | | | | | | | | | This changes the writeX family of functions to have a sync instruction before the MMIO store rather than after, because the generally expected behaviour is that the device receiving the MMIO store can be guaranteed to see the effects of any preceding writes to normal memory. To preserve ordering between writeX and readX, and to preserve ordering between preceding stores and the readX, the readX family of functions have had an sync added before the load. Although writeX followed by spin_unlock is not officially guaranteed to keep the writeX inside the spin-locked region unless an mmiowb() is used, there are currently drivers that depend on the previous behaviour on powerpc, which was that the mmiowb wasn't actually required. Therefore we have a per-cpu flag that is set by writeX, cleared by __raw_spin_lock and mmiowb, and tested by __raw_spin_unlock. If it is set, __raw_spin_unlock does a sync and clears it. This changes both 32-bit and 64-bit readX/writeX. 32-bit already has a sync in __raw_spin_unlock (since lwsync doesn't exist on 32-bit), and thus doesn't need the per-cpu flag. Tested on G5 (PPC970) and POWER5. Signed-off-by: Paul Mackerras <paulus@samba.org>
* [POWERPC] Consolidate some of kernel/misc*.SStephen Rothwell2006-06-28
There were some common functions (mainly i/o). Also some small white space cleanups and remove a couple of small unused functions. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>