aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc/kernel/entry_64.S
Commit message (Collapse)AuthorAge
* Revert "powerpc/tm: Abort syscalls in active transactions"Michael Ellerman2015-04-30
| | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit feba40362b11341bee6d8ed58d54b896abbd9f84. Although the principle of this change is good, the implementation has a few issues. Firstly we can sometimes fail to abort a syscall because r12 may have been clobbered by C code if we went down the virtual CPU accounting path, or if syscall tracing was enabled. Secondly we have decided that it is safer to abort the syscall even earlier in the syscall entry path, so that we avoid the syscall tracing path when we are transactional. So that we have time to thoroughly test those changes we have decided to revert this for this merge window and will merge the fixed version in the next window. NB. Rather than reverting the selftest we just drop tm-syscall from TEST_PROGS so that it's not run by default. Fixes: feba40362b11 ("powerpc/tm: Abort syscalls in active transactions") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/tm: Abort syscalls in active transactionsSam bobroff2015-04-11
| | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the syscall handler to doom (tabort) active transactions when a syscall is made and return immediately without performing the syscall. Currently, the system call instruction automatically suspends an active transaction which causes side effects to persist when an active transaction fails. This does change the kernel's behaviour, but in a way that was documented as unsupported. It doesn't reduce functionality because syscalls will still be performed after tsuspend. It also provides a consistent interface and makes the behaviour of user code substantially the same across powerpc and platforms that do not support suspended transactions (e.g. x86 and s390). Performance measurements using http://ozlabs.org/~anton/junkcode/null_syscall.c indicate the cost of a system call increases by about 0.5%. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Acked-By: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Add a proper syscall for switching endiannessMichael Ellerman2015-03-28
| | | | | | | | | | | | | | | | | | | | | | | | We currently have a "special" syscall for switching endianness. This is syscall number 0x1ebe, which is handled explicitly in the 64-bit syscall exception entry. That has a few problems, firstly the syscall number is outside of the usual range, which confuses various tools. For example strace doesn't recognise the syscall at all. Secondly it's handled explicitly as a special case in the syscall exception entry, which is complicated enough without it. As a first step toward removing the special syscall, we need to add a regular syscall that implements the same functionality. The logic is simple, it simply toggles the MSR_LE bit in the userspace MSR. This is the same as the special syscall, with the caveat that the special syscall clobbers fewer registers. This version clobbers r9-r12, XER, CTR, and CR0-1,5-7. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Remove old compile time disabled syscall tracing codeMichael Ellerman2015-02-01
| | | | | | | | | | | | | | | | | We have code to do syscall tracing which is disabled at compile time by default. It's not been touched since the dawn of time (ie. v2.6.12). There are now better ways to do syscall tracing, ie. using the raw_syscall, or syscall tracepoints. For the specific case of tracing syscalls at boot on a system that doesn't get to userspace, you can boot with: trace_event=syscalls tp_printk=on Which will trace syscalls from boot, and echo all output to the console. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/kernel: Make syscall_exit a local labelMichael Ellerman2015-02-01
| | | | | | | | | | | | | | | | | | | | | Currently when we back trace something that is in a syscall we see something like this: [c000000000000000] [c000000000000000] SyS_read+0x6c/0x110 [c000000000000000] [c000000000000000] syscall_exit+0x0/0x98 Although it's entirely correct, seeing syscall_exit at the bottom can be confusing - we were exiting from a syscall and then called SyS_read() ? If we instead change syscall_exit to be a local label we get something more intuitive: [c0000001fa46fde0] [c00000000026719c] SyS_read+0x6c/0x110 [c0000001fa46fe30] [c000000000009264] system_call+0x38/0xd0 ie. we were handling a system call, and it was SyS_read(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Rename _TIF_SYSCALL_T_OR_A to _TIF_SYSCALL_DOTRACEMichael Ellerman2015-01-22
| | | | | | | | | | | Once upon a time, at least 9 years ago (< 2.6.12), _TIF_SYSCALL_T_OR_A meant "TRACE or AUDIT". But these days it means TRACE or AUDIT or SECCOMP or TRACEPOINT or NOHZ. All of those are implemented via syscall_dotrace() so rename the flag to that to try and clarify things. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/ftrace: simplify prepare_ftrace_returnAnton Blanchard2014-11-09
| | | | | | | | | | | | | | | | | Instead of passing in the stack address of the link register to be modified, just pass in the old value and return the new value and rely on ftrace_graph_caller to do the modification. This removes the exception handling around the stack update - it isn't needed and we weren't consistent about it. Later on we would do an unprotected modification: if (!ftrace_graph_entry(&trace)) { *parent = old; Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/ftrace: Remove mod_return_to_handlerAnton Blanchard2014-11-09
| | | | | | | | | mod_return_to_handler is the same as return_to_handler, except it handles the change of the TOC (r2). Add this into return_to_handler and remove mod_return_to_handler. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: do_notify_resume can be called with bad thread_info flags argumentAnton Blanchard2014-10-31
| | | | | | | | | | | | | | | | | | | | | | | | | | Back in 7230c5644188 ("powerpc: Rework lazy-interrupt handling") we added a call out to restore_interrupts() (written in c) before calling do_notify_resume: bl restore_interrupts addi r3,r1,STACK_FRAME_OVERHEAD bl do_notify_resume Unfortunately do_notify_resume takes two arguments, the second one being the thread_info flags: void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags) We do populate r4 (the second argument) earlier, but restore_interrupts() is free to muck it up all it wants. My guess is the gcc compiler gods shone down on us and its register allocator never used r4. Sometimes, rarely, luck is on our side. LLVM on the other hand did trample r4. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/book3s: Add basic infrastructure to handle HMI in Linux.Mahesh Salgaonkar2014-08-05
| | | | | | | | | | Handle Hypervisor Maintenance Interrupt (HMI) in Linux. This patch implements basic infrastructure to handle HMI in Linux host. The design is to invoke opal handle hmi in real mode for recovery and set irq_pending when we hit HMI. During check_irq_replay pull opal hmi event and print hmi info on console. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Remove MMU_FTR_SLBMichael Ellerman2014-07-28
| | | | | | | | We now only support cpus that use an SLB, so we don't need an MMU feature to indicate that. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Correct DSCR during TM context switchSam bobroff2014-06-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Correct the DSCR SPR becoming temporarily corrupted if a task is context switched during a transaction. The problem occurs while suspending the task and is caused by saving the DSCR to thread.dscr after it has already been set to the CPU's default value: __switch_to() calls __switch_to_tm() which calls tm_reclaim_task() which calls tm_reclaim_thread() which calls tm_reclaim() where the DSCR is set to the CPU's default __switch_to() calls _switch() where thread.dscr is set to the DSCR When the task is resumed, it's transaction will be doomed (as usual) and the DSCR SPR will be corrupted, although the checkpointed value will be correct. Therefore the DSCR will be immediately corrected by the transaction aborting, unless it has been suspended. In that case the incorrect value can be seen by the task until it resumes the transaction. The fix is to treat the DSCR similarly to the TAR and save it early in __switch_to(). A program exposing the problem is added to the kernel self tests as: tools/testing/selftests/powerpc/tm/tm-resched-dscr. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> CC: <stable@vger.kernel.org> [v3.10+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Fix regression of per-CPU DSCR settingSam bobroff2014-05-27
| | | | | | | | | | | | | | | | | | | | | Since commit "efcac65 powerpc: Per process DSCR + some fixes (try#4)" it is no longer possible to set the DSCR on a per-CPU basis. The old behaviour was to minipulate the DSCR SPR directly but this is no longer sufficient: the value is quickly overwritten by context switching. This patch stores the per-CPU DSCR value in a kernel variable rather than directly in the SPR and it is used whenever a process has not set the DSCR itself. The sysfs interface (/sys/devices/system/cpu/cpuN/dscr) is unchanged. Writes to the old global default (/sys/devices/system/cpu/dscr_default) now set all of the per-CPU values and reads return the last written value. The new per-CPU default is added to the paca_struct and is used everywhere outside of sysfs.c instead of the old global default. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: ftrace_caller, _mcount is exported to modules so needs _GLOBAL_TOC()Anton Blanchard2014-04-22
| | | | | | | | When testing the ftrace function tracer, I realised that ftrace_caller and mcount are called from modules and they both call into C, therefore they need the ABIv2 global entry point to establish r2. Signed-off-by: Anton Blanchard <anton@samba.org>
* powerpc: Fix kernel thread creation on ABIv2Anton Blanchard2014-04-22
| | | | | | | | | | Change how we setup registers for ret_from_kernel_thread. In ABIv1, instead of passing a function descriptor in, dereference it and pass the target in directly. Use ppc_global_function_entry to get it right on both ABIv1 and ABIv2. Signed-off-by: Anton Blanchard <anton@samba.org>
* powerpc: ABIv2 function calls must place target address in r12Anton Blanchard2014-04-22
| | | | | | | | | | To establish addressability quickly, ABIv2 requires the target address of the function being called to be in r12. Fix a number of places in assembly code that we do indirect function calls. We need to avoid function descriptors on ABIv2 too. Signed-off-by: Anton Blanchard <anton@samba.org>
* powerpc: Don't use a function descriptor for system call tableAnton Blanchard2014-04-22
| | | | | | | | | | There is no need to create a function descriptor for the system call table. By using one we force the system call table into the text section and it really belongs in the rodata section. This also removes another use of dot symbols. Signed-off-by: Anton Blanchard <anton@samba.org>
* powerpc: Remove superflous function descriptors in assembly only codeAnton Blanchard2014-04-22
| | | | | | | | | | | | We have a number of places where we load the text address of a local function and indirectly branch to it in assembly. Since it is an indirect branch binutils will not know to use the function text address, so that trick wont work. There is no need for these functions to have a function descriptor so we can replace it with a label and remove the dot symbol. Signed-off-by: Anton Blanchard <anton@samba.org>
* powerpc: No need to use dot symbols when branching to a functionAnton Blanchard2014-04-22
| | | | | | | | | binutils is smart enough to know that a branch to a function descriptor is actually a branch to the functions text address. Alan tells me that binutils has been doing this for 9 years. Signed-off-by: Anton Blanchard <anton@samba.org>
* powerpc: Don't corrupt transactional state when using FP/VMX in kernelPaul Mackerras2014-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when we have a process using the transactional memory facilities on POWER8 (that is, the processor is in transactional or suspended state), and the process enters the kernel and the kernel then uses the floating-point or vector (VMX/Altivec) facility, we end up corrupting the user-visible FP/VMX/VSX state. This happens, for example, if a page fault causes a copy-on-write operation, because the copy_page function will use VMX to do the copy on POWER8. The test program below demonstrates the bug. The bug happens because when FP/VMX state for a transactional process is stored in the thread_struct, we store the checkpointed state in .fp_state/.vr_state and the transactional (current) state in .transact_fp/.transact_vr. However, when the kernel wants to use FP/VMX, it calls enable_kernel_fp() or enable_kernel_altivec(), which saves the current state in .fp_state/.vr_state. Furthermore, when we return to the user process we return with FP/VMX/VSX disabled. The next time the process uses FP/VMX/VSX, we don't know which set of state (the current register values, .fp_state/.vr_state, or .transact_fp/.transact_vr) we should be using, since we have no way to tell if we are still in the same transaction, and if not, whether the previous transaction succeeded or failed. Thus it is necessary to strictly adhere to the rule that if FP has been enabled at any point in a transaction, we must keep FP enabled for the user process with the current transactional state in the FP registers, until we detect that it is no longer in a transaction. Similarly for VMX; once enabled it must stay enabled until the process is no longer transactional. In order to keep this rule, we add a new thread_info flag which we test when returning from the kernel to userspace, called TIF_RESTORE_TM. This flag indicates that there is FP/VMX/VSX state to be restored before entering userspace, and when it is set the .tm_orig_msr field in the thread_struct indicates what state needs to be restored. The restoration is done by restore_tm_state(). The TIF_RESTORE_TM bit is set by new giveup_fpu/altivec_maybe_transactional helpers, which are called from enable_kernel_fp/altivec, giveup_vsx, and flush_fp/altivec_to_thread instead of giveup_fpu/altivec. The other thing to be done is to get the transactional FP/VMX/VSX state from .fp_state/.vr_state when doing reclaim, if that state has been saved there by giveup_fpu/altivec_maybe_transactional. Having done this, we set the FP/VMX bit in the thread's MSR after reclaim to indicate that that part of the state is now valid (having been reclaimed from the processor's checkpointed state). Finally, in the signal handling code, we move the clearing of the transactional state bits in the thread's MSR a bit earlier, before calling flush_fp_to_thread(), so that we don't unnecessarily set the TIF_RESTORE_TM bit. This is the test program: /* Michael Neuling 4/12/2013 * * See if the altivec state is leaked out of an aborted transaction due to * kernel vmx copy loops. * * gcc -m64 htm_vmxcopy.c -o htm_vmxcopy * */ /* We don't use all of these, but for reference: */ int main(int argc, char *argv[]) { long double vecin = 1.3; long double vecout; unsigned long pgsize = getpagesize(); int i; int fd; int size = pgsize*16; char tmpfile[] = "/tmp/page_faultXXXXXX"; char buf[pgsize]; char *a; uint64_t aborted = 0; fd = mkstemp(tmpfile); assert(fd >= 0); memset(buf, 0, pgsize); for (i = 0; i < size; i += pgsize) assert(write(fd, buf, pgsize) == pgsize); unlink(tmpfile); a = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0); assert(a != MAP_FAILED); asm __volatile__( "lxvd2x 40,0,%[vecinptr] ; " // set 40 to initial value TBEGIN "beq 3f ;" TSUSPEND "xxlxor 40,40,40 ; " // set 40 to 0 "std 5, 0(%[map]) ;" // cause kernel vmx copy page TABORT TRESUME TEND "li %[res], 0 ;" "b 5f ;" "3: ;" // Abort handler "li %[res], 1 ;" "5: ;" "stxvd2x 40,0,%[vecoutptr] ; " : [res]"=r"(aborted) : [vecinptr]"r"(&vecin), [vecoutptr]"r"(&vecout), [map]"r"(a) : "memory", "r0", "r3", "r4", "r5", "r6", "r7"); if (aborted && (vecin != vecout)){ printf("FAILED: vector state leaked on abort %f != %f\n", (double)vecin, (double)vecout); exit(1); } munmap(a, size); close(fd); printf("PASSED!\n"); return 0; } Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Move precessing of MCE queued event out from syscall exit path.Mahesh Salgaonkar2014-01-14
| | | | | | | | | | | | Huge Dickins reported an issue that b5ff4211a829 "powerpc/book3s: Queue up and process delayed MCE events" breaks the PowerMac G5 boot. This patch fixes it by moving the mce even processing away from syscall exit, which was wrong to do that in first place, and using irq work framework to delay processing of mce event. Reported-by: Hugh Dickins <hughd@google.com Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/book3s: Queue up and process delayed MCE events.Mahesh Salgaonkar2013-12-05
| | | | | | | | | | | When machine check real mode handler can not continue into host kernel in V mode, it returns from the interrupt and we loose MCE event which never gets logged. In such a situation queue up the MCE event so that we can log it later when we get back into host kernel with r1 pointing to kernel stack e.g. during syscall exit. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Fix fatal SLB miss when restoring PPRBenjamin Herrenschmidt2013-11-05
| | | | | | | | | | | | When restoring the PPR value, we incorrectly access the thread structure at a time where MSR:RI is clear, which means we cannot recover from nested faults. However the thread structure isn't covered by the "bolted" SLB entries and thus accessing can fault. This fixes it by splitting the code so that the PPR value is loaded into a GPR before MSR:RI is cleared. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/ppc64: Remove the unneeded load of ti_flags in resume_kernelKevin Hao2013-10-11
| | | | | | | | | We already got the value of current_thread_info and ti_flags and store them into r9 and r4 respectively before jumping to resume_kernel. So there is no reason to reload them again. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Endian safe trampolineBenjamin Herrenschmidt2013-10-11
| | | | | | | | | | | | Create a trampoline that works in either endian and flips to the expected endian. Use it for primary and secondary thread entry as well as RTAS and OF call return. Credit for finding the magic instruction goes to Paul Mackerras Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Cleanup handling of the DSCR bit in the FSCR registerMichael Neuling2013-08-27
| | | | | | | | | | | | | | | | As suggested by paulus we can simplify the Data Stream Control Register (DSCR) Facility Status and Control Register (FSCR) handling. Firstly, we simplify the asm by using a rldimi. Secondly, we now use the FSCR only to control the DSCR facility, rather than both the FSCR and HFSCR. Users will see no functional change from this but will get a minor speedup as they will trap into the kernel only once (rather than twice) when they first touch the DSCR. Also, this changes removes a bunch of ugly FTR_SECTION code. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Merge branch 'merge' into nextBenjamin Herrenschmidt2013-08-27
|\ | | | | | | | | Merge stuff that already went into Linus via "merge" which are pre-reqs for subsequent patches
| * powerpc: Save the TAR register earlierMichael Neuling2013-08-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This moves us to save the Target Address Register (TAR) a earlier in __switch_to. It introduces a new function save_tar() to do this. We need to save the TAR earlier as we will overwrite it in the transactional memory reclaim/recheckpoint path. We are going to do this in a subsequent patch which will fix saving the TAR register when it's modified inside a transaction. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Fix context switch DSCR on POWER8Michael Neuling2013-08-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POWER8 allows the DSCR to be accessed directly from userspace via a new SPR number 0x3 (Rather than 0x11. DSCR SPR number 0x11 is still used on POWER8 but like POWER7, is only accessible in HV and OS modes). Currently, we allow this by setting H/FSCR DSCR bit on boot. Unfortunately this doesn't work, as the kernel needs to see the DSCR change so that it knows to no longer restore the system wide version of DSCR on context switch (ie. to set thread.dscr_inherit). This clears the H/FSCR DSCR bit initially. If a process then accesses the DSCR (via SPR 0x3), it'll trap into the kernel where we set thread.dscr_inherit in facility_unavailable_exception(). We also change _switch() so that we set or clear the H/FSCR DSCR bit based on the thread.dscr_inherit. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Fix little endian lppaca, slb_shadow and dtl_entryAnton Blanchard2013-08-14
| | | | | | | | | | | | | | | | | | | | | | The lppaca, slb_shadow and dtl_entry hypervisor structures are big endian, so we have to byte swap them in little endian builds. LE KVM hosts will also need to be fixed but for now add an #error to remind us. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/ppc64: Rename SOFT_DISABLE_INTS with RECONCILE_IRQ_STATETiejun Chen2013-08-14
|/ | | | | | | | | The SOFT_DISABLE_INTS seems an odd name for something that updates the software state to be consistent with interrupts being hard disabled, so rename SOFT_DISABLE_INTS with RECONCILE_IRQ_STATE to avoid this confusion. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Restore dbcr0 on user space exitBharat Bhushan2013-06-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On BookE (Branch taken + Single Step) is as same as Branch Taken on BookS and in Linux we simulate BookS behavior for BookE as well. When doing so, in Branch taken handling we want to set DBCR0_IC but we update the current->thread->dbcr0 and not DBCR0. Now on 64bit the current->thread.dbcr0 (and other debug registers) is synchronized ONLY on context switch flow. But after handling Branch taken in debug exception if we return back to user space without context switch then single stepping change (DBCR0_ICMP) does not get written in h/w DBCR0 and Instruction Complete exception does not happen. This fixes using ptrace reliably on BookE-PowerPC lmbench latency test (lat_syscall) Results are (they varies a little on each run) 1) ./lat_syscall <action> /dev/shm/uImage action: Open read write stat fstat null Before: 3.8618 0.2017 0.2851 1.6789 0.2256 0.0856 After: 3.8580 0.2017 0.2851 1.6955 0.2255 0.0856 1) ./lat_syscall -P 2 -N 10 <action> /dev/shm/uImage action: Open read write stat fstat null Before: 4.1388 0.2238 0.3066 1.7106 0.2256 0.0856 After: 4.1413 0.2236 0.3062 1.7107 0.2256 0.0856 [ Slightly modified to avoid extra branch in the fast path on Book3S and fix build on all non-BookE 64-bit -- BenH ] Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Partial revert of "Context switch more PMU related SPRs"Michael Ellerman2013-06-09
| | | | | | | | | | | | | In commit 59affcd I added context switching of more PMU SPRs, because they are potentially exposed to userspace on Power8. However despite me being a smart arse in the commit message it's actually not correct. In particular it interacts badly with a global perf record. We will have to do something more complicated, but that will have to wait for 3.11. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/pseries: Kill all prefetch streams on context switchMichael Neuling2013-05-31
| | | | | | | | | | | On context switch, we should have no prefetch streams leak from one userspace process to another. This frees up prefetch resources for the next process. Based on patch from Milton Miller. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Context switch more PMU related SPRsMichael Ellerman2013-05-24
| | | | | | | | | | | In commit 9353374 "Context switch the new EBB SPRs" we added support for context switching some new EBB SPRs. However despite four of us signing off on that patch we missed some. To be fair these are not actually new SPRs, but they are now potentially user accessible so need to be context switched. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Merge branch 'merge' of ↵Linus Torvalds2013-05-14
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Benjamin Herrenschmidt: "This is mostly bug fixes (some of them regressions, some of them I deemed worth merging now) along with some patches from Li Zhong hooking up the new context tracking stuff (for the new full NO_HZ)" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits) powerpc: Set show_unhandled_signals to 1 by default powerpc/perf: Fix setting of "to" addresses for BHRB powerpc/pmu: Fix order of interpreting BHRB target entries powerpc/perf: Move BHRB code into CONFIG_PPC64 region powerpc: select HAVE_CONTEXT_TRACKING for pSeries powerpc: Use the new schedule_user API on userspace preemption powerpc: Exit user context on notify resume powerpc: Exception hooks for context tracking subsystem powerpc: Syscall hooks for context tracking subsystem powerpc/booke64: Fix kernel hangs at kernel_dbg_exc powerpc: Fix irq_set_affinity() return values powerpc: Provide __bswapdi2 powerpc/powernv: Fix starting of secondary CPUs on OPALv2 and v3 powerpc/powernv: Detect OPAL v3 API version powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning again powerpc: Make CONFIG_RTAS_PROC depend on CONFIG_PROC_FS powerpc: Bring all threads online prior to migration/hibernation powerpc/rtas_flash: Fix validate_flash buffer overflow issue powerpc/kexec: Fix kexec when using VMX optimised memcpy powerpc: Fix build errors STRICT_MM_TYPECHECKS ...
| * powerpc: Use the new schedule_user API on userspace preemptionLi Zhong2013-05-14
| | | | | | | | | | | | | | | | | | This patch corresponds to [PATCH] x86: Use the new schedule_user API on userspace preemption commit 0430499ce9d78691f3985962021b16bf8f8a8048 Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning againLi Zhong2013-05-14
| | | | | | | | | | | | | | | | | | | | | | Saw this warning again, and this time from the ret_from_fork path. It seems we could clear the back chain earlier in copy_thread(), which could cover both path, and also fix potential lockdep usage in schedule_tail(), or exception occurred before we clear the back chain. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | Merge git://git.infradead.org/users/eparis/auditLinus Torvalds2013-05-11
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull audit changes from Eric Paris: "Al used to send pull requests every couple of years but he told me to just start pushing them to you directly. Our touching outside of core audit code is pretty straight forward. A couple of interface changes which hit net/. A simple argument bug calling audit functions in namei.c and the removal of some assembly branch prediction code on ppc" * git://git.infradead.org/users/eparis/audit: (31 commits) audit: fix message spacing printing auid Revert "audit: move kaudit thread start from auditd registration to kaudit init" audit: vfs: fix audit_inode call in O_CREAT case of do_last audit: Make testing for a valid loginuid explicit. audit: fix event coverage of AUDIT_ANOM_LINK audit: use spin_lock in audit_receive_msg to process tty logging audit: do not needlessly take a lock in tty_audit_exit audit: do not needlessly take a spinlock in copy_signal audit: add an option to control logging of passwords with pam_tty_audit audit: use spin_lock_irqsave/restore in audit tty code helper for some session id stuff audit: use a consistent audit helper to log lsm information audit: push loginuid and sessionid processing down audit: stop pushing loginid, uid, sessionid as arguments audit: remove the old depricated kernel interface audit: make validity checking generic audit: allow checking the type of audit message in the user filter audit: fix build break when AUDIT_DEBUG == 2 audit: remove duplicate export of audit_enabled Audit: do not print error when LSMs disabled ...
| * powerpc: Remove static branch prediction in 64bit traced syscall pathAnton Blanchard2013-04-10
| | | | | | | | | | | | | | | | | | Some distros enable auditing by default which forces us through the syscall trace path. Remove the static branch prediction in our 64bit syscall handler and let the hardware do the prediction. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Eric Paris <eparis@redhat.com>
* | powerpc: Context switch the new EBB SPRsMichael Ellerman2013-05-01
| | | | | | | | | | | | | | | | | | | | | | | | | | This context switches the new Event Based Branching (EBB) SPRs. The three new SPRs are: - Event Based Branch Handler Register (EBBHR) - Event Based Branch Return Register (EBBRR) - Branch Event Status and Control Register (BESCR) Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Replace CPU_FTR_BCTAR with CPU_FTR_ARCH_207SMichael Ellerman2013-05-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are getting low on cpu feature bits. So rather than add a separate bit for every new Power8 feature, add a bit for arch 2.07 server catagory and use that instead. Hijack the value we had for BCTAR, but swap the value with CFAR so that all the ARCH defines are together. Note we don't touch CPU_FTR_TM, because it is conditionally enabled if the kernel is built with TM support. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: add a missing label in resume_kernelKevin Hao2013-04-15
| | | | | | | | | | | | | | | | | | | | | | | | A label 0 was missed in the patch a9c4e541 (powerpc/kprobe: Complete kprobe and migrate exception frame). This will cause the kernel branch to an undetermined address if there really has a conflict when updating the thread flags. Signed-off-by: Kevin Hao <haokexin@gmail.com> Cc: stable@vger.kernel.org Acked-By: Tiejun Chen <tiejun.chen@windriver.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
* | powerpc: Fix audit crash due to save/restore PPR changesAlistair Popple2013-04-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current mainline crashes when hitting userspace with the following: kernel BUG at kernel/auditsc.c:1769! cpu 0x1: Vector: 700 (Program Check) at [c000000023883a60] pc: c0000000001047a8: .__audit_syscall_entry+0x38/0x130 lr: c00000000000ed64: .do_syscall_trace_enter+0xc4/0x270 sp: c000000023883ce0 msr: 8000000000029032 current = 0xc000000023800000 paca = 0xc00000000f080380 softe: 0 irq_happened: 0x01 pid = 1629, comm = start_udev kernel BUG at kernel/auditsc.c:1769! enter ? for help [c000000023883d80] c00000000000ed64 .do_syscall_trace_enter+0xc4/0x270 [c000000023883e30] c000000000009b08 syscall_dotrace+0xc/0x38 --- Exception: c00 (System Call) at 0000008010ec50dc Bisecting found the following patch caused it: commit 44e9309f1f357794b7ae93d5f3e3e6f11d2b8a7f Author: Haren Myneni <haren@linux.vnet.ibm.com> powerpc: Implement PPR save/restore It was found this patch corrupted r9 when calling SET_DEFAULT_THREAD_PPR() Using r10 as a scratch register instead of r9 solved the problem. Signed-off-by: Alistair Popple <alistair@popple.id.au> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
* | Merge branch 'next' of ↵Linus Torvalds2013-02-23
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Benjamin Herrenschmidt: "So from the depth of frozen Minnesota, here's the powerpc pull request for 3.9. It has a few interesting highlights, in addition to the usual bunch of bug fixes, minor updates, embedded device tree updates and new boards: - Hand tuned asm implementation of SHA1 (by Paulus & Michael Ellerman) - Support for Doorbell interrupts on Power8 (kind of fast thread-thread IPIs) by Ian Munsie - Long overdue cleanup of the way we handle relocation of our open firmware trampoline (prom_init.c) on 64-bit by Anton Blanchard - Support for saving/restoring & context switching the PPR (Processor Priority Register) on server processors that support it. This allows the kernel to preserve thread priorities established by userspace. By Haren Myneni. - DAWR (new watchpoint facility) support on Power8 by Michael Neuling - Ability to change the DSCR (Data Stream Control Register) which controls cache prefetching on a running process via ptrace by Alexey Kardashevskiy - Support for context switching the TAR register on Power8 (new branch target register meant to be used by some new specific userspace perf event interrupt facility which is yet to be enabled) by Ian Munsie. - Improve preservation of the CFAR register (which captures the origin of a branch) on various exception conditions by Paulus. - Move the Bestcomm DMA driver from arch powerpc to drivers/dma where it belongs by Philippe De Muyter - Support for Transactional Memory on Power8 by Michael Neuling (based on original work by Matt Evans). For those curious about the feature, the patch contains a pretty good description." (See commit db8ff907027b: "powerpc: Documentation for transactional memory on powerpc" for the mentioned description added to the file Documentation/powerpc/transactional_memory.txt) * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (140 commits) powerpc/kexec: Disable hard IRQ before kexec powerpc/85xx: l2sram - Add compatible string for BSC9131 platform powerpc/85xx: bsc9131 - Correct typo in SDHC device node powerpc/e500/qemu-e500: enable coreint powerpc/mpic: allow coreint to be determined by MPIC version powerpc/fsl_pci: Store the pci ctlr device ptr in the pci ctlr struct powerpc/85xx: Board support for ppa8548 powerpc/fsl: remove extraneous DIU platform functions arch/powerpc/platforms/85xx/p1022_ds.c: adjust duplicate test powerpc: Documentation for transactional memory on powerpc powerpc: Add transactional memory to pseries and ppc64 defconfigs powerpc: Add config option for transactional memory powerpc: Add transactional memory to POWER8 cpu features powerpc: Add new transactional memory state to the signal context powerpc: Hook in new transactional memory code powerpc: Routines for FP/VSX/VMX unavailable during a transaction powerpc: Add transactional memory unavaliable execption handler powerpc: Add reclaim and recheckpoint functions for context switching transactional memory processes powerpc: Add FP/VSX and VMX register load functions for transactional memory powerpc: Add helper functions for transactional memory context switching ...
| * | powerpc: Add transactional memory paca scratch register to show_regsMichael Neuling2013-02-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add transactional memory paca scratch register to show_regs. This is useful for debugging. Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | powerpc: Add support for context switching the TAR registerIan Munsie2013-02-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for enabling and context switching the Target Address Register in Power8. The TAR is a new special purpose register that can be used for computed branches with the bctar[l] (branch conditional to TAR) instruction in the same manner as the count and link registers. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | Merge branch 'merge' into nextBenjamin Herrenschmidt2013-01-28
| |\| | | | | | | | | | | | | Merge "merge" branch to bring in various bug fixes that are going into 3.8
| * | powerpc: Implement PPR save/restoreHaren Myneni2013-01-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [PATCH 6/6] powerpc: Implement PPR save/restore When the task enters in to kernel space, the user defined priority (PPR) will be saved in to PACA at the beginning of first level exception vector and then copy from PACA to thread_info in second level vector. PPR will be restored from thread_info before exits the kernel space. P7/P8 temporarily raises the thread priority to higher level during exception until the program executes HMT_* calls. But it will not modify PPR register. So we save PPR value whenever some register is available to use and then calls HMT_MEDIUM to increase the priority. This feature supports on P7 or later processors. We save/ restore PPR for all exception vectors except system call entry. GLIBC will be saving / restore for system calls. So the default PPR value (3) will be set for the system call exit when the task returned to the user space. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | powerpc: Move branch instruction from ACCOUNT_CPU_USER_ENTRY to callerHaren Myneni2013-01-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [PATCH 1/6] powerpc: Move branch instruction from ACCOUNT_CPU_USER_ENTRY to caller The first instruction in ACCOUNT_CPU_USER_ENTRY is 'beq' which checks for exceptions coming from kernel mode. PPR value will be saved immediately after ACCOUNT_CPU_USER_ENTRY and is also for user level exceptions. So moved this branch instruction in the caller code. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>