From 74bf4312fff083ab25c3f357cc653ada7995e5f6 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Tue, 31 Jan 2006 18:29:18 -0800 Subject: [SPARC64]: Move away from virtual page tables, part 1. We now use the TSB hardware assist features of the UltraSPARC MMUs. SMP is currently knowingly broken, we need to find another place to store the per-cpu base pointers. We hid them away in the TSB base register, and that obviously will not work any more :-) Another known broken case is non-8KB base page size. Also noticed that flush_tlb_all() is not referenced anywhere, only the internal __flush_tlb_all() (local cpu only) is used by the sparc64 port, so we can get rid of flush_tlb_all(). The kernel gets it's own 8KB TSB (swapper_tsb) and each address space gets it's own private 8K TSB. Later we can add code to dynamically increase the size of per-process TSB as the RSS grows. An 8KB TSB is good enough for up to about a 4MB RSS, after which the TSB starts to incur many capacity and conflict misses. We even accumulate OBP translations into the kernel TSB. Another area for refinement is large page size support. We could use a secondary address space TSB to handle those. Signed-off-by: David S. Miller --- arch/sparc64/kernel/rtrap.S | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) (limited to 'arch/sparc64/kernel/rtrap.S') diff --git a/arch/sparc64/kernel/rtrap.S b/arch/sparc64/kernel/rtrap.S index b80eba0081ca..213eb4a9d8a4 100644 --- a/arch/sparc64/kernel/rtrap.S +++ b/arch/sparc64/kernel/rtrap.S @@ -223,10 +223,14 @@ rt_continue: ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 ldx [%sp + PTREGS_OFF + PT_V9_G3], %g3 ldx [%sp + PTREGS_OFF + PT_V9_G4], %g4 ldx [%sp + PTREGS_OFF + PT_V9_G5], %g5 +#ifdef CONFIG_SMP +#error IMMU TSB usage must be fixed mov TSB_REG, %g6 brnz,a,pn %l3, 1f ldxa [%g6] ASI_IMMU, %g5 -1: ldx [%sp + PTREGS_OFF + PT_V9_G6], %g6 +#endif +1: + ldx [%sp + PTREGS_OFF + PT_V9_G6], %g6 ldx [%sp + PTREGS_OFF + PT_V9_G7], %g7 wrpr %g0, RTRAP_PSTATE_AG_IRQOFF, %pstate ldx [%sp + PTREGS_OFF + PT_V9_I0], %i0 -- cgit v1.2.2 From 56fb4df6da76c35dca22036174e2d1edef83ff1f Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Sun, 26 Feb 2006 23:24:22 -0800 Subject: [SPARC64]: Elminate all usage of hard-coded trap globals. UltraSPARC has special sets of global registers which are switched to for certain trap types. There is one set for MMU related traps, one set of Interrupt Vector processing, and another set (called the Alternate globals) for all other trap types. For what seems like forever we've hard coded the values in some of these trap registers. Some examples include: 1) Interrupt Vector global %g6 holds current processors interrupt work struct where received interrupts are managed for IRQ handler dispatch. 2) MMU global %g7 holds the base of the page tables of the currently active address space. 3) Alternate global %g6 held the current_thread_info() value. Such hardcoding has resulted in some serious issues in many areas. There are some code sequences where having another register available would help clean up the implementation. Taking traps such as cross-calls from the OBP firmware requires some trick code sequences wherein we have to save away and restore all of the special sets of global registers when we enter/exit OBP. We were also using the IMMU TSB register on SMP to hold the per-cpu area base address, which doesn't work any longer now that we actually use the TSB facility of the cpu. The implementation is pretty straight forward. One tricky bit is getting the current processor ID as that is different on different cpu variants. We use a stub with a fancy calling convention which we patch at boot time. The calling convention is that the stub is branched to and the (PC - 4) to return to is in register %g1. The cpu number is left in %g6. This stub can be invoked by using the __GET_CPUID macro. We use an array of per-cpu trap state to store the current thread and physical address of the current address space's page tables. The TRAP_LOAD_THREAD_REG loads %g6 with the current thread from this table, it uses __GET_CPUID and also clobbers %g1. TRAP_LOAD_IRQ_WORK is used by the interrupt vector processing to load the current processor's IRQ software state into %g6. It also uses __GET_CPUID and clobbers %g1. Finally, TRAP_LOAD_PGD_PHYS loads the physical address base of the current address space's page tables into %g7, it clobbers %g1 and uses __GET_CPUID. Many refinements are possible, as well as some tuning, with this stuff in place. Signed-off-by: David S. Miller --- arch/sparc64/kernel/rtrap.S | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) (limited to 'arch/sparc64/kernel/rtrap.S') diff --git a/arch/sparc64/kernel/rtrap.S b/arch/sparc64/kernel/rtrap.S index 213eb4a9d8a4..5a62ec5d531c 100644 --- a/arch/sparc64/kernel/rtrap.S +++ b/arch/sparc64/kernel/rtrap.S @@ -223,12 +223,10 @@ rt_continue: ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 ldx [%sp + PTREGS_OFF + PT_V9_G3], %g3 ldx [%sp + PTREGS_OFF + PT_V9_G4], %g4 ldx [%sp + PTREGS_OFF + PT_V9_G5], %g5 -#ifdef CONFIG_SMP -#error IMMU TSB usage must be fixed - mov TSB_REG, %g6 - brnz,a,pn %l3, 1f - ldxa [%g6] ASI_IMMU, %g5 -#endif + brz,pt %l3, 1f + nop + /* Must do this before thread reg is clobbered below. */ + LOAD_PER_CPU_BASE(%g6, %g7) 1: ldx [%sp + PTREGS_OFF + PT_V9_G6], %g6 ldx [%sp + PTREGS_OFF + PT_V9_G7], %g7 -- cgit v1.2.2 From 4da808c352c290d3f762933d44d4ab90c2fd65f3 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Tue, 31 Jan 2006 18:33:00 -0800 Subject: [SPARC64]: Fix bogus flush instruction usage. Some of the trap code was still assuming that alternate global %g6 was hard coded with current_thread_info(). Let's just consistently flush at KERNBASE when we need a pipeline synchronization. That's locked into the TLB and will always work. Signed-off-by: David S. Miller --- arch/sparc64/kernel/rtrap.S | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'arch/sparc64/kernel/rtrap.S') diff --git a/arch/sparc64/kernel/rtrap.S b/arch/sparc64/kernel/rtrap.S index 5a62ec5d531c..89794ebdcbcf 100644 --- a/arch/sparc64/kernel/rtrap.S +++ b/arch/sparc64/kernel/rtrap.S @@ -259,7 +259,8 @@ rt_continue: ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 ldx [%l1 + %lo(sparc64_kern_pri_nuc_bits)], %l1 or %l0, %l1, %l0 stxa %l0, [%l7] ASI_DMMU - flush %g6 + sethi %hi(KERNBASE), %l7 + flush %l7 rdpr %wstate, %l1 rdpr %otherwin, %l2 srl %l1, 3, %l1 -- cgit v1.2.2 From 86b818687d4894063ecd1190e54717a0cce8c009 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Tue, 31 Jan 2006 18:34:51 -0800 Subject: [SPARC64]: Fix race in LOAD_PER_CPU_BASE() Since we use %g5 itself as a temporary, it can get clobbered if we take an interrupt mid-stream and thus cause end up with the final %g5 value too early as a result of rtrap processing. Set %g5 at the very end, atomically, to avoid this problem. Signed-off-by: David S. Miller --- arch/sparc64/kernel/rtrap.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch/sparc64/kernel/rtrap.S') diff --git a/arch/sparc64/kernel/rtrap.S b/arch/sparc64/kernel/rtrap.S index 89794ebdcbcf..64bc03610bc6 100644 --- a/arch/sparc64/kernel/rtrap.S +++ b/arch/sparc64/kernel/rtrap.S @@ -226,7 +226,7 @@ rt_continue: ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 brz,pt %l3, 1f nop /* Must do this before thread reg is clobbered below. */ - LOAD_PER_CPU_BASE(%g6, %g7) + LOAD_PER_CPU_BASE(%i0, %i1, %i2) 1: ldx [%sp + PTREGS_OFF + PT_V9_G6], %g6 ldx [%sp + PTREGS_OFF + PT_V9_G7], %g7 -- cgit v1.2.2 From ffe483d55229fadbaf4cc7316d47024a24ecd1a2 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 2 Feb 2006 21:55:10 -0800 Subject: [SPARC64]: Add explicit register args to trap state loading macros. This, as well as making the code cleaner, allows a simplification in the TSB miss handling path. Signed-off-by: David S. Miller --- arch/sparc64/kernel/rtrap.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch/sparc64/kernel/rtrap.S') diff --git a/arch/sparc64/kernel/rtrap.S b/arch/sparc64/kernel/rtrap.S index 64bc03610bc6..61bd45e7697e 100644 --- a/arch/sparc64/kernel/rtrap.S +++ b/arch/sparc64/kernel/rtrap.S @@ -226,7 +226,7 @@ rt_continue: ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 brz,pt %l3, 1f nop /* Must do this before thread reg is clobbered below. */ - LOAD_PER_CPU_BASE(%i0, %i1, %i2) + LOAD_PER_CPU_BASE(%g5, %g6, %i0, %i1, %i2) 1: ldx [%sp + PTREGS_OFF + PT_V9_G6], %g6 ldx [%sp + PTREGS_OFF + PT_V9_G7], %g7 -- cgit v1.2.2 From 314ef6859750b6539eac48d78059bb7986f29cb1 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Sat, 4 Feb 2006 00:10:01 -0800 Subject: [SPARC64]: Refine register window trap handling. When saving and restoing trap state, do the window spill/fill handling inline so that we never trap deeper than 2 trap levels. This is important for chips like Niagara. The window fixup code is massively simplified, and many more improvements are now possible. Signed-off-by: David S. Miller --- arch/sparc64/kernel/rtrap.S | 58 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 56 insertions(+), 2 deletions(-) (limited to 'arch/sparc64/kernel/rtrap.S') diff --git a/arch/sparc64/kernel/rtrap.S b/arch/sparc64/kernel/rtrap.S index 61bd45e7697e..ecfbbdc56125 100644 --- a/arch/sparc64/kernel/rtrap.S +++ b/arch/sparc64/kernel/rtrap.S @@ -267,15 +267,69 @@ rt_continue: ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 wrpr %l2, %g0, %canrestore wrpr %l1, %g0, %wstate - wrpr %g0, %g0, %otherwin + brnz,pt %l2, user_rtt_restore + wrpr %g0, %g0, %otherwin + + ldx [%g6 + TI_FLAGS], %g3 + wr %g0, ASI_AIUP, %asi + rdpr %cwp, %g1 + andcc %g3, _TIF_32BIT, %g0 + sub %g1, 1, %g1 + bne,pt %xcc, user_rtt_fill_32bit + wrpr %g1, %cwp + ba,a,pt %xcc, user_rtt_fill_64bit + +user_rtt_fill_fixup: + rdpr %cwp, %g1 + add %g1, 1, %g1 + wrpr %g1, 0x0, %cwp + + rdpr %wstate, %g2 + sll %g2, 3, %g2 + wrpr %g2, 0x0, %wstate + + /* We know %canrestore and %otherwin are both zero. */ + + sethi %hi(sparc64_kern_pri_context), %g2 + ldx [%g2 + %lo(sparc64_kern_pri_context)], %g2 + mov PRIMARY_CONTEXT, %g1 + stxa %g2, [%g1] ASI_DMMU + sethi %hi(KERNBASE), %g1 + flush %g1 + + or %g4, FAULT_CODE_WINFIXUP, %g4 + stb %g4, [%g6 + TI_FAULT_CODE] + stx %g5, [%g6 + TI_FAULT_ADDR] + + mov %g6, %l1 + wrpr %g0, 0x0, %tl + wrpr %g0, RTRAP_PSTATE, %pstate + mov %l1, %g6 + ldx [%g6 + TI_TASK], %g4 + LOAD_PER_CPU_BASE(%g5, %g6, %g1, %g2, %g3) + call do_sparc64_fault + add %sp, PTREGS_OFF, %o0 + ba,pt %xcc, rtrap + nop + +user_rtt_pre_restore: + add %g1, 1, %g1 + wrpr %g1, 0x0, %cwp + +user_rtt_restore: restore rdpr %canrestore, %g1 wrpr %g1, 0x0, %cleanwin retry nop -kern_rtt: restore +kern_rtt: rdpr %canrestore, %g1 + brz,pn %g1, kern_rtt_fill + nop +kern_rtt_restore: + restore retry + to_kernel: #ifdef CONFIG_PREEMPT ldsw [%g6 + TI_PRE_COUNT], %l5 -- cgit v1.2.2 From 936f482af1743141d637483ec10eb881537c26dc Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Sun, 5 Feb 2006 21:29:28 -0800 Subject: [SPARC64]: Add initial code to twiddle %gl on trap entry/exit. Instead of setting/clearing PSTATE_AG we have to change the %gl register value on sun4v. Signed-off-by: David S. Miller --- arch/sparc64/kernel/rtrap.S | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) (limited to 'arch/sparc64/kernel/rtrap.S') diff --git a/arch/sparc64/kernel/rtrap.S b/arch/sparc64/kernel/rtrap.S index ecfbbdc56125..e6130956307f 100644 --- a/arch/sparc64/kernel/rtrap.S +++ b/arch/sparc64/kernel/rtrap.S @@ -230,7 +230,14 @@ rt_continue: ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 1: ldx [%sp + PTREGS_OFF + PT_V9_G6], %g6 ldx [%sp + PTREGS_OFF + PT_V9_G7], %g7 - wrpr %g0, RTRAP_PSTATE_AG_IRQOFF, %pstate + + /* Normal globals are restored, go to trap globals. */ +661: wrpr %g0, RTRAP_PSTATE_AG_IRQOFF, %pstate + .section .gl_1insn_patch, "ax" + .word 661b + SET_GL(1) + .previous + ldx [%sp + PTREGS_OFF + PT_V9_I0], %i0 ldx [%sp + PTREGS_OFF + PT_V9_I1], %i1 @@ -304,6 +311,13 @@ user_rtt_fill_fixup: mov %g6, %l1 wrpr %g0, 0x0, %tl wrpr %g0, RTRAP_PSTATE, %pstate + +661: nop + .section .gl_1insn_patch, "ax" + .word 661b + SET_GL(0) + .previous + mov %l1, %g6 ldx [%g6 + TI_TASK], %g4 LOAD_PER_CPU_BASE(%g5, %g6, %g1, %g2, %g3) -- cgit v1.2.2 From 314981ac7177a933319e3c071a5cf0a579205e6e Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Sun, 5 Feb 2006 21:59:03 -0800 Subject: [SPARC64]: Kill all %pstate changes in context switch code. They are totally unnecessary because: 1) Interrupts are already disabled when switch_to() runs. 2) We don't use hard-coded alternate globals any longer. This found a case in rtrap, which still assumed alternate global %g6 was current_thread_info(), and that is fixed by this changeset as well. Signed-off-by: David S. Miller --- arch/sparc64/kernel/rtrap.S | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) (limited to 'arch/sparc64/kernel/rtrap.S') diff --git a/arch/sparc64/kernel/rtrap.S b/arch/sparc64/kernel/rtrap.S index e6130956307f..a2fa277da62b 100644 --- a/arch/sparc64/kernel/rtrap.S +++ b/arch/sparc64/kernel/rtrap.S @@ -224,7 +224,8 @@ rt_continue: ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 ldx [%sp + PTREGS_OFF + PT_V9_G4], %g4 ldx [%sp + PTREGS_OFF + PT_V9_G5], %g5 brz,pt %l3, 1f - nop + mov %g6, %l2 + /* Must do this before thread reg is clobbered below. */ LOAD_PER_CPU_BASE(%g5, %g6, %i0, %i1, %i2) 1: @@ -238,6 +239,8 @@ rt_continue: ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 SET_GL(1) .previous + mov %l2, %g6 + ldx [%sp + PTREGS_OFF + PT_V9_I0], %i0 ldx [%sp + PTREGS_OFF + PT_V9_I1], %i1 -- cgit v1.2.2 From df7d6aec96ab98cb182dd5138a85bdc363a9bf0d Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Tue, 7 Feb 2006 00:00:16 -0800 Subject: [SPARC64]: Rename gl_{1,2}insn_patch --> sun4v_{1,2}insn_patch Signed-off-by: David S. Miller --- arch/sparc64/kernel/rtrap.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'arch/sparc64/kernel/rtrap.S') diff --git a/arch/sparc64/kernel/rtrap.S b/arch/sparc64/kernel/rtrap.S index a2fa277da62b..a55d517e76aa 100644 --- a/arch/sparc64/kernel/rtrap.S +++ b/arch/sparc64/kernel/rtrap.S @@ -234,7 +234,7 @@ rt_continue: ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 /* Normal globals are restored, go to trap globals. */ 661: wrpr %g0, RTRAP_PSTATE_AG_IRQOFF, %pstate - .section .gl_1insn_patch, "ax" + .section .sun4v_1insn_patch, "ax" .word 661b SET_GL(1) .previous @@ -316,7 +316,7 @@ user_rtt_fill_fixup: wrpr %g0, RTRAP_PSTATE, %pstate 661: nop - .section .gl_1insn_patch, "ax" + .section .sun4v_1insn_patch, "ax" .word 661b SET_GL(0) .previous -- cgit v1.2.2 From 8b11bd12aff76e02cdc2cbc9e439bba88d281223 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Tue, 7 Feb 2006 22:13:05 -0800 Subject: [SPARC64]: Patch up mmu context register writes for sun4v. sun4v uses ASI_MMU instead of ASI_DMMU Signed-off-by: David S. Miller --- arch/sparc64/kernel/rtrap.S | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-) (limited to 'arch/sparc64/kernel/rtrap.S') diff --git a/arch/sparc64/kernel/rtrap.S b/arch/sparc64/kernel/rtrap.S index a55d517e76aa..551f71982008 100644 --- a/arch/sparc64/kernel/rtrap.S +++ b/arch/sparc64/kernel/rtrap.S @@ -264,11 +264,23 @@ rt_continue: ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 brnz,pn %l3, kern_rtt mov PRIMARY_CONTEXT, %l7 - ldxa [%l7 + %l7] ASI_DMMU, %l0 + +661: ldxa [%l7 + %l7] ASI_DMMU, %l0 + .section .sun4v_1insn_patch, "ax" + .word 661b + ldxa [%l7 + %l7] ASI_MMU, %l0 + .previous + sethi %hi(sparc64_kern_pri_nuc_bits), %l1 ldx [%l1 + %lo(sparc64_kern_pri_nuc_bits)], %l1 or %l0, %l1, %l0 - stxa %l0, [%l7] ASI_DMMU + +661: stxa %l0, [%l7] ASI_DMMU + .section .sun4v_1insn_patch, "ax" + .word 661b + stxa %l0, [%l7] ASI_MMU + .previous + sethi %hi(KERNBASE), %l7 flush %l7 rdpr %wstate, %l1 @@ -303,7 +315,13 @@ user_rtt_fill_fixup: sethi %hi(sparc64_kern_pri_context), %g2 ldx [%g2 + %lo(sparc64_kern_pri_context)], %g2 mov PRIMARY_CONTEXT, %g1 - stxa %g2, [%g1] ASI_DMMU + +661: stxa %g2, [%g1] ASI_DMMU + .section .sun4v_1insn_patch, "ax" + .word 661b + stxa %g2, [%g1] ASI_MMU + .previous + sethi %hi(KERNBASE), %g1 flush %g1 -- cgit v1.2.2 From af02bec66294c76fba181c665c68a31fd4392020 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 16 Feb 2006 16:23:45 -0800 Subject: [SPARC64]: Fix return from trap on SUN4V. We need to set the global register set _AND_ disable PSTATE_IE in %pstate. The original patch sequence was leaving PSTATE_IE enabled when returning to kernel mode, oops. This fixes the random register corruption being seen on SUN4V. Signed-off-by: David S. Miller --- arch/sparc64/kernel/rtrap.S | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'arch/sparc64/kernel/rtrap.S') diff --git a/arch/sparc64/kernel/rtrap.S b/arch/sparc64/kernel/rtrap.S index 551f71982008..1e724fe172ae 100644 --- a/arch/sparc64/kernel/rtrap.S +++ b/arch/sparc64/kernel/rtrap.S @@ -234,8 +234,10 @@ rt_continue: ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 /* Normal globals are restored, go to trap globals. */ 661: wrpr %g0, RTRAP_PSTATE_AG_IRQOFF, %pstate - .section .sun4v_1insn_patch, "ax" + nop + .section .sun4v_2insn_patch, "ax" .word 661b + wrpr %g0, RTRAP_PSTATE_IRQOFF, %pstate SET_GL(1) .previous -- cgit v1.2.2 From fc504928677049f0ad3f1fd4e0bb3908172df8f3 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Wed, 22 Feb 2006 16:15:45 -0800 Subject: [SPARC64]: Drop %gl to 0 before re-enabling PSTATE_IE in rtrap If we take a window fault, on SUN4V set %gl to zero before we turn PSTATE_IE back on in %pstate. Otherwise if we take an interrupt we'll end up with corrupt register state. Signed-off-by: David S. Miller --- arch/sparc64/kernel/rtrap.S | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'arch/sparc64/kernel/rtrap.S') diff --git a/arch/sparc64/kernel/rtrap.S b/arch/sparc64/kernel/rtrap.S index 1e724fe172ae..7130e866f935 100644 --- a/arch/sparc64/kernel/rtrap.S +++ b/arch/sparc64/kernel/rtrap.S @@ -333,7 +333,6 @@ user_rtt_fill_fixup: mov %g6, %l1 wrpr %g0, 0x0, %tl - wrpr %g0, RTRAP_PSTATE, %pstate 661: nop .section .sun4v_1insn_patch, "ax" @@ -341,6 +340,8 @@ user_rtt_fill_fixup: SET_GL(0) .previous + wrpr %g0, RTRAP_PSTATE, %pstate + mov %l1, %g6 ldx [%g6 + TI_TASK], %g4 LOAD_PER_CPU_BASE(%g5, %g6, %g1, %g2, %g3) -- cgit v1.2.2