aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/cpu/bugs.c
diff options
context:
space:
mode:
authorSuresh Siddha <suresh.b.siddha@intel.com>2012-08-24 17:13:02 -0400
committerH. Peter Anvin <hpa@linux.intel.com>2012-09-18 18:52:11 -0400
commit304bceda6a18ae0b0240b8aac9a6bdf8ce2d2469 (patch)
tree9ffae43391d69aa4765590b942b907da4a189041 /arch/x86/kernel/cpu/bugs.c
parent9c6ff8bbb69a4e7b47ac40bfa44509296e89c5c0 (diff)
x86, fpu: use non-lazy fpu restore for processors supporting xsave
Fundamental model of the current Linux kernel is to lazily init and restore FPU instead of restoring the task state during context switch. This changes that fundamental lazy model to the non-lazy model for the processors supporting xsave feature. Reasons driving this model change are: i. Newer processors support optimized state save/restore using xsaveopt and xrstor by tracking the INIT state and MODIFIED state during context-switch. This is faster than modifying the cr0.TS bit which has serializing semantics. ii. Newer glibc versions use SSE for some of the optimized copy/clear routines. With certain workloads (like boot, kernel-compilation etc), application completes its work with in the first 5 task switches, thus taking upto 5 #DNA traps with the kernel not getting a chance to apply the above mentioned pre-load heuristic. iii. Some xstate features (like AMD's LWP feature) don't honor the cr0.TS bit and thus will not work correctly in the presence of lazy restore. Non-lazy state restore is needed for enabling such features. Some data on a two socket SNB system: * Saved 20K DNA exceptions during boot on a two socket SNB system. * Saved 50K DNA exceptions during kernel-compilation workload. * Improved throughput of the AVX based checksumming function inside the kernel by ~15% as xsave/xrstor is faster than the serializing clts/stts pair. Also now kernel_fpu_begin/end() relies on the patched alternative instructions. So move check_fpu() which uses the kernel_fpu_begin/end() after alternative_instructions(). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1345842782-24175-7-git-send-email-suresh.b.siddha@intel.com Merge 32-bit boot fix from, Link: http://lkml.kernel.org/r/1347300665-6209-4-git-send-email-suresh.b.siddha@intel.com Cc: Jim Kukunas <james.t.kukunas@linux.intel.com> Cc: NeilBrown <neilb@suse.de> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Diffstat (limited to 'arch/x86/kernel/cpu/bugs.c')
-rw-r--r--arch/x86/kernel/cpu/bugs.c7
1 files changed, 6 insertions, 1 deletions
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index c97bb7b5a9f8..d0e910da16c5 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -165,10 +165,15 @@ void __init check_bugs(void)
165 print_cpu_info(&boot_cpu_data); 165 print_cpu_info(&boot_cpu_data);
166#endif 166#endif
167 check_config(); 167 check_config();
168 check_fpu();
169 check_hlt(); 168 check_hlt();
170 check_popad(); 169 check_popad();
171 init_utsname()->machine[1] = 170 init_utsname()->machine[1] =
172 '0' + (boot_cpu_data.x86 > 6 ? 6 : boot_cpu_data.x86); 171 '0' + (boot_cpu_data.x86 > 6 ? 6 : boot_cpu_data.x86);
173 alternative_instructions(); 172 alternative_instructions();
173
174 /*
175 * kernel_fpu_begin/end() in check_fpu() relies on the patched
176 * alternative instructions.
177 */
178 check_fpu();
174} 179}