aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86
diff options
context:
space:
mode:
authorAndi Kleen <andi@firstfloor.org>2008-03-11 22:53:32 -0400
committerIngo Molnar <mingo@elte.hu>2008-04-17 11:41:30 -0400
commit8346ea17aa20e9864b0f7dc03d55f3cd5620b8c1 (patch)
tree23a0e515b5aa3c5611f2b821ecc3a2b08c463858 /arch/x86
parent1de87bd40e119d26533b5135677901990390bfa9 (diff)
x86: split large page mapping for AMD TSEG
On AMD SMM protected memory is part of the address map, but handled internally like an MTRR. That leads to large pages getting split internally which has some performance implications. Check for the AMD TSEG MSR and split the large page mapping on that area explicitely if it is part of the direct mapping. There is also SMM ASEG, but it is in the first 1MB and already covered by the earlier split first page patch. Idea for this came from an earlier patch by Andreas Herrmann On a RevF dual Socket Opteron system kernbench shows a clear improvement from this: (together with the earlier patches in this series, especially the split first 2MB patch) [lower is better] no split stddev split stddev delta Elapsed Time 87.146 (0.727516) 84.296 (1.09098) -3.2% User Time 274.537 (4.05226) 273.692 (3.34344) -0.3% System Time 34.907 (0.42492) 34.508 (0.26832) -1.1% Percent CPU 322.5 (38.3007) 326.5 (44.5128) +1.2% => About 3.2% improvement in elapsed time for kernbench. With GB pages on AMD Fam1h the impact of splitting is much higher of course, since it would split two full GB pages (together with the first 1MB split patch) instead of two 2MB pages. I could not benchmark a clear difference in kernbench on gbpages, so I kept it disabled for that case That was only limited benchmarking of course, so if someone was interested in running more tests for the gbpages case that could be revisited (contributions welcome) I didn't bother implementing this for 32bit because it is very unlikely the 32bit lowmem mapping overlaps into the TSEG near 4GB and the 2MB low split is already handled for both. [ mingo@elte.hu: do it on gbpages kernels too, there's no clear reason why it shouldnt help there. ] Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: andreas.herrmann3@amd.com Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Diffstat (limited to 'arch/x86')
-rw-r--r--arch/x86/kernel/setup_64.c13
1 files changed, 13 insertions, 0 deletions
diff --git a/arch/x86/kernel/setup_64.c b/arch/x86/kernel/setup_64.c
index 3d76dbd9f2c0..b5425979501c 100644
--- a/arch/x86/kernel/setup_64.c
+++ b/arch/x86/kernel/setup_64.c
@@ -729,6 +729,19 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
729 729
730 if (amd_apic_timer_broken()) 730 if (amd_apic_timer_broken())
731 disable_apic_timer = 1; 731 disable_apic_timer = 1;
732
733 if (c == &boot_cpu_data && c->x86 >= 0xf && c->x86 <= 0x11) {
734 unsigned long long tseg;
735
736 /*
737 * Split up direct mapping around the TSEG SMM area.
738 * Don't do it for gbpages because there seems very little
739 * benefit in doing so.
740 */
741 if (!rdmsrl_safe(MSR_K8_TSEG_ADDR, &tseg) &&
742 (tseg >> PMD_SHIFT) < (max_pfn_mapped >> (PMD_SHIFT-PAGE_SHIFT)))
743 set_memory_4k((unsigned long)__va(tseg), 1);
744 }
732} 745}
733 746
734void __cpuinit detect_ht(struct cpuinfo_x86 *c) 747void __cpuinit detect_ht(struct cpuinfo_x86 *c)