aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/lib
diff options
context:
space:
mode:
authorBorislav Petkov <bp@suse.de>2015-01-05 07:48:41 -0500
committerBorislav Petkov <bp@suse.de>2015-02-23 07:44:11 -0500
commit48c7a2509f9e237d8465399d9cdfe487d3212a23 (patch)
tree46d12431183b5173295843f41213141e155f6749 /arch/x86/lib
parent4332195c5615bf748624094ce4ff6797e475024d (diff)
x86/alternatives: Make JMPs more robust
Up until now we had to pay attention to relative JMPs in alternatives about how their relative offset gets computed so that the jump target is still correct. Or, as it is the case for near CALLs (opcode e8), we still have to go and readjust the offset at patching time. What is more, the static_cpu_has_safe() facility had to forcefully generate 5-byte JMPs since we couldn't rely on the compiler to generate properly sized ones so we had to force the longest ones. Worse than that, sometimes it would generate a replacement JMP which is longer than the original one, thus overwriting the beginning of the next instruction at patching time. So, in order to alleviate all that and make using JMPs more straight-forward we go and pad the original instruction in an alternative block with NOPs at build time, should the replacement(s) be longer. This way, alternatives users shouldn't pay special attention so that original and replacement instruction sizes are fine but the assembler would simply add padding where needed and not do anything otherwise. As a second aspect, we go and recompute JMPs at patching time so that we can try to make 5-byte JMPs into two-byte ones if possible. If not, we still have to recompute the offsets as the replacement JMP gets put far away in the .altinstr_replacement section leading to a wrong offset if copied verbatim. For example, on a locally generated kernel image old insn VA: 0xffffffff810014bd, CPU feat: X86_FEATURE_ALWAYS, size: 2 __switch_to: ffffffff810014bd: eb 21 jmp ffffffff810014e0 repl insn: size: 5 ffffffff81d0b23c: e9 b1 62 2f ff jmpq ffffffff810014f2 gets corrected to a 2-byte JMP: apply_alternatives: feat: 3*32+21, old: (ffffffff810014bd, len: 2), repl: (ffffffff81d0b23c, len: 5) alt_insn: e9 b1 62 2f ff recompute_jumps: next_rip: ffffffff81d0b241, tgt_rip: ffffffff810014f2, new_displ: 0x00000033, ret len: 2 converted to: eb 33 90 90 90 and a 5-byte JMP: old insn VA: 0xffffffff81001516, CPU feat: X86_FEATURE_ALWAYS, size: 2 __switch_to: ffffffff81001516: eb 30 jmp ffffffff81001548 repl insn: size: 5 ffffffff81d0b241: e9 10 63 2f ff jmpq ffffffff81001556 gets shortened into a two-byte one: apply_alternatives: feat: 3*32+21, old: (ffffffff81001516, len: 2), repl: (ffffffff81d0b241, len: 5) alt_insn: e9 10 63 2f ff recompute_jumps: next_rip: ffffffff81d0b246, tgt_rip: ffffffff81001556, new_displ: 0x0000003e, ret len: 2 converted to: eb 3e 90 90 90 ... and so on. This leads to a net win of around 40ish replacements * 3 bytes savings =~ 120 bytes of I$ on an AMD guest which means some savings of precious instruction cache bandwidth. The padding to the shorter 2-byte JMPs are single-byte NOPs which on smart microarchitectures means discarding NOPs at decode time and thus freeing up execution bandwidth. Signed-off-by: Borislav Petkov <bp@suse.de>
Diffstat (limited to 'arch/x86/lib')
-rw-r--r--arch/x86/lib/copy_user_64.S11
1 files changed, 5 insertions, 6 deletions
diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S
index a9aedd6aa7f7..dad718ce805c 100644
--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -25,14 +25,13 @@
25 */ 25 */
26 .macro ALTERNATIVE_JUMP feature1,feature2,orig,alt1,alt2 26 .macro ALTERNATIVE_JUMP feature1,feature2,orig,alt1,alt2
270: 270:
28 .byte 0xe9 /* 32bit jump */ 28 jmp \orig
29 .long \orig-1f /* by default jump to orig */
301: 291:
31 .section .altinstr_replacement,"ax" 30 .section .altinstr_replacement,"ax"
322: .byte 0xe9 /* near jump with 32bit immediate */ 312:
33 .long \alt1-1b /* offset */ /* or alternatively to alt1 */ 32 jmp \alt1
343: .byte 0xe9 /* near jump with 32bit immediate */ 333:
35 .long \alt2-1b /* offset */ /* or alternatively to alt2 */ 34 jmp \alt2
36 .previous 35 .previous
37 36
38 .section .altinstructions,"a" 37 .section .altinstructions,"a"