aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/relocate_kernel_64.S
Commit message (Collapse)AuthorAge
* x86/asm: Replace "MOVQ $imm, %reg" with MOVLDenys Vlasenko2015-04-01
| | | | | | | | | | | | | | | | | | | | | | | There is no reason to use MOVQ to load a non-negative immediate constant value into a 64-bit register. MOVL does the same, since the upper 32 bits are zero-extended by the CPU. This makes the code a bit smaller, while leaving functionality unchanged. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427821211-25099-8-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86/asm: Optimize unnecessarily wide TEST instructionsDenys Vlasenko2015-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By the nature of the TEST operation, it is often possible to test a narrower part of the operand: "testl $3, mem" -> "testb $3, mem", "testq $3, %rcx" -> "testb $3, %cl" This results in shorter instructions, because the TEST instruction has no sign-entending byte-immediate forms unlike other ALU ops. Note that this change does not create any LCP (Length-Changing Prefix) stalls, which happen when adding a 0x66 prefix, which happens when 16-bit immediates are used, which changes such TEST instructions: [test_opcode] [modrm] [imm32] to: [0x66] [test_opcode] [modrm] [imm16] where [imm16] has a *different length* now: 2 bytes instead of 4. This confuses the decoder and slows down execution. REX prefixes were carefully designed to almost never hit this case: adding REX prefix does not change instruction length except MOVABS and MOV [addr],RAX instruction. This patch does not add instructions which would use a 0x66 prefix, code changes in assembly are: -48 f7 07 01 00 00 00 testq $0x1,(%rdi) +f6 07 01 testb $0x1,(%rdi) -48 f7 c1 01 00 00 00 test $0x1,%rcx +f6 c1 01 test $0x1,%cl -48 f7 c1 02 00 00 00 test $0x2,%rcx +f6 c1 02 test $0x2,%cl -41 f7 c2 01 00 00 00 test $0x1,%r10d +41 f6 c2 01 test $0x1,%r10b -48 f7 c1 04 00 00 00 test $0x4,%rcx +f6 c1 04 test $0x4,%cl -48 f7 c1 08 00 00 00 test $0x8,%rcx +f6 c1 08 test $0x8,%cl Linus further notes: "There are no stalls from using 8-bit instruction forms. Now, changing from 64-bit or 32-bit 'test' instructions to 8-bit ones *could* cause problems if it ends up having forwarding issues, so that instead of just forwarding the result, you end up having to wait for it to be stable in the L1 cache (or possibly the register file). The forwarding from the store buffer is simplest and most reliable if the read is done at the exact same address and the exact same size as the write that gets forwarded. But that's true only if: (a) the write was very recent and is still in the write queue. I'm not sure that's the case here anyway. (b) on at least most Intel microarchitectures, you have to test a different byte than the lowest one (so forwarding a 64-bit write to a 8-bit read ends up working fine, as long as the 8-bit read is of the low 8 bits of the written data). A very similar issue *might* show up for registers too, not just memory writes, if you use 'testb' with a high-byte register (where instead of forwarding the value from the original producer it needs to go through the register file and then shifted). But it's mainly a problem for store buffers. But afaik, the way Denys changed the test instructions, neither of the above issues should be true. The real problem for store buffer forwarding tends to be "write 8 bits, read 32 bits". That can be really surprisingly expensive, because the read ends up having to wait until the write has hit the cacheline, and we might talk tens of cycles of latency here. But "write 32 bits, read the low 8 bits" *should* be fast on pretty much all x86 chips, afaik." Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1425675332-31576-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* x86, reloc: Use xorl instead of xorq in relocate_kernel_64.SH. Peter Anvin2013-06-21
| | | | | | | | | | There is no point in using "xorq" to clear a register... use "xorl" to clear the bottom 32 bits, and the upper 32 bits get cleared by virtue of zero extension. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/n/tip-b76zi1gep39c0zs8fbvkhie9@git.kernel.org
* x86: Fix typo in kexec register clearingKees Cook2013-06-12
| | | | | | | | | | | Fixes a typo in register clearing code. Thanks to PaX Team for fixing this originally, and James Troup for pointing it out. Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/20130605184718.GA8396@www.outflux.net Cc: <stable@vger.kernel.org> v2.6.30+ Cc: PaX Team <pageexec@freemail.hu> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* kexec, x86: Fix incorrect jump back address if not preserving contextHuang Ying2011-07-21
| | | | | | | | | | | | | | | | | | | | | | In kexec jump support, jump back address passed to the kexeced kernel via function calling ABI, that is, the function call return address is the jump back entry. Furthermore, jump back entry == 0 should be used to signal that the jump back or preserve context is not enabled in the original kernel. But in the current implementation the stack position used for function call return address is not cleared context preservation is disabled. The patch fixes this bug. Reported-and-tested-by: Yin Kangkai <kangkai.yin@intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/1310607277-25029-1-git-send-email-ying.huang@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86, kexec: x86_64: add kexec jump support for x86_64Huang Ying2009-03-10
| | | | | | | | | | Impact: New major feature This patch add kexec jump support for x86_64. More information about kexec jump can be found in corresponding x86_32 support patch. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86, kexec: fix kexec x86 coding styleHuang Ying2009-03-10
| | | | | | | | | Impact: Cleanup Fix some coding style issue for kexec x86. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86: use _types.h headers in asm where availableJeremy Fitzhardinge2009-02-13
| | | | | | | In general, the only definitions that assembly files can use are in _types.S headers (where available), so convert them. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
* x86: kexec: Use one page table in x86_64 machine_kexecHuang Ying2009-02-03
| | | | | | | | | | | | | | | | Impact: reduce kernel BSS size by 7 pages, improve code readability Two page tables are used in current x86_64 kexec implementation. One is used to jump from kernel virtual address to identity map address, the other is used to map all physical memory. In fact, on x86_64, there is no conflict between kernel virtual address space and physical memory space, so just one page table is sufficient. The page table pages used to map control page are dynamically allocated to save memory if kexec image is not loaded. ASM code used to map control page is replaced by C code too. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86: relocate_kernel - use predefined macroses for page attributesgorcunov@gmail.com2008-04-17
| | | | | Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: relocate_kernel - use predefined macroses for processor stategorcunov@gmail.com2008-04-17
| | | | | Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: relocate_kernel - use PAGE_SIZE instead of numeric constantgorcunov@gmail.com2008-04-17
| | | | | Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: relocate_kernel - use predefined PAGE_SIZE instead of own aliasCyrill Gorcunov2008-04-17
| | | | | | | | This patch does clean up relocate_kernel_(32|64).S a bit by getting rid of local PAGE_ALIGNED macro. We should use well-known PAGE_SIZE instead Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86_64: move kernelThomas Gleixner2007-10-11
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>