aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm
Commit message (Collapse)AuthorAge
...
* KVM: x86 emulator: Don't write-back cpu-state on X86EMUL_INTERCEPTEDJoerg Roedel2011-05-11
| | | | | | | | | This patch prevents the changed CPU state to be written back when the emulator detected that the instruction was intercepted by the guest. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: add SVM interceptsAvi Kivity2011-05-11
| | | | | | | | | Add intercept codes for instructions defined by SVM as interceptable. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: add framework for instruction interceptsAvi Kivity2011-05-11
| | | | | | | | | | | | | | | | | | | | | | | | When running in guest mode, certain instructions can be intercepted by hardware. This also holds for nested guests running on emulated virtualization hardware, in particular instructions emulated by kvm itself. This patch adds a framework for intercepting instructions. If an instruction is marked for interception, and if we're running in guest mode, a callback is called to check whether an intercept is needed or not. The callback is called at three points in time: immediately after beginning execution, after checking privilge exceptions, and after checking memory exception. This suits the different interception points defined for different instructions and for the various virtualization instruction sets. In addition, a new X86EMUL_INTERCEPT is defined, which any callback or memory access may define, allowing the more complicated intercepts to be implemented in existing callbacks. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: implement movdqu instruction (f3 0f 6f, f3 0f 7f)Avi Kivity2011-05-11
| | | | Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: SSE supportAvi Kivity2011-05-11
| | | | | | | Add support for marking an instruction as SSE, switching registers used to the SSE register file. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: Specialize decoding for insns with 66/f2/f3 prefixesAvi Kivity2011-05-11
| | | | | | | | | | | | | | Most SIMD instructions use the 66/f2/f3 prefixes to distinguish between different variants of the same instruction. Usually the encoding is quite regular, but in some cases (including non-SIMD instructions) the prefixes generate very different instructions. Examples include XCHG/PAUSE, MOVQ/MOVDQA/MOVDQU, and MOVBE/CRC32. Allow the emulator to handle these special cases by splitting such opcodes into groups, with different decode flags and execution functions for different prefixes. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: define callbacks for using the guest fpu within the emulatorAvi Kivity2011-05-11
| | | | | | Needed for emulating fpu instructions. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: do not munge rep prefixAvi Kivity2011-05-11
| | | | | | | | Currently we store a rep prefix as 1 or 2 depending on whether it is a REPE or REPNE. Since sse instructions depend on the prefix value, store it as the original opcode to simplify things further on. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: 16-byte mmio supportAvi Kivity2011-05-11
| | | | | | | | | Since sse instructions can issue 16-byte mmios, we need to support them. We can't increase the kvm_run mmio buffer size to 16 bytes without breaking compatibility, so instead we break the large mmios into two smaller 8-byte ones. Since the bus is 64-bit we aren't breaking any atomicity guarantees. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Split mmio completion into a functionAvi Kivity2011-05-11
| | | | | | Make room for sse mmio completions. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: extend in-kernel mmio to handle >8 byte transactionsAvi Kivity2011-05-11
| | | | | | Needed for coalesced mmio using sse. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86: better fix for race between nmi injection and enabling nmi windowGleb Natapov2011-05-11
| | | | | | Fix race between nmi injection and enabling nmi window in a simpler way. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* Revert "KVM: Fix race between nmi injection and enabling nmi window"Marcelo Tosatti2011-05-11
| | | | | | | | This reverts commit f86368493ec038218e8663cc1b6e5393cd8e008a. Simpler fix to follow. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: expose async pf through our standard mechanismGlauber Costa2011-05-11
| | | | | | | | | | | | | As Avi recently mentioned, the new standard mechanism for exposing features is KVM_GET_SUPPORTED_CPUID, not spamming CAPs. For some reason async pf missed that. So expose async_pf here. Signed-off-by: Glauber Costa <glommer@redhat.com> CC: Gleb Natapov <gleb@redhat.com> CC: Avi Kivity <avi@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: simplify NMI mask managementAvi Kivity2011-05-11
| | | | | | | | | | | | Use vmx_set_nmi_mask() instead of open-coding management of the hardware bit and the software hint (nmi_known_unmasked). There's a slight change of behaviour when running without hardware virtual NMI support - we now clear the NMI mask if NMI delivery faulted in that case as well. This improves emulation accuracy. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: Remove unused svm_featuresJan Kiszka2011-05-11
| | | | | | | We use boot_cpu_has now. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Use cached VM_EXIT_INTR_INFO in handle_exceptionAvi Kivity2011-05-11
| | | | | | vmx_complete_atomic_exit() cached it for us, so we can use it here. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Don't VMREAD VM_EXIT_INTR_INFO unconditionallyAvi Kivity2011-05-11
| | | | | | Only read it if we're going to use it later. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Refactor vmx_complete_atomic_exit()Avi Kivity2011-05-11
| | | | | | | Move the exit reason checks to the front of the function, for early exit in the common case. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Qualify check for host NMIAvi Kivity2011-05-11
| | | | | | | Check for the exit reason first; this allows us, later, to avoid a VMREAD for VM_EXIT_INTR_INFO_FIELD. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Avoid vmx_recover_nmi_blocking() when unneededAvi Kivity2011-05-11
| | | | | | | | When we haven't injected an interrupt, we don't need to recover the nmi blocking state (since the guest can't set it by itself). This allows us to avoid a VMREAD later on. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Cache cplAvi Kivity2011-05-11
| | | | | | | | We may read the cpl quite often in the same vmexit (instruction privilege check, memory access checks for instruction and operands), so we gain a bit if we cache the value. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Optimize vmx_get_cpl()Avi Kivity2011-05-11
| | | | | | | In long mode, vm86 mode is disallowed, so we need not check for it. Reading rflags.vm may require a VMREAD, so it is expensive. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Optimize vmx_get_rflags()Avi Kivity2011-05-11
| | | | | | If called several times within the same exit, return cached results. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Use kvm_get_rflags() and kvm_set_rflags() instead of the raw versionsAvi Kivity2011-05-11
| | | | | | | | Some rflags bits are owned by the host, not guest, so we need to use kvm_get_rflags() to strip those bits away or kvm_set_rflags() to add them back. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: move and fix substitue search for missing CPUID entriesAndre Przywara2011-04-06
| | | | | | | | | | | | | | | | | | | If KVM cannot find an exact match for a requested CPUID leaf, the code will try to find the closest match instead of simply confessing it's failure. The implementation was meant to satisfy the CPUID specification, but did not properly check for extended and standard leaves and also didn't account for the index subleaf. Beside that this rule only applies to CPUID intercepts, which is not the only user of the kvm_find_cpuid_entry() function. So fix this algorithm and call it from kvm_emulate_cpuid(). This fixes a crash of newer Linux kernels as KVM guests on AMD Bulldozer CPUs, where bogus values were returned in response to a CPUID intercept. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: fix XSAVE bit scanningAndre Przywara2011-04-06
| | | | | | | | | | | | When KVM scans the 0xD CPUID leaf for propagating the XSAVE save area leaves, it assumes that the leaves are contigious and stops at the first zero one. On AMD hardware there is a gap, though, as LWP uses leaf 62 to announce it's state save area. So lets iterate through all 64 possible leaves and simply skip zero ones to also cover later features. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2011-03-18
|\ | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Flush TLB if PGD entry is changed in i386 PAE mode x86, dumpstack: Correct stack dump info when frame pointer is available x86: Clean up csum-copy_64.S a bit x86: Fix common misspellings x86: Fix misspelling and align params x86: Use PentiumPro-optimized partial_csum() on VIA C7
| * x86: Fix common misspellingsLucas De Marchi2011-03-18
| | | | | | | | | | | | | | | | | | They were generated by 'codespell' and then manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: trivial@kernel.org LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | KVM: unbreak userspace that does not sets tss addressGleb Natapov2011-03-17
| | | | | | | | | | | | | | | | | | | | Commit 6440e5967bc broke old userspaces that do not set tss address before entering vcpu. Unbreak it by setting tss address to a safe value on the first vcpu entry. New userspaces should set tss address, so print warning in case it doesn't. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* | KVM: MMU: cleanup pte write pathXiao Guangrong2011-03-17
| | | | | | | | | | | | | | | | | | | | | | This patch does: - call vcpu->arch.mmu.update_pte directly - use gfn_to_pfn_atomic in update_pte path The suggestion is from Avi. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: MMU: introduce a common function to get no-dirty-logged slotXiao Guangrong2011-03-17
| | | | | | | | | | | | | | Cleanup the code of pte_prefetch_gfn_to_memslot and mapping_level_dirty_bitmap Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: fix rcu usage in init_rmode_* functionsXiao Guangrong2011-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fix: [ 3494.671786] stack backtrace: [ 3494.671789] Pid: 10527, comm: qemu-system-x86 Not tainted 2.6.38-rc6+ #23 [ 3494.671790] Call Trace: [ 3494.671796] [] ? lockdep_rcu_dereference+0x9d/0xa5 [ 3494.671826] [] ? kvm_memslots+0x6b/0x73 [kvm] [ 3494.671834] [] ? gfn_to_memslot+0x16/0x4f [kvm] [ 3494.671843] [] ? gfn_to_hva+0x16/0x27 [kvm] [ 3494.671851] [] ? kvm_write_guest_page+0x31/0x83 [kvm] [ 3494.671861] [] ? kvm_clear_guest_page+0x1a/0x1c [kvm] [ 3494.671867] [] ? vmx_set_tss_addr+0x83/0x122 [kvm_intel] and: [ 8328.789599] stack backtrace: [ 8328.789601] Pid: 18736, comm: qemu-system-x86 Not tainted 2.6.38-rc6+ #23 [ 8328.789603] Call Trace: [ 8328.789609] [] ? lockdep_rcu_dereference+0x9d/0xa5 [ 8328.789621] [] ? kvm_memslots+0x6b/0x73 [kvm] [ 8328.789628] [] ? gfn_to_memslot+0x16/0x4f [kvm] [ 8328.789635] [] ? gfn_to_hva+0x16/0x27 [kvm] [ 8328.789643] [] ? kvm_write_guest_page+0x31/0x83 [kvm] [ 8328.789699] [] ? kvm_clear_guest_page+0x1a/0x1c [kvm] [ 8328.789713] [] ? vmx_create_vcpu+0x316/0x3c8 [kvm_intel] Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: fix kvmclock regression due to missing clock updateNikola Ciprich2011-03-17
| | | | | | | | | | | | | | | | | | | | commit 387b9f97750444728962b236987fbe8ee8cc4f8c moved kvm_request_guest_time_update(vcpu), breaking 32bit SMP guests using kvm-clock. Fix this by moving (new) clock update function to proper place. Signed-off-by: Nikola Ciprich <nikola.ciprich@linuxbox.cz> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: emulator: Fix permission checking in io permission bitmapGleb Natapov2011-03-17
| | | | | | | | | | | | | | | | Currently if io port + len crosses 8bit boundary in io permission bitmap the check may allow IO that otherwise should not be allowed. The patch fixes that. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: emulator: Fix io permission checking for 64bit guestGleb Natapov2011-03-17
| | | | | | | | | | | | | | | | | | Current implementation truncates upper 32bit of TR base address during IO permission bitmap check. The patch fixes this. Reported-and-tested-by: Francis Moreau <francis.moro@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: SVM: Load %gs earlier if CONFIG_X86_32_LAZY_GS=nAvi Kivity2011-03-17
| | | | | | | | | | | | | | | | | | With CONFIG_CC_STACKPROTECTOR, we need a valid %gs at all times, so disable lazy reload and do an eager reload immediately after the vmexit. Reported-by: IVAN ANGELOV <ivangotoy@gmail.com> Acked-By: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86: Remove useless regs_page pointer from kvm_lapicTakuya Yoshikawa2011-03-17
| | | | | | | | | | | | | | | | | | Access to this page is mostly done through the regs member which holds the address to this page. The exceptions are in vmx_vcpu_reset() and kvm_free_lapic() and these both can easily be converted to using regs. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: MMU: remove unused macrosXiao Guangrong2011-03-17
| | | | | | | | | | | | | | These macros are not used, so removed Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: MMU: cleanup page alloc and freeXiao Guangrong2011-03-17
| | | | | | | | | | | | | | | | Using __get_free_page instead of alloc_page and page_address, using free_page instead of __free_page and virt_to_page Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: MMU: do not record gfn in kvm_mmu_pte_writeXiao Guangrong2011-03-17
| | | | | | | | | | | | | | | | | | No need to record the gfn to verifier the pte has the same mode as current vcpu, it's because we only speculatively update the pte only if the pte and vcpu have the same mode Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: MMU: move mmu pages calculated out of mmu lockXiao Guangrong2011-03-17
| | | | | | | | | | | | | | | | kvm_mmu_calculate_mmu_pages need to walk all memslots and it's protected by kvm->slots_lock, so move it out of mmu spinlock Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: MMU: set spte accessed bit properlyXiao Guangrong2011-03-17
| | | | | | | | | | | | | | | | Set spte accessed bit only if guest_initiated == 1 that means the really accessed Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: MMU: fix kvm_mmu_slot_remove_write_access dropping intermediate W bitsXiao Guangrong2011-03-17
| | | | | | | | | | | | | | Only remove write access in the last sptes. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: better readability of efer_reserved_bitsLai Jiangshan2011-03-17
| | | | | | | | | | | | | | use EFER_SCE, EFER_LME and EFER_LMA instead of magic numbers. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: Clear async page fault hash after switching to real modeLai Jiangshan2011-03-17
| | | | | | | | | | | | | | | | | | The hash array of async gfns may still contain some left gfns after kvm_clear_async_pf_completion_queue() called, need to clear them. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: Initialize vm86 TSS only once.Gleb Natapov2011-03-17
| | | | | | | | | | | | | | | | | | | | | | | | Currently vm86 task is initialized on each real mode entry and vcpu reset. Initialization is done by zeroing TSS and updating relevant fields. But since all vcpus are using the same TSS there is a race where one vcpu may use TSS while other vcpu is initializing it, so the vcpu that uses TSS will see wrong TSS content and will behave incorrectly. Fix that by initializing TSS only once. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: update live TR selector if it changes in real modeGleb Natapov2011-03-17
| | | | | | | | | | | | | | | | | | | | | | | | When rmode.vm86 is active TR descriptor is updated with vm86 task values, but selector is left intact. vmx_set_segment() makes sure that if TR register is written into while vm86 is active the new values are saved for use after vm86 is deactivated, but since selector is not updated on vm86 activation/deactivation new value is lost. Fix this by writing new selector into vmcs immediately. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: add the __noclone attribute to vmx_vcpu_runLai Jiangshan2011-03-17
| | | | | | | | | | | | | | | | | | The changelog of 104f226 said "adds the __noclone attribute", but it was missing in its patch. I think it is still needed. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* | KVM: x86: Convert tsc_write_lock to raw_spinlockJan Kiszka2011-03-17
| | | | | | | | | | | | | | | | Code under this lock requires non-preemptibility. Ensure this also over -rt by converting it to raw spinlock. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>