aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm
Commit message (Collapse)AuthorAge
* KVM: Break dependency between vcpu index in vcpus array and vcpu_id.Gleb Natapov2009-09-10
| | | | | | | | | Archs are free to use vcpu_id as they see fit. For x86 it is used as vcpu's apic id. New ioctl is added to configure boot vcpu id that was assumed to be 0 till now. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Use pointer to vcpu instead of vcpu_id in timer code.Gleb Natapov2009-09-10
| | | | | Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Introduce kvm_vcpu_is_bsp() function.Gleb Natapov2009-09-10
| | | | | | | Use it instead of open code "vcpu_id zero is BSP" assumption. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: s/shadow_pte/spte/Avi Kivity2009-09-10
| | | | | | | | | We use shadow_pte and spte inconsistently, switch to the shorter spelling. Rename set_shadow_pte() to __set_spte() to avoid a conflict with the existing set_spte(), and to indicate its lowlevelness. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: Adjust pte accessors to explicitly indicate guest or shadow pteAvi Kivity2009-09-10
| | | | | | | | | Since the guest and host ptes can have wildly different format, adjust the pte accessor names to indicate on which type of pte they operate on. No functional changes. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: Fix is_dirty_pte()Avi Kivity2009-09-10
| | | | | | | | | | is_dirty_pte() is used on guest ptes, not shadow ptes, so it needs to avoid shadow_dirty_mask and use PT_DIRTY_MASK instead. Misdetecting dirty pages could lead to unnecessarily setting the dirty bit under EPT. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Move rmode structure to vmx-specific codeAvi Kivity2009-09-10
| | | | | | rmode is only used in vmx, so move it to vmx.c Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Support Unrestricted Guest featureNitin A Kamble2009-09-10
| | | | | | | | | | | | | | | | | | "Unrestricted Guest" feature is added in the VMX specification. Intel Westmere and onwards processors will support this feature. It allows kvm guests to run real mode and unpaged mode code natively in the VMX mode when EPT is turned on. With the unrestricted guest there is no need to emulate the guest real mode code in the vm86 container or in the emulator. Also the guest big real mode code works like native. The attached patch enhances KVM to use the unrestricted guest feature if available on the processor. It also adds a new kernel/module parameter to disable the unrestricted guest feature at the boot time. Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: switch irq injection/acking data structures to irq_lockMarcelo Tosatti2009-09-10
| | | | | | | | | | | | | | | | | | | Protect irq injection/acking data structures with a separate irq_lock mutex. This fixes the following deadlock: CPU A CPU B kvm_vm_ioctl_deassign_dev_irq() mutex_lock(&kvm->lock); worker_thread() -> kvm_deassign_irq() -> kvm_assigned_dev_interrupt_work_handler() -> deassign_host_irq() mutex_lock(&kvm->lock); -> cancel_work_sync() [blocked] [gleb: fix ia64 path] Reported-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Grab pic lock in kvm_pic_clear_isr_ackMarcelo Tosatti2009-09-10
| | | | | | | isr_ack is protected by kvm_pic->lock. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Cleanup LAPIC interfaceJan Kiszka2009-09-10
| | | | | | | | | None of the interface services the LAPIC emulation provides need to be exported to modules, and kvm_lapic_get_base is even totally unused today. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Fix reporting of unhandled EPT violationsAvi Kivity2009-09-10
| | | | | | | Instead of returning -ENOTSUPP, exit normally but indicate the hardware exit reason. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Cache pdptrsAvi Kivity2009-09-10
| | | | | | | Instead of reloading the pdptrs on every entry and exit (vmcs writes on vmx, guest memory access on svm) extract them on demand. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Simplify pdptr and cr3 managementAvi Kivity2009-09-10
| | | | | | | | | Instead of reading the PDPTRs from memory after every exit (which is slow and wrong, as the PDPTRs are stored on the cpu), sync the PDPTRs from memory to the VMCS before entry, and from the VMCS to memory after exit. Do the same for cr3. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Avoid duplicate ept tlb flush when setting cr3Avi Kivity2009-09-10
| | | | | | | vmx_set_cr3() will call vmx_tlb_flush(), which will flush the ept context. So there is no need to call ept_sync_context() explicitly. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: do not register i8254 PIO regions until we are initializedGregory Haskins2009-09-10
| | | | | | | | | | | We currently publish the i8254 resources to the pio_bus before the devices are fully initialized. Since we hold the pit_lock, its probably not a real issue. But lets clean this up anyway. Reported-by: Avi Kivity <avi@redhat.com> Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: cleanup io_device codeGregory Haskins2009-09-10
| | | | | | | | | | | We modernize the io_device code so that we use container_of() instead of dev->private, and move the vtable to a separate ops structure (theoretically allows better caching for multiple instances of the same ops structure) Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: Fold kvm_svm.h info svm.cAvi Kivity2009-09-10
| | | | | | kvm_svm.h is only included from svm.c, so fold it in. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: use explicit 64bit storage for sysenter valuesAndre Przywara2009-09-10
| | | | | | | | | | | | Since AMD does not support sysenter in 64bit mode, the VMCB fields storing the MSRs are truncated to 32bit upon VMRUN/#VMEXIT. So store the values in a separate 64bit storage to avoid truncation. [andre: fix amd->amd migration] Signed-off-by: Christoph Egger <christoph.egger@amd.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Allow PIT emulation without speaker portJan Kiszka2009-09-10
| | | | | | | | | | | | | The in-kernel speaker emulation is only a dummy and also unneeded from the performance point of view. Rather, it takes user space support to generate sound output on the host, e.g. console beeps. To allow this, introduce KVM_CREATE_PIT2 which controls in-kernel speaker port emulation via a flag passed along the new IOCTL. It also leaves room for future extensions of the PIT configuration interface. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: irqfdGregory Haskins2009-09-10
| | | | | | | | | | | | | | | | KVM provides a complete virtual system environment for guests, including support for injecting interrupts modeled after the real exception/interrupt facilities present on the native platform (such as the IDT on x86). Virtual interrupts can come from a variety of sources (emulated devices, pass-through devices, etc) but all must be injected to the guest via the KVM infrastructure. This patch adds a new mechanism to inject a specific interrupt to a guest using a decoupled eventfd mechnanism: Any legal signal on the irqfd (using eventfd semantics from either userspace or kernel) will translate into an injected interrupt in the guest at the next available interrupt window. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Move common KVM Kconfig items to new file virt/kvm/KconfigAvi Kivity2009-09-10
| | | | | | Reduce Kconfig code duplication. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Drop interrupt shadow when single stepping should be done only on VMXGleb Natapov2009-09-10
| | | | | | | | The problem exists only on VMX. Also currently we skip this step if there is pending exception. The patch fixes this too. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: cleanup arch/x86/kvm/MakefileChristoph Hellwig2009-09-10
| | | | | | | | | Use proper foo-y style list additions to cleanup all the conditionals, move module selection after compound object selection and remove the superflous comment. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: fix jmp far decoding (opcode 0xea)Avi Kivity2009-09-10
| | | | | | | The jump target should not be sign extened; use an unsigned decode flag. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: Implement zero-extended immediate decodingAvi Kivity2009-09-10
| | | | | | | Absolute jumps use zero extended immediate operands. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: fix cpuid E2BIG handling for extended request typesMark McLoughlin2009-09-10
| | | | | | | | | If we run out of cpuid entries for extended request types we should return -E2BIG, just like we do for the standard request types. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Use MSR names in place of addressJaswinder Singh Rajput2009-09-10
| | | | | | | Replace 0xc0010010 with MSR_K8_SYSCFG and 0xc0010015 with MSR_K7_HWCR. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Add MCE supportHuang Ying2009-09-10
| | | | | | | | | | | | The related MSRs are emulated. MCE capability is exported via extension KVM_CAP_MCE and ioctl KVM_X86_GET_MCE_CAP_SUPPORTED. A new vcpu ioctl command KVM_X86_SETUP_MCE is used to setup MCE emulation such as the mcg_cap. MCE is injected via vcpu ioctl command KVM_X86_SET_MCE. Extended machine-check state (MCG_EXT_P) and CMCI are not implemented. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Replace MSR_IA32_TIME_STAMP_COUNTER with MSR_IA32_TSC of msr-index.hJaswinder Singh Rajput2009-09-10
| | | | | | | | | | Use standard msr-index.h's MSR declaration. MSR_IA32_TSC is better than MSR_IA32_TIME_STAMP_COUNTER as it also solves 80 column issue. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Properly handle software interrupt re-injection in real modeGleb Natapov2009-09-10
| | | | | | | | | | | | | When reinjecting a software interrupt or exception, use the correct instruction length provided by the hardware instead of a hardcoded 1. Fixes problems running the suse 9.1 livecd boot loader. Problem introduced by commit f0a3602c20 ("KVM: Move interrupt injection logic to x86.c"). Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: limit rmap chain lengthMarcelo Tosatti2009-08-06
| | | | | | | | | Otherwise the host can spend too long traversing an rmap chain, which happens under a spinlock. Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Fix locking imbalance on emulation failureJan Kiszka2009-08-05
| | | | | | | | | | We have to disable preemption and IRQs on every exit from handle_invalid_guest_state, otherwise we generate at least a preempt_disable imbalance. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Fix locking order in handle_invalid_guest_stateJan Kiszka2009-08-05
| | | | | | | | | Release and re-acquire preemption and IRQ lock in the same order as vcpu_enter_guest does. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: handle n_free_mmu_pages > n_alloc_mmu_pages in ↵Marcelo Tosatti2009-08-05
| | | | | | | | | | | | | | kvm_mmu_change_mmu_pages kvm_mmu_change_mmu_pages mishandles the case where n_alloc_mmu_pages is smaller then n_free_mmu_pages, by not checking if the result of the subtraction is negative. Its a valid condition which can happen if a large number of pages has been recently freed. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: force new asid on vcpu migrationMarcelo Tosatti2009-08-05
| | | | | | | | | | | | | | | | If a migrated vcpu matches the asid_generation value of the target pcpu, there will be no TLB flush via TLB_CONTROL_FLUSH_ALL_ASID. The check for vcpu.cpu in pre_svm_run is meaningless since svm_vcpu_load already updated it on schedule in. Such vcpu will VMRUN with stale TLB entries. Based on original patch from Joerg Roedel (http://patchwork.kernel.org/patch/10021/) Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86: verify MTRR/PAT validityMarcelo Tosatti2009-08-05
| | | | | | | | Do not allow invalid memory types in MTRR/PAT (generating a #GP otherwise). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: PIT: fix kpit_elapsed division by zeroMarcelo Tosatti2009-08-05
| | | | | | | | | Fix division by zero triggered by latch count command on uninitialized counter. Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Fix KVM_GET_MSR_INDEX_LISTJan Kiszka2009-08-05
| | | | | | | | | | | | So far, KVM copied the emulated_msrs (only MSR_IA32_MISC_ENABLE) to a wrong address in user space due to broken pointer arithmetic. This caused subtle corruption up there (missing MSR_IA32_MISC_ENABLE had probably no practical relevance). Moreover, the size check for the user-provided kvm_msr_list forgot about emulated MSRs. Cc: stable@kernel.org Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: shut up uninit compiler warning in paging_tmpl.hJaswinder Singh Rajput2009-06-28
| | | | | | | | | | | | | | | Dixes compilation warning: CC arch/x86/kernel/io_delay.o arch/x86/kvm/paging_tmpl.h: In function ‘paging64_fetch’: arch/x86/kvm/paging_tmpl.h:279: warning: ‘sptep’ may be used uninitialized in this function arch/x86/kvm/paging_tmpl.h: In function ‘paging32_fetch’: arch/x86/kvm/paging_tmpl.h:279: warning: ‘sptep’ may be used uninitialized in this function warning is bogus (always have a least one level), but need to shut the compiler up. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Ignore reads to K7 EVNTSEL MSRsAmit Shah2009-06-28
| | | | | | | | | | | | | In commit 7fe29e0faacb650d31b9e9f538203a157bec821d we ignored the reads to the P6 EVNTSEL MSRs. That fixed crashes on Intel machines. Ignore the reads to K7 EVNTSEL MSRs as well to fix this on AMD hosts. This fixes Kaspersky antivirus crashing Windows guests on AMD hosts. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Handle vmx instruction vmexitsAvi Kivity2009-06-28
| | | | | | | | | | IF a guest tries to use vmx instructions, inject a #UD to let it know the instruction is not implemented, rather than crashing. This prevents guest userspace from crashing the guest kernel. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: kvm/x86_emulate.c toggle_interruptibility() should be staticJaswinder Singh Rajput2009-06-28
| | | | | | | | | | | toggle_interruptibility() is used only by same file, it should be static. Fixed following sparse warning : arch/x86/kvm/x86_emulate.c:1364:6: warning: symbol 'toggle_interruptibility' was not declared. Should it be static? Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: Allow 4K ptes with bit 7 (PAT) setAvi Kivity2009-06-28
| | | | | | Bit 7 is perfectly legal in the 4K page leve; it is used for the PAT. Signed-off-by: Avi Kivity <avi@redhat.com>
* page allocator: do not check NUMA node ID when the caller knows the node is ↵Mel Gorman2009-06-16
| | | | | | | | | | | | | | | | | | | | | | | valid Callers of alloc_pages_node() can optionally specify -1 as a node to mean "allocate from the current node". However, a number of the callers in fast paths know for a fact their node is valid. To avoid a comparison and branch, this patch adds alloc_pages_exact_node() that only checks the nid with VM_BUG_ON(). Callers that know their node is valid are then converted. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Paul Mundt <lethal@linux-sh.org> [for the SLOB NUMA bits] Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* KVM: Add VT-x machine check supportAndi Kleen2009-06-10
| | | | | | | | | | | | | | | | | | | VT-x needs an explicit MC vector intercept to handle machine checks in the hyper visor. It also has a special option to catch machine checks that happen during VT entry. Do these interceptions and forward them to the Linux machine check handler. Make it always look like user space is interrupted because the machine check handler treats kernel/user space differently. Thanks to Jiang Yunhong for help and testing. Cc: stable@kernel.org Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Rename rmode.active to rmode.vm86_activeNitin A Kamble2009-06-10
| | | | | | | | That way the interpretation of rmode.active becomes more clear with unrestricted guest code. Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Move "exit due to NMI" handling into vmx_complete_interrupts()Gleb Natapov2009-06-10
| | | | | | | To save us one reading of VM_EXIT_INTR_INFO. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Disable CR8 intercept if tpr patching is activeGleb Natapov2009-06-10
| | | | | Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Do not migrate pending software interrupts.Gleb Natapov2009-06-10
| | | | | | | | | | | INTn will be re-executed after migration. If we wanted to migrate pending software interrupt we would need to migrate interrupt type and instruction length too, but we do not have all required info on SVM, so SVM->VMX migration would need to re-execute INTn anyway. To make it simple never migrate pending soft interrupt. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>