aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm/vmx.c
Commit message (Collapse)AuthorAge
* KVM: VMX: Fix locking imbalance on emulation failureJan Kiszka2009-08-05
| | | | | | | | | | We have to disable preemption and IRQs on every exit from handle_invalid_guest_state, otherwise we generate at least a preempt_disable imbalance. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Fix locking order in handle_invalid_guest_stateJan Kiszka2009-08-05
| | | | | | | | | Release and re-acquire preemption and IRQ lock in the same order as vcpu_enter_guest does. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Handle vmx instruction vmexitsAvi Kivity2009-06-28
| | | | | | | | | | IF a guest tries to use vmx instructions, inject a #UD to let it know the instruction is not implemented, rather than crashing. This prevents guest userspace from crashing the guest kernel. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
* page allocator: do not check NUMA node ID when the caller knows the node is ↵Mel Gorman2009-06-16
| | | | | | | | | | | | | | | | | | | | | | | valid Callers of alloc_pages_node() can optionally specify -1 as a node to mean "allocate from the current node". However, a number of the callers in fast paths know for a fact their node is valid. To avoid a comparison and branch, this patch adds alloc_pages_exact_node() that only checks the nid with VM_BUG_ON(). Callers that know their node is valid are then converted. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Paul Mundt <lethal@linux-sh.org> [for the SLOB NUMA bits] Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* KVM: Add VT-x machine check supportAndi Kleen2009-06-10
| | | | | | | | | | | | | | | | | | | VT-x needs an explicit MC vector intercept to handle machine checks in the hyper visor. It also has a special option to catch machine checks that happen during VT entry. Do these interceptions and forward them to the Linux machine check handler. Make it always look like user space is interrupted because the machine check handler treats kernel/user space differently. Thanks to Jiang Yunhong for help and testing. Cc: stable@kernel.org Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Rename rmode.active to rmode.vm86_activeNitin A Kamble2009-06-10
| | | | | | | | That way the interpretation of rmode.active becomes more clear with unrestricted guest code. Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Move "exit due to NMI" handling into vmx_complete_interrupts()Gleb Natapov2009-06-10
| | | | | | | To save us one reading of VM_EXIT_INTR_INFO. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Do not re-execute INTn instruction.Gleb Natapov2009-06-10
| | | | | | | | Re-inject event instead. This is what Intel suggest. Also use correct instruction length when re-injecting soft fault/interrupt. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Unprotect a page if #PF happens during NMI injection.Gleb Natapov2009-06-10
| | | | | | | It is done for exception and interrupt already. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Replace ->drop_interrupt_shadow() by ->set_interrupt_shadow()Glauber Costa2009-06-10
| | | | | | | | | | | | This patch replaces drop_interrupt_shadow with the more general set_interrupt_shadow, that can either drop or raise it, depending on its parameter. It also adds ->get_interrupt_shadow() for future use. Signed-off-by: Glauber Costa <glommer@redhat.com> CC: H. Peter Anvin <hpa@zytor.com> CC: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Enable snooping control for supported hardwareSheng Yang2009-06-10
| | | | | | | | | | | | | | | | Memory aliases with different memory type is a problem for guest. For the guest without assigned device, the memory type of guest memory would always been the same as host(WB); but for the assigned device, some part of memory may be used as DMA and then set to uncacheable memory type(UC/WC), which would be a conflict of host memory type then be a potential issue. Snooping control can guarantee the cache correctness of memory go through the DMA engine of VT-d. [avi: fix build on ia64] Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Replace get_mt_mask_shift with get_mt_maskSheng Yang2009-06-10
| | | | | | | | | | Shadow_mt_mask is out of date, now it have only been used as a flag to indicate if TDP enabled. Get rid of it and use tdp_enabled instead. Also put memory type logical in kvm_x86_ops->get_mt_mask(). Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Get rid of get_irq() callbackGleb Natapov2009-06-10
| | | | | | | | | It just returns pending IRQ vector from the queue for VMX/SVM. Get IRQ directly from the queue before migration and put it back after. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: Add NMI injection supportGleb Natapov2009-06-10
| | | | | Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Get rid of arch.interrupt_window_open & arch.nmi_window_openGleb Natapov2009-06-10
| | | | | | | They are recalculated before each use anyway. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Do not report TPR write to userspace if new value bigger or equal to a ↵Gleb Natapov2009-06-10
| | | | | | | | | previous one. Saves many exits to userspace in a case of IRQ chip in userspace. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Remove inject_pending_vectors() callbackGleb Natapov2009-06-10
| | | | | | | It is the same as inject_pending_irq() for VMX/SVM now. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Remove exception_injected() callback.Gleb Natapov2009-06-10
| | | | | | | It always return false for VMX/SVM now. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Cleanup vmx_intr_assist()Gleb Natapov2009-06-10
| | | | | Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Consolidate userspace and kernel interrupt injection for VMXGleb Natapov2009-06-10
| | | | | | | | Use the same callback to inject irq/nmi events no matter what irqchip is in use. Only from VMX for now. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Make kvm_cpu_(has|get)_interrupt() work for userspace irqchip tooGleb Natapov2009-06-10
| | | | | | | At the vector level, kernel and userspace irqchip are fairly similar. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Fix unneeded instruction skipping during task switching.Gleb Natapov2009-06-10
| | | | | | | | | | There is no need to skip instruction if the reason for a task switch is a task gate in IDT and access to it is caused by an external even. The problem is currently solved only for VMX since there is no reliable way to skip an instruction in SVM. We should emulate it instead. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Do not zero idt_vectoring_info in vmx_complete_interrupts().Gleb Natapov2009-06-10
| | | | | | | | | | We will need it later in task_switch(). Code in handle_exception() is dead. is_external_interrupt(vect_info) will always be false since idt_vectoring_info is zeroed in vmx_complete_interrupts(). Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Rewrite vmx_complete_interrupt()'s twisted maze of if() statementsGleb Natapov2009-06-10
| | | | | | | | | | | ...with a more straightforward switch(). Also fix a bug when NMI could be dropped on exit. Although this should never happen in practice, since NMIs can only be injected, never triggered internally by the guest like exceptions. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Fix handling of a fault during NMI unblocked due to IRETGleb Natapov2009-06-10
| | | | | | | | | Bit 12 is undefined in any of the following cases: If the VM exit sets the valid bit in the IDT-vectoring information field. If the VM exit is due to a double fault. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Fix feature testingSheng Yang2009-06-10
| | | | | | | The testing of feature is too early now, before vmcs_config complete initialization. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Clean up Flex Priority relatedSheng Yang2009-06-10
| | | | | | | And clean paranthes on returns. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Correct wrong vmcs field sizesSheng Yang2009-06-10
| | | | | | | EXIT_QUALIFICATION and GUEST_LINEAR_ADDRESS are natural width, not 64-bit. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Make flexpriority module parameter reflect hardware capabilityAvi Kivity2009-06-10
| | | | | | If the hardware does not support flexpriority, zero the module parameter. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Fix interrupt unhalting a vcpu when it shouldn'tGleb Natapov2009-06-10
| | | | | | | | kvm_vcpu_block() unhalts vpu on an interrupt/timer without checking if interrupt window is actually opened. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Fold vm_need_ept() into callersAvi Kivity2009-06-10
| | | | | | Trivial. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Zero ept module parameter if ept is not presentAvi Kivity2009-06-10
| | | | | | Allows reading back hardware capability. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Zero the vpid module parameter if vpid is not supportedAvi Kivity2009-06-10
| | | | | | This allows reading back how the hardware is configured. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Annotate module parameters as __read_mostlyAvi Kivity2009-06-10
| | | | Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Simplify module parameter namesAvi Kivity2009-06-10
| | | | | | Instead of 'enable_vpid=1', use a simple 'vpid=1'. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Rename kvm_handle_exit() to vmx_handle_exit()Avi Kivity2009-06-10
| | | | | | It is a static vmx-specific function. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Make module parameters readableAvi Kivity2009-06-10
| | | | | | Useful to see how the module was loaded. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: reuse (pop|push)_irq from svm.c in vmx.cGleb Natapov2009-06-10
| | | | | | | | The prioritized bit vector manipulation functions are useful in both vmx and svm. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Don't intercept MSR_KERNEL_GS_BASEAvi Kivity2009-06-10
| | | | | | | | | | | | | Windows 2008 accesses this MSR often on context switch intensive workloads; since we run in guest context with the guest MSR value loaded (so swapgs can work correctly), we can simply disable interception of rdmsr/wrmsr for this MSR. A complication occurs since in legacy mode, we run with the host MSR value loaded. In this case we enable interception. This means we need two MSR bitmaps, one for legacy mode and one for long mode. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Don't use highmem pages for the msr and pio bitmapsAvi Kivity2009-06-10
| | | | | | | Highmem pages are a pain, and saving three lowmem pages on i386 isn't worth the extra code. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Don't allow uninhibited access to EFER on i386Avi Kivity2009-03-24
| | | | | | | | | | | | | vmx_set_msr() does not allow i386 guests to touch EFER, but they can still do so through the default: label in the switch. If they set EFER_LME, they can oops the host. Fix by having EFER access through the normal channel (which will check for EFER_LME) even on i386. Reported-and-tested-by: Benjamin Gilbert <bgilbert@cs.cmu.edu> Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Update necessary state when guest enters long modeAmit Shah2009-03-24
| | | | | | | | | | | | | | | | | setup_msrs() should be called when entering long mode to save the shadow state for the 64-bit guest state. Using vmx_set_efer() in enter_lmode() removes some duplicated code and also ensures we call setup_msrs(). We can safely pass the value of shadow_efer to vmx_set_efer() as no other bits in the efer change while enabling long mode (guest first sets EFER.LME, then sets CR0.PG which causes a vmexit where we activate long mode). With this fix, is_long_mode() can check for EFER.LMA set instead of EFER.LME and 5e23049e86dd298b72e206b420513dbc3a240cd9 can be reverted. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Use kvm_mmu_page_fault() handle EPT violation mmioSheng Yang2009-03-24
| | | | | | | Removed duplicated code. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Drop unused evaluations from string pio handlersJan Kiszka2009-03-24
| | | | | | | | Looks like neither the direction nor the rep prefix are used anymore. Drop related evaluations from SVM's and VMX's I/O exit handlers. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: When emulating on invalid vmx state, don't return to userspace ↵Avi Kivity2009-03-24
| | | | | | | | | unnecessarily If we aren't doing mmio there's no need to exit to userspace (which will just be confused). Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Prevent exit handler from running if emulating due to invalid stateAvi Kivity2009-03-24
| | | | | | | | | | | If we've just emulated an instruction, we won't have any valid exit reason and associated information. Fix by moving the clearing of the emulation_required flag to the exit handler. This way the exit handler can notice that we've been emulating and abort early. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: don't clobber segment AR if emulating invalid stateAvi Kivity2009-03-24
| | | | | | | The ususable bit is important for determining state validity; don't clobber it. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Fix guest state validity checksAvi Kivity2009-03-24
| | | | | | | The vmx guest state validity checks are full of bugs. Make them conform to the manual. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: initialize TSC offset relative to vm creation timeMarcelo Tosatti2009-03-24
| | | | | | | | | | | | | | | | | | | | | VMX initializes the TSC offset for each vcpu at different times, and also reinitializes it for vcpus other than 0 on APIC SIPI message. This bug causes the TSC's to appear unsynchronized in the guest, even if the host is good. Older Linux kernels don't handle the situation very well, so gettimeofday is likely to go backwards in time: http://www.mail-archive.com/kvm@vger.kernel.org/msg02955.html http://sourceforge.net/tracker/index.php?func=detail&aid=2025534&group_id=180599&atid=893831 Fix it by initializating the offset of each vcpu relative to vm creation time, and moving it from vmx_vcpu_reset to vmx_vcpu_setup, out of the APIC MP init path. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86: Wire-up hardware breakpoints for guest debuggingJan Kiszka2009-03-24
| | | | | | | | | Add the remaining bits to make use of debug registers also for guest debugging, thus enabling the use of hardware breakpoints and watchpoints. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>