aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86
Commit message (Collapse)AuthorAge
...
| * | | | | | | | | | | | kvm: KVM_CAP_IOMMU only available with device assignmentAlex Williamson2013-04-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix build with CONFIG_PCI unset by linking KVM_CAP_IOMMU to device assignment config option. It has no purpose otherwise. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Skip PF interception check when queuing during nested runJan Kiszka2013-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While a nested run is pending, vmx_queue_exception is only called to requeue exceptions that were previously picked up via vmx_cancel_injection. Therefore, we must not check for PF interception by L1, possibly causing a bogus nested vmexit. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: x86: Increase the "hard" max VCPU limitChegu Vinod2013-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM guests today use 8bit APIC ids allowing for 256 ID's. Reserving one ID for Broadcast interrupts should leave 255 ID's. In case of KVM there is no need for reserving another ID for IO-APIC so the hard max limit for VCPUS can be increased from 254 to 255. (This was confirmed by Gleb Natapov http://article.gmane.org/gmane.comp.emulators.kvm.devel/99713 ) Signed-off-by: Chegu Vinod <chegu_vinod@hp.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | kvm: Allow build-time configuration of KVM device assignmentAlex Williamson2013-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We hope to at some point deprecate KVM legacy device assignment in favor of VFIO-based assignment. Towards that end, allow legacy device assignment to be deconfigured. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alexander Graf <agraf@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | Merge git://github.com/agraf/linux-2.6.git kvm-ppc-next into queueGleb Natapov2013-04-28
| |\ \ \ \ \ \ \ \ \ \ \ \
| | * | | | | | | | | | | | KVM: Move irqfd resample cap handling to generic codeAlexander Graf2013-04-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have most irqfd code completely platform agnostic, let's move irqfd's resample capability return to generic code as well. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com>
| | * | | | | | | | | | | | KVM: Extract generic irqchip logic into irqchip.cAlexander Graf2013-04-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current irq_comm.c file contains pieces of code that are generic across different irqchip implementations, as well as code that is fully IOAPIC specific. Split the generic bits out into irqchip.c. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com>
| | * | | | | | | | | | | | KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTINGAlexander Graf2013-04-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Quite a bit of code in KVM has been conditionalized on availability of IOAPIC emulation. However, most of it is generically applicable to platforms that don't have an IOPIC, but a different type of irq chip. Make code that only relies on IRQ routing, not an APIC itself, on CONFIG_HAVE_KVM_IRQ_ROUTING, so that we can reuse it later. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com>
| | * | | | | | | | | | | | KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINSAlexander Graf2013-04-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The concept of routing interrupt lines to an irqchip is nothing that is IOAPIC specific. Every irqchip has a maximum number of pins that can be linked to irq lines. So let's add a new define that allows us to reuse generic code for non-IOAPIC platforms. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com>
| * | | | | | | | | | | | | KVM: x86: Rework request for immediate exitJan Kiszka2013-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The VMX implementation of enable_irq_window raised KVM_REQ_IMMEDIATE_EXIT after we checked it in vcpu_enter_guest. This caused infinite loops on vmentry. Fix it by letting enable_irq_window signal the need for an immediate exit via its return value and drop KVM_REQ_IMMEDIATE_EXIT. This issue only affects nested VMX scenarios. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | | kvm, svm: Fix typo in printk messageBorislav Petkov2013-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is "exit_int_info". It is actually EXITINTINFO in the official docs but we don't like screaming docs. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | | KVM: VMX: remove unprintable characters from commentJan Kiszka2013-04-28
| |/ / / / / / / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Slipped in while copy&pasting from the SDM. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: X86 emulator: fix source operand decoding for 8bit mov[zs]x instructionsGleb Natapov2013-04-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Source operand for one byte mov[zs]x is decoded incorrectly if it is in high byte register. Fix that. Cc: stable@vger.kernel.org Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: VM_ENTRY/EXIT_LOAD_IA32_EFER overrides EFER.LMA settingsJan Kiszka2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we load the complete EFER MSR on entry or exit, EFER.LMA (and LME) loading is skipped. Their consistency is already checked now before starting the transition. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Validate EFER values for VM_ENTRY/EXIT_LOAD_IA32_EFERJan Kiszka2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we may emulate the loading of EFER on VM-entry and VM-exit, implement the checks that VMX performs on the guest and host values on vmlaunch/ vmresume. Factor out kvm_valid_efer for this purpose which checks for set reserved bits. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Fix conditions for NMI injectionJan Kiszka2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The logic for checking if interrupts can be injected has to be applied also on NMIs. The difference is that if NMI interception is on these events are consumed and blocked by the VM exit. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: VMX: Move vmx_nmi_allowed after vmx_set_nmi_maskJan Kiszka2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vmx_set_nmi_mask will soon be used by vmx_nmi_allowed. No functional changes. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: x86: Fix memory leak in vmx.cAndrew Honig2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If userspace creates and destroys multiple VMs within the same process we leak 20k of memory in the userspace process context per VM. This patch frees the memory in kvm_arch_destroy_vm. If the process exits without closing the VM file descriptor or the file descriptor has been shared with another process then we don't free the memory. It's still possible for a user space process to leak memory if the last process to close the fd for the VM is not the process that created it. However, this is an unexpected case that's only caused by a user space process that's misbehaving. Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: x86: fix error return code in kvm_arch_vcpu_init()Wei Yongjun2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix to return a negative error code from the error handling case instead of 0, as returned elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Enable and disable shadow vmcs functionalityAbel Gordon2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once L1 loads VMCS12 we enable shadow-vmcs capability and copy all the VMCS12 shadowed fields to the shadow vmcs. When we release the VMCS12, we also disable shadow-vmcs capability. Signed-off-by: Abel Gordon <abelg@il.ibm.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Synchronize VMCS12 content with the shadow vmcsAbel Gordon2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Synchronize between the VMCS12 software controlled structure and the processor-specific shadow vmcs Signed-off-by: Abel Gordon <abelg@il.ibm.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Copy VMCS12 to processor-specific shadow vmcsAbel Gordon2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a function used to copy fields from the software controlled VMCS12 to the processor-specific shadow vmcs Signed-off-by: Abel Gordon <abelg@il.ibm.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Copy processor-specific shadow-vmcs to VMCS12Abel Gordon2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a function used to copy fields from the processor-specific shadow vmcs to the software controlled VMCS12 Signed-off-by: Abel Gordon <abelg@il.ibm.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Release shadow vmcsAbel Gordon2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unmap vmcs12 and release the corresponding shadow vmcs Signed-off-by: Abel Gordon <abelg@il.ibm.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Allocate shadow vmcsAbel Gordon2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allocate a shadow vmcs used by the processor to shadow part of the fields stored in the software defined VMCS12 (let L1 access fields without causing exits). Note we keep a shadow vmcs only for the current vmcs12. Once a vmcs12 becomes non-current, its shadow vmcs is released. Signed-off-by: Abel Gordon <abelg@il.ibm.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Fix VMXON emulationAbel Gordon2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | handle_vmon doesn't check if L1 is already in root mode (VMXON was previously called). This patch adds this missing check and calls nested_vmx_failValid if VMX is already ON. We need this check because L0 will allocate the shadow vmcs when L1 executes VMXON and we want to avoid host leaks (due to shadow vmcs allocation) if L1 executes VMXON repeatedly. Signed-off-by: Abel Gordon <abelg@il.ibm.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Refactor handle_vmwriteAbel Gordon2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor existent code so we re-use vmcs12_write_any to copy fields from the shadow vmcs specified by the link pointer (used by the processor, implementation-specific) to the VMCS12 software format used by L0 to hold the fields in L1 memory address space. Signed-off-by: Abel Gordon <abelg@il.ibm.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Introduce vmread and vmwrite bitmapsAbel Gordon2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare vmread and vmwrite bitmaps according to a pre-specified list of fields. These lists are intended to specifiy most frequent accessed fields so we can minimize the number of fields that are copied from/to the software controlled VMCS12 format to/from to processor-specific shadow vmcs. The lists were built measuring the VMCS fields access rate after L2 Ubuntu 12.04 booted when it was running on top of L1 KVM, also Ubuntu 12.04. Note that during boot there were additional fields which were frequently modified but they were not added to these lists because after boot these fields were not longer accessed by L1. Signed-off-by: Abel Gordon <abelg@il.ibm.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Detect shadow-vmcs capabilityAbel Gordon2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add logic required to detect if shadow-vmcs is supported by the processor. Introduce a new kernel module parameter to specify if L0 should use shadow vmcs (or not) to run L1. Signed-off-by: Abel Gordon <abelg@il.ibm.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Shadow-vmcs control fields/bitsAbel Gordon2013-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add definitions for all the vmcs control fields/bits required to enable vmcs-shadowing Signed-off-by: Abel Gordon <abelg@il.ibm.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: x86: Fix posted interrupt with CONFIG_SMP=nZhang, Yang Z2013-04-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ->send_IPI_mask is not defined on UP. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: VMX: Fix check guest state validity if a guest is in VM86 modeGleb Natapov2013-04-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If guest vcpu is in VM86 mode the vcpu state should be checked as if in real mode. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: check vmcs12 for valid activity statePaolo Bonzini2013-04-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM does not use the activity state VMCS field, and does not support it in nested VMX either (the corresponding bits in the misc VMX feature MSR are zero). Fail entry if the activity state is set to anything but "active". Since the value will always be the same for L1 and L2, we do not need to read and write the corresponding VMCS field on L1/L2 transitions, either. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: VMX: Use posted interrupt to deliver virtual interruptYang Zhang2013-04-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If posted interrupt is avaliable, then uses it to inject virtual interrupt to guest. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: VMX: Add the deliver posted interrupt algorithmYang Zhang2013-04-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only deliver the posted interrupt when target vcpu is running and there is no previous interrupt pending in pir. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: Set TMR when programming ioapic entryYang Zhang2013-04-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already know the trigger mode of a given interrupt when programming the ioapice entry. So it's not necessary to set it in each interrupt delivery. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: Call common update function when ioapic entry changed.Yang Zhang2013-04-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both TMR and EOI exit bitmap need to be updated when ioapic changed or vcpu's id/ldr/dfr changed. So use common function instead eoi exit bitmap specific function. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: VMX: Check the posted interrupt capabilityYang Zhang2013-04-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Detect the posted interrupt feature. If it exists, then set it in vmcs_config. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: VMX: Register a new IPI for posted interruptYang Zhang2013-04-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Posted Interrupt feature requires a special IPI to deliver posted interrupt to guest. And it should has a high priority so the interrupt will not be blocked by others. Normally, the posted interrupt will be consumed by vcpu if target vcpu is running and transparent to OS. But in some cases, the interrupt will arrive when target vcpu is scheduled out. And host will see it. So we need to register a dump handler to handle it. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: VMX: Enable acknowledge interupt on vmexitYang Zhang2013-04-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "acknowledge interrupt on exit" feature controls processor behavior for external interrupt acknowledgement. When this control is set, the processor acknowledges the interrupt controller to acquire the interrupt vector on VM exit. After enabling this feature, an interrupt which arrived when target cpu is running in vmx non-root mode will be handled by vmx handler instead of handler in idt. Currently, vmx handler only fakes an interrupt stack and jump to idt table to let real handler to handle it. Further, we will recognize the interrupt and only delivery the interrupt which not belong to current vcpu through idt table. The interrupt which belonged to current vcpu will be handled inside vmx handler. This will reduce the interrupt handle cost of KVM. Also, interrupt enable logic is changed if this feature is turnning on: Before this patch, hypervior call local_irq_enable() to enable it directly. Now IF bit is set on interrupt stack frame, and will be enabled on a return from interrupt handler if exterrupt interrupt exists. If no external interrupt, still call local_irq_enable() to enable it. Refer to Intel SDM volum 3, chapter 33.2. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: Let ioapic know the irq line statusYang Zhang2013-04-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Userspace may deliver RTC interrupt without query the status. So we want to track RTC EOI for this case. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: Add reset/restore rtc_status supportYang Zhang2013-04-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | restore rtc_status from migration or save/restore Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: Return destination vcpu on interrupt injectionYang Zhang2013-04-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new parameter to know vcpus who received the interrupt. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: Add vcpu info to ioapic_update_eoi()Yang Zhang2013-04-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add vcpu info to ioapic_update_eoi, so we can know which vcpu issued this EOI. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Avoid reading VM_EXIT_INTR_ERROR_CODE needlessly on nested exitsJan Kiszka2013-04-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We only need to update vm_exit_intr_error_code if there is a valid exit interruption information and it comes with a valid error code. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Fix conditions for interrupt injectionJan Kiszka2013-04-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we are entering guest mode, we do not want L0 to interrupt this vmentry with all its side effects on the vmcs. Therefore, injection shall be disallowed during L1->L2 transitions, as in the previous version. However, this check is conceptually independent of nested_exit_on_intr, so decouple it. If L1 traps external interrupts, we can kick the guest from L2 to L1, also just like the previous code worked. But we no longer need to consider L1's idt_vectoring_info_field. It will always be empty at this point. Instead, if L2 has pending events, those are now found in the architectural queues and will, thus, prevent vmx_interrupt_allowed from being called at all. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Rework event injection and recoveryJan Kiszka2013-04-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The basic idea is to always transfer the pending event injection on vmexit into the architectural state of the VCPU and then drop it from there if it turns out that we left L2 to enter L1, i.e. if we enter prepare_vmcs12. vmcs12_save_pending_events takes care to transfer pending L0 events into the queue of L1. That is mandatory as L1 may decide to switch the guest state completely, invalidating or preserving the pending events for later injection (including on a different node, once we support migration). This concept is based on the rule that a pending vmlaunch/vmresume is not canceled. Otherwise, we would risk to lose injected events or leak them into the wrong queues. Encode this rule via a WARN_ON_ONCE at the entry of nested_vmx_vmexit. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: nVMX: Fix injection of PENDING_INTERRUPT and NMI_WINDOW exits to L1Jan Kiszka2013-04-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check if the interrupt or NMI window exit is for L1 by testing if it has the corresponding controls enabled. This is required when we allow direct injection from L0 to L2 Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: emulator: mark 0xff 0x7d opcode as undefined.Gleb Natapov2013-04-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Gleb Natapov <gleb@redhat.com>
| * | | | | | | | | | | | KVM: emulator: Do not fail on emulation of undefined opcodeGleb Natapov2013-04-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Emulation of undefined opcode should inject #UD instead of causing emulation failure. Do that by moving Undefined flag check to emulation stage and injection #UD there. Signed-off-by: Gleb Natapov <gleb@redhat.com>