aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* Merge branch 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2009-06-11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (138 commits) KVM: Prevent overflow in largepages calculation KVM: Disable large pages on misaligned memory slots KVM: Add VT-x machine check support KVM: VMX: Rename rmode.active to rmode.vm86_active KVM: Move "exit due to NMI" handling into vmx_complete_interrupts() KVM: Disable CR8 intercept if tpr patching is active KVM: Do not migrate pending software interrupts. KVM: inject NMI after IRET from a previous NMI, not before. KVM: Always request IRQ/NMI window if an interrupt is pending KVM: Do not re-execute INTn instruction. KVM: skip_emulated_instruction() decode instruction if size is not known KVM: Remove irq_pending bitmap KVM: Do not allow interrupt injection from userspace if there is a pending event. KVM: Unprotect a page if #PF happens during NMI injection. KVM: s390: Verify memory in kvm run KVM: s390: Sanity check on validity intercept KVM: s390: Unlink vcpu on destroy - v2 KVM: s390: optimize float int lock: spin_lock_bh --> spin_lock KVM: s390: use hrtimer for clock wakeup from idle - v2 KVM: s390: Fix memory slot versus run - v3 ...
| * KVM: Prevent overflow in largepages calculationAvi Kivity2009-06-10
| | | | | | | | | | | | | | | | If userspace specifies a memory slot that is larger than 8 petabytes, it could overflow the largepages variable. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Disable large pages on misaligned memory slotsAvi Kivity2009-06-10
| | | | | | | | | | | | | | | | | | | | If a slots guest physical address and host virtual address unequal (mod large page size), then we would erronously try to back guest large pages with host large pages. Detect this misalignment and diable large page support for the trouble slot. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Add VT-x machine check supportAndi Kleen2009-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VT-x needs an explicit MC vector intercept to handle machine checks in the hyper visor. It also has a special option to catch machine checks that happen during VT entry. Do these interceptions and forward them to the Linux machine check handler. Make it always look like user space is interrupted because the machine check handler treats kernel/user space differently. Thanks to Jiang Yunhong for help and testing. Cc: stable@kernel.org Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: VMX: Rename rmode.active to rmode.vm86_activeNitin A Kamble2009-06-10
| | | | | | | | | | | | | | | | That way the interpretation of rmode.active becomes more clear with unrestricted guest code. Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Move "exit due to NMI" handling into vmx_complete_interrupts()Gleb Natapov2009-06-10
| | | | | | | | | | | | | | To save us one reading of VM_EXIT_INTR_INFO. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Disable CR8 intercept if tpr patching is activeGleb Natapov2009-06-10
| | | | | | | | | | Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Do not migrate pending software interrupts.Gleb Natapov2009-06-10
| | | | | | | | | | | | | | | | | | | | | | INTn will be re-executed after migration. If we wanted to migrate pending software interrupt we would need to migrate interrupt type and instruction length too, but we do not have all required info on SVM, so SVM->VMX migration would need to re-execute INTn anyway. To make it simple never migrate pending soft interrupt. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: inject NMI after IRET from a previous NMI, not before.Gleb Natapov2009-06-10
| | | | | | | | | | | | | | | | | | | | | | | | If NMI is received during handling of another NMI it should be injected immediately after IRET from previous NMI handler, but SVM intercept IRET before instruction execution so we can't inject pending NMI at this point and there is not way to request exit when NMI window opens. This patch fix SVM code to open NMI window after IRET by single stepping over IRET instruction. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Always request IRQ/NMI window if an interrupt is pendingGleb Natapov2009-06-10
| | | | | | | | | | | | | | Currently they are not requested if there is pending exception. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Do not re-execute INTn instruction.Gleb Natapov2009-06-10
| | | | | | | | | | | | | | | | Re-inject event instead. This is what Intel suggest. Also use correct instruction length when re-injecting soft fault/interrupt. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: skip_emulated_instruction() decode instruction if size is not knownGleb Natapov2009-06-10
| | | | | | | | | | Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Remove irq_pending bitmapGleb Natapov2009-06-10
| | | | | | | | | | | | | | | | | | Only one interrupt vector can be injected from userspace irqchip at any given time so no need to store it in a bitmap. Put it into interrupt queue directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Do not allow interrupt injection from userspace if there is a pending ↵Gleb Natapov2009-06-10
| | | | | | | | | | | | | | | | | | event. The exception will immediately close the interrupt window. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Unprotect a page if #PF happens during NMI injection.Gleb Natapov2009-06-10
| | | | | | | | | | | | | | It is done for exception and interrupt already. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: s390: Verify memory in kvm runCarsten Otte2009-06-10
| | | | | | | | | | | | | | | | | | | | | | This check verifies that the guest we're trying to run in KVM_RUN has some memory assigned to it. It enters an endless exception loop if this is not the case. Reported-by: Mijo Safradin <mijo@linux.vnet.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Ehrhardt <ehrhardt@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: s390: Sanity check on validity interceptCarsten Otte2009-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a sanity check for the content of the guest prefix register content before faulting in the cpu lowcore that it refers to. The guest might end up in an endless loop where SIE complains about missing lowcore with incorrect content of the prefix register without this fix. Reported-by: Mijo Safradin <mijo@linux.vnet.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Ehrhardt <ehrhardt@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: s390: Unlink vcpu on destroy - v2Carsten Otte2009-06-10
| | | | | | | | | | | | | | | | | | | | | | | | This patch makes sure we do unlink a vcpu's sie control block from the system control area in kvm_arch_vcpu_destroy. This prevents illegal accesses to the sie control block from other virtual cpus after free. Reported-by: Mijo Safradin <mijo@linux.vnet.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Ehrhardt <ehrhardt@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: s390: optimize float int lock: spin_lock_bh --> spin_lockChristian Borntraeger2009-06-10
| | | | | | | | | | | | | | | | | | The floating interrupt lock is only taken in process context. We can replace all spin_lock_bh with standard spin_lock calls. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Ehrhardt <ehrhardt@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: s390: use hrtimer for clock wakeup from idle - v2Christian Borntraeger2009-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch reworks the s390 clock comparator wakeup to hrtimer. The clock comparator is a per-cpu value that is compared against the TOD clock. If ckc <= TOD an external interrupt 1004 is triggered. Since the clock comparator and the TOD clock have a much higher resolution than jiffies we should use hrtimers to trigger the wakeup. This speeds up guest nanosleep for small values. Since hrtimers callbacks run in hard-irq context, I added a tasklet to do the actual work with enabled interrupts. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Ehrhardt <ehrhardt@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: s390: Fix memory slot versus run - v3Carsten Otte2009-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes an incorrectness in the kvm backend for s390. In case virtual cpus are being created before the corresponding memory slot is being registered, we need to update the sie control blocks for the virtual cpus. *updates in v3* In consideration of the s390 memslot constraints locking was changed to trylock. These locks should never be held, as vcpu's can't run without the single memslot we just assign when running this code. To ensure this never deadlocks in case other code changes the code uses trylocks and bail out if it can't get all locks. Additionally most of the discussed special conditions for s390 like only one memslot and no user_alloc are now checked for validity in kvm_arch_set_memory_region. Reported-by: Mijo Safradin <mijo@linux.vnet.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Ehrhardt <ehrhardt@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Expand on "help" info to specify kvm intel and amd module namesRobert P. J. Day2009-06-10
| | | | | | | | | | | | | | Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: x86: check for cr3 validity in mmu_alloc_rootsMarcelo Tosatti2009-06-10
| | | | | | | | | | | | | | | | Verify the cr3 address stored in vcpu->arch.cr3 points to an existant memslot. If not, inject a triple fault. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: take mmu_lock when updating a deleted slotMarcelo Tosatti2009-06-10
| | | | | | | | | | | | | | | | kvm_handle_hva relies on mmu_lock protection to safely access the memslot structures. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: protect kvm_mmu_change_mmu_pages with mmu_lockMarcelo Tosatti2009-06-10
| | | | | | | | | | | | | | | | | | | | | | kvm_handle_hva, called by MMU notifiers, manipulates mmu data only with the protection of mmu_lock. Update kvm_mmu_change_mmu_pages callers to take mmu_lock, thus protecting against kvm_handle_hva. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Deal with interrupt shadow state for emulated instructionsGlauber Costa2009-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently unblock shadow interrupt state when we skip an instruction, but failing to do so when we actually emulate one. This blocks interrupts in key instruction blocks, in particular sti; hlt; sequences If the instruction emulated is an sti, we have to block shadow interrupts. The same goes for mov ss. pop ss also needs it, but we don't currently emulate it. Without this patch, I cannot boot gpxe option roms at vmx machines. This is described at https://bugzilla.redhat.com/show_bug.cgi?id=494469 Signed-off-by: Glauber Costa <glommer@redhat.com> CC: H. Peter Anvin <hpa@zytor.com> CC: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Replace ->drop_interrupt_shadow() by ->set_interrupt_shadow()Glauber Costa2009-06-10
| | | | | | | | | | | | | | | | | | | | | | | | This patch replaces drop_interrupt_shadow with the more general set_interrupt_shadow, that can either drop or raise it, depending on its parameter. It also adds ->get_interrupt_shadow() for future use. Signed-off-by: Glauber Costa <glommer@redhat.com> CC: H. Peter Anvin <hpa@zytor.com> CC: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: protect assigned dev workqueue, int handler and irq ackerMarcelo Tosatti2009-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kvm_assigned_dev_ack_irq is vulnerable to a race condition with the interrupt handler function. It does: if (dev->host_irq_disabled) { enable_irq(dev->host_irq); dev->host_irq_disabled = false; } If an interrupt triggers before the host->dev_irq_disabled assignment, it will disable the interrupt and set dev->host_irq_disabled to true. On return to kvm_assigned_dev_ack_irq, dev->host_irq_disabled is set to false, and the next kvm_assigned_dev_ack_irq call will fail to reenable it. Other than that, having the interrupt handler and work handlers run in parallel sounds like asking for trouble (could not spot any obvious problem, but better not have to, its fragile). CC: sheng.yang@intel.com Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: use smp_send_reschedule in kvm_vcpu_kickMarcelo Tosatti2009-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM uses a function call IPI to cause the exit of a guest running on a physical cpu. For virtual interrupt notification there is no need to wait on IPI receival, or to execute any function. This is exactly what the reschedule IPI does, without the overhead of function IPI. So use it instead of smp_call_function_single in kvm_vcpu_kick. Also change the "guest_mode" variable to a bit in vcpu->requests, and use that to collapse multiple IPI's that would be issued between the first one and zeroing of guest mode. This allows kvm_vcpu_kick to called with interrupts disabled. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Update cpuid 1.ecx reportingAvi Kivity2009-06-10
| | | | | | | | Signed-off-by: Avi Kivity <avi@redhat.com>
| * x86: Add cpu features MOVBE and POPCNTAvi Kivity2009-06-10
| | | | | | | | | | | | Add cpu feature bit support for the MOVBE and POPCNT instructions. Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Add AMD cpuid bit: cr8_legacy, abm, misaligned sse, sse4, 3dnow prefetchAvi Kivity2009-06-10
| | | | | | | | Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Fix cpuid feature misreportingAvi Kivity2009-06-10
| | | | | | | | | | | | | | | | MTRR, PAT, MCE, and MCA are all supported (to some extent) but not reported. Vista requires these features, so if userspace relies on kernel cpuid reporting, it loses support for Vista. Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Drop request_nmi from statsJan Kiszka2009-06-10
| | | | | | | | | | | | | | | | The stats entry request_nmi is no longer used as the related user space interface was dropped. So clean it up. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Don't reinject event that caused a task switchGleb Natapov2009-06-10
| | | | | | | | | | | | | | | | If a task switch caused by an event remove it from the event queue. VMX already does that. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Fix cross vendor migration issue in segment segment descriptorAndre Przywara2009-06-10
| | | | | | | | | | | | | | | | | | | | On AMD CPUs sometimes the DB bit in the stack segment descriptor is left as 1, although the whole segment has been made unusable. Clear it here to pass an Intel VMX entry check when cross vendor migrating. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: fix apic_debug instancesGlauber Costa2009-06-10
| | | | | | | | | | | | | | | | | | Apparently nobody turned this on in a while... setting apic_debug to something compilable, generates some errors. This patch fixes it. Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Trivial format fix in setup_routing_entry()Chris Wright2009-06-10
| | | | | | | | | | | | | | Remove extra tab. Signed-off-by: Chris Wright <chrisw@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: VMX: Disable VMX when system shutdownSheng Yang2009-06-10
| | | | | | | | | | | | | | | | | | Intel TXT(Trusted Execution Technology) required VMX off for all cpu to work when system shutdown. CC: Joseph Cihula <joseph.cihula@intel.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Enable snooping control for supported hardwareSheng Yang2009-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Memory aliases with different memory type is a problem for guest. For the guest without assigned device, the memory type of guest memory would always been the same as host(WB); but for the assigned device, some part of memory may be used as DMA and then set to uncacheable memory type(UC/WC), which would be a conflict of host memory type then be a potential issue. Snooping control can guarantee the cache correctness of memory go through the DMA engine of VT-d. [avi: fix build on ia64] Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Replace get_mt_mask_shift with get_mt_maskSheng Yang2009-06-10
| | | | | | | | | | | | | | | | | | | | Shadow_mt_mask is out of date, now it have only been used as a flag to indicate if TDP enabled. Get rid of it and use tdp_enabled instead. Also put memory type logical in kvm_x86_ops->get_mt_mask(). Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Wake up waitqueue before calling get_cpu()Jan Blunck2009-06-10
| | | | | | | | | | | | | | | | | | | | This moves the get_cpu() call down to be called after we wake up the waiters. Therefore the waitqueue locks can safely be rt mutex. Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Sven-Thorsten Dietrich <sven@thebigcorporation.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Get rid of get_irq() callbackGleb Natapov2009-06-10
| | | | | | | | | | | | | | | | | | It just returns pending IRQ vector from the queue for VMX/SVM. Get IRQ directly from the queue before migration and put it back after. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Fix userspace IRQ chip migrationGleb Natapov2009-06-10
| | | | | | | | | | | | | | | | Re-put pending IRQ vector into interrupt_bitmap before migration. Otherwise it will be lost if migration happens in the wrong time. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add NMI injection supportGleb Natapov2009-06-10
| | | | | | | | | | Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Get rid of arch.interrupt_window_open & arch.nmi_window_openGleb Natapov2009-06-10
| | | | | | | | | | | | | | They are recalculated before each use anyway. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Do not report TPR write to userspace if new value bigger or equal to a ↵Gleb Natapov2009-06-10
| | | | | | | | | | | | | | | | | | previous one. Saves many exits to userspace in a case of IRQ chip in userspace. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: sync_lapic_to_cr8() should always sync cr8 to V_TPRGleb Natapov2009-06-10
| | | | | | | | | | | | | | Even if IRQ chip is in userspace. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Remove kvm_push_irq()Gleb Natapov2009-06-10
| | | | | | | | | | | | | | No longer used. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Remove inject_pending_vectors() callbackGleb Natapov2009-06-10
| | | | | | | | | | | | | | It is the same as inject_pending_irq() for VMX/SVM now. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>