aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm
Commit message (Collapse)AuthorAge
...
* | KVM: x86 emulator: add Src2 decode setGuillaume Thouvenin2008-12-31
| | | | | | | | | | | | | | | | | | Instruction like shld has three operands, so we need to add a Src2 decode set. We start with Src2None, Src2CL, and Src2ImmByte, Src2One to support shld/shrd and we will expand it later. Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86 emulator: Extend the opcode descriptorGuillaume Thouvenin2008-12-31
| | | | | | | | | | | | | | | | Extend the opcode descriptor to 32 bits. This is needed by the introduction of a new Src2 operand type. Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: fix sparse warningHannes Eder2008-12-31
| | | | | | | | | | | | | | | | | | Impact: make global function static arch/x86/kvm/vmx.c:134:3: warning: symbol 'vmx_capability' was not declared. Should it be static? Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: Remove extraneous semicolon after do/whileAvi Kivity2008-12-31
| | | | | | | | | | | | Notices by Guillaume Thouvenin. Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86 emulator: fix popf emulationAvi Kivity2008-12-31
| | | | | | | | | | | | Set operand type and size to get correct writeback behavior. Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86 emulator: fix ret emulationAvi Kivity2008-12-31
| | | | | | | | | | | | | | 'ret' did not set the operand type or size for the destination, so writeback ignored it. Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86 emulator: switch 'pop reg' instruction to emulate_pop()Avi Kivity2008-12-31
| | | | | | | | Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86 emulator: allow pop from mmioAvi Kivity2008-12-31
| | | | | | | | Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86 emulator: Extract 'pop' sequence into a functionAvi Kivity2008-12-31
| | | | | | | | | | | | Switch 'pop r/m' instruction to use the new function. Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86 emulator: consolidate emulation of two operand instructionsAvi Kivity2008-12-31
| | | | | | | | | | | | No need to repeat the same assembly block over and over. Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86 emulator: reduce duplication in one operand emulation thunksAvi Kivity2008-12-31
| | | | | | | | Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: MMU: optimize set_spte for page syncMarcelo Tosatti2008-12-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The write protect verification in set_spte is unnecessary for page sync. Its guaranteed that, if the unsync spte was writable, the target page does not have a write protected shadow (if it had, the spte would have been write protected under mmu_lock by rmap_write_protect before). Same reasoning applies to mark_page_dirty: the gfn has been marked as dirty via the pagefault path. The cost of hash table and memslot lookups are quite significant if the workload is pagetable write intensive resulting in increased mmu_lock contention. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: Conditionally request interrupt window after injecting irqAvi Kivity2008-12-31
| | | | | | | | | | | | | | | | | | | | | | If we're injecting an interrupt, and another one is pending, request an interrupt window notification so we don't have excess latency on the second interrupt. This shouldn't happen in practice since an EOI will be issued, giving a second chance to request an interrupt window, but... Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: SVM: move svm_hardware_disable() code to asm/virtext.hEduardo Habkost2008-12-31
| | | | | | | | | | | | | | Create cpu_svm_disable() function. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: SVM: move has_svm() code to asm/virtext.hEduardo Habkost2008-12-31
| | | | | | | | | | | | | | | | | | Use a trick to keep the printk()s on has_svm() working as before. gcc will take care of not generating code for the 'msg' stuff when the function is called with a NULL msg argument. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: extract kvm_cpu_vmxoff() from hardware_disable()Eduardo Habkost2008-12-31
| | | | | | | | | | | | | | | | Along with some comments on why it is different from the core cpu_vmxoff() function. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: move cpu_has_kvm_support() to an inline on asm/virtext.hEduardo Habkost2008-12-31
| | | | | | | | | | | | | | | | It will be used by core code on kdump and reboot, to disable vmx if needed. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: SVM: move svm.h to include/asmEduardo Habkost2008-12-31
| | | | | | | | | | | | | | | | svm.h will be used by core code that is independent of KVM, so I am moving it outside the arch/x86/kvm directory. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: move vmx.h to include/asmEduardo Habkost2008-12-31
| | | | | | | | | | | | | | | | vmx.h will be used by core code that is independent of KVM, so I am moving it outside the arch/x86/kvm directory. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: Fix cpuid iteration on multiple leaves per eacNitin A Kamble2008-12-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code to traverse the cpuid data array list for counting type of leaves is currently broken. This patches fixes the 2 things in it. 1. Set the 1st counting entry's flag KVM_CPUID_FLAG_STATE_READ_NEXT. Without it the code will never find a valid entry. 2. Also the stop condition in the for loop while looking for the next unflaged entry is broken. It needs to stop when it find one matching entry; and in the case of count of 1, it will be the same entry found in this iteration. Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: Fix cpuid leaf 0xb loop terminationNitin A Kamble2008-12-31
| | | | | | | | | | | | | | | | | | For cpuid leaf 0xb the bits 8-15 in ECX register define the end of counting leaf. The previous code was using bits 0-7 for this purpose, which is a bug. Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: MMU: Fix aliased gfns treated as unaliasedIzik Eidus2008-12-31
| | | | | | | | | | | | | | | | | | | | | | Some areas of kvm x86 mmu are using gfn offset inside a slot without unaliasing the gfn first. This patch makes sure that the gfn will be unaliased and add gfn_to_memslot_unaliased() to save the calculating of the gfn unaliasing in case we have it unaliased already. Signed-off-by: Izik Eidus <ieidus@redhat.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: Enable Function Level Reset for assigned deviceSheng Yang2008-12-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ideally, every assigned device should in a clear condition before and after assignment, so that the former state of device won't affect later work. Some devices provide a mechanism named Function Level Reset, which is defined in PCI/PCI-e document. We should execute it before and after device assignment. (But sadly, the feature is new, and most device on the market now don't support it. We are considering using D0/D3hot transmit to emulate it later, but not that elegant and reliable as FLR itself.) [Update: Reminded by Xiantao, execute FLR after we ensure that the device can be assigned to the guest.] Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: Handle mmio emulation when guest state is invalidGuillaume Thouvenin2008-12-31
| | | | | | | | | | | | | | | | | | | | | | If emulate_invalid_guest_state is enabled, the emulator is called when guest state is invalid. Until now, we reported an mmio failure when emulate_instruction() returned EMULATE_DO_MMIO. This patch adds the case where emulate_instruction() failed and an MMIO emulation is needed. Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: allow emulator to adjust rip for emulated pio instructionsGuillaume Thouvenin2008-12-31
| | | | | | | | | | | | | | | | | | | | | | | | | | If we call the emulator we shouldn't call skip_emulated_instruction() in the first place, since the emulator already computes the next rip for us. Thus we move ->skip_emulated_instruction() out of kvm_emulate_pio() and into handle_io() (and the svm equivalent). We also replaced "return 0" by "break" in the "do_io:" case because now the shadow register state needs to be committed. Otherwise eip will never be updated. Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: SVM: Set the 'busy' flag of the TR selectorAmit Shah2008-12-31
| | | | | | | | | | | | | | | | The busy flag of the TR selector is not set by the hardware. This breaks migration from amd hosts to intel hosts. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: SVM: Set the 'g' bit of the cs selector for cross-vendor migrationAmit Shah2008-12-31
| | | | | | | | | | | | | | | | | | The hardware does not set the 'g' bit of the cs selector and this breaks migration from amd hosts to intel hosts. Set this bit if the segment limit is beyond 1 MB. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86: Fix typo in function nameAmit Shah2008-12-31
| | | | | | | | | | | | | | get_segment_descritptor_dtable() contains an obvious type. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86: Optimize NMI watchdog deliveryJan Kiszka2008-12-31
| | | | | | | | | | | | | | | | | | | | As suggested by Avi, this patch introduces a counter of VCPUs that have LVT0 set to NMI mode. Only if the counter > 0, we push the PIT ticks via all LAPIC LVT0 lines to enable NMI watchdog support. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86: Fix and refactor NMI watchdog emulationJan Kiszka2008-12-31
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch refactors the NMI watchdog delivery patch, consolidating tests and providing a proper API for delivering watchdog events. An included micro-optimization is to check only for apic_hw_enabled in kvm_apic_local_deliver (the test for LVT mask is covering the soft-disabled case already). Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86 emulator: Add decode entries for 0x04 and 0x05 opcodes (add acc, imm)Guillaume Thouvenin2008-12-31
| | | | | | | | | | | | | | | | Add decode entries for 0x04 and 0x05 (ADD) opcodes, execution is already implemented. Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: Move private memory slot positionSheng Yang2008-12-31
| | | | | | | | | | | | | | | | | | | | | | PCI device assignment would map guest MMIO spaces as separate slot, so it is possible that the device has more than 2 MMIO spaces and overwrite current private memslot. The patch move private memory slot to the top of userspace visible memory slots. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: MMU: Extend kvm_mmu_page->slot_bitmap sizeSheng Yang2008-12-31
| | | | | | | | | | | | | | | | Otherwise set_bit() for private memory slot(above KVM_MEMORY_SLOTS) would corrupted memory in 32bit host. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: Enable MTRR for EPTSheng Yang2008-12-31
| | | | | | | | | | | | | | | | The effective memory type of EPT is the mixture of MSR_IA32_CR_PAT and memory type field of EPT entry. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: Add local get_mtrr_type() to support MTRRSheng Yang2008-12-31
| | | | | | | | | | | | | | For EPT memory type support. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: Add PAT support for EPTSheng Yang2008-12-31
| | | | | | | | | | | | | | | | | | | | | | GUEST_PAT support is a new feature introduced by Intel Core i7 architecture. With this, cpu would save/load guest and host PAT automatically, for EPT memory type in guest depends on MSR_IA32_CR_PAT. Also add save/restore for MSR_IA32_CR_PAT. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: Improve MTRR structureSheng Yang2008-12-31
| | | | | | | | | | | | | | As well as reset mmu context when set MTRR. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: call kvm_arch_vcpu_reset() instead of the kvm_x86_ops callbackGleb Natapov2008-12-31
| | | | | | | | | | | | | | | | Call kvm_arch_vcpu_reset() instead of directly using arch callback. The function does additional things. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: work around lacking VNMI supportJan Kiszka2008-12-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Older VMX supporting CPUs do not provide the "Virtual NMI" feature for tracking the NMI-blocked state after injecting such events. For now KVM is unable to inject NMIs on those CPUs. Derived from Sheng Yang's suggestion to use the IRQ window notification for detecting the end of NMI handlers, this patch implements virtual NMI support without impact on the host's ability to receive real NMIs. The downside is that the given approach requires some heuristics that can cause NMI nesting in vary rare corner cases. The approach works as follows: - inject NMI and set a software-based NMI-blocked flag - arm the IRQ window start notification whenever an NMI window is requested - if the guest exits due to an opening IRQ window, clear the emulated NMI-blocked flag - if the guest net execution time with NMI-blocked but without an IRQ window exceeds 1 second, force NMI-blocked reset and inject anyway This approach covers most practical scenarios: - succeeding NMIs are seperated by at least one open IRQ window - the guest may spin with IRQs disabled (e.g. due to a bug), but leaving the NMI handler takes much less time than one second - the guest does not rely on strict ordering or timing of NMIs (would be problematic in virtualized environments anyway) Successfully tested with the 'nmi n' monitor command, the kgdbts testsuite on smp guests (additional patches required to add debug register support to kvm) + the kernel's nmi_watchdog=1, and a Siemens- specific board emulation (+ guest) that comes with its own NMI watchdog mechanism. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: Provide support for user space injected NMIsJan Kiszka2008-12-31
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the required bits to the VMX side for user space injected NMIs. As with the preexisting in-kernel irqchip support, the CPU must provide the "virtual NMI" feature for proper tracking of the NMI blocking state. Based on the original patch by Sheng Yang. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86: Support for user space injected NMIsJan Kiszka2008-12-31
| | | | | | | | | | | | | | | | | | | | | | | | Introduces the KVM_NMI IOCTL to the generic x86 part of KVM for injecting NMIs from user space and also extends the statistic report accordingly. Based on the original patch by Sheng Yang. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: Kick NMI receiving VCPUJan Kiszka2008-12-31
| | | | | | | | | | | | | | | | Kick the NMI receiving VCPU in case the triggering caller runs in a different context. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86: VCPU with pending NMI is runnabledJan Kiszka2008-12-31
| | | | | | | | | | | | | | Ensure that a VCPU with pending NMIs is considered runnable. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86: Enable NMI Watchdog via in-kernel PIT sourceJan Kiszka2008-12-31
| | | | | | | | | | | | | | | | | | | | | | | | | | LINT0 of the LAPIC can be used to route PIT events as NMI watchdog ticks into the guest. This patch aligns the in-kernel irqchip emulation with the user space irqchip with already supports this feature. The trick is to route PIT interrupts to all LAPIC's LVT0 lines. Rebased and slightly polished patch originally posted by Sheng Yang. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: fix real-mode NMI supportJan Kiszka2008-12-31
| | | | | | | | | | | | | | | | Fix NMI injection in real-mode with the same pattern we perform IRQ injection. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: refactor IRQ and NMI window enablingJan Kiszka2008-12-31
| | | | | | | | | | | | | | | | | | | | | | | | do_interrupt_requests and vmx_intr_assist go different way for achieving the same: enabling the nmi/irq window start notification. Unify their code over enable_{irq|nmi}_window, get rid of a redundant call to enable_intr_window instead of direct enable_nmi_window invocation and unroll enable_intr_window for both in-kernel and user space irq injection accordingly. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: refactor/fix IRQ and NMI injectability determinationJan Kiszka2008-12-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are currently two ways in VMX to check if an IRQ or NMI can be injected: - vmx_{nmi|irq}_enabled and - vcpu.arch.{nmi|interrupt}_window_open. Even worse, one test (at the end of vmx_vcpu_run) uses an inconsistent, likely incorrect logic. This patch consolidates and unifies the tests over {nmi|interrupt}_window_open as cache + vmx_update_window_states for updating the cache content. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: x86: Reset pending/inject NMI state on CPU resetJan Kiszka2008-12-31
| | | | | | | | | | | | | | | | | | | | | | CPU reset invalidates pending or already injected NMIs, therefore reset the related state variables. Based on original patch by Gleb Natapov. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: Support for NMI task gatesJan Kiszka2008-12-31
| | | | | | | | | | | | | | | | | | | | | | | | | | Properly set GUEST_INTR_STATE_NMI and reset nmi_injected when a task-switch vmexit happened due to a task gate being used for handling NMIs. Also avoid the false warning about valid vectoring info in kvm_handle_exit. Based on original patch by Gleb Natapov. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: VMX: Use INTR_TYPE_NMI_INTR instead of magic valueJan Kiszka2008-12-31
| | | | | | | | | | Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>