aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* Merge tag 'kvm-3.6-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2012-07-24
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM updates from Avi Kivity: "Highlights include - full big real mode emulation on pre-Westmere Intel hosts (can be disabled with emulate_invalid_guest_state=0) - relatively small ppc and s390 updates - PCID/INVPCID support in guests - EOI avoidance; 3.6 guests should perform better on 3.6 hosts on interrupt intensive workloads) - Lockless write faults during live migration - EPT accessed/dirty bits support for new Intel processors" Fix up conflicts in: - Documentation/virtual/kvm/api.txt: Stupid subchapter numbering, added next to each other. - arch/powerpc/kvm/booke_interrupts.S: PPC asm changes clashing with the KVM fixes - arch/s390/include/asm/sigp.h, arch/s390/kvm/sigp.c: Duplicated commits through the kvm tree and the s390 tree, with subsequent edits in the KVM tree. * tag 'kvm-3.6-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (93 commits) KVM: fix race with level interrupts x86, hyper: fix build with !CONFIG_KVM_GUEST Revert "apic: fix kvm build on UP without IOAPIC" KVM guest: switch to apic_set_eoi_write, apic_write apic: add apic_set_eoi_write for PV use KVM: VMX: Implement PCID/INVPCID for guests with EPT KVM: Add x86_hyper_kvm to complete detect_hypervisor_platform check KVM: PPC: Critical interrupt emulation support KVM: PPC: e500mc: Fix tlbilx emulation for 64-bit guests KVM: PPC64: booke: Set interrupt computation mode for 64-bit host KVM: PPC: bookehv: Add ESR flag to Data Storage Interrupt KVM: PPC: bookehv64: Add support for std/ld emulation. booke: Added crit/mc exception handler for e500v2 booke/bookehv: Add host crit-watchdog exception support KVM: MMU: document mmu-lock and fast page fault KVM: MMU: fix kvm_mmu_pagetable_walk tracepoint KVM: MMU: trace fast page fault KVM: MMU: fast path of handling guest page fault KVM: MMU: introduce SPTE_MMU_WRITEABLE bit KVM: MMU: fold tlb flush judgement into mmu_spte_update ...
| * KVM: fix race with level interruptsMichael S. Tsirkin2012-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When more than 1 source id is in use for the same GSI, we have the following race related to handling irq_states race: CPU 0 clears bit 0. CPU 0 read irq_state as 0. CPU 1 sets level to 1. CPU 1 calls kvm_ioapic_set_irq(1). CPU 0 calls kvm_ioapic_set_irq(0). Now ioapic thinks the level is 0 but irq_state is not 0. Fix by performing all irq_states bitmap handling under pic/ioapic lock. This also removes the need for atomics with irq_states handling. Reported-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * x86, hyper: fix build with !CONFIG_KVM_GUESTAvi Kivity2012-07-18
| | | | | | | | | | Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * Revert "apic: fix kvm build on UP without IOAPIC"Michael S. Tsirkin2012-07-16
| | | | | | | | | | | | | | | | | | This reverts commit f9808b7fd422b965cea52e05ba470e0a473c53d3. After commit 'kvm: switch to apic_set_eoi_write, apic_write' the stubs are no longer needed as kvm does not look at apicdrivers anymore. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM guest: switch to apic_set_eoi_write, apic_writeMichael S. Tsirkin2012-07-16
| | | | | | | | | | | | | | | | Use apic_set_eoi_write, apic_write to avoid meedling in core apic driver data structures directly. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * apic: add apic_set_eoi_write for PV useMichael S. Tsirkin2012-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM PV EOI optimization overrides eoi_write apic op with its own version. Add an API for this to avoid meddling with core x86 apic driver data structures directly. For KVM use, we don't need any guarantees about when the switch to the new op will take place, so it could in theory use this API after SMP init, but it currently doesn't, and restricting callers to early init makes it clear that it's safe as it won't race with actual APIC driver use. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Avi Kivity <avi@redhat.com>
| * Merge branch 'for-upstream' of git://github.com/agraf/linux-2.6 into nextAvi Kivity2012-07-15
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ppc queue from Alex Graf: * Prepare some of the booke code for 64 bit support * BookE: Fix ESR flag in DSI * BookE: Add rfci emulation * 'for-upstream' of git://github.com/agraf/linux-2.6: KVM: PPC: Critical interrupt emulation support KVM: PPC: e500mc: Fix tlbilx emulation for 64-bit guests KVM: PPC64: booke: Set interrupt computation mode for 64-bit host KVM: PPC: bookehv: Add ESR flag to Data Storage Interrupt KVM: PPC: bookehv64: Add support for std/ld emulation. booke: Added crit/mc exception handler for e500v2 booke/bookehv: Add host crit-watchdog exception support Signed-off-by: Avi Kivity <avi@redhat.com>
| | * KVM: PPC: Critical interrupt emulation supportBharat Bhushan2012-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | rfci instruction and CSRR0/1 registers are emulated. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: e500mc: Fix tlbilx emulation for 64-bit guestsMihai Caraman2012-07-11
| | | | | | | | | | | | | | | | | | | | | | | | tlbilxva emulation was using an u32 variable for guest effective address. Replace it with gva_t type to handle 64-bit guests. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC64: booke: Set interrupt computation mode for 64-bit hostMihai Caraman2012-07-11
| | | | | | | | | | | | | | | | | | | | | | | | 64-bit host needs to remain in 64-bit mode when an exception take place. Set interrupt computaion mode in EPCR register. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: bookehv: Add ESR flag to Data Storage InterruptMihai Caraman2012-07-11
| | | | | | | | | | | | | | | | | | | | | | | | ESR register is required by Data Storage Interrupt handling code. Add the specific flag to the interrupt handler. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: bookehv64: Add support for std/ld emulation.Varun Sethi2012-07-11
| | | | | | | | | | | | | | | | | | | | | Add support for std/ld emulation. Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * booke: Added crit/mc exception handler for e500v2Bharat Bhushan2012-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | Watchdog is taken at critical exception level. So this patch is tested with host watchdog exception happening when guest is running. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * booke/bookehv: Add host crit-watchdog exception supportBharat Bhushan2012-07-11
| | | | | | | | | | | | | | | Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| * | KVM: VMX: Implement PCID/INVPCID for guests with EPTMao, Junjie2012-07-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch handles PCID/INVPCID for guests. Process-context identifiers (PCIDs) are a facility by which a logical processor may cache information for multiple linear-address spaces so that the processor may retain cached information when software switches to a different linear address space. Refer to section 4.10.1 in IA32 Intel Software Developer's Manual Volume 3A for details. For guests with EPT, the PCID feature is enabled and INVPCID behaves as running natively. For guests without EPT, the PCID feature is disabled and INVPCID triggers #UD. Signed-off-by: Junjie Mao <junjie.mao@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: Add x86_hyper_kvm to complete detect_hypervisor_platform checkPrarit Bhargava2012-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While debugging I noticed that unlike all the other hypervisor code in the kernel, kvm does not have an entry for x86_hyper which is used in detect_hypervisor_platform() which results in a nice printk in the syslog. This is only really a stub function but it does make kvm more consistent with the other hypervisors. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Marcelo Tostatti <mtosatti@redhat.com> Cc: kvm@vger.kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: MMU: document mmu-lock and fast page faultXiao Guangrong2012-07-11
| | | | | | | | | | | | | | | | | | | | | Document fast page fault and mmu-lock in locking.txt Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: MMU: fix kvm_mmu_pagetable_walk tracepointXiao Guangrong2012-07-11
| | | | | | | | | | | | | | | | | | | | | | | | The P bit of page fault error code is missed in this tracepoint, fix it by passing the full error code Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: MMU: trace fast page faultXiao Guangrong2012-07-11
| | | | | | | | | | | | | | | | | | | | | To see what happen on this path and help us to optimize it Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: MMU: fast path of handling guest page faultXiao Guangrong2012-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the the present bit of page fault error code is set, it indicates the shadow page is populated on all levels, it means what we do is only modify the access bit which can be done out of mmu-lock Currently, in order to simplify the code, we only fix the page fault caused by write-protect on the fast path Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: MMU: introduce SPTE_MMU_WRITEABLE bitXiao Guangrong2012-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This bit indicates whether the spte can be writable on MMU, that means the corresponding gpte is writable and the corresponding gfn is not protected by shadow page protection In the later path, SPTE_MMU_WRITEABLE will indicates whether the spte can be locklessly updated Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: MMU: fold tlb flush judgement into mmu_spte_updateXiao Guangrong2012-07-11
| | | | | | | | | | | | | | | | | | | | | mmu_spte_update() is the common function, we can easily audit the path Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: VMX: export PFEC.P bit on eptXiao Guangrong2012-07-11
| | | | | | | | | | | | | | | | | | | | | | | | Export the present bit of page fault error code, the later patch will use it Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: MMU: cleanup spte_write_protectXiao Guangrong2012-07-11
| | | | | | | | | | | | | | | | | | | | | Use __drop_large_spte to cleanup this function and comment spte_write_protect Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: MMU: abstract spte write-protectXiao Guangrong2012-07-11
| | | | | | | | | | | | | | | | | | | | | | | | Introduce a common function to abstract spte write-protect to cleanup the code Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: MMU: return bool in __rmap_write_protectXiao Guangrong2012-07-11
| | | | | | | | | | | | | | | | | | | | | | | | The reture value of __rmap_write_protect is either 1 or 0, use true/false instead of these Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: VMX: Emulate invalid guest state by defaultAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | | | | Our emulation should be complete enough that we can emulate guests while they are in big real mode, or in a mode transition that is not virtualizable without unrestricted guest support. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: implement LTRAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | Opcode 0F 00 /3. Encountered during Windows XP secondary processor bringup. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: make loading TR set the busy bitAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | | | | Guest software doesn't actually depend on it, but vmx will refuse us entry if we don't. Set the bit in both the cached segment and memory, just to be nice. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: make read_segment_descriptor() return the addressAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | Some operations want to modify the descriptor later on, so save the address for future use. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: emulate LLDTAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | Opcode 0F 00 /2. Used by isolinux durign the protected mode transition. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: emulate BSWAPAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | | | | Opcodes 0F C8 - 0F CF. Used by the SeaBIOS cdrom code (though not in big real mode). Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: VMX: Improve error reporting during invalid guest state emulationAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | If instruction emulation fails, report it properly to userspace. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: VMX: Stop invalid guest state emulation on pending eventAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | Process the event, possibly injecting an interrupt, before continuing. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: implement ENTERAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Opcode C8. Only ENTER with lexical nesting depth 0 is implemented, since others are very rare. We'll fail emulation if nonzero lexical depth is used so data is not corrupted. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: split push logic from push opcode emulationAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | This allows us to reuse the code without populating ctxt->src and overriding ctxt->op_bytes. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: fix byte-sized MOVZX/MOVSXAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 2adb5ad9fe1 removed ByteOp from MOVZX/MOVSX, replacing them by SrcMem8, but neglected to fix the dependency in the emulation code on ByteOp. This caused the instruction not to have any effect in some circumstances. Fix by replacing the check for ByteOp with the equivalent src.op_bytes == 1. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: emulate LAHFAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | Opcode 9F. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: VMX: Continue emulating after batch exhaustedAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | If we return early from an invalid guest state emulation loop, make sure we return to it later if the guest state is still invalid. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: VMX: Fix interrupt exit condition during emulationAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Checking EFLAGS.IF is incorrect as we might be in interrupt shadow. If that is the case, the main loop will notice that and not inject the interrupt, causing an endless loop. Fix by using vmx_interrupt_allowed() to check if we can inject an interrupt instead. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: emulate SGDT/SIDTAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | Opcodes 0F 01 /0 and 0F 01 /1 Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: Fix SS default ESP/EBP based addressingAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We correctly default to SS when BP is used as a base in 16-bit address mode, but we don't do that for 32-bit mode. Fix by adjusting the default to SS when either ESP or EBP is used as the base register. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: initialize memopAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memop is not initialized; this can lead to a two-byte operation following a 4-byte operation to see garbage values. Usually truncation fixes things fot us later on, but at least in one case (call abs) it doesn't. Fix by moving memop to the auto-initialized field area. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: emulate LEAVEAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | Opcode c9; used by some variants of Windows during boot, in big real mode. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: VMX: Limit iterations with emulator_invalid_guest_stateAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | Otherwise, if the guest ends up looping, we never exit the srcu critical section, which causes synchronize_srcu() to hang. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: VMX: Relax check on unusable segmentAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | Some userspace (e.g. QEMU 1.1) munge the d and g bits of segment descriptors, causing us not to recognize them as unusable segments with emulate_invalid_guest_state=1. Relax the check by testing for segment not present (a non-present segment cannot be usable). Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: fix LIDT/LGDT in long modeAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | The operand size for these instructions is 8 bytes in long mode, even without a REX prefix. Set it explicitly. Triggered while booting Linux with emulate_invalid_guest_state=1. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: allow loading null SS in long modeAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | Null SS is valid in long mode; allow loading it. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: emulate cpuidAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Opcode 0F A2. Used by Linux during the mode change trampoline while in a state that is not virtualizable on vmx without unrestricted_guest, so we need to emulate it is emulate_invalid_guest_state=1. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: change ->get_cpuid() accessor to use the x86 semanticsAvi Kivity2012-07-09
| | | | | | | | | | | | | | | | | | | | | | | | Instead of getting an exact leaf, follow the spec and fall back to the last main leaf instead. This lets us easily emulate the cpuid instruction in the emulator. Signed-off-by: Avi Kivity <avi@redhat.com>