aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/kvm/vmx.c
Commit message (Collapse)AuthorAge
* KVM: Improve emulation failure reportingAvi Kivity2007-10-13
| | | | | | Report failed opcodes from all locations. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Fix exit qualification width on i386He, Qing2007-10-13
| | | | | | | | | | | | | | According to Intel Software Developer's Manual, Vol. 3B, Appendix H.4.2, exit qualification should be of natural width. However, current code uses u64 as the data type for this register, which occasionally introduces invalid value to VMExit handling logics. This patch fixes this bug. I have tested Windows and Linux guest on i386 host, and they can boot successfully with this patch. Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Move main vcpu loop into subarch independent codeAvi Kivity2007-10-13
| | | | | | This simplifies adding new code as well as reducing overall code size. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Move vm entry failure handling to the exit handlerAvi Kivity2007-10-13
| | | | | | This will help moving the main loop to subarch independent code. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Rename kvm_arch_ops to kvm_x86_opsChristian Ehrhardt2007-10-13
| | | | | | | | This patch just renames the current (misnamed) _arch namings to _x86 to ensure better readability when a real arch layer takes place. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: enable in-kernel APIC INIT/SIPI handlingHe, Qing2007-10-13
| | | | | | | | | | | This patch enables INIT/SIPI handling using in-kernel APIC by introducing a ->mp_state field to emulate the SMP state transition. [avi: remove smp_processor_id() warning] Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Xin Li <xin.b.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Migrate lapic hrtimer when vcpu moves to another cpuEddie Dong2007-10-13
| | | | | | | | | | | This reduces overhead by accessing cachelines from the wrong node, as well as simplifying locking. [Qing: fix for inactive or expired one-shot timer] Signed-off-by: Yaozu (Eddie) Dong <Eddie.Dong@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Keep track of missed timer irq injectionsEddie Dong2007-10-13
| | | | | | | | | | | | | APIC timer IRQ is set every time when a certain period expires at host time, but the guest may be descheduled at that time and thus the irq be overwritten by later fire. This patch keep track of firing irq numbers and decrease only when the IRQ is injected to guest or buffered in APIC. Signed-off-by: Yaozu (Eddie) Dong <Eddie.Dong@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Use shadow TPR/cr8 for 64-bits guestsYang, Sheng2007-10-13
| | | | | | | | | | | This patch enables TPR shadow of VMX on CR8 access. 64bit Windows using CR8 access TPR frequently. The TPR shadow can improve the performance of access TPR by not causing vmexit. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: pending irq save/restoreEddie Dong2007-10-13
| | | | | | | | | | | Add in kernel irqchip save/restore support for pending vectors. [avi: fix compile warning on i386] [avi: remove printk] Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Emulate hlt in the kernelEddie Dong2007-10-13
| | | | | | | | | By sleeping in the kernel when hlt is executed, we simplify the in-kernel guest interrupt path considerably. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Emulate local APIC in kernelEddie Dong2007-10-13
| | | | | | | | | | | | | | | | Because lightweight exits (exits which don't involve userspace) are many times faster than heavyweight exits, it makes sense to emulate high usage devices in the kernel. The local APIC is one such device, especially for Windows and for SMP, so we add an APIC model to kvm. It also allows in-kernel host-side drivers to inject interrupts without going through userspace. [compile fix on i386 from Jindrich Makovicka] Signed-off-by: Yaozu (Eddie) Dong <Eddie.Dong@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Define and use cr8 access functionsEddie Dong2007-10-13
| | | | | | | | | This patch is to wrap APIC base register and CR8 operation which can provide a unique API for user level irqchip and kernel irqchip. This is a preparation of merging lapic/ioapic patch. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add support for in-kernel PIC emulationEddie Dong2007-10-13
| | | | | Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Split segments reload in vmx_load_host_state()Laurent Vivier2007-10-13
| | | | | | | | | | | vmx_load_host_state() bundles fs, gs, ldt, and tss reloading into one in the hope that it is infrequent. With smp guests, fs reloading is frequent due to fs being used by threads. Unbundle the reloads so reduce expensive gs reloads. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: allow rmode_tss_base() to work with >2G of guest memoryIzik Eidus2007-10-13
| | | | | Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Communicate cr8 changes to userspaceYang, Sheng2007-10-13
| | | | | | | This allows running 64-bit Windows. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Close minor race in signal handlingAvi Kivity2007-10-13
| | | | | | | | | | We need to check for signals inside the critical section, otherwise a signal can be sent which we will not notice. Also move the check before entry, so that if the signal happens before the first entry, we exit immediately instead of waiting for something to happen to the guest. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Clean up kvm_setup_pio()Laurent Vivier2007-10-13
| | | | | | | | Split kvm_setup_pio() into two functions, one to setup in/out pio (kvm_emulate_pio()) and one to setup ins/outs pio (kvm_emulate_pio_string()). Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Cleanup string I/O instruction emulationLaurent Vivier2007-10-13
| | | | | | | | | | | | | | | | | Both vmx and svm decode the I/O instructions, and both botch the job, requiring the instruction prefixes to be fetched in order to completely decode the instruction. So, if we see a string I/O instruction, use the x86 emulator to decode it, as it already has all the prefix decoding machinery. This patch defines ins/outs opcodes in x86_emulate.c and calls emulate_instruction() from io_interception() (svm.c) and from handle_io() (vmx.c). It removes all vmx/svm prefix instruction decoders (get_addr_size(), io_get_override(), io_address(), get_io_count()) Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Remove a duplicated ia32e mode vm entry controlLi, Xin B2007-10-13
| | | | | | | | Remove a duplicated ia32e mode VM Entry control definition and use the proper one. Signed-off-by: Xin Li <xin.b.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use kmem_cache_free for kmem_cache_zalloc'ed objectsRusty Russell2007-10-13
| | | | | | | | We use kfree in svm.c and vmx.c, and this works, but it could break at any time. kfree() is supposed to match up with kmalloc(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add and use pr_unimpl for standard formatting of unimplemented featuresRusty Russell2007-10-13
| | | | | | | | All guest-invokable printks should be ratelimited to prevent malicious guests from flooding logs. This is a start. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Fix defined but not used warning in drivers/kvm/vmx.cGabriel C2007-10-13
| | | | | | | move_msr_up() is used only on X86_64 and generates a warning on !X86_64 Signed-off-by: Gabriel Craciunescu <nix.or.die@googlemail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Remove redundant alloc_vmcs_cpu declarationRusty Russell2007-10-13
| | | | | | | | alloc_vmcs_cpu is already declared (static) above, no need to redeclare. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Add cpu consistency checkYang, Sheng2007-10-13
| | | | | | | | | All the physical CPUs on the board should support the same VMX feature set. Add check_processor_compatibility to kvm_arch_ops for the consistency check. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use kmem cache for allocating vcpusRusty Russell2007-10-13
| | | | | | | | | Avi wants the allocations of vcpus centralized again. The easiest way is to add a "size" arg to kvm_init_arch, and expose the thus-prepared cache to the modules. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Remove kvm_{read,write}_guest()Laurent Vivier2007-10-13
| | | | | | | ... in favor of the more general emulator_{read,write}_*. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: pass vcpu_vmx internallyRusty Russell2007-10-13
| | | | | | | | | container_of is wonderful, but not casting at all is better. This patch changes vmx.c's internal functions to pass "struct vcpu_vmx" instead of "struct kvm_vcpu" and using container_of. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Convert vm lock to a mutexShaohua Li2007-10-13
| | | | | | | | This allows the kvm mmu to perform sleepy operations, such as memory allocation. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use the scheduler preemption notifiers to make kvm preemptibleAvi Kivity2007-10-13
| | | | | | | | | | | | | | Current kvm disables preemption while the new virtualization registers are in use. This of course is not very good for latency sensitive workloads (one use of virtualization is to offload user interface and other latency insensitive stuff to a container, so that it is easier to analyze the remaining workload). This patch re-enables preemption for kvm; preemption is now only disabled when switching the registers in and out, and during the switch to guest mode and back. Contains fixes from Shaohua Li <shaohua.li@intel.com>. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Improve the method of writing vmcs controlYang, Sheng2007-10-13
| | | | | | | | | | | | Put cpu feature detecting part in hardware_setup, and stored the vmcs condition in global variable for further check. [glommer: fix for some i386-only machines not supporting CR8 load/store exiting] Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Dynamically allocate vcpusRusty Russell2007-10-13
| | | | | | | | | | This patch converts the vcpus array in "struct kvm" to a pointer array, and changes the "vcpu_create" and "vcpu_setup" hooks into one "vcpu_create" call which does the allocation and initialization of the vcpu (calling back into the kvm_vcpu_init core helper). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Remove arch specific components from the general codeGregory Haskins2007-10-13
| | | | | | | | struct kvm_vcpu has vmx-specific members; remove them to a private structure. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Import some constants of vmcs from IA32 SDMYang, Sheng2007-10-13
| | | | | | | | | | | | This patch mainly imports some constants and rename two exist constants of vmcs according to IA32 SDM. It also adds two constants to indicate Lock bit and Enable bit in MSR_IA32_FEATURE_CONTROL, and replace the hardcode _5_ with these two bits. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Hoist kvm_mmu_reload() out of the critical sectionShaohua Li2007-10-13
| | | | | | | | vmx_cpu_run doesn't handle error correctly and kvm_mmu_reload might sleep with mutex changes, so I move it above. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Set exit_reason to KVM_EXIT_MMIO where run->mmio is initialized.Jeff Dike2007-10-13
| | | | | Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use standard CR4 flags, tighten checkingRusty Russell2007-10-13
| | | | | | | | | | | On this machine (Intel), writing to the CR4 bits 0x00000800 and 0x00001000 cause a GPF. The Intel manual is a little unclear, but AFIACT they're reserved, too. Also fix spelling of CR4_RESEVED_BITS. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Trivial: Use standard CR0 flags macros from asm/cpu-features.hRusty Russell2007-10-13
| | | | | | | | | | The kernel now has asm/cpu-features.h: use those macros instead of inventing our own. Also spell out definition of CR0_RESEVED_BITS (no code change) and fix typo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SMP: Add vcpu_id field in struct vcpuQing He2007-10-13
| | | | | | | | This patch adds a `vcpu_id' field in `struct vcpu', so we can differentiate BSP and APs without pointer comparison or arithmetic. Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Clean up #includesAvi Kivity2007-07-16
| | | | | | Remove unnecessary ones, and rearange the remaining in the standard order. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Remove unnecessary code in vmx_tlb_flush()Avi Kivity2007-07-16
| | | | | | | | A vmexit implicitly flushes the tlb; the code is bogus. Noted by Shaohua Li. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Reinitialize the real-mode tss when entering real modeAvi Kivity2007-07-16
| | | | | | | Protected mode code may have corrupted the real-mode tss, so re-initialize it when switching to real mode. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Fix interrupt checking on lightweight exitGregory Haskins2007-07-16
| | | | | | | | With kernel-injected interrupts, we need to check for interrupts on lightweight exits too. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Ensure vcpu time stamp counter is monotonousAvi Kivity2007-07-16
| | | | | | | | | | | | If the time stamp counter goes backwards, a guest delay loop can become infinite. This can happen if a vcpu is migrated to another cpu, where the counter has a lower value than the first cpu. Since we're doing an IPI to the first cpu anyway, we can use that to pick up the old tsc, and use that to calculate the adjustment we need to make to the tsc offset. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Initialize the BSP bit in the APIC_BASE msr correctlyAvi Kivity2007-07-16
| | | | | | Needs to be set on vcpu 0 only. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Replace memset(<addr>, 0, PAGESIZE) with clear_page(<addr>)Shani Moideen2007-07-16
| | | | | Signed-off-by: Shani Moideen <shani.moideen@wipro.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Flush remote tlbs when reducing shadow pte permissionsAvi Kivity2007-07-16
| | | | | | | | | | | When a vcpu causes a shadow tlb entry to have reduced permissions, it must also clear the tlb on remote vcpus. We do that by: - setting a bit on the vcpu that requests a tlb flush before the next entry - if the vcpu is currently executing, we send an ipi to make sure it exits before we continue Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Emulate hlt on real mode for IntelAvi Kivity2007-07-16
| | | | | | | This has two use cases: the bios can't boot from disk, and guest smp bootstrap. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Move duplicate halt handling code into kvm_main.cAvi Kivity2007-07-16
| | | | | | Will soon have a thid user. Signed-off-by: Avi Kivity <avi@qumranet.com>