aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/kvm/kvm.h
Commit message (Collapse)AuthorAge
...
* KVM: Portability: Split kvm_vm_ioctl v3Carsten Otte2008-01-30
| | | | | | | | | | | | | | | | | | | | This patch splits kvm_vm_ioctl into archtecture independent parts, and x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. The patch is unchanged since last submission. Common ioctls for all architectures are: KVM_CREATE_VCPU, KVM_GET_DIRTY_LOG, KVM_SET_USER_MEMORY_REGION x86 specific ioctls are: KVM_SET_MEMORY_REGION, KVM_GET/SET_NR_MMU_PAGES, KVM_SET_MEMORY_ALIAS, KVM_CREATE_IRQCHIP, KVM_CREATE_IRQ_LINE, KVM_GET/SET_IRQCHIP KVM_SET_TSS_ADDR Signed-off-by: Carsten Otte <cotte@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add ioctl to tss address from userspace,Izik Eidus2008-01-30
| | | | | | | | | | Currently kvm has a wart in that it requires three extra pages for use as a tss when emulating real mode on Intel. This patch moves the allocation internally, only requiring userspace to tell us where in the physical address space we can place the tss. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add kernel-internal memory slotsIzik Eidus2008-01-30
| | | | | | | | | | Reserve a few memory slots for kernel internal use. This is good for case you have to register memory region and you want to be sure it was not registered from userspace, and for case you want to register a memory region that won't be seen from userspace. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Export memory slot allocation mechanismIzik Eidus2008-01-30
| | | | | | | | Remove kvm memory slot allocation mechanism from the ioctl and put it to exported function. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Unmap kernel-allocated memory on slot destructionIzik Eidus2008-01-30
| | | | | | | | | kvm_vm_ioctl_set_memory_region() is able to remove memory in addition to adding it. Therefore when using kernel swapping support for old userspaces, we need to munmap the memory if the user request to remove it Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Move vmx_vcpu_reset() out of vmx_vcpu_setup()Avi Kivity2008-01-30
| | | | | | | | | | Split guest reset code out of vmx_vcpu_setup(). Besides being cleaner, this moves the realmode tss setup (which can sleep) outside vmx_vcpu_setup() (which is executed with preemption enabled). [izik: remove unused variable] Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Split kvm_vcpu into arch dependent and independent parts ↵Zhang Xiantao2008-01-30
| | | | | | | | | | | (part 1) First step to split kvm_vcpu. Currently, we just use an macro to define the common fields in kvm_vcpu for all archs, and all archs need to define its own kvm_vcpu struct. Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Allocate userspace memory for older userspaceAnthony Liguori2008-01-30
| | | | | | | | | | | | Allocate a userspace buffer for older userspaces. Also eliminate phys_mem buffer. The memset() in kvmctl really kills initial memory usage but swapping works even with old userspaces. A side effect is that maximum guest side is reduced for older userspace on i386. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use virtual cpu accounting if available for guest times.Christian Borntraeger2008-01-30
| | | | | | | | | | | | | | | | | | ppc and s390 offer the possibility to track process times precisely by looking at cpu timer on every context switch, irq, softirq etc. We can use that infrastructure as well for guest time accounting. We need to account the used time before we change the state. This patch adds a call to account_system_vtime to kvm_guest_enter and kvm_guest exit. If CONFIG_VIRT_CPU_ACCOUNTING is not set, account_system_vtime is defined in hardirq.h as an empty function, which means this patch does not change the behaviour on other platforms. I compile tested this patch on x86 and function tested the patch on s390. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Partial swapping of guest memoryIzik Eidus2008-01-30
| | | | | | | | | | | | This allows guest memory to be swapped. Pages which are currently mapped via shadow page tables are pinned into memory, but all other pages can be freely swapped. The patch makes gfn_to_page() elevate the page's reference count, and introduces kvm_release_page() that pairs with it. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Make gfn_to_page() always safeIzik Eidus2008-01-30
| | | | | | | | | | In case the page is not present in the guest memory map, return a dummy page the guest can scribble on. This simplifies error checking in its users. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Rename KVM_TLB_FLUSH to KVM_REQ_TLB_FLUSHAvi Kivity2008-01-30
| | | | | | We now have a new namespace, KVM_REQ_*, for bits in vcpu->requests. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: split kvm_vcpu_ioctlCarsten Otte2008-01-30
| | | | | | | | | | | | | | | | | | | | | | | | This patch splits kvm_vcpu_ioctl into archtecture independent parts, and x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. Common ioctls for all architectures are: KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT, KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU Note that some PPC chips don't have an FPU, so we might need an #ifdef around KVM_GET/SET_FPU one day. x86 specific ioctls are: KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS An interresting aspect is vcpu_load/vcpu_put. We now have a common vcpu_load/put which does the preemption stuff, and an architecture specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the vmx/svm function defined in kvm_x86_ops. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: More struct kvm_vcpu -> struct kvm cleanupsAnthony Liguori2008-01-30
| | | | | | | | | This time, the biggest change is gpa_to_hpa. The translation of GPA to HPA does not depend on the VCPU state unlike GVA to GPA so there's no need to pass in the kvm_vcpu. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Move x86 msr handling to new files x86.[ch]Carsten Otte2008-01-30
| | | | | Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Support assigning userspace memory to the guestIzik Eidus2008-01-30
| | | | | | | | | | | Instead of having the kernel allocate memory to the guest, let userspace allocate it and pass the address to the kernel. This is required for s390 support, but also enables features like memory sharing and using hugetlbfs backed memory. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: CodingStyle cleanupMike Day2008-01-30
| | | | | Signed-off-by: Mike D. Day <ncmike@ncultra.org> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Allow dynamic allocation of the mmu shadow cache sizeIzik Eidus2008-01-30
| | | | | | | The user is now able to set how many mmu pages will be allocated to the guest. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add general accessors to read and write guest memoryIzik Eidus2008-01-30
| | | | | Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Remove the usage of page->private field by rmapIzik Eidus2008-01-30
| | | | | | | | | | | | | When kvm uses user-allocated pages in the future for the guest, we won't be able to use page->private for rmap, since page->rmap is reserved for the filesystem. So we move the rmap base pointers to the memory slot. A side effect of this is that we need to store the gfn of each gpte in the shadow pages, since the memory slot is addressed by gfn, instead of hfn like struct page. Signed-off-by: Izik Eidus <izik@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Make flooding detection work when guest page faults are bypassedAvi Kivity2008-01-30
| | | | | | | | | When we allow guest page faults to reach the guests directly, we lose the fault tracking which allows us to detect demand paging. So we provide an alternate mechnism by clearing the accessed bit when we set a pte, and checking it later to see if the guest actually used it. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Allow not-present guest page faults to bypass kvmAvi Kivity2008-01-30
| | | | | | | | | | | | | | | | | | | | | | | | | | There are two classes of page faults trapped by kvm: - host page faults, where the fault is needed to allow kvm to install the shadow pte or update the guest accessed and dirty bits - guest page faults, where the guest has faulted and kvm simply injects the fault back into the guest to handle The second class, guest page faults, is pure overhead. We can eliminate some of it on vmx using the following evil trick: - when we set up a shadow page table entry, if the corresponding guest pte is not present, set up the shadow pte as not present - if the guest pte _is_ present, mark the shadow pte as present but also set one of the reserved bits in the shadow pte - tell the vmx hardware not to trap faults which have the present bit clear With this, normal page-not-present faults go directly to the guest, bypassing kvm entirely. Unfortunately, this trick only works on Intel hardware, as AMD lacks a way to discriminate among page faults based on error code. It is also a little risky since it uses reserved bits which might become unreserved in the future, so a module parameter is provided to disable it. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Call x86_decode_insn() only when neededLaurent Vivier2008-01-30
| | | | | | | | | Move emulate_ctxt to kvm_vcpu to keep emulate context when we exit from kvm module. Call x86_decode_insn() only when needed. Modify x86_emulate_insn() to not modify the context if it must be re-entered. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Refactor hypercall infrastructure (v3)Anthony Liguori2008-01-30
| | | | | | | | | | | | | | | | | | This patch refactors the current hypercall infrastructure to better support live migration and SMP. It eliminates the hypercall page by trapping the UD exception that would occur if you used the wrong hypercall instruction for the underlying architecture and replacing it with the right one lazily. A fall-out of this patch is that the unhandled hypercalls no longer trap to userspace. There is very little reason though to use a hypercall to communicate with userspace as PIO or MMIO can be used. There is no code in tree that uses userspace hypercalls. [avi: fix #ud injection on vmx] Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* sched: guest CPU accounting: maintain guest state in KVMLaurent Vivier2007-10-15
| | | | | | | | | | Modify KVM to update guest time accounting. [ mingo@elte.hu: ported to 2.6.24 KVM. ] Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Acked-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* KVM: Improve emulation failure reportingAvi Kivity2007-10-13
| | | | | | Report failed opcodes from all locations. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Move main vcpu loop into subarch independent codeAvi Kivity2007-10-13
| | | | | | This simplifies adding new code as well as reducing overall code size. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Rename kvm_arch_ops to kvm_x86_opsChristian Ehrhardt2007-10-13
| | | | | | | | This patch just renames the current (misnamed) _arch namings to _x86 to ensure better readability when a real arch layer takes place. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Simplify memory allocationLaurent Vivier2007-10-13
| | | | | | | | | | The mutex->splinlock convertion alllows us to make some code simplifications. As we can keep the lock longer, we don't have to release it and then have to check if the environment has not been modified before re-taking it. We can remove kvm->busy and kvm->memory_config_version. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Hoist SVM's get_cs_db_l_bits into core code.Rusty Russell2007-10-13
| | | | | | | | SVM gets the DB and L bits for the cs by decoding the segment. This is in fact the completely generic code, so hoist it for kvm-lite to use. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Clean up unloved invlpg emulationRusty Russell2007-10-13
| | | | | | | | | | invlpg shouldn't fetch the "src" address, since it may not be valid, however SVM's "solution" which neuters emulation of all group 7 instruction is horrible and breaks kvm-lite. The simplest fix is to put a special check in for invlpg. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Remove the unused invlpg member of struct kvm_arch_ops.Rusty Russell2007-10-13
| | | | | Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: enable in-kernel APIC INIT/SIPI handlingHe, Qing2007-10-13
| | | | | | | | | | | This patch enables INIT/SIPI handling using in-kernel APIC by introducing a ->mp_state field to emulate the SMP state transition. [avi: remove smp_processor_id() warning] Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Xin Li <xin.b.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: round robin for APIC lowest priority delivery modeHe, Qing2007-10-13
| | | | | Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: pending irq save/restoreEddie Dong2007-10-13
| | | | | | | | | | | Add in kernel irqchip save/restore support for pending vectors. [avi: fix compile warning on i386] [avi: remove printk] Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Emulate hlt in the kernelEddie Dong2007-10-13
| | | | | | | | | By sleeping in the kernel when hlt is executed, we simplify the in-kernel guest interrupt path considerably. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: In-kernel I/O APIC modelEddie Dong2007-10-13
| | | | | | | | | | | | | | | This allows in-kernel host-side device drivers to raise guest interrupts without going to userspace. [avi: fix level-triggered interrupt redelivery on eoi] [avi: add missing #include] [avi: avoid redelivery of edge-triggered interrupt] [avi: implement polarity] [avi: don't deliver edge-triggered interrupts when unmasking] [avi: fix host oops on invalid guest access] Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Emulate local APIC in kernelEddie Dong2007-10-13
| | | | | | | | | | | | | | | | Because lightweight exits (exits which don't involve userspace) are many times faster than heavyweight exits, it makes sense to emulate high usage devices in the kernel. The local APIC is one such device, especially for Windows and for SMP, so we add an APIC model to kvm. It also allows in-kernel host-side drivers to inject interrupts without going through userspace. [compile fix on i386 from Jindrich Makovicka] Signed-off-by: Yaozu (Eddie) Dong <Eddie.Dong@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Define and use cr8 access functionsEddie Dong2007-10-13
| | | | | | | | | This patch is to wrap APIC base register and CR8 operation which can provide a unique API for user level irqchip and kernel irqchip. This is a preparation of merging lapic/ioapic patch. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add support for in-kernel PIC emulationEddie Dong2007-10-13
| | | | | Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Support more memory slotsIzik Eidus2007-10-13
| | | | | | | Needed for mapping memory at 4GB. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Clean up kvm_setup_pio()Laurent Vivier2007-10-13
| | | | | | | | Split kvm_setup_pio() into two functions, one to setup in/out pio (kvm_emulate_pio()) and one to setup ins/outs pio (kvm_emulate_pio_string()). Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add and use pr_unimpl for standard formatting of unimplemented featuresRusty Russell2007-10-13
| | | | | | | | All guest-invokable printks should be ratelimited to prevent malicious guests from flooding logs. This is a start. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Add cpu consistency checkYang, Sheng2007-10-13
| | | | | | | | | All the physical CPUs on the board should support the same VMX feature set. Add check_processor_compatibility to kvm_arch_ops for the consistency check. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use alignment properties of vcpu to simplify FPU opsRusty Russell2007-10-13
| | | | | | | | | Now we use a kmem cache for allocating vcpus, we can get the 16-byte alignment required by fxsave & fxrstor instructions, and avoid manually aligning the buffer. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use kmem cache for allocating vcpusRusty Russell2007-10-13
| | | | | | | | | Avi wants the allocations of vcpus centralized again. The easiest way is to add a "size" arg to kvm_init_arch, and expose the thus-prepared cache to the modules. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Remove kvm_{read,write}_guest()Laurent Vivier2007-10-13
| | | | | | | ... in favor of the more general emulator_{read,write}_*. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Convert vm lock to a mutexShaohua Li2007-10-13
| | | | | | | | This allows the kvm mmu to perform sleepy operations, such as memory allocation. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use the scheduler preemption notifiers to make kvm preemptibleAvi Kivity2007-10-13
| | | | | | | | | | | | | | Current kvm disables preemption while the new virtualization registers are in use. This of course is not very good for latency sensitive workloads (one use of virtualization is to offload user interface and other latency insensitive stuff to a container, so that it is easier to analyze the remaining workload). This patch re-enables preemption for kvm; preemption is now only disabled when switching the registers in and out, and during the switch to guest mode and back. Contains fixes from Shaohua Li <shaohua.li@intel.com>. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Dynamically allocate vcpusRusty Russell2007-10-13
| | | | | | | | | | This patch converts the vcpus array in "struct kvm" to a pointer array, and changes the "vcpu_create" and "vcpu_setup" hooks into one "vcpu_create" call which does the allocation and initialization of the vcpu (calling back into the kvm_vcpu_init core helper). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>