aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* KVM: VMX: Use vmx to inject real-mode interruptsAvi Kivity2008-01-30
| | | | | | | | | | | | Instead of injecting real-mode interrupts by writing the interrupt frame into guest memory, abuse vmx by injecting a software interrupt. We need to pretend the software interrupt instruction had a length > 0, so we have to adjust rip backward. This lets us not to mess with writing guest memory, which is complex and also sleeps. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add make_page_dirty() to kvm_clear_guest_page()Dor Laor2008-01-30
| | | | | | | | Every write access to guest pages should be tracked. Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Move x86 vcpu ioctl handlers to x86.cHollis Blanchard2008-01-30
| | | | | Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Move x86 FPU handling to x86.cHollis Blanchard2008-01-30
| | | | | Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Move x86 instruction emulation code to x86.cHollis Blanchard2008-01-30
| | | | | Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Make exported debugfs data architecture-specificHollis Blanchard2008-01-30
| | | | | Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: Hoist modrm and abs decoding into separate functionsAvi Kivity2008-01-30
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Make mark_page_dirty() work for aliased pages too.Uri Lublin2008-01-30
| | | | | | | Recommended by Izik Eidus. Signed-off-by: Uri Lublin <uril@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Simplify decode_register_operand() calling conventionAvi Kivity2008-01-30
| | | | | | | Now that rex_prefix is part of the decode cache, there is no need to pass it along. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: centralize decoding of one-byte register access insnsAvi Kivity2008-01-30
| | | | | | | | | Instructions like 'inc reg' that have the register operand encoded in the opcode are currently specially decoded. Extend decode_register_operand() to handle that case, indicated by having DstReg or SrcReg without ModRM. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: Extract the common code of SrcReg and DstRegAvi Kivity2008-01-30
| | | | | | Share the common parts of SrcReg and DstReg decoding. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Move pio emulation functions to x86.cCarsten Otte2008-01-30
| | | | | | | | | | | | | | This patch moves implementation of the following functions from kvm_main.c to x86.c: free_pio_guest_pages, vcpu_find_pio_dev, pio_copy_data, complete_pio, kernel_pio, pio_string_write, kvm_emulate_pio, kvm_emulate_pio_string The function inject_gp, which was duplicated by yesterday's patch series, is removed from kvm_main.c now because it is not needed anymore. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Acked-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Move x86 emulation and mmio device hook to x86.cCarsten Otte2008-01-30
| | | | | | | | | | | | | | | | This patch moves the following functions to from kvm_main.c to x86.c: emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev, emulator_read/write_emulated, emulator_write_phys, emulator_write_emulated_onepage, emulator_cmpxchg_emulated, get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr, kvm_report_emulation_failure, emulate_instruction The following data type is moved to x86.c: struct x86_emulate_ops emulate_ops Signed-off-by: Carsten Otte <cotte@de.ibm.com> Acked-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Move kvm_get/set_msr[_common] to x86.cCarsten Otte2008-01-30
| | | | | | | | | | This patch moves the implementation of the functions of kvm_get/set_msr, kvm_get/set_msr_common, and set_efer from kvm_main.c to x86.c. The definition of EFER_RESERVED_BITS is moved too. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Acked-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Fix gfn_to_page() acquiring mmap_sem twiceAnthony Liguori2008-01-30
| | | | | | | | | | | | KVM's nopage handler calls gfn_to_page() which acquires the mmap_sem when calling out to get_user_pages(). nopage handlers are already invoked with the mmap_sem held though. Introduce a __gfn_to_page() for use by the nopage handler which requires the lock to already be held. This was noticed by tglx. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Enable memory mapped TPR shadow (FlexPriority)Sheng Yang2008-01-30
| | | | | | | | | | This patch based on CR8/TPR patch, and enable the TPR shadow (FlexPriority) for 32bit Windows. Since TPR is accessed very frequently by 32bit Windows, especially SMP guest, with FlexPriority enabled, we saw significant performance gain. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Move control register helper functions to x86.cCarsten Otte2008-01-30
| | | | | | | | | | | | | | | | | | This patch moves the definitions of CR0_RESERVED_BITS, CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following functions from kvm_main.c to x86.c: set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(), load_pdptrs() The static function wrapper inject_gp is duplicated in kvm_main.c and x86.c for now, the version in kvm_main.c should disappear once the last user of it is gone too. The function load_pdptrs is no longer static, and now defined in x86.h for the time being, until the last user of it is gone from kvm_main.c. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: move get/set_apic_base to x86.cCarsten Otte2008-01-30
| | | | | | | | | | This patch moves the implementation of get_apic_base and set_apic_base from kvm_main.c to x86.c Signed-off-by: Carsten Otte <cotte@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Move memory segmentation to x86.cCarsten Otte2008-01-30
| | | | | | | | | | | | This patch moves the definition of segment_descriptor_64 for AMD64 and EM64T from kvm_main.c to segment_descriptor.h. It also adds a proper #ifndef...#define...#endif around that header file. The implementation of segment_base is moved from kvm_main.c to x86.c. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Split kvm_vm_ioctl v3Carsten Otte2008-01-30
| | | | | | | | | | | | | | | | | | | | This patch splits kvm_vm_ioctl into archtecture independent parts, and x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. The patch is unchanged since last submission. Common ioctls for all architectures are: KVM_CREATE_VCPU, KVM_GET_DIRTY_LOG, KVM_SET_USER_MEMORY_REGION x86 specific ioctls are: KVM_SET_MEMORY_REGION, KVM_GET/SET_NR_MMU_PAGES, KVM_SET_MEMORY_ALIAS, KVM_CREATE_IRQCHIP, KVM_CREATE_IRQ_LINE, KVM_GET/SET_IRQCHIP KVM_SET_TSS_ADDR Signed-off-by: Carsten Otte <cotte@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Topup the mmu memory preallocation caches before emulating an insnAvi Kivity2008-01-30
| | | | | | | Emulation may cause a shadow pte to be instantiated, which requires memory resources. Make sure the caches are filled to avoid an oops. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Move page fault processing to common codeAvi Kivity2008-01-30
| | | | | | | The code that dispatches the page fault and emulates if we failed to map is duplicated across vmx and svm. Merge it to simplify further bugfixing. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: don't depend on cr2 for mov abs emulationAvi Kivity2008-01-30
| | | | | | | | | | | | | | | | | The 'mov abs' instruction family (opcodes 0xa0 - 0xa3) still depends on cr2 provided by the page fault handler. This is wrong for several reasons: - if an instruction accessed misaligned data that crosses a page boundary, and if the fault happened on the second page, cr2 will point at the second page, not the data itself. - if we're emulating in real mode, or due to a FlexPriority exit, there is no cr2 generated. So, this change adds decoding for this instruction form and drops reliance on cr2. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: Let gcc to choose which registers to save (i386)Laurent Vivier2008-01-30
| | | | | | | | | | | | | | | | | | | | | | This patch lets GCC to determine which registers to save when we switch to/from a VCPU in the case of AMD i386 * Original code saves following registers: ebx, ecx, edx, esi, edi, ebp * Patched code: - informs GCC that we modify following registers using the clobber description: ebx, ecx, edx, esi, edi - rbp is saved (pop/push) because GCC seems to ignore its use in the clobber description. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: Let gcc to choose which registers to save (x86_64)Laurent Vivier2008-01-30
| | | | | | | | | | | | | | | | | | | | | | | | This patch lets GCC to determine which registers to save when we switch to/from a VCPU in the case of AMD x86_64. * Original code saves following registers: rbx, rcx, rdx, rsi, rdi, rbp, r8, r9, r10, r11, r12, r13, r14, r15 * Patched code: - informs GCC that we modify following registers using the clobber description: rbx, rcx, rdx, rsi, rdi r8, r9, r10, r11, r12, r13, r14, r15 - rbp is saved (pop/push) because GCC seems to ignore its use in the clobber description. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Let gcc to choose which registers to save (i386)Laurent Vivier2008-01-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch lets GCC to determine which registers to save when we switch to/from a VCPU in the case of intel i386. * Original code saves following registers: eax, ebx, ecx, edx, edi, esi, ebp (using popa) * Patched code: - informs GCC that we modify following registers using the clobber description: ebx, edi, rsi - doesn't save eax because it is an output operand (vmx->fail) - cannot put ecx in clobber description because it is an input operand, but as we modify it and we want to keep its value (vcpu), we must save it (pop/push) - ebp is saved (pop/push) because GCC seems to ignore its use the clobber description. - edx is saved (pop/push) because it is reserved by GCC (REGPARM) and cannot be put in the clobber description. - line "mov (%%esp), %3 \n\t" has been removed because %3 is ecx and ecx is restored just after. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Let gcc to choose which registers to save (x86_64)Laurent Vivier2008-01-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch lets GCC to determine which registers to save when we switch to/from a VCPU in the case of intel x86_64. * Original code saves following registers: rax, rbx, rcx, rdx, rsi, rdi, rbp, r8, r9, r10, r11, r12, r13, r14, r15 * Patched code: - informs GCC that we modify following registers using the clobber description: rbx, rdi, rsi, r8, r9, r10, r11, r12, r13, r14, r15 - doesn't save rax because it is an output operand (vmx->fail) - cannot put rcx in clobber description because it is an input operand, but as we modify it and we want to keep its value (vcpu), we must save it (pop/push) - rbp is saved (pop/push) because GCC seems to ignore its use in the clobber description. - rdx is saved (pop/push) because it is reserved by GCC (REGPARM) and cannot be put in the clobber description. - line "mov (%%rsp), %3 \n\t" has been removed because %3 is rcx and rcx is restored just after. - line ASM_VMX_VMWRITE_RSP_RDX() is moved out of the ifdef/else/endif Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add ioctl to tss address from userspace,Izik Eidus2008-01-30
| | | | | | | | | | Currently kvm has a wart in that it requires three extra pages for use as a tss when emulating real mode on Intel. This patch moves the allocation internally, only requiring userspace to tell us where in the physical address space we can place the tss. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add kernel-internal memory slotsIzik Eidus2008-01-30
| | | | | | | | | | Reserve a few memory slots for kernel internal use. This is good for case you have to register memory region and you want to be sure it was not registered from userspace, and for case you want to register a memory region that won't be seen from userspace. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Export memory slot allocation mechanismIzik Eidus2008-01-30
| | | | | | | | Remove kvm memory slot allocation mechanism from the ioctl and put it to exported function. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Unmap kernel-allocated memory on slot destructionIzik Eidus2008-01-30
| | | | | | | | | kvm_vm_ioctl_set_memory_region() is able to remove memory in addition to adding it. Therefore when using kernel swapping support for old userspaces, we need to munmap the memory if the user request to remove it Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Per-architecture hypercall definitionsChristian Borntraeger2008-01-30
| | | | | | | | | | Currently kvm provides hypercalls only for x86* architectures. To provide hypercall infrastructure for other kvm architectures I split kvm_para.h into a generic header file and architecture specific definitions. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Split IOAPIC reset function and export for kernel RESETEddie Dong2008-01-30
| | | | | Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Export PIC reset for kernel device resetEddie Dong2008-01-30
| | | | | Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add a might_sleep() annotation to gfn_to_page()Avi Kivity2008-01-30
| | | | | | This will help trap accesses to guest memory in atomic context. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Move vmx_vcpu_reset() out of vmx_vcpu_setup()Avi Kivity2008-01-30
| | | | | | | | | | Split guest reset code out of vmx_vcpu_setup(). Besides being cleaner, this moves the realmode tss setup (which can sleep) outside vmx_vcpu_setup() (which is executed with preemption enabled). [izik: remove unused variable] Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Split kvm_vcpu into arch dependent and independent parts ↵Zhang Xiantao2008-01-30
| | | | | | | | | | | (part 1) First step to split kvm_vcpu. Currently, we just use an macro to define the common fields in kvm_vcpu for all archs, and all archs need to define its own kvm_vcpu struct. Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Allocate userspace memory for older userspaceAnthony Liguori2008-01-30
| | | | | | | | | | | | Allocate a userspace buffer for older userspaces. Also eliminate phys_mem buffer. The memset() in kvmctl really kills initial memory usage but swapping works even with old userspaces. A side effect is that maximum guest side is reduced for older userspace on i386. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use virtual cpu accounting if available for guest times.Christian Borntraeger2008-01-30
| | | | | | | | | | | | | | | | | | ppc and s390 offer the possibility to track process times precisely by looking at cpu timer on every context switch, irq, softirq etc. We can use that infrastructure as well for guest time accounting. We need to account the used time before we change the state. This patch adds a call to account_system_vtime to kvm_guest_enter and kvm_guest exit. If CONFIG_VIRT_CPU_ACCOUNTING is not set, account_system_vtime is defined in hardirq.h as an empty function, which means this patch does not change the behaviour on other platforms. I compile tested this patch on x86 and function tested the patch on s390. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Partial swapping of guest memoryIzik Eidus2008-01-30
| | | | | | | | | | | | This allows guest memory to be swapped. Pages which are currently mapped via shadow page tables are pinned into memory, but all other pages can be freely swapped. The patch makes gfn_to_page() elevate the page's reference count, and introduces kvm_release_page() that pairs with it. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Make gfn_to_page() always safeIzik Eidus2008-01-30
| | | | | | | | | | In case the page is not present in the guest memory map, return a dummy page the guest can scribble on. This simplifies error checking in its users. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Keep a reverse mapping of non-writable translationsIzik Eidus2008-01-30
| | | | | | | | | | | | The current kvm mmu only reverse maps writable translation. This is used to write-protect a page in case it becomes a pagetable. But with swapping support, we need a reverse mapping of read-only pages as well: when we evict a page, we need to remove any mapping to it, whether writable or not. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Add rmap_next(), a helper for walking kvm rmapsIzik Eidus2008-01-30
| | | | | Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: cmc, clc, cli, stiNitin A Kamble2008-01-30
| | | | | | | | | | Instruction: cmc, clc, cli, sti opcodes: 0xf5, 0xf8, 0xfa, 0xfb respectively. [avi: fix reference to EFLG_IF which is not defined anywhere] Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Simplify page table walkerAvi Kivity2008-01-30
| | | | | | | | | | | | Simplify the walker level loop not to carry so much information from one loop to the next. In addition to being complex, this made kmap_atomic() critical sections difficult to manage. As a result of this change, kmap_atomic() sections are limited to actually touching the guest pte, which allows the other functions called from the walker to do sleepy operations. This will happen when we enable swapping. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: Implement emulation of instruction: inc & decNitin A Kamble2008-01-30
| | | | | | | | | Instructions: inc r16/r32 (opcode 0x40-0x47) dec r16/r32 (opcode 0x48-0x4f) Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Rename KVM_TLB_FLUSH to KVM_REQ_TLB_FLUSHAvi Kivity2008-01-30
| | | | | | We now have a new namespace, KVM_REQ_*, for bits in vcpu->requests. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Move apic timer interrupt backlog processing to common codeAvi Kivity2008-01-30
| | | | | | | | Beside the obvious goodness of making code more common, this prevents a livelock with the next patch which moves interrupt injection out of the critical section. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add some \n in ioapic_debug()Laurent Vivier2008-01-30
| | | | | | | Add new-line at end of debug strings. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: apic round robin cleanupQing He2008-01-30
| | | | | | | | | | If no apic is enabled in the bitmap of an interrupt delivery with delivery mode of lowest priority, a warning should be reported rather than select a fallback vcpu Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Eddie (Yaozu) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>