litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	x86, hyper: fix build with !CONFIG_KVM_GUEST	Avi Kivity	2012-07-18
\| \| \| \| \|	Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	Revert "apic: fix kvm build on UP without IOAPIC"	Michael S. Tsirkin	2012-07-16
\| \| \| \| \| \| \| \| \|	This reverts commit f9808b7fd422b965cea52e05ba470e0a473c53d3. After commit 'kvm: switch to apic_set_eoi_write, apic_write' the stubs are no longer needed as kvm does not look at apicdrivers anymore. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM guest: switch to apic_set_eoi_write, apic_write	Michael S. Tsirkin	2012-07-16
\| \| \| \| \| \| \| \|	Use apic_set_eoi_write, apic_write to avoid meedling in core apic driver data structures directly. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	apic: add apic_set_eoi_write for PV use	Michael S. Tsirkin	2012-07-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	KVM PV EOI optimization overrides eoi_write apic op with its own version. Add an API for this to avoid meddling with core x86 apic driver data structures directly. For KVM use, we don't need any guarantees about when the switch to the new op will take place, so it could in theory use this API after SMP init, but it currently doesn't, and restricting callers to early init makes it clear that it's safe as it won't race with actual APIC driver use. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: VMX: Implement PCID/INVPCID for guests with EPT	Mao, Junjie	2012-07-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch handles PCID/INVPCID for guests. Process-context identifiers (PCIDs) are a facility by which a logical processor may cache information for multiple linear-address spaces so that the processor may retain cached information when software switches to a different linear address space. Refer to section 4.10.1 in IA32 Intel Software Developer's Manual Volume 3A for details. For guests with EPT, the PCID feature is enabled and INVPCID behaves as running natively. For guests without EPT, the PCID feature is disabled and INVPCID triggers #UD. Signed-off-by: Junjie Mao <junjie.mao@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: Add x86_hyper_kvm to complete detect_hypervisor_platform check	Prarit Bhargava	2012-07-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While debugging I noticed that unlike all the other hypervisor code in the kernel, kvm does not have an entry for x86_hyper which is used in detect_hypervisor_platform() which results in a nice printk in the syslog. This is only really a stub function but it does make kvm more consistent with the other hypervisors. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Marcelo Tostatti <mtosatti@redhat.com> Cc: kvm@vger.kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: MMU: fix kvm_mmu_pagetable_walk tracepoint	Xiao Guangrong	2012-07-11
\| \| \| \| \| \| \| \|	The P bit of page fault error code is missed in this tracepoint, fix it by passing the full error code Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: MMU: trace fast page fault	Xiao Guangrong	2012-07-11
\| \| \| \| \| \| \|	To see what happen on this path and help us to optimize it Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: MMU: fast path of handling guest page fault	Xiao Guangrong	2012-07-11
\| \| \| \| \| \| \| \| \| \| \| \|	If the the present bit of page fault error code is set, it indicates the shadow page is populated on all levels, it means what we do is only modify the access bit which can be done out of mmu-lock Currently, in order to simplify the code, we only fix the page fault caused by write-protect on the fast path Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: MMU: introduce SPTE_MMU_WRITEABLE bit	Xiao Guangrong	2012-07-11
\| \| \| \| \| \| \| \| \| \| \| \|	This bit indicates whether the spte can be writable on MMU, that means the corresponding gpte is writable and the corresponding gfn is not protected by shadow page protection In the later path, SPTE_MMU_WRITEABLE will indicates whether the spte can be locklessly updated Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: MMU: fold tlb flush judgement into mmu_spte_update	Xiao Guangrong	2012-07-11
\| \| \| \| \| \| \|	mmu_spte_update() is the common function, we can easily audit the path Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: VMX: export PFEC.P bit on ept	Xiao Guangrong	2012-07-11
\| \| \| \| \| \| \| \|	Export the present bit of page fault error code, the later patch will use it Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: MMU: cleanup spte_write_protect	Xiao Guangrong	2012-07-11
\| \| \| \| \| \| \|	Use __drop_large_spte to cleanup this function and comment spte_write_protect Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: MMU: abstract spte write-protect	Xiao Guangrong	2012-07-11
\| \| \| \| \| \| \| \|	Introduce a common function to abstract spte write-protect to cleanup the code Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: MMU: return bool in __rmap_write_protect	Xiao Guangrong	2012-07-11
\| \| \| \| \| \| \| \|	The reture value of __rmap_write_protect is either 1 or 0, use true/false instead of these Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: VMX: Emulate invalid guest state by default	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \|	Our emulation should be complete enough that we can emulate guests while they are in big real mode, or in a mode transition that is not virtualizable without unrestricted guest support. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: implement LTR	Avi Kivity	2012-07-09
\| \| \| \| \| \|	Opcode 0F 00 /3. Encountered during Windows XP secondary processor bringup. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: make loading TR set the busy bit	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \|	Guest software doesn't actually depend on it, but vmx will refuse us entry if we don't. Set the bit in both the cached segment and memory, just to be nice. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: make read_segment_descriptor() return the address	Avi Kivity	2012-07-09
\| \| \| \| \| \| \|	Some operations want to modify the descriptor later on, so save the address for future use. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: emulate LLDT	Avi Kivity	2012-07-09
\| \| \| \| \| \|	Opcode 0F 00 /2. Used by isolinux durign the protected mode transition. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: emulate BSWAP	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \|	Opcodes 0F C8 - 0F CF. Used by the SeaBIOS cdrom code (though not in big real mode). Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: VMX: Improve error reporting during invalid guest state emulation	Avi Kivity	2012-07-09
\| \| \| \| \| \|	If instruction emulation fails, report it properly to userspace. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: VMX: Stop invalid guest state emulation on pending event	Avi Kivity	2012-07-09
\| \| \| \| \| \|	Process the event, possibly injecting an interrupt, before continuing. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: implement ENTER	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \| \| \|	Opcode C8. Only ENTER with lexical nesting depth 0 is implemented, since others are very rare. We'll fail emulation if nonzero lexical depth is used so data is not corrupted. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: split push logic from push opcode emulation	Avi Kivity	2012-07-09
\| \| \| \| \| \| \|	This allows us to reuse the code without populating ctxt->src and overriding ctxt->op_bytes. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: fix byte-sized MOVZX/MOVSX	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \| \| \| \|	Commit 2adb5ad9fe1 removed ByteOp from MOVZX/MOVSX, replacing them by SrcMem8, but neglected to fix the dependency in the emulation code on ByteOp. This caused the instruction not to have any effect in some circumstances. Fix by replacing the check for ByteOp with the equivalent src.op_bytes == 1. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: emulate LAHF	Avi Kivity	2012-07-09
\| \| \| \| \| \|	Opcode 9F. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: VMX: Continue emulating after batch exhausted	Avi Kivity	2012-07-09
\| \| \| \| \| \| \|	If we return early from an invalid guest state emulation loop, make sure we return to it later if the guest state is still invalid. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: VMX: Fix interrupt exit condition during emulation	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \| \| \| \|	Checking EFLAGS.IF is incorrect as we might be in interrupt shadow. If that is the case, the main loop will notice that and not inject the interrupt, causing an endless loop. Fix by using vmx_interrupt_allowed() to check if we can inject an interrupt instead. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: emulate SGDT/SIDT	Avi Kivity	2012-07-09
\| \| \| \| \| \|	Opcodes 0F 01 /0 and 0F 01 /1 Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: Fix SS default ESP/EBP based addressing	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \| \| \|	We correctly default to SS when BP is used as a base in 16-bit address mode, but we don't do that for 32-bit mode. Fix by adjusting the default to SS when either ESP or EBP is used as the base register. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: initialize memop	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \| \| \| \|	memop is not initialized; this can lead to a two-byte operation following a 4-byte operation to see garbage values. Usually truncation fixes things fot us later on, but at least in one case (call abs) it doesn't. Fix by moving memop to the auto-initialized field area. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: emulate LEAVE	Avi Kivity	2012-07-09
\| \| \| \| \| \|	Opcode c9; used by some variants of Windows during boot, in big real mode. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: VMX: Limit iterations with emulator_invalid_guest_state	Avi Kivity	2012-07-09
\| \| \| \| \| \| \|	Otherwise, if the guest ends up looping, we never exit the srcu critical section, which causes synchronize_srcu() to hang. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: VMX: Relax check on unusable segment	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \| \|	Some userspace (e.g. QEMU 1.1) munge the d and g bits of segment descriptors, causing us not to recognize them as unusable segments with emulate_invalid_guest_state=1. Relax the check by testing for segment not present (a non-present segment cannot be usable). Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: fix LIDT/LGDT in long mode	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \| \|	The operand size for these instructions is 8 bytes in long mode, even without a REX prefix. Set it explicitly. Triggered while booting Linux with emulate_invalid_guest_state=1. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: allow loading null SS in long mode	Avi Kivity	2012-07-09
\| \| \| \| \| \|	Null SS is valid in long mode; allow loading it. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: emulate cpuid	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \| \| \|	Opcode 0F A2. Used by Linux during the mode change trampoline while in a state that is not virtualizable on vmx without unrestricted_guest, so we need to emulate it is emulate_invalid_guest_state=1. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: x86 emulator: change ->get_cpuid() accessor to use the x86 semantics	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \|	Instead of getting an exact leaf, follow the spec and fall back to the last main leaf instead. This lets us easily emulate the cpuid instruction in the emulator. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: Split cpuid register access from computation	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \| \| \|	Introduce kvm_cpuid() to perform the leaf limit check and calculate register values, and let kvm_emulate_cpuid() just handle reading and writing the registers from/to the vcpu. This allows us to reuse kvm_cpuid() in a context where directly reading and writing registers is not desired. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: VMX: Return correct CPL during transition to protected mode	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \| \| \| \| \| \|	In protected mode, the CPL is defined as the lower two bits of CS, as set by the last far jump. But during the transition to protected mode, there is no last far jump, so we need to return zero (the inherited real mode CPL). Fix by reading CPL from the cache during the transition. This isn't 100% correct since we don't set the CPL cache on a far jump, but since protected mode transition will always jump to a segment with RPL=0, it will always work. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: MMU: Force cr3 reload with two dimensional paging on mov cr3 emulation	Avi Kivity	2012-07-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the MMU's ->new_cr3() callback does nothing when guest paging is disabled or when two-dimentional paging (e.g. EPT on Intel) is active. This means that an emulated write to cr3 can be lost; kvm_set_cr3() will write vcpu-arch.cr3, but the GUEST_CR3 field in the VMCS will retain its old value and this is what the guest sees. This bug did not have any effect until now because: - with unrestricted guest, or with svm, we never emulate a mov cr3 instruction - without unrestricted guest, and with paging enabled, we also never emulate a mov cr3 instruction - without unrestricted guest, but with paging disabled, the guest's cr3 is ignored until the guest enables paging; at this point the value from arch.cr3 is loaded correctly my the mov cr0 instruction which turns on paging However, the patchset that enables big real mode causes us to emulate mov cr3 instructions in protected mode sometimes (when guest state is not virtualizable by vmx); this mov cr3 is effectively ignored and will crash the guest. The fix is to make nonpaging_new_cr3() call mmu_free_roots() to force a cr3 reload. This is awkward because now all the new_cr3 callbacks to the same thing, and because mmu_free_roots() is somewhat of an overkill; but fixing that is more complicated and will be done after this minimal fix. Observed in the Window XP 32-bit installer while bringing up secondary vcpus. Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: VMX: code clean for vmx_init()	Guo Chao	2012-07-03
\| \| \| \| \|	Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	apic: fix kvm build on UP without IOAPIC	Michael S. Tsirkin	2012-07-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On UP i386, when APIC is disabled # CONFIG_X86_UP_APIC is not set # CONFIG_PCI_IOAPIC is not set code looking at apicdrivers never has any effect but it still gets compiled in. In particular, this causes build failures with kvm, but it generally bloats the kernel unnecessarily. Fix by defining both __apicdrivers and __apicdrivers_end to be NULL when CONFIG_X86_LOCAL_APIC is unset: I verified that as the result any loop scanning __apicdrivers gets optimized out by the compiler. Warning: a .config with apic disabled doesn't seem to boot for me (even without this patch). Still verifying why, meanwhile this patch is compile-tested only. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reported-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: host side for eoi optimization	Michael S. Tsirkin	2012-06-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implementation of PV EOI using shared memory. This reduces the number of exits an interrupt causes as much as by half. The idea is simple: there's a bit, per APIC, in guest memory, that tells the guest that it does not need EOI. We set it before injecting an interrupt and clear before injecting a nested one. Guest tests it using a test and clear operation - this is necessary so that host can detect interrupt nesting - and if set, it can skip the EOI MSR. There's a new MSR to set the address of said register in guest memory. Otherwise not much changed: - Guest EOI is not required - Register is tested & ISR is automatically cleared on exit For testing results see description of previous patch 'kvm_para: guest side for eoi avoidance'. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: rearrange injection cancelling code	Michael S. Tsirkin	2012-06-25
\| \| \| \| \| \| \| \| \| \| \|	Each time we need to cancel injection we invoke same code (cancel_injection callback). Move it towards the end of function using the familiar goto on error pattern. Will make it easier to do more cleanups for PV EOI. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: only sync when attention bits set	Michael S. Tsirkin	2012-06-25
\| \| \| \| \| \| \| \| \| \|	Commit eb0dc6d0368072236dcd086d7fdc17fd3c4574d4 introduced apic attention bitmask but kvm still syncs lapic unconditionally. As that commit suggested and in anticipation of adding more attention bits, only sync lapic if(apic_attention). Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	x86, bitops: note on __test_and_clear_bit atomicity	Michael S. Tsirkin	2012-06-25
\| \| \| \| \| \| \| \| \| \|	__test_and_clear_bit is actually atomic with respect to the local CPU. Add a note saying that KVM on x86 relies on this behaviour so people don't accidentaly break it. Also warn not to rely on this in portable code. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM guest: guest side for eoi avoidance	Michael S. Tsirkin	2012-06-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The idea is simple: there's a bit, per APIC, in guest memory, that tells the guest that it does not need EOI. Guest tests it using a single est and clear operation - this is necessary so that host can detect interrupt nesting - and if set, it can skip the EOI MSR. I run a simple microbenchmark to show exit reduction (note: for testing, need to apply follow-up patch 'kvm: host side for eoi optimization' + a qemu patch I posted separately, on host): Before: Performance counter stats for 'sleep 1s': 47,357 kvm:kvm_entry [99.98%] 0 kvm:kvm_hypercall [99.98%] 0 kvm:kvm_hv_hypercall [99.98%] 5,001 kvm:kvm_pio [99.98%] 0 kvm:kvm_cpuid [99.98%] 22,124 kvm:kvm_apic [99.98%] 49,849 kvm:kvm_exit [99.98%] 21,115 kvm:kvm_inj_virq [99.98%] 0 kvm:kvm_inj_exception [99.98%] 0 kvm:kvm_page_fault [99.98%] 22,937 kvm:kvm_msr [99.98%] 0 kvm:kvm_cr [99.98%] 0 kvm:kvm_pic_set_irq [99.98%] 0 kvm:kvm_apic_ipi [99.98%] 22,207 kvm:kvm_apic_accept_irq [99.98%] 22,421 kvm:kvm_eoi [99.98%] 0 kvm:kvm_pv_eoi [99.99%] 0 kvm:kvm_nested_vmrun [99.99%] 0 kvm:kvm_nested_intercepts [99.99%] 0 kvm:kvm_nested_vmexit [99.99%] 0 kvm:kvm_nested_vmexit_inject [99.99%] 0 kvm:kvm_nested_intr_vmexit [99.99%] 0 kvm:kvm_invlpga [99.99%] 0 kvm:kvm_skinit [99.99%] 57 kvm:kvm_emulate_insn [99.99%] 0 kvm:vcpu_match_mmio [99.99%] 0 kvm:kvm_userspace_exit [99.99%] 2 kvm:kvm_set_irq [99.99%] 2 kvm:kvm_ioapic_set_irq [99.99%] 23,609 kvm:kvm_msi_set_irq [99.99%] 1 kvm:kvm_ack_irq [99.99%] 131 kvm:kvm_mmio [99.99%] 226 kvm:kvm_fpu [100.00%] 0 kvm:kvm_age_page [100.00%] 0 kvm:kvm_try_async_get_page [100.00%] 0 kvm:kvm_async_pf_doublefault [100.00%] 0 kvm:kvm_async_pf_not_present [100.00%] 0 kvm:kvm_async_pf_ready [100.00%] 0 kvm:kvm_async_pf_completed 1.002100578 seconds time elapsed After: Performance counter stats for 'sleep 1s': 28,354 kvm:kvm_entry [99.98%] 0 kvm:kvm_hypercall [99.98%] 0 kvm:kvm_hv_hypercall [99.98%] 1,347 kvm:kvm_pio [99.98%] 0 kvm:kvm_cpuid [99.98%] 1,931 kvm:kvm_apic [99.98%] 29,595 kvm:kvm_exit [99.98%] 24,884 kvm:kvm_inj_virq [99.98%] 0 kvm:kvm_inj_exception [99.98%] 0 kvm:kvm_page_fault [99.98%] 1,986 kvm:kvm_msr [99.98%] 0 kvm:kvm_cr [99.98%] 0 kvm:kvm_pic_set_irq [99.98%] 0 kvm:kvm_apic_ipi [99.99%] 25,953 kvm:kvm_apic_accept_irq [99.99%] 26,132 kvm:kvm_eoi [99.99%] 26,593 kvm:kvm_pv_eoi [99.99%] 0 kvm:kvm_nested_vmrun [99.99%] 0 kvm:kvm_nested_intercepts [99.99%] 0 kvm:kvm_nested_vmexit [99.99%] 0 kvm:kvm_nested_vmexit_inject [99.99%] 0 kvm:kvm_nested_intr_vmexit [99.99%] 0 kvm:kvm_invlpga [99.99%] 0 kvm:kvm_skinit [99.99%] 284 kvm:kvm_emulate_insn [99.99%] 68 kvm:vcpu_match_mmio [99.99%] 68 kvm:kvm_userspace_exit [99.99%] 2 kvm:kvm_set_irq [99.99%] 2 kvm:kvm_ioapic_set_irq [99.99%] 28,288 kvm:kvm_msi_set_irq [99.99%] 1 kvm:kvm_ack_irq [99.99%] 131 kvm:kvm_mmio [100.00%] 588 kvm:kvm_fpu [100.00%] 0 kvm:kvm_age_page [100.00%] 0 kvm:kvm_try_async_get_page [100.00%] 0 kvm:kvm_async_pf_doublefault [100.00%] 0 kvm:kvm_async_pf_not_present [100.00%] 0 kvm:kvm_async_pf_ready [100.00%] 0 kvm:kvm_async_pf_completed 1.002039622 seconds time elapsed We see that # of exits is almost halved. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: optimize ISR lookups	Michael S. Tsirkin	2012-06-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We perform ISR lookups twice: during interrupt injection and on EOI. Typical workloads only have a single bit set there. So we can avoid ISR scans by 1. counting bits as we set/clear them in ISR 2. on set, caching the injected vector number 3. on clear, invalidating the cache The real purpose of this is enabling PV EOI which needs to quickly validate the vector. But non PV guests also benefit: with this patch, and without interrupt nesting, apic_find_highest_isr will always return immediately without scanning ISR. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>