litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	KVM: Fix off-by-one when writing to a nonpae guest pde	Avi Kivity	2007-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Nonpae guest pdes are shadowed by two pae ptes, so we double the offset twice: once to account for the pte size difference, and once because we need to shadow pdes for a single guest pde. But when writing to the upper guest pde we also need to truncate the lower bits, otherwise the multiply shifts these bits into the pde index and causes an access to the wrong shadow pde. If we're at the end of the page (accessing the very last guest pde) we can even overflow into the next host page and oops. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: always reload segment selectors	Ingo Molnar	2007-03-27
\| \| \| \| \| \| \| \| \|	failed VM entry on VMX might still change %fs or %gs, thus make sure that KVM always reloads the segment selectors. This is crutial on both x86 and x86_64: x86 has __KERNEL_PDA in %fs on which things like 'current' depends and x86_64 has 0 there and needs MSR_GS_BASE to work. Signed-off-by: Ingo Molnar <mingo@elte.hu>
*	KVM: Prevent system selectors leaking into guest on real->protected mode ↵	Avi Kivity	2007-03-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	transition on vmx Intel virtualization extensions do not support virtualizing real mode. So kvm uses virtualized vm86 mode to run real mode code. Unfortunately, this virtualized vm86 mode does not support the so called "big real" mode, where the segment selector and base do not agree with each other according to the real mode rules (base == selector << 4). To work around this, kvm checks whether a selector/base pair violates the virtualized vm86 rules, and if so, forces it into conformance. On a transition back to protected mode, if we see that the guest did not touch a forced segment, we restore it back to the original protected mode value. This pile of hacks breaks down if the gdt has changed in real mode, as it can cause a segment selector to point to a system descriptor instead of a normal data segment. In fact, this happens with the Windows bootloader and the qemu acpi bios, where a protected mode memcpy routine issues an innocent 'pop %es' and traps on an attempt to load a system descriptor. "Fix" by checking if the to-be-restored selector points at a system segment, and if so, coercing it into a normal data segment. The long term solution, of course, is to abandon vm86 mode and use emulation for big real mode. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: MMU: Fix host memory corruption on i386 with >= 4GB ram	Avi Kivity	2007-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	PAGE_MASK is an unsigned long, so using it to mask physical addresses on i386 (which are 64-bit wide) leads to truncation. This can result in page->private of unrelated memory pages being modified, with disasterous results. Fix by not using PAGE_MASK for physical addresses; instead calculate the correct value directly from PAGE_SIZE. Also fix a similar BUG_ON(). Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: MMU: Fix guest writes to nonpae pde	Avi Kivity	2007-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	KVM shadow page tables are always in pae mode, regardless of the guest setting. This means that a guest pde (mapping 4MB of memory) is mapped to two shadow pdes (mapping 2MB each). When the guest writes to a pte or pde, we intercept the write and emulate it. We also remove any shadowed mappings corresponding to the write. Since the mmu did not account for the doubling in the number of pdes, it removed the wrong entry, resulting in a mismatch between shadow page tables and guest page tables, followed shortly by guest memory corruption. This patch fixes the problem by detecting the special case of writing to a non-pae pde and adjusting the address and number of shadow pdes zapped accordingly. Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Fix guest sysenter on vmx	Avi Kivity	2007-03-18
\| \| \| \| \| \| \| \|	The vmx code currently treats the guest's sysenter support msrs as 32-bit values, which breaks 32-bit compat mode userspace on 64-bit guests. Fix by using the native word width of the machine. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Unset kvm_arch_ops if arch module loading failed	Avi Kivity	2007-03-18
\| \| \| \| \| \| \|	Otherwise, the core module thinks the arch module is loaded, and won't let you reload it after you've fixed the bug. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Move kvmfs magic number to <linux/magic.h>	Andrew Morton	2007-03-04
\| \| \| \| \| \| \| \|	Use the standard magic.h for kvmfs. Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Fix bogus failure in kvm.ko module initialization	Avi Kivity	2007-03-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	A bogus 'return r' can cause an otherwise successful module load to fail. This both denies users the use of kvm, and it also denies them the use of their machine, as it leaves a filesystem registered with its callbacks pointing into now-freed module memory. Fix by returning a zero like a good module. Thanks to Richard Lucassen <mailinglists@lucassen.org> (?) for reporting the problem and for providing access to a machine which exhibited it. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Remove write access permissions when dirty-page-logging is enabled	Uri Lublin	2007-03-04
\| \| \| \| \| \| \| \| \|	Enabling dirty page logging is done using KVM_SET_MEMORY_REGION ioctl. If the memory region already exists, we need to remove write accesses, so writes will be caught, and dirty pages will be logged. Signed-off-by: Uri Lublin <uril@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	kvm: move do_remove_write_access() up	Uri Lublin	2007-03-04
\| \| \| \| \| \| \|	To be called from kvm_vm_ioctl_set_memory_region() Signed-off-by: Uri Lublin <uril@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Fix dirty page log bitmap size/access calculation	Uri Lublin	2007-03-04
\| \| \| \| \| \| \| \|	Since dirty_bitmap is an unsigned long array, the alignment and size need to take that into account. Signed-off-by: Uri Lublin <uril@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Add missing calls to mark_page_dirty()	Uri Lublin	2007-03-04
\| \| \| \| \| \| \| \|	A few places where we modify guest memory fail to call mark_page_dirty(), causing live migration to fail. This adds the missing calls. Signed-off-by: Uri Lublin <uril@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Per-vcpu inodes	Avi Kivity	2007-03-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allocate a distinct inode for every vcpu in a VM. This has the following benefits: - the filp cachelines are no longer bounced when f_count is incremented on every ioctl() - the API and internal code are distinctly clearer; for example, on the KVM_GET_REGS ioctl, there is no need to copy the vcpu number from userspace and then copy the registers back; the vcpu identity is derived from the fd used to make the call Right now the performance benefits are completely theoretical since (a) we don't support more than one vcpu per VM and (b) virtualization hardware inefficiencies completely everwhelm any cacheline bouncing effects. But both of these will change, and we need to prepare the API today. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Move kvm_vm_ioctl_create_vcpu() around	Avi Kivity	2007-03-04
\| \| \| \| \| \|	In preparation of some hacking. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Rename some kvm_dev_ioctl_() functions to kvm_vm_ioctl_()	Avi Kivity	2007-03-04
\| \| \| \| \| \| \|	This reflects the changed scope, from device-wide to single vm (previously every device open created a virtual machine). Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Create an inode per virtual machine	Avi Kivity	2007-03-04
\| \| \| \| \| \| \| \| \| \|	This avoids having filp->f_op and the corresponding inode->i_fop different, which is a little unorthodox. The ioctl list is split into two: global kvm ioctls and per-vm ioctls. A new ioctl, KVM_CREATE_VM, is used to create VMs and return the VM fd. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Add internal filesystem for generating inodes	Avi Kivity	2007-03-04
\| \| \| \| \| \| \|	The kvmfs inodes will represent virtual machines and vcpus, as necessary, reducing cacheline bouncing due to inodes and filps being shared. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: More 0 -> NULL conversions	Avi Kivity	2007-03-04
\| \| \| \|	Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: SVM: intercept SMI to handle it at host level	Joerg Roedel	2007-03-04
\| \| \| \| \| \| \| \|	This patch changes the SVM code to intercept SMIs and handle it outside the guest. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: svm: init cr0 with the wp bit set	Avi Kivity	2007-03-04
\| \| \| \|	Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Wire up hypercall handlers to a central arch-independent location	Avi Kivity	2007-03-04
\| \| \| \|	Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Add hypercall host support for svm	Avi Kivity	2007-03-04
\| \| \| \|	Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Add host hypercall support for vmx	Ingo Molnar	2007-03-04
\| \| \| \|	Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: add MSR based hypercall API	Ingo Molnar	2007-03-04
\| \| \| \| \| \| \| \|	This adds a special MSR based hypercall API to KVM. This is to be used by paravirtual kernels and virtual drivers. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Use page_private()/set_page_private() apis	Markus Rechberger	2007-03-04
\| \| \| \| \| \| \|	Besides using an established api, this allows using kvm in older kernels. Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Use ARRAY_SIZE macro instead of manual calculation.	Ahmed S. Darwish	2007-03-04
\| \| \| \| \| \|	Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com> Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: vmx: hack set_cr0_no_modeswitch() to actually do modeswitch	Joerg Roedel	2007-03-04
\| \| \| \| \| \| \| \| \|	The whole thing is rotten, but this allows vmx to boot with the guest reboot fix. Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Cosmetics	Avi Kivity	2007-03-04
\| \| \| \|	Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Move virtualization deactivation from CPU_DEAD state to CPU_DOWN_PREPARE	Jeremy Katz	2007-03-04
\| \| \| \| \| \| \|	This gives it more chances of surviving suspend. Signed-off-by: Jeremy Katz <katzj@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: mmu: add missing dirty page tracking cases	Avi Kivity	2007-03-04
\| \| \| \| \| \| \| \| \| \| \| \|	We fail to mark a page dirty in three cases: - setting the accessed bit in a pte - setting the dirty bit in a pte - emulating a write into a pagetable This fix adds the missing cases. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	[PATCH] i386: Convert i386 PDA code to use %fs	Jeremy Fitzhardinge	2007-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Convert the PDA code to use %fs rather than %gs as the segment for per-processor data. This is because some processors show a small but measurable performance gain for reloading a NULL segment selector (as %fs generally is in user-space) versus a non-NULL one (as %gs generally is). On modern processors the difference is very small, perhaps undetectable. Some old AMD "K6 3D+" processors are noticably slower when %fs is used rather than %gs; I have no idea why this might be, but I think they're sufficiently rare that it doesn't matter much. This patch also fixes the math emulator, which had not been adjusted to match the changed struct pt_regs. [frederik.deweerdt@gmail.com: fixit with gdb] [mingo@elte.hu: Fix KVM too] Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Ian Campbell <Ian.Campbell@XenSource.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Zachary Amsden <zach@vmware.com> Cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org>
*	[PATCH] KVM: Host suspend/resume support	Avi Kivity	2007-02-12
\| \| \| \| \| \| \| \| \| \|	Add the necessary callbacks to suspend and resume a host running kvm. This is just a repeat of the cpu hotplug/unplug work. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] KVM: cpu hotplug support	Avi Kivity	2007-02-12
\| \| \| \| \| \| \| \| \| \| \|	On hotplug, we execute the hardware extension enable sequence. On unplug, we decache any vcpus that last ran on the exiting cpu, and execute the hardware extension disable sequence. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] KVM: VMX: add vcpu_clear()	Avi Kivity	2007-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Like the inline code it replaces, this function decaches the vmcs from the cpu it last executed on. in addition: - vcpu_clear() works if the last cpu is also the cpu we're running on - it is faster on larger smps by virtue of using smp_call_function_single() Includes fix from Ingo Molnar. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] KVM: Add a global list of all virtual machines	Avi Kivity	2007-02-12
\| \| \| \| \| \| \| \| \| \| \|	This will allow us to iterate over all vcpus and see which cpus they are running on. [akpm@osdl.org: use standard (ugly) initialisers] Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] kvm: fix vcpu freeing bug	Ingo Molnar	2007-02-12
\| \| \| \| \| \| \| \| \| \| \|	vcpu_load() can return NULL and it sometimes does in failure paths (for example when the userspace ABI version is too old) - causing a preemption count underflow in the ->vcpu_free() later on. So check for NULL. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] kvm: VMX: Reload ds and es even in 64-bit mode	Avi Kivity	2007-02-12
\| \| \| \| \| \| \| \| \|	Or 32-bit userspace will get confused. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] kvm: Two-way apic tpr synchronization	Dor Laor	2007-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We report the value of cr8 to userspace on an exit. Also let userspace change cr8 when we re-enter the guest. The lets 64-bit guest code maintain the tpr correctly. Thanks for Yaniv Kamay for the idea. Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] kvm: SVM: Hack initial cpu csbase to be consistent with intel	Avi Kivity	2007-02-12
\| \| \| \| \| \| \| \| \|	This allows us to run the mmu testsuite on amd. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] kvm: Fix mmu going crazy of guest sets cr0.wp == 0	Avi Kivity	2007-02-12
\| \| \| \| \| \| \| \| \| \| \|	The kvm mmu relies on cr0.wp being set even if the guest does not set it. The vmx code correctly forces cr0.wp at all times, the svm code does not, so it can't boot solaris without this patch. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] kvm: vmx: handle triple faults by returning EXIT_REASON_SHUTDOWN to ↵	Avi Kivity	2007-02-12
\| \| \| \| \| \| \| \| \| \| \|	userspace Just like svm. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] kvm: Fix gva_to_gpa()	Avi Kivity	2007-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \|	gva_to_gpa() needs to be updated to the new walk_addr() calling convention, otherwise it may oops under some circumstances. Use the opportunity to remove all the code duplication in gva_to_gpa(), which essentially repeats the calculations in walk_addr(). Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] kvm: Fix asm constraint for lldt instruction	S.Caglar Onur	2007-02-12
\| \| \| \| \| \| \| \| \| \| \|	lldt does not accept immediate operands, which "g" allows. Signed-off-by: S.Caglar Onur <caglar@pardus.org.tr> Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] kvm: optimize inline assembly	Ingo Molnar	2007-02-12
\| \| \| \| \| \| \| \| \| \| \|	Forms like "0(%rsp)" generate an instruction with an unnecessary one byte displacement under certain circumstances. replace with the equivalent "(%rsp)". Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] misc NULL noise removal	Al Viro	2007-02-09
\| \| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] kvm: NULL noise removal	Al Viro	2007-02-09
\| \| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] kvm: __user annotations	Al Viro	2007-02-09
\| \| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] KVM: fix lockup on 32-bit intel hosts with nx disabled in the bios	Avi Kivity	2007-02-01
\| \| \| \| \| \| \| \| \| \| \|	Intel hosts, without long mode, and with nx support disabled in the bios have an efer that is readable but not writable. This causes a lockup on switch to guest mode (even though it should exit with reason 34 according to the documentation). Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] Fix "CONFIG_X86_64_" typo in drivers/kvm/svm.c	Robert P. J. Day	2007-01-30
\| \| \| \| \| \| \| \| \|	Fix what looks like an obvious typo in the file drivers/kvm/svm.c. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>