aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/kvm
Commit message (Collapse)AuthorAge
...
* KVM: MMU: Simplify kvm_mmu_free_page() a tiny bitAvi Kivity2007-07-16
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Implement IA32_EBL_CR_POWERON msrMatthew Gregan2007-07-16
| | | | | | | | | | | | | | Attempting to boot the default 'bsd' kernel of OpenBSD 4.1 i386 in a guest fails early in the kernel init inside p3_get_bus_clock while trying to read the IA32_EBL_CR_POWERON MSR. KVM logs an 'unhandled MSR' message and the guest kernel faults. This patch is sufficient to allow OpenBSD to boot, after which it seems to run fine. I'm not sure if this is the correct solution for dealing with this particular MSR, but it works for me. Signed-off-by: Matthew Gregan <kinetik@flim.org> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Set cr0.mp for guestsAvi Kivity2007-07-16
| | | | | | | This allows fwait instructions to be trapped when the guest fpu is not loaded. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Consolidate guest fpu activation and deactivationAvi Kivity2007-07-16
| | | | | | Easier to keep track of where the fpu is this way. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Rationalize exception bitmap usageAvi Kivity2007-07-16
| | | | | | | | Everyone owns a piece of the exception bitmap, but they happily write to the entire thing like there's no tomorrow. Centralize handling in update_exception_bitmap() and have everyone call that. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Move some more msr mangling into vmx_save_host_state()Avi Kivity2007-07-16
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Fix potential guest state leak into hostAvi Kivity2007-07-16
| | | | | | | | | | | The lightweight vmexit path avoids saving and reloading certain host state. However in certain cases lightweight vmexit handling can schedule() which requires reloading the host state. So we store the host state in the vcpu structure, and reloaded it if we relinquish the vcpu. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Increase mmu shadow cache to 1024 pagesAvi Kivity2007-07-16
| | | | | | | This improves kbuild times by about 10%, bringing it within a respectable 25% of native. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Update shadow pte on write to guest pteAvi Kivity2007-07-16
| | | | | | | | | | | | | | | | | | | | A typical demand page/copy on write pattern is: - page fault on vaddr - kvm propagates fault to guest - guest handles fault, updates pte - kvm traps write, clears shadow pte, resumes guest - guest returns to userspace, re-faults on same vaddr - kvm installs shadow pte, resumes guest - guest continues So, three vmexits for a single guest page fault. But if instead of clearing the page table entry, we update to correspond to the value that the guest has just written, we eliminate the third vmexit. This patch does exactly that, reducing kbuild time by about 10%. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Respect nonpae pagetable quadrant when zapping ptesAvi Kivity2007-07-16
| | | | | | | | | | | | | | | | | | | When a guest writes to a page that has an mmu shadow, we have to clear the shadow pte corresponding to the memory location touched by the guest. Now, in nonpae mode, a single guest page may have two or four shadow pages (because a nonpae page maps 4MB or 4GB, whereas the pae shadow maps 2MB or 1GB), so we when we look up the page we find up to three additional aliases for the page. Since we _clear_ the shadow pte, it doesn't matter except for a slight performance penalty, but if we want to _update_ the shadow pte instead of clearing it, it is vital that we don't modify the aliases. Fortunately, exactly which page is needed (the "quadrant") is easily computed, and is accessible in the shadow page header. All we need is to ignore shadow pages from the wrong quadrants. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Unify kvm_mmu_pre_write() and kvm_mmu_post_write()Avi Kivity2007-07-16
| | | | | | | Instead of calling two functions and repeating expensive checks, call one function and provide it with before/after information. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Be more careful restoring fs on lightweight vmexitAvi Kivity2007-07-16
| | | | | | | | i386 wants fs for accessing the pda even on a lightweight exit, so ensure we can always restore it. This fixes a regression on i386 introduced by the lightweight vmexit patch. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Reduce misfirings of the fork detectorAvi Kivity2007-07-16
| | | | | | | | | | | | | | The kvm mmu tries to detects forks by looking for repeated writes to a page table. If it sees a fork, it unshadows the page table so the page table copying can proceed at native speed instead of being emulated. However, the detector also triggered on simple demand paging access patterns: a linear walk of memory would of course cause repeated writes to the same pagetable page, causing it to unshadow prematurely. Fix by resetting the fork detector if we detect a demand fault. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Unindent some codeAvi Kivity2007-07-16
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Avoid saving and restoring some host CPU state on lightweight vmexitAvi Kivity2007-07-16
| | | | | | | | | | | | Many msrs and the like will only be used by the host if we schedule() or return to userspace. Therefore, we avoid saving them if we handle the exit within the kernel, and if a reschedule is not requested. Based on a patch from Eddie Dong <eddie.dong@intel.com> with a couple of fixes by me. Signed-off-by: Yaozu(Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Assume that writes smaller than 4 bytes are to non-pagetable pagesAvi Kivity2007-07-16
| | | | | | | | This allows us to remove write protection earlier than otherwise. Should some mad OS choose to use byte writes to update pagetables, it will suffer a performance hit, but still work correctly. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: Allow direct guest access to PC debug portAnthony Liguori2007-07-16
| | | | | | | The PC debug port is used for IO delay and does not require emulation. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Enable io bitmaps to avoid IO port 0x80 VMEXITsHe, Qing2007-07-16
| | | | | | | | | | | | | This patch enables IO bitmaps control on vmx and unmask the 0x80 port to avoid VMEXITs caused by accessing port 0x80. 0x80 is used as delays (see include/asm/io.h), and handling VMEXITs on its access is unnecessary but slows things down. This patch improves kernel build test at around 3%~5%. Because every VM uses the same io bitmap, it is shared between all VMs rather than a per-VM data structure. Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Prevent guest fpu state from leaking into the hostAvi Kivity2007-06-15
| | | | | | | | | The lazy fpu changes did not take into account that some vmexit handlers can sleep. Move loading the guest state into the inner loop so that it can be reloaded if necessary, and move loading the host state into vmx_vcpu_put() so it can be performed whenever we relinquish the vcpu. Signed-off-by: Avi Kivity <avi@qumranet.com>
* kvm: fix section mismatch warning in kvm-intel.oSam Ravnborg2007-06-01
| | | | | | | | | | | | | | | | | | Fix following section mismatch warning in kvm-intel.o: WARNING: o-i386/drivers/kvm/kvm-intel.o(.init.text+0xbd): Section mismatch: reference to .exit.text: (between 'hardware_setup' and 'vmx_disabled_by_bios') The function free_kvm_area is used in the function alloc_kvm_area which is marked __init. The __exit area is discarded by some archs during link-time if a module is built-in resulting in an oops. Note: This warning is only seen by my local copy of modpost but the change will soon hit upstream. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Detach sched.h from mm.hAlexey Dobriyan2007-05-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | First thing mm.h does is including sched.h solely for can_do_mlock() inline function which has "current" dereference inside. By dealing with can_do_mlock() mm.h can be detached from sched.h which is good. See below, why. This patch a) removes unconditional inclusion of sched.h from mm.h b) makes can_do_mlock() normal function in mm/mlock.c c) exports can_do_mlock() to not break compilation d) adds sched.h inclusions back to files that were getting it indirectly. e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were getting them indirectly Net result is: a) mm.h users would get less code to open, read, preprocess, parse, ... if they don't need sched.h b) sched.h stops being dependency for significant number of files: on x86_64 allmodconfig touching sched.h results in recompile of 4083 files, after patch it's only 3744 (-8.3%). Cross-compile tested on all arm defconfigs, all mips defconfigs, all powerpc defconfigs, alpha alpha-up arm i386 i386-up i386-defconfig i386-allnoconfig ia64 ia64-up m68k mips parisc parisc-up powerpc powerpc-up s390 s390-up sparc sparc-up sparc64 sparc64-up um-x86_64 x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [S390] Kconfig: refine depends statements.Martin Schwidefsky2007-05-10
| | | | | | | Refine some depends statements to limit their visibility to the environments that are actually supported. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* Add suspend-related notifications for CPU hotplugRafael J. Wysocki2007-05-09
| | | | | | | | | | | | | | | | | | | | | Since nonboot CPUs are now disabled after tasks and devices have been frozen and the CPU hotplug infrastructure is used for this purpose, we need special CPU hotplug notifications that will help the CPU-hotplug-aware subsystems distinguish normal CPU hotplug events from CPU hotplug events related to a system-wide suspend or resume operation in progress. This patch introduces such notifications and causes them to be used during suspend and resume transitions. It also changes all of the CPU-hotplug-aware subsystems to take these notifications into consideration (for now they are handled in the same way as the corresponding "normal" ones). [oleg@tv-sign.ru: cleanups] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* KVM: Remove unused 'instruction_length'Avi Kivity2007-05-03
| | | | | | | As we no longer emulate in userspace, this is meaningless. We don't compute it on SVM anyway. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Don't require explicit indication of completion of mmio or pioAvi Kivity2007-05-03
| | | | | | | | It is illegal not to return from a pio or mmio request without completing it, as mmio or pio is an atomic operation. Therefore, we can simplify the userspace interface by avoiding the completion indication. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Remove extraneous guest entry on mmio readAvi Kivity2007-05-03
| | | | | | | | | | | | | When emulating an mmio read, we actually emulate twice: once to determine the physical address of the mmio, and, after we've exited to userspace to get the mmio value, we emulate again to place the value in the result register and update any flags. But we don't really need to enter the guest again for that, only to take an immediate vmexit. So, if we detect that we're doing an mmio read, emulate a single instruction before entering the guest again. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: Only save/restore MSRs when neededAnthony Liguori2007-05-03
| | | | | | | | | | | | We only have to save/restore MSR_GS_BASE on every VMEXIT. The rest can be saved/restored when we leave the VCPU. Since we don't emulate the DEBUGCTL MSRs and the guest cannot write to them, we don't have to worry about saving/restoring them at all. This shaves a whopping 40% off raw vmexit costs on AMD. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: fix an if() conditionAdrian Bunk2007-05-03
| | | | | | | | It might have worked in this case since PT_PRESENT_MASK is 1, but let's express this correctly. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Add lazy FPU support for VTAnthony Liguori2007-05-03
| | | | | | | | Only save/restore the FPU host state when the guest is actually using the FPU. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Properly shadow the CR0 register in the vcpu structAnthony Liguori2007-05-03
| | | | | | | | Set all of the host mask bits for CR0 so that we can maintain a proper shadow of CR0. This exposes CR0.TS, paving the way for lazy fpu handling. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Don't complain about cpu erratum AA15Avi Kivity2007-05-03
| | | | | | It slows down Windows x64 horribly. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Lazy FPU support for SVMAnthony Liguori2007-05-03
| | | | | | | | Avoid saving and restoring the guest fpu state on every exit. This shaves ~100 cycles off the guest/host switch. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Allow passing 64-bit values to the emulated read/write APIAvi Kivity2007-05-03
| | | | | | | This simplifies the API somewhat (by eliminating the special-case cmpxchg8b on i386). Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Per-vcpu statisticsAvi Kivity2007-05-03
| | | | | | | | Make the exit statistics per-vcpu instead of global. This gives a 3.5% boost when running one virtual machine per core on my two socket dual core (4 cores total) machine. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Avoid unnecessary vcpu_load()/vcpu_put() cyclesYaozu Dong2007-05-03
| | | | | | | | By checking if a reschedule is needed, we avoid dropping the vcpu. [With changes by me, based on Anthony Liguori's observations] Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Avoid heavy ASSERT at non debug mode.Yaozu Dong2007-05-03
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Only save/restore MSR_K6_STAR if necessaryAvi Kivity2007-05-03
| | | | | | | | | | Intel hosts only support syscall/sysret in long more (and only if efer.sce is enabled), so only reload the related MSR_K6_STAR if the guest will actually be able to use it. This reduces vmexit cost by about 500 cycles (6400 -> 5870) on my setup. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Fold drivers/kvm/kvm_vmx.h into drivers/kvm/vmx.cAvi Kivity2007-05-03
| | | | | | No meat in that file. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Don't switch 64-bit msrs for 32-bit guestsAvi Kivity2007-05-03
| | | | | | | | | Some msrs are only used by x86_64 instructions, and are therefore not needed when the guest is legacy mode. By not bothering to switch them, we reduce vmexit latency by 2400 cycles (from about 8800) when running a 32-bt guest on a 64-bit host. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Reduce unnecessary saving of host msrsAvi Kivity2007-05-03
| | | | | | | | | | | THe automatically switched msrs are never changed on the host (with the exception of MSR_KERNEL_GS_BASE) and thus there is no need to save them on every vm entry. This reduces vmexit latency by ~400 cycles on i386 and by ~900 cycles (10%) on x86_64. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Handle guest page faults when emulating mmioAvi Kivity2007-05-03
| | | | | | | | | | | | | | | | | | Usually, guest page faults are detected by the kvm page fault handler, which detects if they are shadow faults, mmio faults, pagetable faults, or normal guest page faults. However, in ceratin circumstances, we can detect a page fault much later. One of these events is the following combination: - A two memory operand instruction (e.g. movsb) is executed. - The first operand is in mmio space (which is the fault reported to kvm) - The second operand is in an ummaped address (e.g. a guest page fault) The Windows 2000 installer does such an access, an promptly hangs. Fix by adding the missing page fault injection on that path. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: Report hardware exit reason to userspace instead of dmesgAvi Kivity2007-05-03
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Retry sleeping allocation if atomic allocation failsAvi Kivity2007-05-03
| | | | | | This avoids -ENOMEM under memory pressure. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use slab caches to allocate mmu data structuresAvi Kivity2007-05-03
| | | | | | | Better leak detection, statistics, memory use, speed -- goodness all around. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Handle partial pae pdptrAvi Kivity2007-05-03
| | | | | | | | | | | Some guests (Solaris) do not set up all four pdptrs, but leave some invalid. kvm incorrectly treated these as valid page directories, pinning the wrong pages and causing general confusion. Fix by checking the valid bit of a pae pdpte. This closes sourceforge bug 1698922. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Initialize cr0 to indicate an fpu is presentAvi Kivity2007-05-03
| | | | | | | Solaris panics if it sees a cpu with no fpu, and it seems to rely on this bit. Closes sourceforge bug 1698920. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Fix overflow bug in overflow detection codeEric Sesterhenn / Snakebyte2007-05-03
| | | | | | | | | | | | | The expression sp - 6 < sp where sp is a u16 is undefined in C since 'sp - 6' is promoted to int, and signed overflow is undefined in C. gcc 4.2 actually warns about it. Replace with a simpler test. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use kernel-standard typesAvi Kivity2007-05-03
| | | | | | Noted by Joerg Roedel. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: enable LBRV virtualization if availableJoerg Roedel2007-05-03
| | | | | | | | | This patch enables the virtualization of the last branch record MSRs on SVM if this feature is available in hardware. It also introduces a small and simple check feature for specific SVM extensions. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add fpu get/set operationsAvi Kivity2007-05-03
| | | | | | | These are really helpful when migrating an floating point app to another machine. Signed-off-by: Avi Kivity <avi@qumranet.com>