aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* KVM: SVM: internal function name cleanupRusty Russell2007-10-13
| | | | | | | | | | Changes some svm.c internal function names: 1) io_adress -> io_address (de-germanify the spelling) 2) kvm_reput_irq -> reput_irq (it's not a generic kvm function) 3) kvm_do_inject_irq -> (it's not a generic kvm function) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: de-containizationRusty Russell2007-10-13
| | | | | | | | | container_of is wonderful, but not casting at all is better. This patch changes svm.c's internal functions to pass "struct vcpu_svm" instead of "struct kvm_vcpu" and using container_of. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Remove three magic numbersRusty Russell2007-10-13
| | | | | | | | There are several places where hardcoded numbers are used in place of the easily-available constant, which is poor form. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: pass vcpu_vmx internallyRusty Russell2007-10-13
| | | | | | | | | container_of is wonderful, but not casting at all is better. This patch changes vmx.c's internal functions to pass "struct vcpu_vmx" instead of "struct kvm_vcpu" and using container_of. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: fx_init() needs preemption disabled while it plays with the FPU stateRusty Russell2007-10-13
| | | | | | | | Now that kvm generally runs with preemption enabled, we need to protect the fpu intialization sequence. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Convert vm lock to a mutexShaohua Li2007-10-13
| | | | | | | | This allows the kvm mmu to perform sleepy operations, such as memory allocation. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use the scheduler preemption notifiers to make kvm preemptibleAvi Kivity2007-10-13
| | | | | | | | | | | | | | Current kvm disables preemption while the new virtualization registers are in use. This of course is not very good for latency sensitive workloads (one use of virtualization is to offload user interface and other latency insensitive stuff to a container, so that it is easier to analyze the remaining workload). This patch re-enables preemption for kvm; preemption is now only disabled when switching the registers in and out, and during the switch to guest mode and back. Contains fixes from Shaohua Li <shaohua.li@intel.com>. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: add hypercall nr to kvm_runJeff Dike2007-10-13
| | | | | | | | Add the hypercall number to kvm_run and initialize it. This changes the ABI, but as this particular ABI was unusable before this no users are affected. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Improve the method of writing vmcs controlYang, Sheng2007-10-13
| | | | | | | | | | | | Put cpu feature detecting part in hardware_setup, and stored the vmcs condition in global variable for further check. [glommer: fix for some i386-only machines not supporting CR8 load/store exiting] Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Dynamically allocate vcpusRusty Russell2007-10-13
| | | | | | | | | | This patch converts the vcpus array in "struct kvm" to a pointer array, and changes the "vcpu_create" and "vcpu_setup" hooks into one "vcpu_create" call which does the allocation and initialization of the vcpu (calling back into the kvm_vcpu_init core helper). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Remove arch specific components from the general codeGregory Haskins2007-10-13
| | | | | | | | struct kvm_vcpu has vmx-specific members; remove them to a private structure. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: load_pdptrs() cleanupsRusty Russell2007-10-13
| | | | | | | | | | | | | | | | | | load_pdptrs can be handed an invalid cr3, and it should not oops. This can happen because we injected #gp in set_cr3() after we set vcpu->cr3 to the invalid value, or from kvm_vcpu_ioctl_set_sregs(), or memory configuration changes after the guest did set_cr3(). We should also copy the pdpte array once, before checking and assigning, otherwise an SMP guest can potentially alter the values between the check and the set. Finally one nitpick: ret = 1 should be done as late as possible: this allows GCC to check for unset "ret" should the function change in future. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Remove dead code in the cmpxchg instruction emulationAurelien Jarno2007-10-13
| | | | | | | | The writeback fixes (02c03a326a5df825cc01de426f72e160db2b9538) let some dead code in the cmpxchg instruction emulation. Remove it. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Import some constants of vmcs from IA32 SDMYang, Sheng2007-10-13
| | | | | | | | | | | | This patch mainly imports some constants and rename two exist constants of vmcs according to IA32 SDM. It also adds two constants to indicate Lock bit and Enable bit in MSR_IA32_FEATURE_CONTROL, and replace the hardcode _5_ with these two bits. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Move gfn_to_page out of kmap/unmap pairsShaohua Li2007-10-13
| | | | | | | gfn_to_page might sleep with swap support. Move it out of the kmap calls. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Hoist kvm_mmu_reload() out of the critical sectionShaohua Li2007-10-13
| | | | | | | | vmx_cpu_run doesn't handle error correctly and kvm_mmu_reload might sleep with mutex changes, so I move it above. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Return if the pdptrs are invalid when the guest turns on PAE.Rusty Russell2007-10-13
| | | | | | | Don't fall through and turn on PAE in this case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: fix faulty check for two-byte opcodeAvi Kivity2007-10-13
| | | | | | | | | Right now, the bug is harmless as we never emulate one-byte 0xb6 or 0xb7. But things may change. Noted by the mysterious Gabriel C. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: fix cmov for writeback changesAvi Kivity2007-10-13
| | | | | | | The writeback fixes (02c03a326a5df825cc01de426f72e160db2b9538) broke cmov emulation. Fix. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use standard CR8 flags, and fix TPR definitionRusty Russell2007-10-13
| | | | | | | | | Intel manual (and KVM definition) say the TPR is 4 bits wide. Also fix CR8_RESEVED_BITS typo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Set exit_reason to KVM_EXIT_MMIO where run->mmio is initialized.Jeff Dike2007-10-13
| | | | | Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Trivial: Use standard BITMAP macros, open-code userspace-exposed headerRusty Russell2007-10-13
| | | | | | | | | Creating one's own BITMAP macro seems suboptimal: if we use manual arithmetic in the one place exposed to userspace, we can use standard macros elsewhere. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use standard CR4 flags, tighten checkingRusty Russell2007-10-13
| | | | | | | | | | | On this machine (Intel), writing to the CR4 bits 0x00000800 and 0x00001000 cause a GPF. The Intel manual is a little unclear, but AFIACT they're reserved, too. Also fix spelling of CR4_RESEVED_BITS. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use standard CR3 flags, tighten checkingRusty Russell2007-10-13
| | | | | | | | | | | The kernel now has asm/cpu-features.h: use those macros instead of inventing our own. Also spell out definition of CR3_RESEVED_BITS, fix spelling and tighten it for the non-PAE case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Trivial: Use standard CR0 flags macros from asm/cpu-features.hRusty Russell2007-10-13
| | | | | | | | | | The kernel now has asm/cpu-features.h: use those macros instead of inventing our own. Also spell out definition of CR0_RESEVED_BITS (no code change) and fix typo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Trivial: Avoid hardware_disable predeclarationRusty Russell2007-10-13
| | | | | | | Don't pre-declare hardware_disable: shuffle the reboot hook down. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Trivial: Comment spelling may escape grepRusty Russell2007-10-13
| | | | | | | Speling error in comment. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Trivial: Make decode_register() staticRusty Russell2007-10-13
| | | | | | | | | I have shied away from touching x86_emulate.c (it could definitely use some love, but it is forked from the Xen code, and it would be more productive to cross-merge fixes). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Trivial: Remove unused struct cpu_user_regs declarationRusty Russell2007-10-13
| | | | | Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Trivial: /dev/kvm interface is no longer experimental.Rusty Russell2007-10-13
| | | | | | | KVM interface is no longer experimental. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: In-kernel string pio write supportEddie Dong2007-10-13
| | | | | | | Add string pio write support to support some version of Windows. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Future-proof the exit information union ABIAvi Kivity2007-10-13
| | | | | | | Note that as the size of struct kvm_run is not part of the ABI, we can add things at the end. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SMP: Add vcpu_id field in struct vcpuQing He2007-10-13
| | | | | | | | This patch adds a `vcpu_id' field in `struct vcpu', so we can differentiate BSP and APs without pointer comparison or arithmetic. Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Fix *nopage() in kvm_main.cNguyen Anh Quynh2007-10-13
| | | | | | | | *nopage() in kvm_main.c should only store the type of mmap() fault if the pointers are not NULL. This patch fixes the problem. Signed-off-by: Nguyen Anh Quynh <aquynh@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* i386: Expose IOAPIC register definitions even if CONFIG_X86_IO_APIC is not setAvi Kivity2007-10-13
| | | | | | | | | KVM reuses the IOAPIC register definitions, and needs them even if the host is not compiled with IOAPIC support. Move the #ifdef below so that only the IOAPIC variables and functions are protected, and the register definitions are available to all. Signed-off-by: Avi Kivity <avi@qumranet.com>
* x86/pci/acpi: fix DMI const-ification falloutJeff Garzik2007-10-12
| | | | | | | | Fix DMI const-ification fallout that appeared when merging subsystem trees. Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86: optimise barriersNick Piggin2007-10-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to latest memory ordering specification documents from Intel and AMD, both manufacturers are committed to in-order loads from cacheable memory for the x86 architecture. Hence, smp_rmb() may be a simple barrier. Also according to those documents, and according to existing practice in Linux (eg. spin_unlock doesn't enforce ordering), stores to cacheable memory are visible in program order too. Special string stores are safe -- their constituent stores may be out of order, but they must complete in order WRT surrounding stores. Nontemporal stores to WB memory can go out of order, and so they should be fenced explicitly to make them appear in-order WRT other stores. Hence, smp_wmb() may be a simple barrier. http://developer.intel.com/products/processor/manuals/318147.pdf http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf In userspace microbenchmarks on a core2 system, fence instructions range anywhere from around 15 cycles to 50, which may not be totally insignificant in performance critical paths (code size will go down too). However the primary motivation for this is to have the canonical barrier implementation for x86 architecture. smp_rmb on buggy pentium pros remains a locked op, which is apparently required. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86: fix IO write barrierNick Piggin2007-10-12
| | | | | | | | wmb() on x86 must always include a barrier, because stores can go out of order in many cases when dealing with devices (eg. WC memory). Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86: fence oostores on 64-bitNick Piggin2007-10-12
| | | | | | | | | | | | | movnt* instructions are not strongly ordered with respect to other stores, so if we are to assume stores are strongly ordered in the rest of the 64 bit code, we must fence these off (see similar examples in 32 bit code). [ The AMD memory ordering document seems to say that nontemporal stores can also pass earlier regular stores, so maybe we need sfences _before_ movnt* everywhere too? ] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Only enable BLOCK_COMPAT if COMPAT is neededLinus Torvalds2007-10-12
| | | | | | IOW, it needs to depend on both CONFIG_BLOCK and CONFIG_COMPAT. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'upstream' of ↵Linus Torvalds2007-10-12
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (119 commits) [libata] struct pci_dev related cleanups libata: use ata_exec_internal() for PMP register access libata: implement ATA_PFLAG_RESETTING libata: add @timeout to ata_exec_internal[_sg]() ahci: fix notification handling ahci: clean up PORT_IRQ_BAD_PMP enabling ahci: kill leftover from enabling NCQ over PMP libata: wrap schedule_timeout_uninterruptible() in loop libata: skip suppress reporting if ATA_EHI_QUIET libata: clear ehi description after initial host report pata_jmicron: match vendor and class code only libata: add ST9160821AS / 3.ALD to NCQ blacklist pata_acpi: ACPI driver support libata-core: Expose gtm methods for driver use libata: add HDT722516DLA380 to NCQ blacklist libata: blacklist NCQ on Seagate Barracuda ST380817AS [libata] Turn on ACPI by default libata_scsi: Fix ATAPI transfer lengths libata: correct handling of SRST reset sequences libata: Integrate ACPI-based PATA/SATA hotplug - version 5 ...
| * [libata] struct pci_dev related cleanupsJeff Garzik2007-10-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * remove pointless pci_dev_to_dev() wrapper. Just directly reference the embedded struct device like everyone else does. * pata_cs5520: delete cs5520_remove_one(), it was a duplicate of ata_pci_remove_one() * linux/libata.h: don't bother including linux/pci.h, we don't need it. Simply declare 'struct pci_dev' and assume interested parties will include the header, as they should be doing anyway. * linux/libata.h: consolidate all CONFIG_PCI declarations into a single location in the header. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * libata: use ata_exec_internal() for PMP register accessTejun Heo2007-10-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PMP registers used to be accessed with dedicated accessors ->pmp_read and ->pmp_write. During reset, those callbacks are called with the port frozen so they should be able to run without depending on interrupt delivery. To achieve this, they were implemented polling. However, as resetting the host port makes the PMP to isolate fan-out ports until SError.X is cleared, resetting fan-out ports while port is frozen doesn't buy much additional safety. This patch updates libata PMP support such that PMP registers are accessed using regular ata_exec_internal() mechanism and kills ->pmp_read/write() callbacks. The following changes are made. * PMP access helpers - sata_pmp_read_init_tf(), sata_pmp_read_val(), sata_pmp_write_init_tf() are folded into sata_pmp_read/write() which are now standalone PMP register access functions. * sata_pmp_read/write() returns err_mask instead of rc. This is consistent with other functions which issue internal commands and allows more detailed error reporting. * ahci interrupt handler is modified to ignore BAD_PMP and spurious/illegal completion IRQs while reset is in progress. These conditions are expected during reset. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * libata: implement ATA_PFLAG_RESETTINGTejun Heo2007-10-12
| | | | | | | | | | | | | | | | | | | | | | | | | | Implement ATA_PFLAG_RESETTING. This flag is set while reset is in progress. It's set before prereset is called and cleared after reset fails or postreset is finished. This flag itself doesn't have any function. It will be used by LLDs to tell whether reset is in progress if it needs to behave differently during reset. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * libata: add @timeout to ata_exec_internal[_sg]()Tejun Heo2007-10-12
| | | | | | | | | | | | | | | | Add @timeout argument to ata_exec_internal[_sg](). If 0, default timeout ata_probe_timeout is used. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * ahci: fix notification handlingTejun Heo2007-10-12
| | | | | | | | | | | | | | | | | | | | | | | | Asynchronous notification on ICH9 didn't work because it didn't write AN FIS into the RX area - it only updates SNotification. Also, snooping SDB_FIS RX area is racy against further SDB FIS receptions. Let sata_async_notification() determine using SNTF if it's available and snoop RX area iff SNTF isn't available Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * ahci: clean up PORT_IRQ_BAD_PMP enablingTejun Heo2007-10-12
| | | | | | | | | | | | | | | | Now that we have pp->intr_mask, move PORT_IRQ_BAD_PMP enabling to ahci_pmp_attach/detach() where it belongs. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * ahci: kill leftover from enabling NCQ over PMPTejun Heo2007-10-12
| | | | | | | | | | | | | | | | | | ahci had problems with NCQ over PMP and NCQ used to be disabled while PMP was attached. After fixing the problem, the temporary NCQ disabling code wasn't removed completely. Kill the remaining piece. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * libata: wrap schedule_timeout_uninterruptible() in loopTejun Heo2007-10-12
| | | | | | | | | | | | | | | | | | | | Tasks in uninterruptible sleep might be woken up by unrelated events and should check whether the condition it was waiting for has actually triggered. Wrap schedule_timeout_uninterruptible() in loop to achieve it. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * libata: skip suppress reporting if ATA_EHI_QUIETTejun Heo2007-10-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | ATA_EHI_NO_AUTOPSY and ATA_EHI_QUIET are used during initial probing to skip exception analysis and reporting. Usually, there's nothing to report but on some allowed but rare corner cases (e.g. phy status changed interrupt when IRQ is enabled on frozen port - this happens if IRQ pending status isn't cleared in the IRQ router or controller) exception messages get printed. Skip reporting if ATA_EHI_QUIET is set. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>