aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm
Commit message (Collapse)AuthorAge
...
* KVM: Implement dummy values for MSR_PERF_STATUSAlexander Graf2008-04-27
| | | | | | | Darwin relies on this and ceases to work without. Signed-off-by: Alexander Graf <alex@csgraf.de> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: sparse fixes for kvm/x86.cHarvey Harrison2008-04-27
| | | | | | | | | | | | | | | | | | In two case statements, use the ever popular 'i' instead of index: arch/x86/kvm/x86.c:1063:7: warning: symbol 'index' shadows an earlier one arch/x86/kvm/x86.c:1000:9: originally declared here arch/x86/kvm/x86.c:1079:7: warning: symbol 'index' shadows an earlier one arch/x86/kvm/x86.c:1000:9: originally declared here Make it static. arch/x86/kvm/x86.c:1945:24: warning: symbol 'emulate_ops' was not declared. Should it be static? Drop the return statements. arch/x86/kvm/x86.c:2878:2: warning: returning void-valued expression arch/x86/kvm/x86.c:2944:2: warning: returning void-valued expression Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: make iopm_base staticHarvey Harrison2008-04-27
| | | | | | | | Fixes sparse warning as well. arch/x86/kvm/svm.c:69:15: warning: symbol 'iopm_base' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: fix sparse warnings in x86_emulate.cHarvey Harrison2008-04-27
| | | | | | | | | | | | | | | Nesting __emulate_2op_nobyte inside__emulate_2op produces many shadowed variable warnings on the internal variable _tmp used by both macros. Change the outer macro to use __tmp. Avoids a sparse warning like the following at every call site of __emulate_2op arch/x86/kvm/x86_emulate.c:1091:3: warning: symbol '_tmp' shadows an earlier one arch/x86/kvm/x86_emulate.c:1091:3: originally declared here [18 more warnings suppressed] Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add stat counter for hypercallsAmit Shah2008-04-27
| | | | | Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use x86's segment descriptor struct instead of private definitionAvi Kivity2008-04-27
| | | | | | The x86 desc_struct unification allows us to remove segment_descriptor.h. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add API for determining the number of supported memory slotsAvi Kivity2008-04-27
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add API to retrieve the number of supported vcpus per vmAvi Kivity2008-04-27
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: make register_address_increment and JMP_REL static inlinesHarvey Harrison2008-04-27
| | | | | | | Change jmp_rel() to a function as well. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: make register_address, address_mask static inlinesHarvey Harrison2008-04-27
| | | | | Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: add ad_mask static inlineHarvey Harrison2008-04-27
| | | | | | | Replaces open-coded mask calculation in macros. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: paravirtualized clocksource: host partGlauber de Oliveira Costa2008-04-27
| | | | | | | | | | | | | | | | | This is the host part of kvm clocksource implementation. As it does not include clockevents, it is a fairly simple implementation. We only have to register a per-vcpu area, and start writing to it periodically. The area is binary compatible with xen, as we use the same shadow_info structure. [marcelo: fix bad_page on MSR_KVM_SYSTEM_TIME] [avi: save full value of the msr, even if enable bit is clear] [avi: clear previous value of time_page] Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: enable LBR virtualizationJoerg Roedel2008-04-27
| | | | | | | | | | | | This patch implements the Last Branch Record Virtualization (LBRV) feature of the AMD Barcelona and Phenom processors into the kvm-amd module. It will only be enabled if the guest enables last branch recording in the DEBUG_CTL MSR. So there is no increased world switch overhead when the guest doesn't use these MSRs. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: allocate the MSR permission map per VCPUJoerg Roedel2008-04-27
| | | | | | | | | | | This patch changes the kvm-amd module to allocate the SVM MSR permission map per VCPU instead of a global map for all VCPUs. With this we have more flexibility allowing specific guests to access virtualized MSRs. This is required for LBR virtualization. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: let init_vmcb() take struct vcpu_svm as parameterJoerg Roedel2008-04-27
| | | | | | | | | Change the parameter of the init_vmcb() function in the kvm-amd module from struct vmcb to struct vcpu_svm. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: fix typo in VMX header defineRyan Harper2008-04-27
| | | | | | | | | Looking at Intel Volume 3b, page 148, table 20-11 and noticed that the field name is 'Deliver' not 'Deliever'. Attached patch changes the define name and its user in vmx.c Signed-off-by: Ryan Harper <ryanh@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: add support for Nested PagingJoerg Roedel2008-04-27
| | | | | | | | This patch contains the SVM architecture dependent changes for KVM to enable support for the Nested Paging feature of AMD Barcelona and Phenom processors. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: add TDP support to the KVM MMUJoerg Roedel2008-04-27
| | | | | | | | This patch contains the changes to the KVM MMU necessary for support of the Nested Paging feature in AMD Barcelona and Phenom Processors. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: export the load_pdptrs() function to modulesJoerg Roedel2008-04-27
| | | | | | | The load_pdptrs() function is required in the SVM module for NPT support. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: make the __nonpaging_map function genericJoerg Roedel2008-04-27
| | | | | | | | | The mapping function for the nonpaging case in the softmmu does basically the same as required for Nested Paging. Make this function generic so it can be used for both. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: export information about NPT to generic x86 codeJoerg Roedel2008-04-27
| | | | | | | | | | The generic x86 code has to know if the specific implementation uses Nested Paging. In the generic code Nested Paging is called Two Dimensional Paging (TDP) to avoid confusion with (future) TDP implementations of other vendors. This patch exports the availability of TDP to the generic x86 code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: add module parameter to disable Nested PagingJoerg Roedel2008-04-27
| | | | | | | | | To disable the use of the Nested Paging feature even if it is available in hardware this patch adds a module parameter. Nested Paging can be disabled by passing npt=0 to the kvm_amd module. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: add detection of Nested Paging featureJoerg Roedel2008-04-27
| | | | | | | | Let SVM detect if the Nested Paging feature is available on the hardware. Disable it to keep this patch series bisectable. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: move feature detection to hardware setup codeJoerg Roedel2008-04-27
| | | | | | | | | By moving the SVM feature detection from the each_cpu code to the hardware setup code it runs only once. As an additional advance the feature check is now available earlier in the module setup process. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: allow access to EFER in 32bit KVMJoerg Roedel2008-04-27
| | | | | | | | This patch makes the EFER register accessible on a 32bit KVM host. This is necessary to boot 32 bit PAE guests under SVM. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: unifdef the EFER specific codeJoerg Roedel2008-04-27
| | | | | | | | | | | To allow access to the EFER register in 32bit KVM the EFER specific code has to be exported to the x86 generic code. This patch does this in a backwards compatible manner. [avi: add check for EFER-less hosts] Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: align valid EFER bits with the features of the host systemJoerg Roedel2008-04-27
| | | | | | | | | | This patch aligns the bits the guest can set in the EFER register with the features in the host processor. Currently it lets EFER.NX disabled if the processor does not support it and enables EFER.LME and EFER.LMA only for KVM on 64 bit hosts. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: make EFER_RESERVED_BITS configurable for architecture codeJoerg Roedel2008-04-27
| | | | | | | | This patch give the SVM and VMX implementations the ability to add some bits the guest can set in its EFER register. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Enable Virtual Processor Identification (VPID)Sheng Yang2008-04-27
| | | | | | | | | | | | To allow TLB entries to be retained across VM entry and VM exit, the VMM can now identify distinct address spaces through a new virtual-processor ID (VPID) field of the VMCS. [avi: drop vpid_sync_all()] [avi: add "cc" to asm constraints] Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Decouple mmio from shadow page tablesAvi Kivity2008-04-27
| | | | | | | | | | | | | Currently an mmio guest pte is encoded in the shadow pagetable as a not-present trapping pte, with the SHADOW_IO_MARK bit set. However nothing is ever done with this information, so maintaining it is a useless complication. This patch moves the check for mmio to before shadow ptes are instantiated, so the shadow code is never invoked for ptes that reference mmio. The code is simpler, and with future work, can be made to handle mmio concurrently. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: group decoding for group 1 instructionsAvi Kivity2008-04-27
| | | | | | Opcodes 0x80-0x83 Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: add group 7 decodingAvi Kivity2008-04-27
| | | | | | This adds group decoding for opcode 0x0f 0x01 (group 7). Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: Group decoding for groups 4 and 5Avi Kivity2008-04-27
| | | | | | Add group decoding support for opcode 0xfe (group 4) and 0xff (group 5). Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: Group decoding for group 3Avi Kivity2008-04-27
| | | | | | This adds group decoding support for opcodes 0xf6, 0xf7 (group 3). Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: group decoding for group 1AAvi Kivity2008-04-27
| | | | | | This adds group decode support for opcode 0x8f. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: add support for group decodingAvi Kivity2008-04-27
| | | | | | | | | Certain x86 instructions use bits 3:5 of the byte following the opcode as an opcode extension, with the decode sometimes depending on bits 6:7 as well. Add support for this in the main decoding table rather than an ad-hock adaptation per opcode. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Simplify hash table indexingDong, Eddie2008-04-27
| | | | | Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Update shadow ptes on partial guest pte writesDong, Eddie2008-04-27
| | | | | | | | | | | | | | | | A guest partial guest pte write will leave shadow_trap_nonpresent_pte in spte, which generates a vmexit at the next guest access through that pte. This patch improves this by reading the full guest pte in advance and thus being able to update the spte and eliminate the vmexit. This helps pae guests which use two 32-bit writes to set a single 64-bit pte. [truncation fix by Eric] Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Feng (Eric) Liu <eric.e.liu@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Fix memory leak on guest demand faultsAvi Kivity2008-03-25
| | | | | | | | | | | While backporting 72dc67a69690288538142df73a7e3ac66fea68dc, a gfn_to_page() call was duplicated instead of moved (due to an unrelated patch not being present in mainline). This caused a page reference leak, resulting in a fairly massive memory leak. Fix by removing the extraneous gfn_to_page() call. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: convert init_rmode_tss() to slots_lockMarcelo Tosatti2008-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | init_rmode_tss was forgotten during the conversion from mmap_sem to slots_lock. INFO: task qemu-system-x86:3748 blocked for more than 120 seconds. Call Trace: [<ffffffff8053d100>] __down_read+0x86/0x9e [<ffffffff8053fb43>] do_page_fault+0x346/0x78e [<ffffffff8053d235>] trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8053dcad>] error_exit+0x0/0xa9 [<ffffffff8035a7a7>] copy_user_generic_string+0x17/0x40 [<ffffffff88099a8a>] :kvm:kvm_write_guest_page+0x3e/0x5f [<ffffffff880b661a>] :kvm_intel:init_rmode_tss+0xa7/0xf9 [<ffffffff880b7d7e>] :kvm_intel:vmx_vcpu_reset+0x10/0x38a [<ffffffff8809b9a5>] :kvm:kvm_arch_vcpu_setup+0x20/0x53 [<ffffffff8809a1e4>] :kvm:kvm_vm_ioctl+0xad/0x1cf [<ffffffff80249dea>] __lock_acquire+0x4f7/0xc28 [<ffffffff8028fad9>] vfs_ioctl+0x21/0x6b [<ffffffff8028fd75>] do_vfs_ioctl+0x252/0x26b [<ffffffff8028fdca>] sys_ioctl+0x3c/0x5e [<ffffffff8020b01b>] system_call_after_swapgs+0x7b/0x80 Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: handle page removal with shadow mappingMarcelo Tosatti2008-03-25
| | | | | | | | | | Do not assume that a shadow mapping will always point to the same host frame number. Fixes crash with madvise(MADV_DONTNEED). [avi: move after first printk(), add another printk()] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Fix is_rmap_pte() with io ptesAvi Kivity2008-03-25
| | | | | | is_rmap_pte() doesn't take into account io ptes, which have the avail bit set. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Restore tss even on x86_64Avi Kivity2008-03-25
| | | | | | | | | | | | | | | The vmx hardware state restore restores the tss selector and base address, but not its length. Usually, this does not matter since most of the tss contents is within the default length of 0x67. However, if a process is using ioperm() to grant itself I/O port permissions, an additional bitmap within the tss, but outside the default length is consulted. The effect is that the process will receive a SIGSEGV instead of transparently accessing the port. Fix by restoring the tss length. Note that i386 had this working already. Closes bugzilla 10246. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Avoid rearranging switched guest msrs while they are loadedAvi Kivity2008-03-04
| | | | | | | | | | | | | | | | | | | | | | | | KVM tries to run as much as possible with the guest msrs loaded instead of host msrs, since switching msrs is very expensive. It also tries to minimize the number of msrs switched according to the guest mode; for example, MSR_LSTAR is needed only by long mode guests. This optimization is done by setup_msrs(). However, we must not change which msrs are switched while we are running with guest msr state: - switch to guest msr state - call setup_msrs(), removing some msrs from the list - switch to host msr state, leaving a few guest msrs loaded An easy way to trigger this is to kexec an x86_64 linux guest. Early during setup, the guest will switch EFER to not include SCE. KVM will stop saving MSR_LSTAR, and on the next msr switch it will leave the guest LSTAR loaded. The next host syscall will end up in a random location in the kernel. Fix by reloading the host msrs before changing the msr list. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Fix race when instantiating a shadow pteAvi Kivity2008-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For improved concurrency, the guest walk is performed concurrently with other vcpus. This means that we need to revalidate the guest ptes once we have write-protected the guest page tables, at which point they can no longer be modified. The current code attempts to avoid this check if the shadow page table is not new, on the assumption that if it has existed before, the guest could not have modified the pte without the shadow lock. However the assumption is incorrect, as the racing vcpu could have modified the pte, then instantiated the shadow page, before our vcpu regains control: vcpu0 vcpu1 fault walk pte modify pte fault in same pagetable instantiate shadow page lookup shadow page conclude it is old instantiate spte based on stale guest pte We could do something clever with generation counters, but a test run by Marcelo suggests this is unnecessary and we can just do the revalidation unconditionally. The pte will be in the processor cache and the check can be quite fast. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Avoid infinite-frequency local apic timerAvi Kivity2008-03-04
| | | | | | | If the local apic initial count is zero, don't start a an hrtimer with infinite frequency, locking up the host. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: make MMU_DEBUG compile againMarcelo Tosatti2008-03-04
| | | | | | | the cr3 variable is now inside the vcpu->arch structure. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: move alloc_apic_access_page() outside of non-preemptable regionMarcelo Tosatti2008-03-04
| | | | | | | | alloc_apic_access_page() can sleep, while vmx_vcpu_setup is called inside a non preemptable region. Move it after put_cpu(). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: fix Windows XP 64 bit installation crashJoerg Roedel2008-03-04
| | | | | | | | | | While installing Windows XP 64 bit wants to access the DEBUGCTL and the last branch record (LBR) MSRs. Don't allowing this in KVM causes the installation to crash. This patch allow the access to these MSRs and fixes the issue. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: remove the usage of the mmap_sem for the protection of the memory slots.Izik Eidus2008-03-04
| | | | | | | | | This patch replaces the mmap_sem lock for the memory slots with a new kvm private lock, it is needed beacuse untill now there were cases where kvm accesses user memory while holding the mmap semaphore. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>