aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/virtual/kvm
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2013-02-24 16:07:18 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2013-02-24 16:07:18 -0500
commit89f883372fa60f604d136924baf3e89ff1870e9e (patch)
treecb69b0a14957945ba00d3d392bf9ccbbef56f3b8 /Documentation/virtual/kvm
parent9e2d59ad580d590134285f361a0e80f0e98c0207 (diff)
parent6b73a96065e89dc9fa75ba4f78b1aa3a3bbd0470 (diff)
Merge tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Marcelo Tosatti: "KVM updates for the 3.9 merge window, including x86 real mode emulation fixes, stronger memory slot interface restrictions, mmu_lock spinlock hold time reduction, improved handling of large page faults on shadow, initial APICv HW acceleration support, s390 channel IO based virtio, amongst others" * tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits) Revert "KVM: MMU: lazily drop large spte" x86: pvclock kvm: align allocation size to page size KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr x86 emulator: fix parity calculation for AAD instruction KVM: PPC: BookE: Handle alignment interrupts booke: Added DBCR4 SPR number KVM: PPC: booke: Allow multiple exception types KVM: PPC: booke: use vcpu reference from thread_struct KVM: Remove user_alloc from struct kvm_memory_slot KVM: VMX: disable apicv by default KVM: s390: Fix handling of iscs. KVM: MMU: cleanup __direct_map KVM: MMU: remove pt_access in mmu_set_spte KVM: MMU: cleanup mapping-level KVM: MMU: lazily drop large spte KVM: VMX: cleanup vmx_set_cr0(). KVM: VMX: add missing exit names to VMX_EXIT_REASONS array KVM: VMX: disable SMEP feature when guest is in non-paging mode KVM: Remove duplicate text in api.txt Revert "KVM: MMU: split kvm_mmu_free_page" ...
Diffstat (limited to 'Documentation/virtual/kvm')
-rw-r--r--Documentation/virtual/kvm/api.txt108
-rw-r--r--Documentation/virtual/kvm/mmu.txt7
2 files changed, 85 insertions, 30 deletions
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index e0fa0ea2b187..119358dfb742 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -219,19 +219,6 @@ allocation of vcpu ids. For example, if userspace wants
219single-threaded guest vcpus, it should make all vcpu ids be a multiple 219single-threaded guest vcpus, it should make all vcpu ids be a multiple
220of the number of vcpus per vcore. 220of the number of vcpus per vcore.
221 221
222On powerpc using book3s_hv mode, the vcpus are mapped onto virtual
223threads in one or more virtual CPU cores. (This is because the
224hardware requires all the hardware threads in a CPU core to be in the
225same partition.) The KVM_CAP_PPC_SMT capability indicates the number
226of vcpus per virtual core (vcore). The vcore id is obtained by
227dividing the vcpu id by the number of vcpus per vcore. The vcpus in a
228given vcore will always be in the same physical core as each other
229(though that might be a different physical core from time to time).
230Userspace can control the threading (SMT) mode of the guest by its
231allocation of vcpu ids. For example, if userspace wants
232single-threaded guest vcpus, it should make all vcpu ids be a multiple
233of the number of vcpus per vcore.
234
235For virtual cpus that have been created with S390 user controlled virtual 222For virtual cpus that have been created with S390 user controlled virtual
236machines, the resulting vcpu fd can be memory mapped at page offset 223machines, the resulting vcpu fd can be memory mapped at page offset
237KVM_S390_SIE_PAGE_OFFSET in order to obtain a memory map of the virtual 224KVM_S390_SIE_PAGE_OFFSET in order to obtain a memory map of the virtual
@@ -345,7 +332,7 @@ struct kvm_sregs {
345 __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64]; 332 __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64];
346}; 333};
347 334
348/* ppc -- see arch/powerpc/include/asm/kvm.h */ 335/* ppc -- see arch/powerpc/include/uapi/asm/kvm.h */
349 336
350interrupt_bitmap is a bitmap of pending external interrupts. At most 337interrupt_bitmap is a bitmap of pending external interrupts. At most
351one bit may be set. This interrupt has been acknowledged by the APIC 338one bit may be set. This interrupt has been acknowledged by the APIC
@@ -892,12 +879,12 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
892be identical. This allows large pages in the guest to be backed by large 879be identical. This allows large pages in the guest to be backed by large
893pages in the host. 880pages in the host.
894 881
895The flags field supports two flag, KVM_MEM_LOG_DIRTY_PAGES, which instructs 882The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and
896kvm to keep track of writes to memory within the slot. See KVM_GET_DIRTY_LOG 883KVM_MEM_READONLY. The former can be set to instruct KVM to keep track of
897ioctl. The KVM_CAP_READONLY_MEM capability indicates the availability of the 884writes to memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to
898KVM_MEM_READONLY flag. When this flag is set for a memory region, KVM only 885use it. The latter can be set, if KVM_CAP_READONLY_MEM capability allows it,
899allows read accesses. Writes will be posted to userspace as KVM_EXIT_MMIO 886to make a new slot read-only. In this case, writes to this memory will be
900exits. 887posted to userspace as KVM_EXIT_MMIO exits.
901 888
902When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of 889When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of
903the memory region are automatically reflected into the guest. For example, an 890the memory region are automatically reflected into the guest. For example, an
@@ -931,7 +918,7 @@ documentation when it pops into existence).
9314.37 KVM_ENABLE_CAP 9184.37 KVM_ENABLE_CAP
932 919
933Capability: KVM_CAP_ENABLE_CAP 920Capability: KVM_CAP_ENABLE_CAP
934Architectures: ppc 921Architectures: ppc, s390
935Type: vcpu ioctl 922Type: vcpu ioctl
936Parameters: struct kvm_enable_cap (in) 923Parameters: struct kvm_enable_cap (in)
937Returns: 0 on success; -1 on error 924Returns: 0 on success; -1 on error
@@ -1792,6 +1779,7 @@ registers, find a list below:
1792 PPC | KVM_REG_PPC_VPA_SLB | 128 1779 PPC | KVM_REG_PPC_VPA_SLB | 128
1793 PPC | KVM_REG_PPC_VPA_DTL | 128 1780 PPC | KVM_REG_PPC_VPA_DTL | 128
1794 PPC | KVM_REG_PPC_EPCR | 32 1781 PPC | KVM_REG_PPC_EPCR | 32
1782 PPC | KVM_REG_PPC_EPR | 32
1795 1783
1796ARM registers are mapped using the lower 32 bits. The upper 16 of that 1784ARM registers are mapped using the lower 32 bits. The upper 16 of that
1797is the register group type, or coprocessor number: 1785is the register group type, or coprocessor number:
@@ -2108,6 +2096,14 @@ KVM_S390_INT_VIRTIO (vm) - virtio external interrupt; external interrupt
2108KVM_S390_INT_SERVICE (vm) - sclp external interrupt; sclp parameter in parm 2096KVM_S390_INT_SERVICE (vm) - sclp external interrupt; sclp parameter in parm
2109KVM_S390_INT_EMERGENCY (vcpu) - sigp emergency; source cpu in parm 2097KVM_S390_INT_EMERGENCY (vcpu) - sigp emergency; source cpu in parm
2110KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm 2098KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm
2099KVM_S390_INT_IO(ai,cssid,ssid,schid) (vm) - compound value to indicate an
2100 I/O interrupt (ai - adapter interrupt; cssid,ssid,schid - subchannel);
2101 I/O interruption parameters in parm (subchannel) and parm64 (intparm,
2102 interruption subclass)
2103KVM_S390_MCHK (vm, vcpu) - machine check interrupt; cr 14 bits in parm,
2104 machine check interrupt code in parm64 (note that
2105 machine checks needing further payload are not
2106 supported by this ioctl)
2111 2107
2112Note that the vcpu ioctl is asynchronous to vcpu execution. 2108Note that the vcpu ioctl is asynchronous to vcpu execution.
2113 2109
@@ -2359,8 +2355,8 @@ executed a memory-mapped I/O instruction which could not be satisfied
2359by kvm. The 'data' member contains the written data if 'is_write' is 2355by kvm. The 'data' member contains the written data if 'is_write' is
2360true, and should be filled by application code otherwise. 2356true, and should be filled by application code otherwise.
2361 2357
2362NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_DCR 2358NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_DCR,
2363 and KVM_EXIT_PAPR the corresponding 2359 KVM_EXIT_PAPR and KVM_EXIT_EPR the corresponding
2364operations are complete (and guest state is consistent) only after userspace 2360operations are complete (and guest state is consistent) only after userspace
2365has re-entered the kernel with KVM_RUN. The kernel side will first finish 2361has re-entered the kernel with KVM_RUN. The kernel side will first finish
2366incomplete operations and then check for pending signals. Userspace 2362incomplete operations and then check for pending signals. Userspace
@@ -2463,6 +2459,41 @@ The possible hypercalls are defined in the Power Architecture Platform
2463Requirements (PAPR) document available from www.power.org (free 2459Requirements (PAPR) document available from www.power.org (free
2464developer registration required to access it). 2460developer registration required to access it).
2465 2461
2462 /* KVM_EXIT_S390_TSCH */
2463 struct {
2464 __u16 subchannel_id;
2465 __u16 subchannel_nr;
2466 __u32 io_int_parm;
2467 __u32 io_int_word;
2468 __u32 ipb;
2469 __u8 dequeued;
2470 } s390_tsch;
2471
2472s390 specific. This exit occurs when KVM_CAP_S390_CSS_SUPPORT has been enabled
2473and TEST SUBCHANNEL was intercepted. If dequeued is set, a pending I/O
2474interrupt for the target subchannel has been dequeued and subchannel_id,
2475subchannel_nr, io_int_parm and io_int_word contain the parameters for that
2476interrupt. ipb is needed for instruction parameter decoding.
2477
2478 /* KVM_EXIT_EPR */
2479 struct {
2480 __u32 epr;
2481 } epr;
2482
2483On FSL BookE PowerPC chips, the interrupt controller has a fast patch
2484interrupt acknowledge path to the core. When the core successfully
2485delivers an interrupt, it automatically populates the EPR register with
2486the interrupt vector number and acknowledges the interrupt inside
2487the interrupt controller.
2488
2489In case the interrupt controller lives in user space, we need to do
2490the interrupt acknowledge cycle through it to fetch the next to be
2491delivered interrupt vector using this exit.
2492
2493It gets triggered whenever both KVM_CAP_PPC_EPR are enabled and an
2494external interrupt has just been delivered into the guest. User space
2495should put the acknowledged interrupt vector into the 'epr' field.
2496
2466 /* Fix the size of the union. */ 2497 /* Fix the size of the union. */
2467 char padding[256]; 2498 char padding[256];
2468 }; 2499 };
@@ -2584,3 +2615,34 @@ For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV:
2584 where "num_sets" is the tlb_sizes[] value divided by the tlb_ways[] value. 2615 where "num_sets" is the tlb_sizes[] value divided by the tlb_ways[] value.
2585 - The tsize field of mas1 shall be set to 4K on TLB0, even though the 2616 - The tsize field of mas1 shall be set to 4K on TLB0, even though the
2586 hardware ignores this value for TLB0. 2617 hardware ignores this value for TLB0.
2618
26196.4 KVM_CAP_S390_CSS_SUPPORT
2620
2621Architectures: s390
2622Parameters: none
2623Returns: 0 on success; -1 on error
2624
2625This capability enables support for handling of channel I/O instructions.
2626
2627TEST PENDING INTERRUPTION and the interrupt portion of TEST SUBCHANNEL are
2628handled in-kernel, while the other I/O instructions are passed to userspace.
2629
2630When this capability is enabled, KVM_EXIT_S390_TSCH will occur on TEST
2631SUBCHANNEL intercepts.
2632
26336.5 KVM_CAP_PPC_EPR
2634
2635Architectures: ppc
2636Parameters: args[0] defines whether the proxy facility is active
2637Returns: 0 on success; -1 on error
2638
2639This capability enables or disables the delivery of interrupts through the
2640external proxy facility.
2641
2642When enabled (args[0] != 0), every time the guest gets an external interrupt
2643delivered, it automatically exits into user space with a KVM_EXIT_EPR exit
2644to receive the topmost interrupt vector.
2645
2646When disabled (args[0] == 0), behavior is as if this facility is unsupported.
2647
2648When this capability is enabled, KVM_EXIT_EPR can occur.
diff --git a/Documentation/virtual/kvm/mmu.txt b/Documentation/virtual/kvm/mmu.txt
index fa5f1dbc6b23..43fcb761ed16 100644
--- a/Documentation/virtual/kvm/mmu.txt
+++ b/Documentation/virtual/kvm/mmu.txt
@@ -187,13 +187,6 @@ Shadow pages contain the following information:
187 perform a reverse map from a pte to a gfn. When role.direct is set, any 187 perform a reverse map from a pte to a gfn. When role.direct is set, any
188 element of this array can be calculated from the gfn field when used, in 188 element of this array can be calculated from the gfn field when used, in
189 this case, the array of gfns is not allocated. See role.direct and gfn. 189 this case, the array of gfns is not allocated. See role.direct and gfn.
190 slot_bitmap:
191 A bitmap containing one bit per memory slot. If the page contains a pte
192 mapping a page from memory slot n, then bit n of slot_bitmap will be set
193 (if a page is aliased among several slots, then it is not guaranteed that
194 all slots will be marked).
195 Used during dirty logging to avoid scanning a shadow page if none if its
196 pages need tracking.
197 root_count: 190 root_count:
198 A counter keeping track of how many hardware registers (guest cr3 or 191 A counter keeping track of how many hardware registers (guest cr3 or
199 pdptrs) are now pointing at the page. While this counter is nonzero, the 192 pdptrs) are now pointing at the page. While this counter is nonzero, the