diff options
Diffstat (limited to 'Documentation/powerpc')
-rw-r--r-- | Documentation/powerpc/kvm_440.txt | 41 |
1 files changed, 41 insertions, 0 deletions
diff --git a/Documentation/powerpc/kvm_440.txt b/Documentation/powerpc/kvm_440.txt new file mode 100644 index 000000000000..c02a003fa03a --- /dev/null +++ b/Documentation/powerpc/kvm_440.txt | |||
@@ -0,0 +1,41 @@ | |||
1 | Hollis Blanchard <hollisb@us.ibm.com> | ||
2 | 15 Apr 2008 | ||
3 | |||
4 | Various notes on the implementation of KVM for PowerPC 440: | ||
5 | |||
6 | To enforce isolation, host userspace, guest kernel, and guest userspace all | ||
7 | run at user privilege level. Only the host kernel runs in supervisor mode. | ||
8 | Executing privileged instructions in the guest traps into KVM (in the host | ||
9 | kernel), where we decode and emulate them. Through this technique, unmodified | ||
10 | 440 Linux kernels can be run (slowly) as guests. Future performance work will | ||
11 | focus on reducing the overhead and frequency of these traps. | ||
12 | |||
13 | The usual code flow is started from userspace invoking an "run" ioctl, which | ||
14 | causes KVM to switch into guest context. We use IVPR to hijack the host | ||
15 | interrupt vectors while running the guest, which allows us to direct all | ||
16 | interrupts to kvmppc_handle_interrupt(). At this point, we could either | ||
17 | - handle the interrupt completely (e.g. emulate "mtspr SPRG0"), or | ||
18 | - let the host interrupt handler run (e.g. when the decrementer fires), or | ||
19 | - return to host userspace (e.g. when the guest performs device MMIO) | ||
20 | |||
21 | Address spaces: We take advantage of the fact that Linux doesn't use the AS=1 | ||
22 | address space (in host or guest), which gives us virtual address space to use | ||
23 | for guest mappings. While the guest is running, the host kernel remains mapped | ||
24 | in AS=0, but the guest can only use AS=1 mappings. | ||
25 | |||
26 | TLB entries: The TLB entries covering the host linear mapping remain | ||
27 | present while running the guest. This reduces the overhead of lightweight | ||
28 | exits, which are handled by KVM running in the host kernel. We keep three | ||
29 | copies of the TLB: | ||
30 | - guest TLB: contents of the TLB as the guest sees it | ||
31 | - shadow TLB: the TLB that is actually in hardware while guest is running | ||
32 | - host TLB: to restore TLB state when context switching guest -> host | ||
33 | When a TLB miss occurs because a mapping was not present in the shadow TLB, | ||
34 | but was present in the guest TLB, KVM handles the fault without invoking the | ||
35 | guest. Large guest pages are backed by multiple 4KB shadow pages through this | ||
36 | mechanism. | ||
37 | |||
38 | IO: MMIO and DCR accesses are emulated by userspace. We use virtio for network | ||
39 | and block IO, so those drivers must be enabled in the guest. It's possible | ||
40 | that some qemu device emulation (e.g. e1000 or rtl8139) may also work with | ||
41 | little effort. | ||