diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2008-07-21 17:55:23 -0400 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2008-07-21 17:55:23 -0400 |
| commit | eb4225b2da2b9f3c1ee43efe58ed1415cc1d4c47 (patch) | |
| tree | 573ce3591679ffcdc179801ed86107e48e0e11ca /Documentation | |
| parent | 807677f812639bdeeddf86abc66117e124eaedb2 (diff) | |
| parent | 4cddb886a4d0e5cc7a790151740bfb87b568c97d (diff) | |
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: (25 commits)
mmtimer: Push BKL down into the ioctl handler
[IA64] Remove experimental status of kdump
[IA64] Update ia64 mmr list for SGI uv
[IA64] Avoid overflowing ia64_cpu_to_sapicid in acpi_map_lsapic()
[IA64] adding parameter check to module_free()
[IA64] improper printk format in acpi-cpufreq
[IA64] pv_ops: move some functions in ivt.S to avoid lack of space.
[IA64] pvops: documentation on ia64/pv_ops
[IA64] pvops: add to hooks, pv_time_ops, for steal time accounting.
[IA64] pvops: add hooks, pv_irq_ops, to paravirtualized irq related operations.
[IA64] pvops: add hooks, pv_iosapic_ops, to paravirtualize iosapic.
[IA64] pvops: define initialization hooks, pv_init_ops, for paravirtualized environment.
[IA64] pvops: paravirtualize NR_IRQS
[IA64] pvops: paravirtualize entry.S
[IA64] pvops: paravirtualize ivt.S
[IA64] pvops: paravirtualize minstate.h.
[IA64] pvops: define paravirtualized instructions for native.
[IA64] pvops: preparation for paravirtulization of hand written assembly code.
[IA64] pvops: introduce pv_cpu_ops to paravirtualize privileged instructions.
[IA64] pvops: add an early setup hook for pv_ops.
...
Diffstat (limited to 'Documentation')
| -rw-r--r-- | Documentation/ia64/paravirt_ops.txt | 137 |
1 files changed, 137 insertions, 0 deletions
diff --git a/Documentation/ia64/paravirt_ops.txt b/Documentation/ia64/paravirt_ops.txt new file mode 100644 index 000000000000..39ded02ec33f --- /dev/null +++ b/Documentation/ia64/paravirt_ops.txt | |||
| @@ -0,0 +1,137 @@ | |||
| 1 | Paravirt_ops on IA64 | ||
| 2 | ==================== | ||
| 3 | 21 May 2008, Isaku Yamahata <yamahata@valinux.co.jp> | ||
| 4 | |||
| 5 | |||
| 6 | Introduction | ||
| 7 | ------------ | ||
| 8 | The aim of this documentation is to help with maintainability and/or to | ||
| 9 | encourage people to use paravirt_ops/IA64. | ||
| 10 | |||
| 11 | paravirt_ops (pv_ops in short) is a way for virtualization support of | ||
| 12 | Linux kernel on x86. Several ways for virtualization support were | ||
| 13 | proposed, paravirt_ops is the winner. | ||
| 14 | On the other hand, now there are also several IA64 virtualization | ||
| 15 | technologies like kvm/IA64, xen/IA64 and many other academic IA64 | ||
| 16 | hypervisors so that it is good to add generic virtualization | ||
| 17 | infrastructure on Linux/IA64. | ||
| 18 | |||
| 19 | |||
| 20 | What is paravirt_ops? | ||
| 21 | --------------------- | ||
| 22 | It has been developed on x86 as virtualization support via API, not ABI. | ||
| 23 | It allows each hypervisor to override operations which are important for | ||
| 24 | hypervisors at API level. And it allows a single kernel binary to run on | ||
| 25 | all supported execution environments including native machine. | ||
| 26 | Essentially paravirt_ops is a set of function pointers which represent | ||
| 27 | operations corresponding to low level sensitive instructions and high | ||
| 28 | level functionalities in various area. But one significant difference | ||
| 29 | from usual function pointer table is that it allows optimization with | ||
| 30 | binary patch. It is because some of these operations are very | ||
| 31 | performance sensitive and indirect call overhead is not negligible. | ||
| 32 | With binary patch, indirect C function call can be transformed into | ||
| 33 | direct C function call or in-place execution to eliminate the overhead. | ||
| 34 | |||
| 35 | Thus, operations of paravirt_ops are classified into three categories. | ||
| 36 | - simple indirect call | ||
| 37 | These operations correspond to high level functionality so that the | ||
| 38 | overhead of indirect call isn't very important. | ||
| 39 | |||
| 40 | - indirect call which allows optimization with binary patch | ||
| 41 | Usually these operations correspond to low level instructions. They | ||
| 42 | are called frequently and performance critical. So the overhead is | ||
| 43 | very important. | ||
| 44 | |||
| 45 | - a set of macros for hand written assembly code | ||
| 46 | Hand written assembly codes (.S files) also need paravirtualization | ||
| 47 | because they include sensitive instructions or some of code paths in | ||
| 48 | them are very performance critical. | ||
| 49 | |||
| 50 | |||
| 51 | The relation to the IA64 machine vector | ||
| 52 | --------------------------------------- | ||
| 53 | Linux/IA64 has the IA64 machine vector functionality which allows the | ||
| 54 | kernel to switch implementations (e.g. initialization, ipi, dma api...) | ||
| 55 | depending on executing platform. | ||
| 56 | We can replace some implementations very easily defining a new machine | ||
| 57 | vector. Thus another approach for virtualization support would be | ||
| 58 | enhancing the machine vector functionality. | ||
| 59 | But paravirt_ops approach was taken because | ||
| 60 | - virtualization support needs wider support than machine vector does. | ||
| 61 | e.g. low level instruction paravirtualization. It must be | ||
| 62 | initialized very early before platform detection. | ||
| 63 | |||
| 64 | - virtualization support needs more functionality like binary patch. | ||
| 65 | Probably the calling overhead might not be very large compared to the | ||
| 66 | emulation overhead of virtualization. However in the native case, the | ||
| 67 | overhead should be eliminated completely. | ||
| 68 | A single kernel binary should run on each environment including native, | ||
| 69 | and the overhead of paravirt_ops on native environment should be as | ||
| 70 | small as possible. | ||
| 71 | |||
| 72 | - for full virtualization technology, e.g. KVM/IA64 or | ||
| 73 | Xen/IA64 HVM domain, the result would be | ||
| 74 | (the emulated platform machine vector. probably dig) + (pv_ops). | ||
| 75 | This means that the virtualization support layer should be under | ||
| 76 | the machine vector layer. | ||
| 77 | |||
| 78 | Possibly it might be better to move some function pointers from | ||
| 79 | paravirt_ops to machine vector. In fact, Xen domU case utilizes both | ||
| 80 | pv_ops and machine vector. | ||
| 81 | |||
| 82 | |||
| 83 | IA64 paravirt_ops | ||
| 84 | ----------------- | ||
| 85 | In this section, the concrete paravirt_ops will be discussed. | ||
| 86 | Because of the architecture difference between ia64 and x86, the | ||
| 87 | resulting set of functions is very different from x86 pv_ops. | ||
| 88 | |||
| 89 | - C function pointer tables | ||
| 90 | They are not very performance critical so that simple C indirect | ||
| 91 | function call is acceptable. The following structures are defined at | ||
| 92 | this moment. For details see linux/include/asm-ia64/paravirt.h | ||
| 93 | - struct pv_info | ||
| 94 | This structure describes the execution environment. | ||
| 95 | - struct pv_init_ops | ||
| 96 | This structure describes the various initialization hooks. | ||
| 97 | - struct pv_iosapic_ops | ||
| 98 | This structure describes hooks to iosapic operations. | ||
| 99 | - struct pv_irq_ops | ||
| 100 | This structure describes hooks to irq related operations | ||
| 101 | - struct pv_time_op | ||
| 102 | This structure describes hooks to steal time accounting. | ||
| 103 | |||
| 104 | - a set of indirect calls which need optimization | ||
| 105 | Currently this class of functions correspond to a subset of IA64 | ||
| 106 | intrinsics. At this moment the optimization with binary patch isn't | ||
| 107 | implemented yet. | ||
| 108 | struct pv_cpu_op is defined. For details see | ||
| 109 | linux/include/asm-ia64/paravirt_privop.h | ||
| 110 | Mostly they correspond to ia64 intrinsics 1-to-1. | ||
| 111 | Caveat: Now they are defined as C indirect function pointers, but in | ||
| 112 | order to support binary patch optimization, they will be changed | ||
| 113 | using GCC extended inline assembly code. | ||
| 114 | |||
| 115 | - a set of macros for hand written assembly code (.S files) | ||
| 116 | For maintenance purpose, the taken approach for .S files is single | ||
| 117 | source code and compile multiple times with different macros definitions. | ||
| 118 | Each pv_ops instance must define those macros to compile. | ||
| 119 | The important thing here is that sensitive, but non-privileged | ||
| 120 | instructions must be paravirtualized and that some privileged | ||
| 121 | instructions also need paravirtualization for reasonable performance. | ||
| 122 | Developers who modify .S files must be aware of that. At this moment | ||
| 123 | an easy checker is implemented to detect paravirtualization breakage. | ||
| 124 | But it doesn't cover all the cases. | ||
| 125 | |||
| 126 | Sometimes this set of macros is called pv_cpu_asm_op. But there is no | ||
| 127 | corresponding structure in the source code. | ||
| 128 | Those macros mostly 1:1 correspond to a subset of privileged | ||
| 129 | instructions. See linux/include/asm-ia64/native/inst.h. | ||
| 130 | And some functions written in assembly also need to be overrided so | ||
| 131 | that each pv_ops instance have to define some macros. Again see | ||
| 132 | linux/include/asm-ia64/native/inst.h. | ||
| 133 | |||
| 134 | |||
| 135 | Those structures must be initialized very early before start_kernel. | ||
| 136 | Probably initialized in head.S using multi entry point or some other trick. | ||
| 137 | For native case implementation see linux/arch/ia64/kernel/paravirt.c. | ||
