diff options
author | Mauro Carvalho Chehab <mchehab@s-opensource.com> | 2017-05-17 08:38:00 -0400 |
---|---|---|
committer | Jonathan Corbet <corbet@lwn.net> | 2017-07-14 15:58:10 -0400 |
commit | c6f4d41338a78bcc3ddcc4e00f5de63c8ee2ad20 (patch) | |
tree | 6f0e730ae79628ee10ff7dd363b317eb1d87e5e8 | |
parent | 2a26ed8e4afff2bb48c044dc3ad69da19d66debf (diff) |
vfio.txt: standardize document format
Each text file under Documentation follows a different
format. Some doesn't even have titles!
Change its representation to follow the adopted standard,
using ReST markups for it to be parseable by Sphinx:
- adjust title marks;
- use footnote marks;
- mark literal blocks;
- adjust identation.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
-rw-r--r-- | Documentation/vfio.txt | 281 |
1 files changed, 144 insertions, 137 deletions
diff --git a/Documentation/vfio.txt b/Documentation/vfio.txt index 1dd3fddfd3a1..ef6a5111eaa1 100644 --- a/Documentation/vfio.txt +++ b/Documentation/vfio.txt | |||
@@ -1,5 +1,7 @@ | |||
1 | VFIO - "Virtual Function I/O"[1] | 1 | ================================== |
2 | ------------------------------------------------------------------------------- | 2 | VFIO - "Virtual Function I/O" [1]_ |
3 | ================================== | ||
4 | |||
3 | Many modern system now provide DMA and interrupt remapping facilities | 5 | Many modern system now provide DMA and interrupt remapping facilities |
4 | to help ensure I/O devices behave within the boundaries they've been | 6 | to help ensure I/O devices behave within the boundaries they've been |
5 | allotted. This includes x86 hardware with AMD-Vi and Intel VT-d, | 7 | allotted. This includes x86 hardware with AMD-Vi and Intel VT-d, |
@@ -7,14 +9,14 @@ POWER systems with Partitionable Endpoints (PEs) and embedded PowerPC | |||
7 | systems such as Freescale PAMU. The VFIO driver is an IOMMU/device | 9 | systems such as Freescale PAMU. The VFIO driver is an IOMMU/device |
8 | agnostic framework for exposing direct device access to userspace, in | 10 | agnostic framework for exposing direct device access to userspace, in |
9 | a secure, IOMMU protected environment. In other words, this allows | 11 | a secure, IOMMU protected environment. In other words, this allows |
10 | safe[2], non-privileged, userspace drivers. | 12 | safe [2]_, non-privileged, userspace drivers. |
11 | 13 | ||
12 | Why do we want that? Virtual machines often make use of direct device | 14 | Why do we want that? Virtual machines often make use of direct device |
13 | access ("device assignment") when configured for the highest possible | 15 | access ("device assignment") when configured for the highest possible |
14 | I/O performance. From a device and host perspective, this simply | 16 | I/O performance. From a device and host perspective, this simply |
15 | turns the VM into a userspace driver, with the benefits of | 17 | turns the VM into a userspace driver, with the benefits of |
16 | significantly reduced latency, higher bandwidth, and direct use of | 18 | significantly reduced latency, higher bandwidth, and direct use of |
17 | bare-metal device drivers[3]. | 19 | bare-metal device drivers [3]_. |
18 | 20 | ||
19 | Some applications, particularly in the high performance computing | 21 | Some applications, particularly in the high performance computing |
20 | field, also benefit from low-overhead, direct device access from | 22 | field, also benefit from low-overhead, direct device access from |
@@ -31,7 +33,7 @@ KVM PCI specific device assignment code as well as provide a more | |||
31 | secure, more featureful userspace driver environment than UIO. | 33 | secure, more featureful userspace driver environment than UIO. |
32 | 34 | ||
33 | Groups, Devices, and IOMMUs | 35 | Groups, Devices, and IOMMUs |
34 | ------------------------------------------------------------------------------- | 36 | --------------------------- |
35 | 37 | ||
36 | Devices are the main target of any I/O driver. Devices typically | 38 | Devices are the main target of any I/O driver. Devices typically |
37 | create a programming interface made up of I/O access, interrupts, | 39 | create a programming interface made up of I/O access, interrupts, |
@@ -114,40 +116,40 @@ well as mechanisms for describing and registering interrupt | |||
114 | notifications. | 116 | notifications. |
115 | 117 | ||
116 | VFIO Usage Example | 118 | VFIO Usage Example |
117 | ------------------------------------------------------------------------------- | 119 | ------------------ |
118 | 120 | ||
119 | Assume user wants to access PCI device 0000:06:0d.0 | 121 | Assume user wants to access PCI device 0000:06:0d.0:: |
120 | 122 | ||
121 | $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group | 123 | $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group |
122 | ../../../../kernel/iommu_groups/26 | 124 | ../../../../kernel/iommu_groups/26 |
123 | 125 | ||
124 | This device is therefore in IOMMU group 26. This device is on the | 126 | This device is therefore in IOMMU group 26. This device is on the |
125 | pci bus, therefore the user will make use of vfio-pci to manage the | 127 | pci bus, therefore the user will make use of vfio-pci to manage the |
126 | group: | 128 | group:: |
127 | 129 | ||
128 | # modprobe vfio-pci | 130 | # modprobe vfio-pci |
129 | 131 | ||
130 | Binding this device to the vfio-pci driver creates the VFIO group | 132 | Binding this device to the vfio-pci driver creates the VFIO group |
131 | character devices for this group: | 133 | character devices for this group:: |
132 | 134 | ||
133 | $ lspci -n -s 0000:06:0d.0 | 135 | $ lspci -n -s 0000:06:0d.0 |
134 | 06:0d.0 0401: 1102:0002 (rev 08) | 136 | 06:0d.0 0401: 1102:0002 (rev 08) |
135 | # echo 0000:06:0d.0 > /sys/bus/pci/devices/0000:06:0d.0/driver/unbind | 137 | # echo 0000:06:0d.0 > /sys/bus/pci/devices/0000:06:0d.0/driver/unbind |
136 | # echo 1102 0002 > /sys/bus/pci/drivers/vfio-pci/new_id | 138 | # echo 1102 0002 > /sys/bus/pci/drivers/vfio-pci/new_id |
137 | 139 | ||
138 | Now we need to look at what other devices are in the group to free | 140 | Now we need to look at what other devices are in the group to free |
139 | it for use by VFIO: | 141 | it for use by VFIO:: |
140 | 142 | ||
141 | $ ls -l /sys/bus/pci/devices/0000:06:0d.0/iommu_group/devices | 143 | $ ls -l /sys/bus/pci/devices/0000:06:0d.0/iommu_group/devices |
142 | total 0 | 144 | total 0 |
143 | lrwxrwxrwx. 1 root root 0 Apr 23 16:13 0000:00:1e.0 -> | 145 | lrwxrwxrwx. 1 root root 0 Apr 23 16:13 0000:00:1e.0 -> |
144 | ../../../../devices/pci0000:00/0000:00:1e.0 | 146 | ../../../../devices/pci0000:00/0000:00:1e.0 |
145 | lrwxrwxrwx. 1 root root 0 Apr 23 16:13 0000:06:0d.0 -> | 147 | lrwxrwxrwx. 1 root root 0 Apr 23 16:13 0000:06:0d.0 -> |
146 | ../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.0 | 148 | ../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.0 |
147 | lrwxrwxrwx. 1 root root 0 Apr 23 16:13 0000:06:0d.1 -> | 149 | lrwxrwxrwx. 1 root root 0 Apr 23 16:13 0000:06:0d.1 -> |
148 | ../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.1 | 150 | ../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.1 |
149 | 151 | ||
150 | This device is behind a PCIe-to-PCI bridge[4], therefore we also | 152 | This device is behind a PCIe-to-PCI bridge [4]_, therefore we also |
151 | need to add device 0000:06:0d.1 to the group following the same | 153 | need to add device 0000:06:0d.1 to the group following the same |
152 | procedure as above. Device 0000:00:1e.0 is a bridge that does | 154 | procedure as above. Device 0000:00:1e.0 is a bridge that does |
153 | not currently have a host driver, therefore it's not required to | 155 | not currently have a host driver, therefore it's not required to |
@@ -157,12 +159,12 @@ support PCI bridges). | |||
157 | The final step is to provide the user with access to the group if | 159 | The final step is to provide the user with access to the group if |
158 | unprivileged operation is desired (note that /dev/vfio/vfio provides | 160 | unprivileged operation is desired (note that /dev/vfio/vfio provides |
159 | no capabilities on its own and is therefore expected to be set to | 161 | no capabilities on its own and is therefore expected to be set to |
160 | mode 0666 by the system). | 162 | mode 0666 by the system):: |
161 | 163 | ||
162 | # chown user:user /dev/vfio/26 | 164 | # chown user:user /dev/vfio/26 |
163 | 165 | ||
164 | The user now has full access to all the devices and the iommu for this | 166 | The user now has full access to all the devices and the iommu for this |
165 | group and can access them as follows: | 167 | group and can access them as follows:: |
166 | 168 | ||
167 | int container, group, device, i; | 169 | int container, group, device, i; |
168 | struct vfio_group_status group_status = | 170 | struct vfio_group_status group_status = |
@@ -248,31 +250,31 @@ VFIO bus driver API | |||
248 | VFIO bus drivers, such as vfio-pci make use of only a few interfaces | 250 | VFIO bus drivers, such as vfio-pci make use of only a few interfaces |
249 | into VFIO core. When devices are bound and unbound to the driver, | 251 | into VFIO core. When devices are bound and unbound to the driver, |
250 | the driver should call vfio_add_group_dev() and vfio_del_group_dev() | 252 | the driver should call vfio_add_group_dev() and vfio_del_group_dev() |
251 | respectively: | 253 | respectively:: |
252 | 254 | ||
253 | extern int vfio_add_group_dev(struct iommu_group *iommu_group, | 255 | extern int vfio_add_group_dev(struct iommu_group *iommu_group, |
254 | struct device *dev, | 256 | struct device *dev, |
255 | const struct vfio_device_ops *ops, | 257 | const struct vfio_device_ops *ops, |
256 | void *device_data); | 258 | void *device_data); |
257 | 259 | ||
258 | extern void *vfio_del_group_dev(struct device *dev); | 260 | extern void *vfio_del_group_dev(struct device *dev); |
259 | 261 | ||
260 | vfio_add_group_dev() indicates to the core to begin tracking the | 262 | vfio_add_group_dev() indicates to the core to begin tracking the |
261 | specified iommu_group and register the specified dev as owned by | 263 | specified iommu_group and register the specified dev as owned by |
262 | a VFIO bus driver. The driver provides an ops structure for callbacks | 264 | a VFIO bus driver. The driver provides an ops structure for callbacks |
263 | similar to a file operations structure: | 265 | similar to a file operations structure:: |
264 | 266 | ||
265 | struct vfio_device_ops { | 267 | struct vfio_device_ops { |
266 | int (*open)(void *device_data); | 268 | int (*open)(void *device_data); |
267 | void (*release)(void *device_data); | 269 | void (*release)(void *device_data); |
268 | ssize_t (*read)(void *device_data, char __user *buf, | 270 | ssize_t (*read)(void *device_data, char __user *buf, |
269 | size_t count, loff_t *ppos); | 271 | size_t count, loff_t *ppos); |
270 | ssize_t (*write)(void *device_data, const char __user *buf, | 272 | ssize_t (*write)(void *device_data, const char __user *buf, |
271 | size_t size, loff_t *ppos); | 273 | size_t size, loff_t *ppos); |
272 | long (*ioctl)(void *device_data, unsigned int cmd, | 274 | long (*ioctl)(void *device_data, unsigned int cmd, |
273 | unsigned long arg); | 275 | unsigned long arg); |
274 | int (*mmap)(void *device_data, struct vm_area_struct *vma); | 276 | int (*mmap)(void *device_data, struct vm_area_struct *vma); |
275 | }; | 277 | }; |
276 | 278 | ||
277 | Each function is passed the device_data that was originally registered | 279 | Each function is passed the device_data that was originally registered |
278 | in the vfio_add_group_dev() call above. This allows the bus driver | 280 | in the vfio_add_group_dev() call above. This allows the bus driver |
@@ -285,50 +287,55 @@ own VFIO_DEVICE_GET_REGION_INFO ioctl. | |||
285 | 287 | ||
286 | 288 | ||
287 | PPC64 sPAPR implementation note | 289 | PPC64 sPAPR implementation note |
288 | ------------------------------------------------------------------------------- | 290 | ------------------------------- |
289 | 291 | ||
290 | This implementation has some specifics: | 292 | This implementation has some specifics: |
291 | 293 | ||
292 | 1) On older systems (POWER7 with P5IOC2/IODA1) only one IOMMU group per | 294 | 1) On older systems (POWER7 with P5IOC2/IODA1) only one IOMMU group per |
293 | container is supported as an IOMMU table is allocated at the boot time, | 295 | container is supported as an IOMMU table is allocated at the boot time, |
294 | one table per a IOMMU group which is a Partitionable Endpoint (PE) | 296 | one table per a IOMMU group which is a Partitionable Endpoint (PE) |
295 | (PE is often a PCI domain but not always). | 297 | (PE is often a PCI domain but not always). |
296 | Newer systems (POWER8 with IODA2) have improved hardware design which allows | 298 | |
297 | to remove this limitation and have multiple IOMMU groups per a VFIO container. | 299 | Newer systems (POWER8 with IODA2) have improved hardware design which allows |
300 | to remove this limitation and have multiple IOMMU groups per a VFIO | ||
301 | container. | ||
298 | 302 | ||
299 | 2) The hardware supports so called DMA windows - the PCI address range | 303 | 2) The hardware supports so called DMA windows - the PCI address range |
300 | within which DMA transfer is allowed, any attempt to access address space | 304 | within which DMA transfer is allowed, any attempt to access address space |
301 | out of the window leads to the whole PE isolation. | 305 | out of the window leads to the whole PE isolation. |
302 | 306 | ||
303 | 3) PPC64 guests are paravirtualized but not fully emulated. There is an API | 307 | 3) PPC64 guests are paravirtualized but not fully emulated. There is an API |
304 | to map/unmap pages for DMA, and it normally maps 1..32 pages per call and | 308 | to map/unmap pages for DMA, and it normally maps 1..32 pages per call and |
305 | currently there is no way to reduce the number of calls. In order to make things | 309 | currently there is no way to reduce the number of calls. In order to make |
306 | faster, the map/unmap handling has been implemented in real mode which provides | 310 | things faster, the map/unmap handling has been implemented in real mode |
307 | an excellent performance which has limitations such as inability to do | 311 | which provides an excellent performance which has limitations such as |
308 | locked pages accounting in real time. | 312 | inability to do locked pages accounting in real time. |
309 | 313 | ||
310 | 4) According to sPAPR specification, A Partitionable Endpoint (PE) is an I/O | 314 | 4) According to sPAPR specification, A Partitionable Endpoint (PE) is an I/O |
311 | subtree that can be treated as a unit for the purposes of partitioning and | 315 | subtree that can be treated as a unit for the purposes of partitioning and |
312 | error recovery. A PE may be a single or multi-function IOA (IO Adapter), a | 316 | error recovery. A PE may be a single or multi-function IOA (IO Adapter), a |
313 | function of a multi-function IOA, or multiple IOAs (possibly including switch | 317 | function of a multi-function IOA, or multiple IOAs (possibly including |
314 | and bridge structures above the multiple IOAs). PPC64 guests detect PCI errors | 318 | switch and bridge structures above the multiple IOAs). PPC64 guests detect |
315 | and recover from them via EEH RTAS services, which works on the basis of | 319 | PCI errors and recover from them via EEH RTAS services, which works on the |
316 | additional ioctl commands. | 320 | basis of additional ioctl commands. |
317 | 321 | ||
318 | So 4 additional ioctls have been added: | 322 | So 4 additional ioctls have been added: |
319 | 323 | ||
320 | VFIO_IOMMU_SPAPR_TCE_GET_INFO - returns the size and the start | 324 | VFIO_IOMMU_SPAPR_TCE_GET_INFO |
321 | of the DMA window on the PCI bus. | 325 | returns the size and the start of the DMA window on the PCI bus. |
322 | 326 | ||
323 | VFIO_IOMMU_ENABLE - enables the container. The locked pages accounting | 327 | VFIO_IOMMU_ENABLE |
328 | enables the container. The locked pages accounting | ||
324 | is done at this point. This lets user first to know what | 329 | is done at this point. This lets user first to know what |
325 | the DMA window is and adjust rlimit before doing any real job. | 330 | the DMA window is and adjust rlimit before doing any real job. |
326 | 331 | ||
327 | VFIO_IOMMU_DISABLE - disables the container. | 332 | VFIO_IOMMU_DISABLE |
333 | disables the container. | ||
328 | 334 | ||
329 | VFIO_EEH_PE_OP - provides an API for EEH setup, error detection and recovery. | 335 | VFIO_EEH_PE_OP |
336 | provides an API for EEH setup, error detection and recovery. | ||
330 | 337 | ||
331 | The code flow from the example above should be slightly changed: | 338 | The code flow from the example above should be slightly changed:: |
332 | 339 | ||
333 | struct vfio_eeh_pe_op pe_op = { .argsz = sizeof(pe_op), .flags = 0 }; | 340 | struct vfio_eeh_pe_op pe_op = { .argsz = sizeof(pe_op), .flags = 0 }; |
334 | 341 | ||
@@ -442,73 +449,73 @@ The code flow from the example above should be slightly changed: | |||
442 | .... | 449 | .... |
443 | 450 | ||
444 | 5) There is v2 of SPAPR TCE IOMMU. It deprecates VFIO_IOMMU_ENABLE/ | 451 | 5) There is v2 of SPAPR TCE IOMMU. It deprecates VFIO_IOMMU_ENABLE/ |
445 | VFIO_IOMMU_DISABLE and implements 2 new ioctls: | 452 | VFIO_IOMMU_DISABLE and implements 2 new ioctls: |
446 | VFIO_IOMMU_SPAPR_REGISTER_MEMORY and VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY | 453 | VFIO_IOMMU_SPAPR_REGISTER_MEMORY and VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY |
447 | (which are unsupported in v1 IOMMU). | 454 | (which are unsupported in v1 IOMMU). |
448 | 455 | ||
449 | PPC64 paravirtualized guests generate a lot of map/unmap requests, | 456 | PPC64 paravirtualized guests generate a lot of map/unmap requests, |
450 | and the handling of those includes pinning/unpinning pages and updating | 457 | and the handling of those includes pinning/unpinning pages and updating |
451 | mm::locked_vm counter to make sure we do not exceed the rlimit. | 458 | mm::locked_vm counter to make sure we do not exceed the rlimit. |
452 | The v2 IOMMU splits accounting and pinning into separate operations: | 459 | The v2 IOMMU splits accounting and pinning into separate operations: |
453 | 460 | ||
454 | - VFIO_IOMMU_SPAPR_REGISTER_MEMORY/VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY ioctls | 461 | - VFIO_IOMMU_SPAPR_REGISTER_MEMORY/VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY ioctls |
455 | receive a user space address and size of the block to be pinned. | 462 | receive a user space address and size of the block to be pinned. |
456 | Bisecting is not supported and VFIO_IOMMU_UNREGISTER_MEMORY is expected to | 463 | Bisecting is not supported and VFIO_IOMMU_UNREGISTER_MEMORY is expected to |
457 | be called with the exact address and size used for registering | 464 | be called with the exact address and size used for registering |
458 | the memory block. The userspace is not expected to call these often. | 465 | the memory block. The userspace is not expected to call these often. |
459 | The ranges are stored in a linked list in a VFIO container. | 466 | The ranges are stored in a linked list in a VFIO container. |
460 | 467 | ||
461 | - VFIO_IOMMU_MAP_DMA/VFIO_IOMMU_UNMAP_DMA ioctls only update the actual | 468 | - VFIO_IOMMU_MAP_DMA/VFIO_IOMMU_UNMAP_DMA ioctls only update the actual |
462 | IOMMU table and do not do pinning; instead these check that the userspace | 469 | IOMMU table and do not do pinning; instead these check that the userspace |
463 | address is from pre-registered range. | 470 | address is from pre-registered range. |
464 | 471 | ||
465 | This separation helps in optimizing DMA for guests. | 472 | This separation helps in optimizing DMA for guests. |
466 | 473 | ||
467 | 6) sPAPR specification allows guests to have an additional DMA window(s) on | 474 | 6) sPAPR specification allows guests to have an additional DMA window(s) on |
468 | a PCI bus with a variable page size. Two ioctls have been added to support | 475 | a PCI bus with a variable page size. Two ioctls have been added to support |
469 | this: VFIO_IOMMU_SPAPR_TCE_CREATE and VFIO_IOMMU_SPAPR_TCE_REMOVE. | 476 | this: VFIO_IOMMU_SPAPR_TCE_CREATE and VFIO_IOMMU_SPAPR_TCE_REMOVE. |
470 | The platform has to support the functionality or error will be returned to | 477 | The platform has to support the functionality or error will be returned to |
471 | the userspace. The existing hardware supports up to 2 DMA windows, one is | 478 | the userspace. The existing hardware supports up to 2 DMA windows, one is |
472 | 2GB long, uses 4K pages and called "default 32bit window"; the other can | 479 | 2GB long, uses 4K pages and called "default 32bit window"; the other can |
473 | be as big as entire RAM, use different page size, it is optional - guests | 480 | be as big as entire RAM, use different page size, it is optional - guests |
474 | create those in run-time if the guest driver supports 64bit DMA. | 481 | create those in run-time if the guest driver supports 64bit DMA. |
475 | 482 | ||
476 | VFIO_IOMMU_SPAPR_TCE_CREATE receives a page shift, a DMA window size and | 483 | VFIO_IOMMU_SPAPR_TCE_CREATE receives a page shift, a DMA window size and |
477 | a number of TCE table levels (if a TCE table is going to be big enough and | 484 | a number of TCE table levels (if a TCE table is going to be big enough and |
478 | the kernel may not be able to allocate enough of physically contiguous memory). | 485 | the kernel may not be able to allocate enough of physically contiguous |
479 | It creates a new window in the available slot and returns the bus address where | 486 | memory). It creates a new window in the available slot and returns the bus |
480 | the new window starts. Due to hardware limitation, the user space cannot choose | 487 | address where the new window starts. Due to hardware limitation, the user |
481 | the location of DMA windows. | 488 | space cannot choose the location of DMA windows. |
482 | 489 | ||
483 | VFIO_IOMMU_SPAPR_TCE_REMOVE receives the bus start address of the window | 490 | VFIO_IOMMU_SPAPR_TCE_REMOVE receives the bus start address of the window |
484 | and removes it. | 491 | and removes it. |
485 | 492 | ||
486 | ------------------------------------------------------------------------------- | 493 | ------------------------------------------------------------------------------- |
487 | 494 | ||
488 | [1] VFIO was originally an acronym for "Virtual Function I/O" in its | 495 | .. [1] VFIO was originally an acronym for "Virtual Function I/O" in its |
489 | initial implementation by Tom Lyon while as Cisco. We've since | 496 | initial implementation by Tom Lyon while as Cisco. We've since |
490 | outgrown the acronym, but it's catchy. | 497 | outgrown the acronym, but it's catchy. |
491 | 498 | ||
492 | [2] "safe" also depends upon a device being "well behaved". It's | 499 | .. [2] "safe" also depends upon a device being "well behaved". It's |
493 | possible for multi-function devices to have backdoors between | 500 | possible for multi-function devices to have backdoors between |
494 | functions and even for single function devices to have alternative | 501 | functions and even for single function devices to have alternative |
495 | access to things like PCI config space through MMIO registers. To | 502 | access to things like PCI config space through MMIO registers. To |
496 | guard against the former we can include additional precautions in the | 503 | guard against the former we can include additional precautions in the |
497 | IOMMU driver to group multi-function PCI devices together | 504 | IOMMU driver to group multi-function PCI devices together |
498 | (iommu=group_mf). The latter we can't prevent, but the IOMMU should | 505 | (iommu=group_mf). The latter we can't prevent, but the IOMMU should |
499 | still provide isolation. For PCI, SR-IOV Virtual Functions are the | 506 | still provide isolation. For PCI, SR-IOV Virtual Functions are the |
500 | best indicator of "well behaved", as these are designed for | 507 | best indicator of "well behaved", as these are designed for |
501 | virtualization usage models. | 508 | virtualization usage models. |
502 | 509 | ||
503 | [3] As always there are trade-offs to virtual machine device | 510 | .. [3] As always there are trade-offs to virtual machine device |
504 | assignment that are beyond the scope of VFIO. It's expected that | 511 | assignment that are beyond the scope of VFIO. It's expected that |
505 | future IOMMU technologies will reduce some, but maybe not all, of | 512 | future IOMMU technologies will reduce some, but maybe not all, of |
506 | these trade-offs. | 513 | these trade-offs. |
507 | 514 | ||
508 | [4] In this case the device is below a PCI bridge, so transactions | 515 | .. [4] In this case the device is below a PCI bridge, so transactions |
509 | from either function of the device are indistinguishable to the iommu: | 516 | from either function of the device are indistinguishable to the iommu:: |
510 | 517 | ||
511 | -[0000:00]-+-1e.0-[06]--+-0d.0 | 518 | -[0000:00]-+-1e.0-[06]--+-0d.0 |
512 | \-0d.1 | 519 | \-0d.1 |
513 | 520 | ||
514 | 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90) | 521 | 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90) |