diff options
-rw-r--r-- | Documentation/memory-barriers.txt | 115 |
1 files changed, 70 insertions, 45 deletions
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 1c22b21ae922..5eb6f4c6a133 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt | |||
@@ -2599,72 +2599,97 @@ likely, then interrupt-disabling locks should be used to guarantee ordering. | |||
2599 | KERNEL I/O BARRIER EFFECTS | 2599 | KERNEL I/O BARRIER EFFECTS |
2600 | ========================== | 2600 | ========================== |
2601 | 2601 | ||
2602 | When accessing I/O memory, drivers should use the appropriate accessor | 2602 | Interfacing with peripherals via I/O accesses is deeply architecture and device |
2603 | functions: | 2603 | specific. Therefore, drivers which are inherently non-portable may rely on |
2604 | specific behaviours of their target systems in order to achieve synchronization | ||
2605 | in the most lightweight manner possible. For drivers intending to be portable | ||
2606 | between multiple architectures and bus implementations, the kernel offers a | ||
2607 | series of accessor functions that provide various degrees of ordering | ||
2608 | guarantees: | ||
2604 | 2609 | ||
2605 | (*) inX(), outX(): | 2610 | (*) readX(), writeX(): |
2606 | 2611 | ||
2607 | These are intended to talk to I/O space rather than memory space, but | 2612 | The readX() and writeX() MMIO accessors take a pointer to the peripheral |
2608 | that's primarily a CPU-specific concept. The i386 and x86_64 processors | 2613 | being accessed as an __iomem * parameter. For pointers mapped with the |
2609 | do indeed have special I/O space access cycles and instructions, but many | 2614 | default I/O attributes (e.g. those returned by ioremap()), then the |
2610 | CPUs don't have such a concept. | 2615 | ordering guarantees are as follows: |
2611 | 2616 | ||
2612 | The PCI bus, amongst others, defines an I/O space concept which - on such | 2617 | 1. All readX() and writeX() accesses to the same peripheral are ordered |
2613 | CPUs as i386 and x86_64 - readily maps to the CPU's concept of I/O | 2618 | with respect to each other. For example, this ensures that MMIO register |
2614 | space. However, it may also be mapped as a virtual I/O space in the CPU's | 2619 | writes by the CPU to a particular device will arrive in program order. |
2615 | memory map, particularly on those CPUs that don't support alternate I/O | ||
2616 | spaces. | ||
2617 | 2620 | ||
2618 | Accesses to this space may be fully synchronous (as on i386), but | 2621 | 2. A writeX() by the CPU to the peripheral will first wait for the |
2619 | intermediary bridges (such as the PCI host bridge) may not fully honour | 2622 | completion of all prior CPU writes to memory. For example, this ensures |
2620 | that. | 2623 | that writes by the CPU to an outbound DMA buffer allocated by |
2624 | dma_alloc_coherent() will be visible to a DMA engine when the CPU writes | ||
2625 | to its MMIO control register to trigger the transfer. | ||
2621 | 2626 | ||
2622 | They are guaranteed to be fully ordered with respect to each other. | 2627 | 3. A readX() by the CPU from the peripheral will complete before any |
2628 | subsequent CPU reads from memory can begin. For example, this ensures | ||
2629 | that reads by the CPU from an incoming DMA buffer allocated by | ||
2630 | dma_alloc_coherent() will not see stale data after reading from the DMA | ||
2631 | engine's MMIO status register to establish that the DMA transfer has | ||
2632 | completed. | ||
2623 | 2633 | ||
2624 | They are not guaranteed to be fully ordered with respect to other types of | 2634 | 4. A readX() by the CPU from the peripheral will complete before any |
2625 | memory and I/O operation. | 2635 | subsequent delay() loop can begin execution. For example, this ensures |
2636 | that two MMIO register writes by the CPU to a peripheral will arrive at | ||
2637 | least 1us apart if the first write is immediately read back with readX() | ||
2638 | and udelay(1) is called prior to the second writeX(). | ||
2626 | 2639 | ||
2627 | (*) readX(), writeX(): | 2640 | __iomem pointers obtained with non-default attributes (e.g. those returned |
2641 | by ioremap_wc()) are unlikely to provide many of these guarantees. | ||
2628 | 2642 | ||
2629 | Whether these are guaranteed to be fully ordered and uncombined with | 2643 | (*) readX_relaxed(), writeX_relaxed(): |
2630 | respect to each other on the issuing CPU depends on the characteristics | ||
2631 | defined for the memory window through which they're accessing. On later | ||
2632 | i386 architecture machines, for example, this is controlled by way of the | ||
2633 | MTRR registers. | ||
2634 | 2644 | ||
2635 | Ordinarily, these will be guaranteed to be fully ordered and uncombined, | 2645 | These are similar to readX() and writeX(), but provide weaker memory |
2636 | provided they're not accessing a prefetchable device. | 2646 | ordering guarantees. Specifically, they do not guarantee ordering with |
2647 | respect to normal memory accesses or delay() loops (i.e bullets 2-4 above) | ||
2648 | but they are still guaranteed to be ordered with respect to other accesses | ||
2649 | to the same peripheral when operating on __iomem pointers mapped with the | ||
2650 | default I/O attributes. | ||
2637 | 2651 | ||
2638 | However, intermediary hardware (such as a PCI bridge) may indulge in | 2652 | (*) readsX(), writesX(): |
2639 | deferral if it so wishes; to flush a store, a load from the same location | ||
2640 | is preferred[*], but a load from the same device or from configuration | ||
2641 | space should suffice for PCI. | ||
2642 | 2653 | ||
2643 | [*] NOTE! attempting to load from the same location as was written to may | 2654 | The readsX() and writesX() MMIO accessors are designed for accessing |
2644 | cause a malfunction - consider the 16550 Rx/Tx serial registers for | 2655 | register-based, memory-mapped FIFOs residing on peripherals that are not |
2645 | example. | 2656 | capable of performing DMA. Consequently, they provide only the ordering |
2657 | guarantees of readX_relaxed() and writeX_relaxed(), as documented above. | ||
2646 | 2658 | ||
2647 | Used with prefetchable I/O memory, an mmiowb() barrier may be required to | 2659 | (*) inX(), outX(): |
2648 | force stores to be ordered. | ||
2649 | 2660 | ||
2650 | Please refer to the PCI specification for more information on interactions | 2661 | The inX() and outX() accessors are intended to access legacy port-mapped |
2651 | between PCI transactions. | 2662 | I/O peripherals, which may require special instructions on some |
2663 | architectures (notably x86). The port number of the peripheral being | ||
2664 | accessed is passed as an argument. | ||
2652 | 2665 | ||
2653 | (*) readX_relaxed(), writeX_relaxed() | 2666 | Since many CPU architectures ultimately access these peripherals via an |
2667 | internal virtual memory mapping, the portable ordering guarantees provided | ||
2668 | by inX() and outX() are the same as those provided by readX() and writeX() | ||
2669 | respectively when accessing a mapping with the default I/O attributes. | ||
2654 | 2670 | ||
2655 | These are similar to readX() and writeX(), but provide weaker memory | 2671 | Device drivers may expect outX() to emit a non-posted write transaction |
2656 | ordering guarantees. Specifically, they do not guarantee ordering with | 2672 | that waits for a completion response from the I/O peripheral before |
2657 | respect to normal memory accesses (e.g. DMA buffers) nor do they guarantee | 2673 | returning. This is not guaranteed by all architectures and is therefore |
2658 | ordering with respect to LOCK or UNLOCK operations. If the latter is | 2674 | not part of the portable ordering semantics. |
2659 | required, an mmiowb() barrier can be used. Note that relaxed accesses to | 2675 | |
2660 | the same peripheral are guaranteed to be ordered with respect to each | 2676 | (*) insX(), outsX(): |
2661 | other. | 2677 | |
2678 | As above, the insX() and outsX() accessors provide the same ordering | ||
2679 | guarantees as readsX() and writesX() respectively when accessing a mapping | ||
2680 | with the default I/O attributes. | ||
2662 | 2681 | ||
2663 | (*) ioreadX(), iowriteX() | 2682 | (*) ioreadX(), iowriteX() |
2664 | 2683 | ||
2665 | These will perform appropriately for the type of access they're actually | 2684 | These will perform appropriately for the type of access they're actually |
2666 | doing, be it inX()/outX() or readX()/writeX(). | 2685 | doing, be it inX()/outX() or readX()/writeX(). |
2667 | 2686 | ||
2687 | All of these accessors assume that the underlying peripheral is little-endian, | ||
2688 | and will therefore perform byte-swapping operations on big-endian architectures. | ||
2689 | |||
2690 | Composing I/O ordering barriers with SMP ordering barriers and LOCK/UNLOCK | ||
2691 | operations is a dangerous sport which may require the use of mmiowb(). See the | ||
2692 | subsection "Acquires vs I/O accesses" for more information. | ||
2668 | 2693 | ||
2669 | ======================================== | 2694 | ======================================== |
2670 | ASSUMED MINIMUM EXECUTION ORDERING MODEL | 2695 | ASSUMED MINIMUM EXECUTION ORDERING MODEL |