diff options
Diffstat (limited to 'Documentation/io_ordering.txt')
-rw-r--r-- | Documentation/io_ordering.txt | 47 |
1 files changed, 47 insertions, 0 deletions
diff --git a/Documentation/io_ordering.txt b/Documentation/io_ordering.txt new file mode 100644 index 000000000000..9faae6f26d32 --- /dev/null +++ b/Documentation/io_ordering.txt | |||
@@ -0,0 +1,47 @@ | |||
1 | On some platforms, so-called memory-mapped I/O is weakly ordered. On such | ||
2 | platforms, driver writers are responsible for ensuring that I/O writes to | ||
3 | memory-mapped addresses on their device arrive in the order intended. This is | ||
4 | typically done by reading a 'safe' device or bridge register, causing the I/O | ||
5 | chipset to flush pending writes to the device before any reads are posted. A | ||
6 | driver would usually use this technique immediately prior to the exit of a | ||
7 | critical section of code protected by spinlocks. This would ensure that | ||
8 | subsequent writes to I/O space arrived only after all prior writes (much like a | ||
9 | memory barrier op, mb(), only with respect to I/O). | ||
10 | |||
11 | A more concrete example from a hypothetical device driver: | ||
12 | |||
13 | ... | ||
14 | CPU A: spin_lock_irqsave(&dev_lock, flags) | ||
15 | CPU A: val = readl(my_status); | ||
16 | CPU A: ... | ||
17 | CPU A: writel(newval, ring_ptr); | ||
18 | CPU A: spin_unlock_irqrestore(&dev_lock, flags) | ||
19 | ... | ||
20 | CPU B: spin_lock_irqsave(&dev_lock, flags) | ||
21 | CPU B: val = readl(my_status); | ||
22 | CPU B: ... | ||
23 | CPU B: writel(newval2, ring_ptr); | ||
24 | CPU B: spin_unlock_irqrestore(&dev_lock, flags) | ||
25 | ... | ||
26 | |||
27 | In the case above, the device may receive newval2 before it receives newval, | ||
28 | which could cause problems. Fixing it is easy enough though: | ||
29 | |||
30 | ... | ||
31 | CPU A: spin_lock_irqsave(&dev_lock, flags) | ||
32 | CPU A: val = readl(my_status); | ||
33 | CPU A: ... | ||
34 | CPU A: writel(newval, ring_ptr); | ||
35 | CPU A: (void)readl(safe_register); /* maybe a config register? */ | ||
36 | CPU A: spin_unlock_irqrestore(&dev_lock, flags) | ||
37 | ... | ||
38 | CPU B: spin_lock_irqsave(&dev_lock, flags) | ||
39 | CPU B: val = readl(my_status); | ||
40 | CPU B: ... | ||
41 | CPU B: writel(newval2, ring_ptr); | ||
42 | CPU B: (void)readl(safe_register); /* maybe a config register? */ | ||
43 | CPU B: spin_unlock_irqrestore(&dev_lock, flags) | ||
44 | |||
45 | Here, the reads from safe_register will cause the I/O chipset to flush any | ||
46 | pending writes before actually posting the read to the chipset, preventing | ||
47 | possible data corruption. | ||