aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorPaul E. McKenney <paulmck@linux.vnet.ibm.com>2011-02-10 19:54:50 -0500
committerPaul E. McKenney <paulmck@linux.vnet.ibm.com>2011-03-04 11:05:49 -0500
commit241e6663b5151733294d1a230a3fd8a4d32e187f (patch)
tree088c44c088d13df65851a33f02f533ee3207335a
parente611eecd6f9f91d74beda8cfbb3b5e02abdeb5a1 (diff)
smp: Document transitivity for memory barriers.
Transitivity is guaranteed only for full memory barriers (smp_mb()). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-rw-r--r--Documentation/memory-barriers.txt58
1 files changed, 58 insertions, 0 deletions
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 631ad2f1b229..f0d3a8026a56 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -21,6 +21,7 @@ Contents:
21 - SMP barrier pairing. 21 - SMP barrier pairing.
22 - Examples of memory barrier sequences. 22 - Examples of memory barrier sequences.
23 - Read memory barriers vs load speculation. 23 - Read memory barriers vs load speculation.
24 - Transitivity
24 25
25 (*) Explicit kernel barriers. 26 (*) Explicit kernel barriers.
26 27
@@ -959,6 +960,63 @@ the speculation will be cancelled and the value reloaded:
959 retrieved : : +-------+ 960 retrieved : : +-------+
960 961
961 962
963TRANSITIVITY
964------------
965
966Transitivity is a deeply intuitive notion about ordering that is not
967always provided by real computer systems. The following example
968demonstrates transitivity (also called "cumulativity"):
969
970 CPU 1 CPU 2 CPU 3
971 ======================= ======================= =======================
972 { X = 0, Y = 0 }
973 STORE X=1 LOAD X STORE Y=1
974 <general barrier> <general barrier>
975 LOAD Y LOAD X
976
977Suppose that CPU 2's load from X returns 1 and its load from Y returns 0.
978This indicates that CPU 2's load from X in some sense follows CPU 1's
979store to X and that CPU 2's load from Y in some sense preceded CPU 3's
980store to Y. The question is then "Can CPU 3's load from X return 0?"
981
982Because CPU 2's load from X in some sense came after CPU 1's store, it
983is natural to expect that CPU 3's load from X must therefore return 1.
984This expectation is an example of transitivity: if a load executing on
985CPU A follows a load from the same variable executing on CPU B, then
986CPU A's load must either return the same value that CPU B's load did,
987or must return some later value.
988
989In the Linux kernel, use of general memory barriers guarantees
990transitivity. Therefore, in the above example, if CPU 2's load from X
991returns 1 and its load from Y returns 0, then CPU 3's load from X must
992also return 1.
993
994However, transitivity is -not- guaranteed for read or write barriers.
995For example, suppose that CPU 2's general barrier in the above example
996is changed to a read barrier as shown below:
997
998 CPU 1 CPU 2 CPU 3
999 ======================= ======================= =======================
1000 { X = 0, Y = 0 }
1001 STORE X=1 LOAD X STORE Y=1
1002 <read barrier> <general barrier>
1003 LOAD Y LOAD X
1004
1005This substitution destroys transitivity: in this example, it is perfectly
1006legal for CPU 2's load from X to return 1, its load from Y to return 0,
1007and CPU 3's load from X to return 0.
1008
1009The key point is that although CPU 2's read barrier orders its pair
1010of loads, it does not guarantee to order CPU 1's store. Therefore, if
1011this example runs on a system where CPUs 1 and 2 share a store buffer
1012or a level of cache, CPU 2 might have early access to CPU 1's writes.
1013General barriers are therefore required to ensure that all CPUs agree
1014on the combined order of CPU 1's and CPU 2's accesses.
1015
1016To reiterate, if your code requires transitivity, use general barriers
1017throughout.
1018
1019
962======================== 1020========================
963EXPLICIT KERNEL BARRIERS 1021EXPLICIT KERNEL BARRIERS
964======================== 1022========================