smp: Document transitivity for memory barriers.

Transitivity is guaranteed only for full memory barriers (smp_mb()). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> 2011-02-10 19:54:50 -0500
committer: Paul E. McKenney <paulmck@linux.vnet.ibm.com> 2011-03-04 11:05:49 -0500
commit: 241e6663b5151733294d1a230a3fd8a4d32e187f (patch)
tree: 088c44c088d13df65851a33f02f533ee3207335a /Documentation/memory-barriers.txt
parent: e611eecd6f9f91d74beda8cfbb3b5e02abdeb5a1 (diff)
1 files changed, 58 insertions, 0 deletions
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 631ad2f1b229..f0d3a8026a56 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -21,6 +21,7 @@ Contents:
     - SMP barrier pairing.
     - Examples of memory barrier sequences.
     - Read memory barriers vs load speculation.
+     - Transitivity
 (*) Explicit kernel barriers.
@@ -959,6 +960,63 @@ the speculation will be cancelled and the value reloaded:
        retrieved                               :       :       +-------+
+TRANSITIVITY
+------------
+Transitivity is a deeply intuitive notion about ordering that is not
+always provided by real computer systems.  The following example
+demonstrates transitivity (also called "cumulativity"):
+        CPU 1                   CPU 2                   CPU 3
+        ======================= ======================= =======================
+                { X = 0, Y = 0 }
+        STORE X=1               LOAD X                  STORE Y=1
+                                <general barrier>       <general barrier>
+                                LOAD Y                  LOAD X
+Suppose that CPU 2's load from X returns 1 and its load from Y returns 0.
+This indicates that CPU 2's load from X in some sense follows CPU 1's
+store to X and that CPU 2's load from Y in some sense preceded CPU 3's
+store to Y.  The question is then "Can CPU 3's load from X return 0?"
+Because CPU 2's load from X in some sense came after CPU 1's store, it
+is natural to expect that CPU 3's load from X must therefore return 1.
+This expectation is an example of transitivity: if a load executing on
+CPU A follows a load from the same variable executing on CPU B, then
+CPU A's load must either return the same value that CPU B's load did,
+or must return some later value.
+In the Linux kernel, use of general memory barriers guarantees
+transitivity.  Therefore, in the above example, if CPU 2's load from X
+returns 1 and its load from Y returns 0, then CPU 3's load from X must
+also return 1.
+However, transitivity is -not- guaranteed for read or write barriers.
+For example, suppose that CPU 2's general barrier in the above example
+is changed to a read barrier as shown below:
+        CPU 1                   CPU 2                   CPU 3
+        ======================= ======================= =======================
+                { X = 0, Y = 0 }
+        STORE X=1               LOAD X                  STORE Y=1
+                                <read barrier>          <general barrier>
+                                LOAD Y                  LOAD X
+This substitution destroys transitivity: in this example, it is perfectly
+legal for CPU 2's load from X to return 1, its load from Y to return 0,
+and CPU 3's load from X to return 0.
+The key point is that although CPU 2's read barrier orders its pair
+of loads, it does not guarantee to order CPU 1's store.  Therefore, if
+this example runs on a system where CPUs 1 and 2 share a store buffer
+or a level of cache, CPU 2 might have early access to CPU 1's writes.
+General barriers are therefore required to ensure that all CPUs agree
+on the combined order of CPU 1's and CPU 2's accesses.
+To reiterate, if your code requires transitivity, use general barriers
+throughout.
 ========================
 EXPLICIT KERNEL BARRIERS
 ========================
author	Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-02-10 19:54:50 -0500
committer	Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-03-04 11:05:49 -0500
commit	241e6663b5151733294d1a230a3fd8a4d32e187f (patch)
tree	088c44c088d13df65851a33f02f533ee3207335a /Documentation/memory-barriers.txt
parent	e611eecd6f9f91d74beda8cfbb3b5e02abdeb5a1 (diff)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 631ad2f1b229..f0d3a8026a56 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt
@@ -21,6 +21,7 @@ Contents:
21	- SMP barrier pairing.	21	- SMP barrier pairing.
22	- Examples of memory barrier sequences.	22	- Examples of memory barrier sequences.
23	- Read memory barriers vs load speculation.	23	- Read memory barriers vs load speculation.
		24	- Transitivity
24		25
25	(*) Explicit kernel barriers.	26	(*) Explicit kernel barriers.
26		27
@@ -959,6 +960,63 @@ the speculation will be cancelled and the value reloaded:
959	retrieved : : +-------+	960	retrieved : : +-------+
960		961
961		962
		963	TRANSITIVITY
		964	------------
		965
		966	Transitivity is a deeply intuitive notion about ordering that is not
		967	always provided by real computer systems. The following example
		968	demonstrates transitivity (also called "cumulativity"):
		969
		970	CPU 1 CPU 2 CPU 3
		971	======================= ======================= =======================
		972	{ X = 0, Y = 0 }
		973	STORE X=1 LOAD X STORE Y=1
		974	<general barrier> <general barrier>
		975	LOAD Y LOAD X
		976
		977	Suppose that CPU 2's load from X returns 1 and its load from Y returns 0.
		978	This indicates that CPU 2's load from X in some sense follows CPU 1's
		979	store to X and that CPU 2's load from Y in some sense preceded CPU 3's
		980	store to Y. The question is then "Can CPU 3's load from X return 0?"
		981
		982	Because CPU 2's load from X in some sense came after CPU 1's store, it
		983	is natural to expect that CPU 3's load from X must therefore return 1.
		984	This expectation is an example of transitivity: if a load executing on
		985	CPU A follows a load from the same variable executing on CPU B, then
		986	CPU A's load must either return the same value that CPU B's load did,
		987	or must return some later value.
		988
		989	In the Linux kernel, use of general memory barriers guarantees
		990	transitivity. Therefore, in the above example, if CPU 2's load from X
		991	returns 1 and its load from Y returns 0, then CPU 3's load from X must
		992	also return 1.
		993
		994	However, transitivity is -not- guaranteed for read or write barriers.
		995	For example, suppose that CPU 2's general barrier in the above example
		996	is changed to a read barrier as shown below:
		997
		998	CPU 1 CPU 2 CPU 3
		999	======================= ======================= =======================
		1000	{ X = 0, Y = 0 }
		1001	STORE X=1 LOAD X STORE Y=1
		1002	<read barrier> <general barrier>
		1003	LOAD Y LOAD X
		1004
		1005	This substitution destroys transitivity: in this example, it is perfectly
		1006	legal for CPU 2's load from X to return 1, its load from Y to return 0,
		1007	and CPU 3's load from X to return 0.
		1008
		1009	The key point is that although CPU 2's read barrier orders its pair
		1010	of loads, it does not guarantee to order CPU 1's store. Therefore, if
		1011	this example runs on a system where CPUs 1 and 2 share a store buffer
		1012	or a level of cache, CPU 2 might have early access to CPU 1's writes.
		1013	General barriers are therefore required to ensure that all CPUs agree
		1014	on the combined order of CPU 1's and CPU 2's accesses.
		1015
		1016	To reiterate, if your code requires transitivity, use general barriers
		1017	throughout.
		1018
		1019
962	========================	1020	========================
963	EXPLICIT KERNEL BARRIERS	1021	EXPLICIT KERNEL BARRIERS
964	========================	1022	========================