memory-barriers: Rework multicopy-atomicity section

Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
author: Alan Stern <stern@rowland.harvard.edu> 2017-09-01 10:53:34 -0400
committer: Paul E. McKenney <paulmck@linux.vnet.ibm.com> 2017-10-09 17:23:37 -0400
commit: 0902b1f44a72558aece92f074154044861681f84 (patch)
tree: eb6505ea836b4248a6effb1788ee29a5b87e18b8
parent: f1ab25a30ce81f4e9be3cb33cd9bb9fb2db64b28 (diff)
1 files changed, 30 insertions, 28 deletions
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index b6882680247e..7deee1441640 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -1343,13 +1343,13 @@ MULTICOPY ATOMICITY
 Multicopy atomicity is a deeply intuitive notion about ordering that is
 not always provided by real computer systems, namely that a given store
-is visible at the same time to all CPUs, or, alternatively, that all
+becomes visible at the same time to all CPUs, or, alternatively, that all
-CPUs agree on the order in which all stores took place.  However, use of
+CPUs agree on the order in which all stores become visible.  However,
-full multicopy atomicity would rule out valuable hardware optimizations,
+support of full multicopy atomicity would rule out valuable hardware
-so a weaker form called ``other multicopy atomicity'' instead guarantees
+optimizations, so a weaker form called ``other multicopy atomicity''
-that a given store is observed at the same time by all -other- CPUs.  The
+instead guarantees only that a given store becomes visible at the same
-remainder of this document discusses this weaker form, but for brevity
+time to all -other- CPUs.  The remainder of this document discusses this
-will call it simply ``multicopy atomicity''.
+weaker form, but for brevity will call it simply ``multicopy atomicity''.
 The following example demonstrates multicopy atomicity:
@@ -1360,24 +1360,26 @@ The following example demonstrates multicopy atomicity:
                                <general barrier>       <read barrier>
                                STORE Y=r1              LOAD X
-Suppose that CPU 2's load from X returns 1 which it then stores to Y and
+Suppose that CPU 2's load from X returns 1, which it then stores to Y,
-that CPU 3's load from Y returns 1.  This indicates that CPU 2's load
+and CPU 3's load from Y returns 1.  This indicates that CPU 1's store
-from X in some sense follows CPU 1's store to X and that CPU 2's store
+to X precedes CPU 2's load from X and that CPU 2's store to Y precedes
-to Y in some sense preceded CPU 3's load from Y.  The question is then
+CPU 3's load from Y.  In addition, the memory barriers guarantee that
-"Can CPU 3's load from X return 0?"
+CPU 2 executes its load before its store, and CPU 3 loads from Y before
+it loads from X.  The question is then "Can CPU 3's load from X return 0?"
-Because CPU 3's load from X in some sense came after CPU 2's load, it
+Because CPU 3's load from X in some sense comes after CPU 2's load, it
 is natural to expect that CPU 3's load from X must therefore return 1.
-This expectation is an example of multicopy atomicity: if a load executing
+This expectation follows from multicopy atomicity: if a load executing
-on CPU A follows a load from the same variable executing on CPU B, then
+on CPU B follows a load from the same variable executing on CPU A (and
-an understandable but incorrect expectation is that CPU A's load must
+CPU A did not originally store the value which it read), then on
-either return the same value that CPU B's load did, or must return some
+multicopy-atomic systems, CPU B's load must return either the same value
-later value.
+that CPU A's load did or some later value.  However, the Linux kernel
+does not require systems to be multicopy atomic.
-In the Linux kernel, the above use of a general memory barrier compensates
-for any lack of multicopy atomicity.  Therefore, in the above example,
+The use of a general memory barrier in the example above compensates
-if CPU 2's load from X returns 1 and its load from Y returns 0, and CPU 3's
+for any lack of multicopy atomicity.  In the example, if CPU 2's load
-load from Y returns 1, then CPU 3's load from X must also return 1.
+from X returns 1 and CPU 3's load from Y returns 1, then CPU 3's load
+from X must indeed also return 1.
 However, dependencies, read barriers, and write barriers are not always
 able to compensate for non-multicopy atomicity.  For example, suppose
@@ -1396,11 +1398,11 @@ this example, it is perfectly legal for CPU 2's load from X to return 1,
 CPU 3's load from Y to return 1, and its load from X to return 0.
 The key point is that although CPU 2's data dependency orders its load
-and store, it does not guarantee to order CPU 1's store.  Therefore,
+and store, it does not guarantee to order CPU 1's store.  Thus, if this
-if this example runs on a non-multicopy-atomic system where CPUs 1 and 2
+example runs on a non-multicopy-atomic system where CPUs 1 and 2 share a
-share a store buffer or a level of cache, CPU 2 might have early access
+store buffer or a level of cache, CPU 2 might have early access to CPU 1's
-to CPU 1's writes.  A general barrier is therefore required to ensure
+writes.  General barriers are therefore required to ensure that all CPUs
-that all CPUs agree on the combined order of CPU 1's and CPU 2's accesses.
+agree on the combined order of multiple accesses.
 General barriers can compensate not only for non-multicopy atomicity,
 but can also generate additional ordering that can ensure that -all-
author	Alan Stern <stern@rowland.harvard.edu>	2017-09-01 10:53:34 -0400
committer	Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-10-09 17:23:37 -0400
commit	0902b1f44a72558aece92f074154044861681f84 (patch)
tree	eb6505ea836b4248a6effb1788ee29a5b87e18b8
parent	f1ab25a30ce81f4e9be3cb33cd9bb9fb2db64b28 (diff)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index b6882680247e..7deee1441640 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt
@@ -1343,13 +1343,13 @@ MULTICOPY ATOMICITY
1343		1343
1344	Multicopy atomicity is a deeply intuitive notion about ordering that is	1344	Multicopy atomicity is a deeply intuitive notion about ordering that is
1345	not always provided by real computer systems, namely that a given store	1345	not always provided by real computer systems, namely that a given store
1346	is visible at the same time to all CPUs, or, alternatively, that all	1346	becomes visible at the same time to all CPUs, or, alternatively, that all
1347	CPUs agree on the order in which all stores took place. However, use of	1347	CPUs agree on the order in which all stores become visible. However,
1348	full multicopy atomicity would rule out valuable hardware optimizations,	1348	support of full multicopy atomicity would rule out valuable hardware
1349	so a weaker form called ``other multicopy atomicity'' instead guarantees	1349	optimizations, so a weaker form called ``other multicopy atomicity''
1350	that a given store is observed at the same time by all -other- CPUs. The	1350	instead guarantees only that a given store becomes visible at the same
1351	remainder of this document discusses this weaker form, but for brevity	1351	time to all -other- CPUs. The remainder of this document discusses this
1352	will call it simply ``multicopy atomicity''.	1352	weaker form, but for brevity will call it simply ``multicopy atomicity''.
1353		1353
1354	The following example demonstrates multicopy atomicity:	1354	The following example demonstrates multicopy atomicity:
1355		1355
@@ -1360,24 +1360,26 @@ The following example demonstrates multicopy atomicity:
1360	<general barrier> <read barrier>	1360	<general barrier> <read barrier>
1361	STORE Y=r1 LOAD X	1361	STORE Y=r1 LOAD X
1362		1362
1363	Suppose that CPU 2's load from X returns 1 which it then stores to Y and	1363	Suppose that CPU 2's load from X returns 1, which it then stores to Y,
1364	that CPU 3's load from Y returns 1. This indicates that CPU 2's load	1364	and CPU 3's load from Y returns 1. This indicates that CPU 1's store
1365	from X in some sense follows CPU 1's store to X and that CPU 2's store	1365	to X precedes CPU 2's load from X and that CPU 2's store to Y precedes
1366	to Y in some sense preceded CPU 3's load from Y. The question is then	1366	CPU 3's load from Y. In addition, the memory barriers guarantee that
1367	"Can CPU 3's load from X return 0?"	1367	CPU 2 executes its load before its store, and CPU 3 loads from Y before
		1368	it loads from X. The question is then "Can CPU 3's load from X return 0?"
1368		1369
1369	Because CPU 3's load from X in some sense came after CPU 2's load, it	1370	Because CPU 3's load from X in some sense comes after CPU 2's load, it
1370	is natural to expect that CPU 3's load from X must therefore return 1.	1371	is natural to expect that CPU 3's load from X must therefore return 1.
1371	This expectation is an example of multicopy atomicity: if a load executing	1372	This expectation follows from multicopy atomicity: if a load executing
1372	on CPU A follows a load from the same variable executing on CPU B, then	1373	on CPU B follows a load from the same variable executing on CPU A (and
1373	an understandable but incorrect expectation is that CPU A's load must	1374	CPU A did not originally store the value which it read), then on
1374	either return the same value that CPU B's load did, or must return some	1375	multicopy-atomic systems, CPU B's load must return either the same value
1375	later value.	1376	that CPU A's load did or some later value. However, the Linux kernel
1376		1377	does not require systems to be multicopy atomic.
1377	In the Linux kernel, the above use of a general memory barrier compensates	1378
1378	for any lack of multicopy atomicity. Therefore, in the above example,	1379	The use of a general memory barrier in the example above compensates
1379	if CPU 2's load from X returns 1 and its load from Y returns 0, and CPU 3's	1380	for any lack of multicopy atomicity. In the example, if CPU 2's load
1380	load from Y returns 1, then CPU 3's load from X must also return 1.	1381	from X returns 1 and CPU 3's load from Y returns 1, then CPU 3's load
		1382	from X must indeed also return 1.
1381		1383
1382	However, dependencies, read barriers, and write barriers are not always	1384	However, dependencies, read barriers, and write barriers are not always
1383	able to compensate for non-multicopy atomicity. For example, suppose	1385	able to compensate for non-multicopy atomicity. For example, suppose
@@ -1396,11 +1398,11 @@ this example, it is perfectly legal for CPU 2's load from X to return 1,
1396	CPU 3's load from Y to return 1, and its load from X to return 0.	1398	CPU 3's load from Y to return 1, and its load from X to return 0.
1397		1399
1398	The key point is that although CPU 2's data dependency orders its load	1400	The key point is that although CPU 2's data dependency orders its load
1399	and store, it does not guarantee to order CPU 1's store. Therefore,	1401	and store, it does not guarantee to order CPU 1's store. Thus, if this
1400	if this example runs on a non-multicopy-atomic system where CPUs 1 and 2	1402	example runs on a non-multicopy-atomic system where CPUs 1 and 2 share a
1401	share a store buffer or a level of cache, CPU 2 might have early access	1403	store buffer or a level of cache, CPU 2 might have early access to CPU 1's
1402	to CPU 1's writes. A general barrier is therefore required to ensure	1404	writes. General barriers are therefore required to ensure that all CPUs
1403	that all CPUs agree on the combined order of CPU 1's and CPU 2's accesses.	1405	agree on the combined order of multiple accesses.
1404		1406
1405	General barriers can compensate not only for non-multicopy atomicity,	1407	General barriers can compensate not only for non-multicopy atomicity,
1406	but can also generate additional ordering that can ensure that -all-	1408	but can also generate additional ordering that can ensure that -all-