summaryrefslogtreecommitdiffstats
path: root/Documentation/memory-barriers.txt
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2015-11-03 20:22:17 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2015-11-03 20:22:17 -0500
commit105ff3cbf225036b75a6a46c96d1ddce8e7bdc66 (patch)
treed37498ac55c638a2a4b4083bdc1adac74c55f5c1 /Documentation/memory-barriers.txt
parentd63a9788650fcd999b34584316afee6bd4378f19 (diff)
atomic: remove all traces of READ_ONCE_CTRL() and atomic*_read_ctrl()
This seems to be a mis-reading of how alpha memory ordering works, and is not backed up by the alpha architecture manual. The helper functions don't do anything special on any other architectures, and the arguments that support them being safe on other architectures also argue that they are safe on alpha. Basically, the "control dependency" is between a previous read and a subsequent write that is dependent on the value read. Even if the subsequent write is actually done speculatively, there is no way that such a speculative write could be made visible to other cpu's until it has been committed, which requires validating the speculation. Note that most weakely ordered architectures (very much including alpha) do not guarantee any ordering relationship between two loads that depend on each other on a control dependency: read A if (val == 1) read B because the conditional may be predicted, and the "read B" may be speculatively moved up to before reading the value A. So we require the user to insert a smp_rmb() between the two accesses to be correct: read A; if (A == 1) smp_rmb() read B Alpha is further special in that it can break that ordering even if the *address* of B depends on the read of A, because the cacheline that is read later may be stale unless you have a memory barrier in between the pointer read and the read of the value behind a pointer: read ptr read offset(ptr) whereas all other weakly ordered architectures guarantee that the data dependency (as opposed to just a control dependency) will order the two accesses. As a result, alpha needs a "smp_read_barrier_depends()" in between those two reads for them to be ordered. The coontrol dependency that "READ_ONCE_CTRL()" and "atomic_read_ctrl()" had was a control dependency to a subsequent *write*, however, and nobody can finalize such a subsequent write without having actually done the read. And were you to write such a value to a "stale" cacheline (the way the unordered reads came to be), that would seem to lose the write entirely. So the things that make alpha able to re-order reads even more aggressively than other weak architectures do not seem to be relevant for a subsequent write. Alpha memory ordering may be strange, but there's no real indication that it is *that* strange. Also, the alpha architecture reference manual very explicitly talks about the definition of "Dependence Constraints" in section 5.6.1.7, where a preceding read dominates a subsequent write. Such a dependence constraint admittedly does not impose a BEFORE (alpha architecture term for globally visible ordering), but it does guarantee that there can be no "causal loop". I don't see how you could avoid such a loop if another cpu could see the stored value and then impact the value of the first read. Put another way: the read and the write could not be seen as being out of order wrt other cpus. So I do not see how these "x_ctrl()" functions can currently be necessary. I may have to eat my words at some point, but in the absense of clear proof that alpha actually needs this, or indeed even an explanation of how alpha could _possibly_ need it, I do not believe these functions are called for. And if it turns out that alpha really _does_ need a barrier for this case, that barrier still should not be "smp_read_barrier_depends()". We'd have to make up some new speciality barrier just for alpha, along with the documentation for why it really is necessary. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul E McKenney <paulmck@us.ibm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'Documentation/memory-barriers.txt')
-rw-r--r--Documentation/memory-barriers.txt54
1 files changed, 22 insertions, 32 deletions
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index b5fe7657456e..aef9487303d0 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -617,16 +617,16 @@ case what's actually required is:
617However, stores are not speculated. This means that ordering -is- provided 617However, stores are not speculated. This means that ordering -is- provided
618for load-store control dependencies, as in the following example: 618for load-store control dependencies, as in the following example:
619 619
620 q = READ_ONCE_CTRL(a); 620 q = READ_ONCE(a);
621 if (q) { 621 if (q) {
622 WRITE_ONCE(b, p); 622 WRITE_ONCE(b, p);
623 } 623 }
624 624
625Control dependencies pair normally with other types of barriers. That 625Control dependencies pair normally with other types of barriers. That
626said, please note that READ_ONCE_CTRL() is not optional! Without the 626said, please note that READ_ONCE() is not optional! Without the
627READ_ONCE_CTRL(), the compiler might combine the load from 'a' with 627READ_ONCE(), the compiler might combine the load from 'a' with other
628other loads from 'a', and the store to 'b' with other stores to 'b', 628loads from 'a', and the store to 'b' with other stores to 'b', with
629with possible highly counterintuitive effects on ordering. 629possible highly counterintuitive effects on ordering.
630 630
631Worse yet, if the compiler is able to prove (say) that the value of 631Worse yet, if the compiler is able to prove (say) that the value of
632variable 'a' is always non-zero, it would be well within its rights 632variable 'a' is always non-zero, it would be well within its rights
@@ -636,16 +636,12 @@ as follows:
636 q = a; 636 q = a;
637 b = p; /* BUG: Compiler and CPU can both reorder!!! */ 637 b = p; /* BUG: Compiler and CPU can both reorder!!! */
638 638
639Finally, the READ_ONCE_CTRL() includes an smp_read_barrier_depends() 639So don't leave out the READ_ONCE().
640that DEC Alpha needs in order to respect control depedencies. Alternatively
641use one of atomic{,64}_read_ctrl().
642
643So don't leave out the READ_ONCE_CTRL().
644 640
645It is tempting to try to enforce ordering on identical stores on both 641It is tempting to try to enforce ordering on identical stores on both
646branches of the "if" statement as follows: 642branches of the "if" statement as follows:
647 643
648 q = READ_ONCE_CTRL(a); 644 q = READ_ONCE(a);
649 if (q) { 645 if (q) {
650 barrier(); 646 barrier();
651 WRITE_ONCE(b, p); 647 WRITE_ONCE(b, p);
@@ -659,7 +655,7 @@ branches of the "if" statement as follows:
659Unfortunately, current compilers will transform this as follows at high 655Unfortunately, current compilers will transform this as follows at high
660optimization levels: 656optimization levels:
661 657
662 q = READ_ONCE_CTRL(a); 658 q = READ_ONCE(a);
663 barrier(); 659 barrier();
664 WRITE_ONCE(b, p); /* BUG: No ordering vs. load from a!!! */ 660 WRITE_ONCE(b, p); /* BUG: No ordering vs. load from a!!! */
665 if (q) { 661 if (q) {
@@ -689,7 +685,7 @@ memory barriers, for example, smp_store_release():
689In contrast, without explicit memory barriers, two-legged-if control 685In contrast, without explicit memory barriers, two-legged-if control
690ordering is guaranteed only when the stores differ, for example: 686ordering is guaranteed only when the stores differ, for example:
691 687
692 q = READ_ONCE_CTRL(a); 688 q = READ_ONCE(a);
693 if (q) { 689 if (q) {
694 WRITE_ONCE(b, p); 690 WRITE_ONCE(b, p);
695 do_something(); 691 do_something();
@@ -698,14 +694,14 @@ ordering is guaranteed only when the stores differ, for example:
698 do_something_else(); 694 do_something_else();
699 } 695 }
700 696
701The initial READ_ONCE_CTRL() is still required to prevent the compiler 697The initial READ_ONCE() is still required to prevent the compiler from
702from proving the value of 'a'. 698proving the value of 'a'.
703 699
704In addition, you need to be careful what you do with the local variable 'q', 700In addition, you need to be careful what you do with the local variable 'q',
705otherwise the compiler might be able to guess the value and again remove 701otherwise the compiler might be able to guess the value and again remove
706the needed conditional. For example: 702the needed conditional. For example:
707 703
708 q = READ_ONCE_CTRL(a); 704 q = READ_ONCE(a);
709 if (q % MAX) { 705 if (q % MAX) {
710 WRITE_ONCE(b, p); 706 WRITE_ONCE(b, p);
711 do_something(); 707 do_something();
@@ -718,7 +714,7 @@ If MAX is defined to be 1, then the compiler knows that (q % MAX) is
718equal to zero, in which case the compiler is within its rights to 714equal to zero, in which case the compiler is within its rights to
719transform the above code into the following: 715transform the above code into the following:
720 716
721 q = READ_ONCE_CTRL(a); 717 q = READ_ONCE(a);
722 WRITE_ONCE(b, p); 718 WRITE_ONCE(b, p);
723 do_something_else(); 719 do_something_else();
724 720
@@ -729,7 +725,7 @@ is gone, and the barrier won't bring it back. Therefore, if you are
729relying on this ordering, you should make sure that MAX is greater than 725relying on this ordering, you should make sure that MAX is greater than
730one, perhaps as follows: 726one, perhaps as follows:
731 727
732 q = READ_ONCE_CTRL(a); 728 q = READ_ONCE(a);
733 BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */ 729 BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
734 if (q % MAX) { 730 if (q % MAX) {
735 WRITE_ONCE(b, p); 731 WRITE_ONCE(b, p);
@@ -746,7 +742,7 @@ of the 'if' statement.
746You must also be careful not to rely too much on boolean short-circuit 742You must also be careful not to rely too much on boolean short-circuit
747evaluation. Consider this example: 743evaluation. Consider this example:
748 744
749 q = READ_ONCE_CTRL(a); 745 q = READ_ONCE(a);
750 if (q || 1 > 0) 746 if (q || 1 > 0)
751 WRITE_ONCE(b, 1); 747 WRITE_ONCE(b, 1);
752 748
@@ -754,7 +750,7 @@ Because the first condition cannot fault and the second condition is
754always true, the compiler can transform this example as following, 750always true, the compiler can transform this example as following,
755defeating control dependency: 751defeating control dependency:
756 752
757 q = READ_ONCE_CTRL(a); 753 q = READ_ONCE(a);
758 WRITE_ONCE(b, 1); 754 WRITE_ONCE(b, 1);
759 755
760This example underscores the need to ensure that the compiler cannot 756This example underscores the need to ensure that the compiler cannot
@@ -768,7 +764,7 @@ x and y both being zero:
768 764
769 CPU 0 CPU 1 765 CPU 0 CPU 1
770 ======================= ======================= 766 ======================= =======================
771 r1 = READ_ONCE_CTRL(x); r2 = READ_ONCE_CTRL(y); 767 r1 = READ_ONCE(x); r2 = READ_ONCE(y);
772 if (r1 > 0) if (r2 > 0) 768 if (r1 > 0) if (r2 > 0)
773 WRITE_ONCE(y, 1); WRITE_ONCE(x, 1); 769 WRITE_ONCE(y, 1); WRITE_ONCE(x, 1);
774 770
@@ -797,11 +793,6 @@ site: https://www.cl.cam.ac.uk/~pes20/ppcmem/index.html.
797 793
798In summary: 794In summary:
799 795
800 (*) Control dependencies must be headed by READ_ONCE_CTRL(),
801 atomic{,64}_read_ctrl(). Or, as a much less preferable alternative,
802 interpose smp_read_barrier_depends() between a READ_ONCE() and the
803 control-dependent write.
804
805 (*) Control dependencies can order prior loads against later stores. 796 (*) Control dependencies can order prior loads against later stores.
806 However, they do -not- guarantee any other sort of ordering: 797 However, they do -not- guarantee any other sort of ordering:
807 Not prior loads against later loads, nor prior stores against 798 Not prior loads against later loads, nor prior stores against
@@ -817,14 +808,13 @@ In summary:
817 between the prior load and the subsequent store, and this 808 between the prior load and the subsequent store, and this
818 conditional must involve the prior load. If the compiler is able 809 conditional must involve the prior load. If the compiler is able
819 to optimize the conditional away, it will have also optimized 810 to optimize the conditional away, it will have also optimized
820 away the ordering. Careful use of READ_ONCE_CTRL() READ_ONCE(), 811 away the ordering. Careful use of READ_ONCE() and WRITE_ONCE()
821 and WRITE_ONCE() can help to preserve the needed conditional. 812 can help to preserve the needed conditional.
822 813
823 (*) Control dependencies require that the compiler avoid reordering the 814 (*) Control dependencies require that the compiler avoid reordering the
824 dependency into nonexistence. Careful use of READ_ONCE_CTRL(), 815 dependency into nonexistence. Careful use of READ_ONCE() or
825 atomic{,64}_read_ctrl() or smp_read_barrier_depends() can help to 816 atomic{,64}_read() can help to preserve your control dependency.
826 preserve your control dependency. Please see the Compiler Barrier 817 Please see the Compiler Barrier section for more information.
827 section for more information.
828 818
829 (*) Control dependencies pair normally with other types of barriers. 819 (*) Control dependencies pair normally with other types of barriers.
830 820