aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorPeter Zijlstra <peterz@infradead.org>2013-12-11 16:59:06 -0500
committerIngo Molnar <mingo@kernel.org>2013-12-16 05:36:11 -0500
commit18c03c61444a211237f3d4782353cb38dba795df (patch)
tree32f92af0726cfdd576370f2965418f5a859e603b
parentfb2b581968db140586e8d7db38ff278f60872313 (diff)
Documentation/memory-barriers.txt: Prohibit speculative writes
No SMP architecture currently supporting Linux allows speculative writes, so this commit updates Documentation/memory-barriers.txt to prohibit them in Linux core code. It also records restrictions on their use. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <linux-arch@vger.kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1386799151-2219-3-git-send-email-paulmck@linux.vnet.ibm.com [ Paul modified the original patch from Peter. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
-rw-r--r--Documentation/memory-barriers.txt183
1 files changed, 175 insertions, 8 deletions
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 2d22da095a60..deafa36aeea1 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -571,11 +571,10 @@ dependency barrier to make it work correctly. Consider the following bit of
571code: 571code:
572 572
573 q = ACCESS_ONCE(a); 573 q = ACCESS_ONCE(a);
574 if (p) { 574 if (q) {
575 <data dependency barrier> 575 <data dependency barrier> /* BUG: No data dependency!!! */
576 q = ACCESS_ONCE(b); 576 p = ACCESS_ONCE(b);
577 } 577 }
578 x = *q;
579 578
580This will not have the desired effect because there is no actual data 579This will not have the desired effect because there is no actual data
581dependency, but rather a control dependency that the CPU may short-circuit 580dependency, but rather a control dependency that the CPU may short-circuit
@@ -584,11 +583,176 @@ the load from b as having happened before the load from a. In such a
584case what's actually required is: 583case what's actually required is:
585 584
586 q = ACCESS_ONCE(a); 585 q = ACCESS_ONCE(a);
587 if (p) { 586 if (q) {
588 <read barrier> 587 <read barrier>
589 q = ACCESS_ONCE(b); 588 p = ACCESS_ONCE(b);
590 } 589 }
591 x = *q; 590
591However, stores are not speculated. This means that ordering -is- provided
592in the following example:
593
594 q = ACCESS_ONCE(a);
595 if (ACCESS_ONCE(q)) {
596 ACCESS_ONCE(b) = p;
597 }
598
599Please note that ACCESS_ONCE() is not optional! Without the ACCESS_ONCE(),
600the compiler is within its rights to transform this example:
601
602 q = a;
603 if (q) {
604 b = p; /* BUG: Compiler can reorder!!! */
605 do_something();
606 } else {
607 b = p; /* BUG: Compiler can reorder!!! */
608 do_something_else();
609 }
610
611into this, which of course defeats the ordering:
612
613 b = p;
614 q = a;
615 if (q)
616 do_something();
617 else
618 do_something_else();
619
620Worse yet, if the compiler is able to prove (say) that the value of
621variable 'a' is always non-zero, it would be well within its rights
622to optimize the original example by eliminating the "if" statement
623as follows:
624
625 q = a;
626 b = p; /* BUG: Compiler can reorder!!! */
627 do_something();
628
629The solution is again ACCESS_ONCE(), which preserves the ordering between
630the load from variable 'a' and the store to variable 'b':
631
632 q = ACCESS_ONCE(a);
633 if (q) {
634 ACCESS_ONCE(b) = p;
635 do_something();
636 } else {
637 ACCESS_ONCE(b) = p;
638 do_something_else();
639 }
640
641You could also use barrier() to prevent the compiler from moving
642the stores to variable 'b', but barrier() would not prevent the
643compiler from proving to itself that a==1 always, so ACCESS_ONCE()
644is also needed.
645
646It is important to note that control dependencies absolutely require a
647a conditional. For example, the following "optimized" version of
648the above example breaks ordering:
649
650 q = ACCESS_ONCE(a);
651 ACCESS_ONCE(b) = p; /* BUG: No ordering vs. load from a!!! */
652 if (q) {
653 /* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */
654 do_something();
655 } else {
656 /* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */
657 do_something_else();
658 }
659
660It is of course legal for the prior load to be part of the conditional,
661for example, as follows:
662
663 if (ACCESS_ONCE(a) > 0) {
664 ACCESS_ONCE(b) = q / 2;
665 do_something();
666 } else {
667 ACCESS_ONCE(b) = q / 3;
668 do_something_else();
669 }
670
671This will again ensure that the load from variable 'a' is ordered before the
672stores to variable 'b'.
673
674In addition, you need to be careful what you do with the local variable 'q',
675otherwise the compiler might be able to guess the value and again remove
676the needed conditional. For example:
677
678 q = ACCESS_ONCE(a);
679 if (q % MAX) {
680 ACCESS_ONCE(b) = p;
681 do_something();
682 } else {
683 ACCESS_ONCE(b) = p;
684 do_something_else();
685 }
686
687If MAX is defined to be 1, then the compiler knows that (q % MAX) is
688equal to zero, in which case the compiler is within its rights to
689transform the above code into the following:
690
691 q = ACCESS_ONCE(a);
692 ACCESS_ONCE(b) = p;
693 do_something_else();
694
695This transformation loses the ordering between the load from variable 'a'
696and the store to variable 'b'. If you are relying on this ordering, you
697should do something like the following:
698
699 q = ACCESS_ONCE(a);
700 BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
701 if (q % MAX) {
702 ACCESS_ONCE(b) = p;
703 do_something();
704 } else {
705 ACCESS_ONCE(b) = p;
706 do_something_else();
707 }
708
709Finally, control dependencies do -not- provide transitivity. This is
710demonstrated by two related examples:
711
712 CPU 0 CPU 1
713 ===================== =====================
714 r1 = ACCESS_ONCE(x); r2 = ACCESS_ONCE(y);
715 if (r1 >= 0) if (r2 >= 0)
716 ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1;
717
718 assert(!(r1 == 1 && r2 == 1));
719
720The above two-CPU example will never trigger the assert(). However,
721if control dependencies guaranteed transitivity (which they do not),
722then adding the following two CPUs would guarantee a related assertion:
723
724 CPU 2 CPU 3
725 ===================== =====================
726 ACCESS_ONCE(x) = 2; ACCESS_ONCE(y) = 2;
727
728 assert(!(r1 == 2 && r2 == 2 && x == 1 && y == 1)); /* FAILS!!! */
729
730But because control dependencies do -not- provide transitivity, the
731above assertion can fail after the combined four-CPU example completes.
732If you need the four-CPU example to provide ordering, you will need
733smp_mb() between the loads and stores in the CPU 0 and CPU 1 code fragments.
734
735In summary:
736
737 (*) Control dependencies can order prior loads against later stores.
738 However, they do -not- guarantee any other sort of ordering:
739 Not prior loads against later loads, nor prior stores against
740 later anything. If you need these other forms of ordering,
741 use smb_rmb(), smp_wmb(), or, in the case of prior stores and
742 later loads, smp_mb().
743
744 (*) Control dependencies require at least one run-time conditional
745 between the prior load and the subsequent store. If the compiler
746 is able to optimize the conditional away, it will have also
747 optimized away the ordering. Careful use of ACCESS_ONCE() can
748 help to preserve the needed conditional.
749
750 (*) Control dependencies require that the compiler avoid reordering the
751 dependency into nonexistence. Careful use of ACCESS_ONCE() or
752 barrier() can help to preserve your control dependency.
753
754 (*) Control dependencies do -not- provide transitivity. If you
755 need transitivity, use smp_mb().
592 756
593 757
594SMP BARRIER PAIRING 758SMP BARRIER PAIRING
@@ -1083,7 +1247,10 @@ compiler from moving the memory accesses either side of it to the other side:
1083 1247
1084 barrier(); 1248 barrier();
1085 1249
1086This is a general barrier - lesser varieties of compiler barrier do not exist. 1250This is a general barrier -- there are no read-read or write-write variants
1251of barrier(). Howevever, ACCESS_ONCE() can be thought of as a weak form
1252for barrier() that affects only the specific accesses flagged by the
1253ACCESS_ONCE().
1087 1254
1088The compiler barrier has no direct effect on the CPU, which may then reorder 1255The compiler barrier has no direct effect on the CPU, which may then reorder
1089things however it wishes. 1256things however it wishes.