summaryrefslogtreecommitdiffstats
path: root/Documentation/memory-barriers.txt
diff options
context:
space:
mode:
authorPaul E. McKenney <paulmck@linux.vnet.ibm.com>2014-08-13 18:40:02 -0400
committerPaul E. McKenney <paulmck@linux.vnet.ibm.com>2014-09-07 19:15:53 -0400
commit2456d2a617de0a37a0f8d1e44f4b270172c4f17a (patch)
tree92eef2cfcedc013b96009f099b52f99301bcb499 /Documentation/memory-barriers.txt
parentefdcd51a4d5bd355796b1a757ff0355bb09ed394 (diff)
memory-barriers: Fix description of 2-legged-if-based control dependencies
Sad to say, current compilers really will hoist identical stores from both branches of an "if" statement to precede the conditional. This commit therefore updates the description of control dependencies to reflect this ugly reality. Reported-by: Pranith Kumar <bobby.prani@gmail.com> Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Diffstat (limited to 'Documentation/memory-barriers.txt')
-rw-r--r--Documentation/memory-barriers.txt103
1 files changed, 49 insertions, 54 deletions
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 600b45c6e2ad..22a969cdd476 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -574,30 +574,14 @@ However, stores are not speculated. This means that ordering -is- provided
574in the following example: 574in the following example:
575 575
576 q = ACCESS_ONCE(a); 576 q = ACCESS_ONCE(a);
577 if (ACCESS_ONCE(q)) {
578 ACCESS_ONCE(b) = p;
579 }
580
581Please note that ACCESS_ONCE() is not optional! Without the ACCESS_ONCE(),
582the compiler is within its rights to transform this example:
583
584 q = a;
585 if (q) { 577 if (q) {
586 b = p; /* BUG: Compiler can reorder!!! */ 578 ACCESS_ONCE(b) = p;
587 do_something();
588 } else {
589 b = p; /* BUG: Compiler can reorder!!! */
590 do_something_else();
591 } 579 }
592 580
593into this, which of course defeats the ordering: 581Please note that ACCESS_ONCE() is not optional! Without the
594 582ACCESS_ONCE(), might combine the load from 'a' with other loads from
595 b = p; 583'a', and the store to 'b' with other stores to 'b', with possible highly
596 q = a; 584counterintuitive effects on ordering.
597 if (q)
598 do_something();
599 else
600 do_something_else();
601 585
602Worse yet, if the compiler is able to prove (say) that the value of 586Worse yet, if the compiler is able to prove (say) that the value of
603variable 'a' is always non-zero, it would be well within its rights 587variable 'a' is always non-zero, it would be well within its rights
@@ -605,11 +589,12 @@ to optimize the original example by eliminating the "if" statement
605as follows: 589as follows:
606 590
607 q = a; 591 q = a;
608 b = p; /* BUG: Compiler can reorder!!! */ 592 b = p; /* BUG: Compiler and CPU can both reorder!!! */
609 do_something();
610 593
611The solution is again ACCESS_ONCE() and barrier(), which preserves the 594So don't leave out the ACCESS_ONCE().
612ordering between the load from variable 'a' and the store to variable 'b': 595
596It is tempting to try to enforce ordering on identical stores on both
597branches of the "if" statement as follows:
613 598
614 q = ACCESS_ONCE(a); 599 q = ACCESS_ONCE(a);
615 if (q) { 600 if (q) {
@@ -622,18 +607,11 @@ ordering between the load from variable 'a' and the store to variable 'b':
622 do_something_else(); 607 do_something_else();
623 } 608 }
624 609
625The initial ACCESS_ONCE() is required to prevent the compiler from 610Unfortunately, current compilers will transform this as follows at high
626proving the value of 'a', and the pair of barrier() invocations are 611optimization levels:
627required to prevent the compiler from pulling the two identical stores
628to 'b' out from the legs of the "if" statement.
629
630It is important to note that control dependencies absolutely require a
631a conditional. For example, the following "optimized" version of
632the above example breaks ordering, which is why the barrier() invocations
633are absolutely required if you have identical stores in both legs of
634the "if" statement:
635 612
636 q = ACCESS_ONCE(a); 613 q = ACCESS_ONCE(a);
614 barrier();
637 ACCESS_ONCE(b) = p; /* BUG: No ordering vs. load from a!!! */ 615 ACCESS_ONCE(b) = p; /* BUG: No ordering vs. load from a!!! */
638 if (q) { 616 if (q) {
639 /* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */ 617 /* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */
@@ -643,21 +621,36 @@ the "if" statement:
643 do_something_else(); 621 do_something_else();
644 } 622 }
645 623
646It is of course legal for the prior load to be part of the conditional, 624Now there is no conditional between the load from 'a' and the store to
647for example, as follows: 625'b', which means that the CPU is within its rights to reorder them:
626The conditional is absolutely required, and must be present in the
627assembly code even after all compiler optimizations have been applied.
628Therefore, if you need ordering in this example, you need explicit
629memory barriers, for example, smp_store_release():
648 630
649 if (ACCESS_ONCE(a) > 0) { 631 q = ACCESS_ONCE(a);
650 barrier(); 632 if (q) {
651 ACCESS_ONCE(b) = q / 2; 633 smp_store_release(&b, p);
652 do_something(); 634 do_something();
653 } else { 635 } else {
654 barrier(); 636 smp_store_release(&b, p);
655 ACCESS_ONCE(b) = q / 3; 637 do_something_else();
638 }
639
640In contrast, without explicit memory barriers, two-legged-if control
641ordering is guaranteed only when the stores differ, for example:
642
643 q = ACCESS_ONCE(a);
644 if (q) {
645 ACCESS_ONCE(b) = p;
646 do_something();
647 } else {
648 ACCESS_ONCE(b) = r;
656 do_something_else(); 649 do_something_else();
657 } 650 }
658 651
659This will again ensure that the load from variable 'a' is ordered before the 652The initial ACCESS_ONCE() is still required to prevent the compiler from
660stores to variable 'b'. 653proving the value of 'a'.
661 654
662In addition, you need to be careful what you do with the local variable 'q', 655In addition, you need to be careful what you do with the local variable 'q',
663otherwise the compiler might be able to guess the value and again remove 656otherwise the compiler might be able to guess the value and again remove
@@ -665,12 +658,10 @@ the needed conditional. For example:
665 658
666 q = ACCESS_ONCE(a); 659 q = ACCESS_ONCE(a);
667 if (q % MAX) { 660 if (q % MAX) {
668 barrier();
669 ACCESS_ONCE(b) = p; 661 ACCESS_ONCE(b) = p;
670 do_something(); 662 do_something();
671 } else { 663 } else {
672 barrier(); 664 ACCESS_ONCE(b) = r;
673 ACCESS_ONCE(b) = p;
674 do_something_else(); 665 do_something_else();
675 } 666 }
676 667
@@ -679,15 +670,15 @@ equal to zero, in which case the compiler is within its rights to
679transform the above code into the following: 670transform the above code into the following:
680 671
681 q = ACCESS_ONCE(a); 672 q = ACCESS_ONCE(a);
682 barrier();
683 ACCESS_ONCE(b) = p; 673 ACCESS_ONCE(b) = p;
684 do_something_else(); 674 do_something_else();
685 675
686This transformation fails to require that the CPU respect the ordering 676Given this transformation, the CPU is not required to respect the ordering
687between the load from variable 'a' and the store to variable 'b'. 677between the load from variable 'a' and the store to variable 'b'. It is
688Yes, the barrier() is still there, but it affects only the compiler, 678tempting to add a barrier(), but this does not help. The conditional
689not the CPU. Therefore, if you are relying on this ordering, you should 679is gone, and the barrier won't bring it back. Therefore, if you are
690do something like the following: 680relying on this ordering, you should make sure that MAX is greater than
681one, perhaps as follows:
691 682
692 q = ACCESS_ONCE(a); 683 q = ACCESS_ONCE(a);
693 BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */ 684 BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
@@ -695,10 +686,14 @@ do something like the following:
695 ACCESS_ONCE(b) = p; 686 ACCESS_ONCE(b) = p;
696 do_something(); 687 do_something();
697 } else { 688 } else {
698 ACCESS_ONCE(b) = p; 689 ACCESS_ONCE(b) = r;
699 do_something_else(); 690 do_something_else();
700 } 691 }
701 692
693Please note once again that the stores to 'b' differ. If they were
694identical, as noted earlier, the compiler could pull this store outside
695of the 'if' statement.
696
702Finally, control dependencies do -not- provide transitivity. This is 697Finally, control dependencies do -not- provide transitivity. This is
703demonstrated by two related examples, with the initial values of 698demonstrated by two related examples, with the initial values of
704x and y both being zero: 699x and y both being zero: