diff options
author | Paul E. McKenney <paulmck@linux.vnet.ibm.com> | 2015-06-18 17:33:24 -0400 |
---|---|---|
committer | Paul E. McKenney <paulmck@linux.vnet.ibm.com> | 2015-07-15 17:43:13 -0400 |
commit | 9af194cefc3c40e75a59df4cbb06e1c1064bee7f (patch) | |
tree | b9a2d049506997ad053262df177b00143bec611d /Documentation/memory-barriers.txt | |
parent | 57aecae950c55ef50934640794160cd118e73256 (diff) |
documentation: Replace ACCESS_ONCE() by READ_ONCE() and WRITE_ONCE()
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Diffstat (limited to 'Documentation/memory-barriers.txt')
-rw-r--r-- | Documentation/memory-barriers.txt | 346 |
1 files changed, 177 insertions, 169 deletions
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 3d06f98b2ff2..470c07c868e4 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt | |||
@@ -194,22 +194,22 @@ There are some minimal guarantees that may be expected of a CPU: | |||
194 | (*) On any given CPU, dependent memory accesses will be issued in order, with | 194 | (*) On any given CPU, dependent memory accesses will be issued in order, with |
195 | respect to itself. This means that for: | 195 | respect to itself. This means that for: |
196 | 196 | ||
197 | ACCESS_ONCE(Q) = P; smp_read_barrier_depends(); D = ACCESS_ONCE(*Q); | 197 | WRITE_ONCE(Q, P); smp_read_barrier_depends(); D = READ_ONCE(*Q); |
198 | 198 | ||
199 | the CPU will issue the following memory operations: | 199 | the CPU will issue the following memory operations: |
200 | 200 | ||
201 | Q = LOAD P, D = LOAD *Q | 201 | Q = LOAD P, D = LOAD *Q |
202 | 202 | ||
203 | and always in that order. On most systems, smp_read_barrier_depends() | 203 | and always in that order. On most systems, smp_read_barrier_depends() |
204 | does nothing, but it is required for DEC Alpha. The ACCESS_ONCE() | 204 | does nothing, but it is required for DEC Alpha. The READ_ONCE() |
205 | is required to prevent compiler mischief. Please note that you | 205 | and WRITE_ONCE() are required to prevent compiler mischief. Please |
206 | should normally use something like rcu_dereference() instead of | 206 | note that you should normally use something like rcu_dereference() |
207 | open-coding smp_read_barrier_depends(). | 207 | instead of open-coding smp_read_barrier_depends(). |
208 | 208 | ||
209 | (*) Overlapping loads and stores within a particular CPU will appear to be | 209 | (*) Overlapping loads and stores within a particular CPU will appear to be |
210 | ordered within that CPU. This means that for: | 210 | ordered within that CPU. This means that for: |
211 | 211 | ||
212 | a = ACCESS_ONCE(*X); ACCESS_ONCE(*X) = b; | 212 | a = READ_ONCE(*X); WRITE_ONCE(*X, b); |
213 | 213 | ||
214 | the CPU will only issue the following sequence of memory operations: | 214 | the CPU will only issue the following sequence of memory operations: |
215 | 215 | ||
@@ -217,7 +217,7 @@ There are some minimal guarantees that may be expected of a CPU: | |||
217 | 217 | ||
218 | And for: | 218 | And for: |
219 | 219 | ||
220 | ACCESS_ONCE(*X) = c; d = ACCESS_ONCE(*X); | 220 | WRITE_ONCE(*X, c); d = READ_ONCE(*X); |
221 | 221 | ||
222 | the CPU will only issue: | 222 | the CPU will only issue: |
223 | 223 | ||
@@ -228,11 +228,11 @@ There are some minimal guarantees that may be expected of a CPU: | |||
228 | 228 | ||
229 | And there are a number of things that _must_ or _must_not_ be assumed: | 229 | And there are a number of things that _must_ or _must_not_ be assumed: |
230 | 230 | ||
231 | (*) It _must_not_ be assumed that the compiler will do what you want with | 231 | (*) It _must_not_ be assumed that the compiler will do what you want |
232 | memory references that are not protected by ACCESS_ONCE(). Without | 232 | with memory references that are not protected by READ_ONCE() and |
233 | ACCESS_ONCE(), the compiler is within its rights to do all sorts | 233 | WRITE_ONCE(). Without them, the compiler is within its rights to |
234 | of "creative" transformations, which are covered in the Compiler | 234 | do all sorts of "creative" transformations, which are covered in |
235 | Barrier section. | 235 | the Compiler Barrier section. |
236 | 236 | ||
237 | (*) It _must_not_ be assumed that independent loads and stores will be issued | 237 | (*) It _must_not_ be assumed that independent loads and stores will be issued |
238 | in the order given. This means that for: | 238 | in the order given. This means that for: |
@@ -520,8 +520,8 @@ following sequence of events: | |||
520 | { A == 1, B == 2, C = 3, P == &A, Q == &C } | 520 | { A == 1, B == 2, C = 3, P == &A, Q == &C } |
521 | B = 4; | 521 | B = 4; |
522 | <write barrier> | 522 | <write barrier> |
523 | ACCESS_ONCE(P) = &B | 523 | WRITE_ONCE(P, &B) |
524 | Q = ACCESS_ONCE(P); | 524 | Q = READ_ONCE(P); |
525 | D = *Q; | 525 | D = *Q; |
526 | 526 | ||
527 | There's a clear data dependency here, and it would seem that by the end of the | 527 | There's a clear data dependency here, and it would seem that by the end of the |
@@ -547,8 +547,8 @@ between the address load and the data load: | |||
547 | { A == 1, B == 2, C = 3, P == &A, Q == &C } | 547 | { A == 1, B == 2, C = 3, P == &A, Q == &C } |
548 | B = 4; | 548 | B = 4; |
549 | <write barrier> | 549 | <write barrier> |
550 | ACCESS_ONCE(P) = &B | 550 | WRITE_ONCE(P, &B); |
551 | Q = ACCESS_ONCE(P); | 551 | Q = READ_ONCE(P); |
552 | <data dependency barrier> | 552 | <data dependency barrier> |
553 | D = *Q; | 553 | D = *Q; |
554 | 554 | ||
@@ -574,8 +574,8 @@ access: | |||
574 | { M[0] == 1, M[1] == 2, M[3] = 3, P == 0, Q == 3 } | 574 | { M[0] == 1, M[1] == 2, M[3] = 3, P == 0, Q == 3 } |
575 | M[1] = 4; | 575 | M[1] = 4; |
576 | <write barrier> | 576 | <write barrier> |
577 | ACCESS_ONCE(P) = 1 | 577 | WRITE_ONCE(P, 1); |
578 | Q = ACCESS_ONCE(P); | 578 | Q = READ_ONCE(P); |
579 | <data dependency barrier> | 579 | <data dependency barrier> |
580 | D = M[Q]; | 580 | D = M[Q]; |
581 | 581 | ||
@@ -596,10 +596,10 @@ A load-load control dependency requires a full read memory barrier, not | |||
596 | simply a data dependency barrier to make it work correctly. Consider the | 596 | simply a data dependency barrier to make it work correctly. Consider the |
597 | following bit of code: | 597 | following bit of code: |
598 | 598 | ||
599 | q = ACCESS_ONCE(a); | 599 | q = READ_ONCE(a); |
600 | if (q) { | 600 | if (q) { |
601 | <data dependency barrier> /* BUG: No data dependency!!! */ | 601 | <data dependency barrier> /* BUG: No data dependency!!! */ |
602 | p = ACCESS_ONCE(b); | 602 | p = READ_ONCE(b); |
603 | } | 603 | } |
604 | 604 | ||
605 | This will not have the desired effect because there is no actual data | 605 | This will not have the desired effect because there is no actual data |
@@ -608,10 +608,10 @@ by attempting to predict the outcome in advance, so that other CPUs see | |||
608 | the load from b as having happened before the load from a. In such a | 608 | the load from b as having happened before the load from a. In such a |
609 | case what's actually required is: | 609 | case what's actually required is: |
610 | 610 | ||
611 | q = ACCESS_ONCE(a); | 611 | q = READ_ONCE(a); |
612 | if (q) { | 612 | if (q) { |
613 | <read barrier> | 613 | <read barrier> |
614 | p = ACCESS_ONCE(b); | 614 | p = READ_ONCE(b); |
615 | } | 615 | } |
616 | 616 | ||
617 | However, stores are not speculated. This means that ordering -is- provided | 617 | However, stores are not speculated. This means that ordering -is- provided |
@@ -619,7 +619,7 @@ for load-store control dependencies, as in the following example: | |||
619 | 619 | ||
620 | q = READ_ONCE_CTRL(a); | 620 | q = READ_ONCE_CTRL(a); |
621 | if (q) { | 621 | if (q) { |
622 | ACCESS_ONCE(b) = p; | 622 | WRITE_ONCE(b, p); |
623 | } | 623 | } |
624 | 624 | ||
625 | Control dependencies pair normally with other types of barriers. That | 625 | Control dependencies pair normally with other types of barriers. That |
@@ -647,11 +647,11 @@ branches of the "if" statement as follows: | |||
647 | q = READ_ONCE_CTRL(a); | 647 | q = READ_ONCE_CTRL(a); |
648 | if (q) { | 648 | if (q) { |
649 | barrier(); | 649 | barrier(); |
650 | ACCESS_ONCE(b) = p; | 650 | WRITE_ONCE(b, p); |
651 | do_something(); | 651 | do_something(); |
652 | } else { | 652 | } else { |
653 | barrier(); | 653 | barrier(); |
654 | ACCESS_ONCE(b) = p; | 654 | WRITE_ONCE(b, p); |
655 | do_something_else(); | 655 | do_something_else(); |
656 | } | 656 | } |
657 | 657 | ||
@@ -660,12 +660,12 @@ optimization levels: | |||
660 | 660 | ||
661 | q = READ_ONCE_CTRL(a); | 661 | q = READ_ONCE_CTRL(a); |
662 | barrier(); | 662 | barrier(); |
663 | ACCESS_ONCE(b) = p; /* BUG: No ordering vs. load from a!!! */ | 663 | WRITE_ONCE(b, p); /* BUG: No ordering vs. load from a!!! */ |
664 | if (q) { | 664 | if (q) { |
665 | /* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */ | 665 | /* WRITE_ONCE(b, p); -- moved up, BUG!!! */ |
666 | do_something(); | 666 | do_something(); |
667 | } else { | 667 | } else { |
668 | /* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */ | 668 | /* WRITE_ONCE(b, p); -- moved up, BUG!!! */ |
669 | do_something_else(); | 669 | do_something_else(); |
670 | } | 670 | } |
671 | 671 | ||
@@ -676,7 +676,7 @@ assembly code even after all compiler optimizations have been applied. | |||
676 | Therefore, if you need ordering in this example, you need explicit | 676 | Therefore, if you need ordering in this example, you need explicit |
677 | memory barriers, for example, smp_store_release(): | 677 | memory barriers, for example, smp_store_release(): |
678 | 678 | ||
679 | q = ACCESS_ONCE(a); | 679 | q = READ_ONCE(a); |
680 | if (q) { | 680 | if (q) { |
681 | smp_store_release(&b, p); | 681 | smp_store_release(&b, p); |
682 | do_something(); | 682 | do_something(); |
@@ -690,10 +690,10 @@ ordering is guaranteed only when the stores differ, for example: | |||
690 | 690 | ||
691 | q = READ_ONCE_CTRL(a); | 691 | q = READ_ONCE_CTRL(a); |
692 | if (q) { | 692 | if (q) { |
693 | ACCESS_ONCE(b) = p; | 693 | WRITE_ONCE(b, p); |
694 | do_something(); | 694 | do_something(); |
695 | } else { | 695 | } else { |
696 | ACCESS_ONCE(b) = r; | 696 | WRITE_ONCE(b, r); |
697 | do_something_else(); | 697 | do_something_else(); |
698 | } | 698 | } |
699 | 699 | ||
@@ -706,10 +706,10 @@ the needed conditional. For example: | |||
706 | 706 | ||
707 | q = READ_ONCE_CTRL(a); | 707 | q = READ_ONCE_CTRL(a); |
708 | if (q % MAX) { | 708 | if (q % MAX) { |
709 | ACCESS_ONCE(b) = p; | 709 | WRITE_ONCE(b, p); |
710 | do_something(); | 710 | do_something(); |
711 | } else { | 711 | } else { |
712 | ACCESS_ONCE(b) = r; | 712 | WRITE_ONCE(b, r); |
713 | do_something_else(); | 713 | do_something_else(); |
714 | } | 714 | } |
715 | 715 | ||
@@ -718,7 +718,7 @@ equal to zero, in which case the compiler is within its rights to | |||
718 | transform the above code into the following: | 718 | transform the above code into the following: |
719 | 719 | ||
720 | q = READ_ONCE_CTRL(a); | 720 | q = READ_ONCE_CTRL(a); |
721 | ACCESS_ONCE(b) = p; | 721 | WRITE_ONCE(b, p); |
722 | do_something_else(); | 722 | do_something_else(); |
723 | 723 | ||
724 | Given this transformation, the CPU is not required to respect the ordering | 724 | Given this transformation, the CPU is not required to respect the ordering |
@@ -731,10 +731,10 @@ one, perhaps as follows: | |||
731 | q = READ_ONCE_CTRL(a); | 731 | q = READ_ONCE_CTRL(a); |
732 | BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */ | 732 | BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */ |
733 | if (q % MAX) { | 733 | if (q % MAX) { |
734 | ACCESS_ONCE(b) = p; | 734 | WRITE_ONCE(b, p); |
735 | do_something(); | 735 | do_something(); |
736 | } else { | 736 | } else { |
737 | ACCESS_ONCE(b) = r; | 737 | WRITE_ONCE(b, r); |
738 | do_something_else(); | 738 | do_something_else(); |
739 | } | 739 | } |
740 | 740 | ||
@@ -747,17 +747,17 @@ evaluation. Consider this example: | |||
747 | 747 | ||
748 | q = READ_ONCE_CTRL(a); | 748 | q = READ_ONCE_CTRL(a); |
749 | if (q || 1 > 0) | 749 | if (q || 1 > 0) |
750 | ACCESS_ONCE(b) = 1; | 750 | WRITE_ONCE(b, 1); |
751 | 751 | ||
752 | Because the first condition cannot fault and the second condition is | 752 | Because the first condition cannot fault and the second condition is |
753 | always true, the compiler can transform this example as following, | 753 | always true, the compiler can transform this example as following, |
754 | defeating control dependency: | 754 | defeating control dependency: |
755 | 755 | ||
756 | q = READ_ONCE_CTRL(a); | 756 | q = READ_ONCE_CTRL(a); |
757 | ACCESS_ONCE(b) = 1; | 757 | WRITE_ONCE(b, 1); |
758 | 758 | ||
759 | This example underscores the need to ensure that the compiler cannot | 759 | This example underscores the need to ensure that the compiler cannot |
760 | out-guess your code. More generally, although ACCESS_ONCE() does force | 760 | out-guess your code. More generally, although READ_ONCE() does force |
761 | the compiler to actually emit code for a given load, it does not force | 761 | the compiler to actually emit code for a given load, it does not force |
762 | the compiler to use the results. | 762 | the compiler to use the results. |
763 | 763 | ||
@@ -769,7 +769,7 @@ x and y both being zero: | |||
769 | ======================= ======================= | 769 | ======================= ======================= |
770 | r1 = READ_ONCE_CTRL(x); r2 = READ_ONCE_CTRL(y); | 770 | r1 = READ_ONCE_CTRL(x); r2 = READ_ONCE_CTRL(y); |
771 | if (r1 > 0) if (r2 > 0) | 771 | if (r1 > 0) if (r2 > 0) |
772 | ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1; | 772 | WRITE_ONCE(y, 1); WRITE_ONCE(x, 1); |
773 | 773 | ||
774 | assert(!(r1 == 1 && r2 == 1)); | 774 | assert(!(r1 == 1 && r2 == 1)); |
775 | 775 | ||
@@ -779,7 +779,7 @@ then adding the following CPU would guarantee a related assertion: | |||
779 | 779 | ||
780 | CPU 2 | 780 | CPU 2 |
781 | ===================== | 781 | ===================== |
782 | ACCESS_ONCE(x) = 2; | 782 | WRITE_ONCE(x, 2); |
783 | 783 | ||
784 | assert(!(r1 == 2 && r2 == 1 && x == 2)); /* FAILS!!! */ | 784 | assert(!(r1 == 2 && r2 == 1 && x == 2)); /* FAILS!!! */ |
785 | 785 | ||
@@ -798,8 +798,7 @@ In summary: | |||
798 | 798 | ||
799 | (*) Control dependencies must be headed by READ_ONCE_CTRL(). | 799 | (*) Control dependencies must be headed by READ_ONCE_CTRL(). |
800 | Or, as a much less preferable alternative, interpose | 800 | Or, as a much less preferable alternative, interpose |
801 | be headed by READ_ONCE() or an ACCESS_ONCE() read and must | 801 | smp_read_barrier_depends() between a READ_ONCE() and the |
802 | have smp_read_barrier_depends() between this read and the | ||
803 | control-dependent write. | 802 | control-dependent write. |
804 | 803 | ||
805 | (*) Control dependencies can order prior loads against later stores. | 804 | (*) Control dependencies can order prior loads against later stores. |
@@ -815,15 +814,16 @@ In summary: | |||
815 | 814 | ||
816 | (*) Control dependencies require at least one run-time conditional | 815 | (*) Control dependencies require at least one run-time conditional |
817 | between the prior load and the subsequent store, and this | 816 | between the prior load and the subsequent store, and this |
818 | conditional must involve the prior load. If the compiler | 817 | conditional must involve the prior load. If the compiler is able |
819 | is able to optimize the conditional away, it will have also | 818 | to optimize the conditional away, it will have also optimized |
820 | optimized away the ordering. Careful use of ACCESS_ONCE() can | 819 | away the ordering. Careful use of READ_ONCE_CTRL() READ_ONCE(), |
821 | help to preserve the needed conditional. | 820 | and WRITE_ONCE() can help to preserve the needed conditional. |
822 | 821 | ||
823 | (*) Control dependencies require that the compiler avoid reordering the | 822 | (*) Control dependencies require that the compiler avoid reordering the |
824 | dependency into nonexistence. Careful use of ACCESS_ONCE() or | 823 | dependency into nonexistence. Careful use of READ_ONCE_CTRL() |
825 | barrier() can help to preserve your control dependency. Please | 824 | or smp_read_barrier_depends() can help to preserve your control |
826 | see the Compiler Barrier section for more information. | 825 | dependency. Please see the Compiler Barrier section for more |
826 | information. | ||
827 | 827 | ||
828 | (*) Control dependencies pair normally with other types of barriers. | 828 | (*) Control dependencies pair normally with other types of barriers. |
829 | 829 | ||
@@ -848,11 +848,11 @@ barrier, an acquire barrier, a release barrier, or a general barrier: | |||
848 | 848 | ||
849 | CPU 1 CPU 2 | 849 | CPU 1 CPU 2 |
850 | =============== =============== | 850 | =============== =============== |
851 | ACCESS_ONCE(a) = 1; | 851 | WRITE_ONCE(a, 1); |
852 | <write barrier> | 852 | <write barrier> |
853 | ACCESS_ONCE(b) = 2; x = ACCESS_ONCE(b); | 853 | WRITE_ONCE(b, 2); x = READ_ONCE(b); |
854 | <read barrier> | 854 | <read barrier> |
855 | y = ACCESS_ONCE(a); | 855 | y = READ_ONCE(a); |
856 | 856 | ||
857 | Or: | 857 | Or: |
858 | 858 | ||
@@ -860,7 +860,7 @@ Or: | |||
860 | =============== =============================== | 860 | =============== =============================== |
861 | a = 1; | 861 | a = 1; |
862 | <write barrier> | 862 | <write barrier> |
863 | ACCESS_ONCE(b) = &a; x = ACCESS_ONCE(b); | 863 | WRITE_ONCE(b, &a); x = READ_ONCE(b); |
864 | <data dependency barrier> | 864 | <data dependency barrier> |
865 | y = *x; | 865 | y = *x; |
866 | 866 | ||
@@ -868,11 +868,11 @@ Or even: | |||
868 | 868 | ||
869 | CPU 1 CPU 2 | 869 | CPU 1 CPU 2 |
870 | =============== =============================== | 870 | =============== =============================== |
871 | r1 = ACCESS_ONCE(y); | 871 | r1 = READ_ONCE(y); |
872 | <general barrier> | 872 | <general barrier> |
873 | ACCESS_ONCE(y) = 1; if (r2 = ACCESS_ONCE(x)) { | 873 | WRITE_ONCE(y, 1); if (r2 = READ_ONCE(x)) { |
874 | <implicit control dependency> | 874 | <implicit control dependency> |
875 | ACCESS_ONCE(y) = 1; | 875 | WRITE_ONCE(y, 1); |
876 | } | 876 | } |
877 | 877 | ||
878 | assert(r1 == 0 || r2 == 0); | 878 | assert(r1 == 0 || r2 == 0); |
@@ -886,11 +886,11 @@ versa: | |||
886 | 886 | ||
887 | CPU 1 CPU 2 | 887 | CPU 1 CPU 2 |
888 | =================== =================== | 888 | =================== =================== |
889 | ACCESS_ONCE(a) = 1; }---- --->{ v = ACCESS_ONCE(c); | 889 | WRITE_ONCE(a, 1); }---- --->{ v = READ_ONCE(c); |
890 | ACCESS_ONCE(b) = 2; } \ / { w = ACCESS_ONCE(d); | 890 | WRITE_ONCE(b, 2); } \ / { w = READ_ONCE(d); |
891 | <write barrier> \ <read barrier> | 891 | <write barrier> \ <read barrier> |
892 | ACCESS_ONCE(c) = 3; } / \ { x = ACCESS_ONCE(a); | 892 | WRITE_ONCE(c, 3); } / \ { x = READ_ONCE(a); |
893 | ACCESS_ONCE(d) = 4; }---- --->{ y = ACCESS_ONCE(b); | 893 | WRITE_ONCE(d, 4); }---- --->{ y = READ_ONCE(b); |
894 | 894 | ||
895 | 895 | ||
896 | EXAMPLES OF MEMORY BARRIER SEQUENCES | 896 | EXAMPLES OF MEMORY BARRIER SEQUENCES |
@@ -1340,10 +1340,10 @@ compiler from moving the memory accesses either side of it to the other side: | |||
1340 | 1340 | ||
1341 | barrier(); | 1341 | barrier(); |
1342 | 1342 | ||
1343 | This is a general barrier -- there are no read-read or write-write variants | 1343 | This is a general barrier -- there are no read-read or write-write |
1344 | of barrier(). However, ACCESS_ONCE() can be thought of as a weak form | 1344 | variants of barrier(). However, READ_ONCE() and WRITE_ONCE() can be |
1345 | for barrier() that affects only the specific accesses flagged by the | 1345 | thought of as weak forms of barrier() that affect only the specific |
1346 | ACCESS_ONCE(). | 1346 | accesses flagged by the READ_ONCE() or WRITE_ONCE(). |
1347 | 1347 | ||
1348 | The barrier() function has the following effects: | 1348 | The barrier() function has the following effects: |
1349 | 1349 | ||
@@ -1355,9 +1355,10 @@ The barrier() function has the following effects: | |||
1355 | (*) Within a loop, forces the compiler to load the variables used | 1355 | (*) Within a loop, forces the compiler to load the variables used |
1356 | in that loop's conditional on each pass through that loop. | 1356 | in that loop's conditional on each pass through that loop. |
1357 | 1357 | ||
1358 | The ACCESS_ONCE() function can prevent any number of optimizations that, | 1358 | The READ_ONCE() and WRITE_ONCE() functions can prevent any number of |
1359 | while perfectly safe in single-threaded code, can be fatal in concurrent | 1359 | optimizations that, while perfectly safe in single-threaded code, can |
1360 | code. Here are some examples of these sorts of optimizations: | 1360 | be fatal in concurrent code. Here are some examples of these sorts |
1361 | of optimizations: | ||
1361 | 1362 | ||
1362 | (*) The compiler is within its rights to reorder loads and stores | 1363 | (*) The compiler is within its rights to reorder loads and stores |
1363 | to the same variable, and in some cases, the CPU is within its | 1364 | to the same variable, and in some cases, the CPU is within its |
@@ -1370,11 +1371,11 @@ code. Here are some examples of these sorts of optimizations: | |||
1370 | Might result in an older value of x stored in a[1] than in a[0]. | 1371 | Might result in an older value of x stored in a[1] than in a[0]. |
1371 | Prevent both the compiler and the CPU from doing this as follows: | 1372 | Prevent both the compiler and the CPU from doing this as follows: |
1372 | 1373 | ||
1373 | a[0] = ACCESS_ONCE(x); | 1374 | a[0] = READ_ONCE(x); |
1374 | a[1] = ACCESS_ONCE(x); | 1375 | a[1] = READ_ONCE(x); |
1375 | 1376 | ||
1376 | In short, ACCESS_ONCE() provides cache coherence for accesses from | 1377 | In short, READ_ONCE() and WRITE_ONCE() provide cache coherence for |
1377 | multiple CPUs to a single variable. | 1378 | accesses from multiple CPUs to a single variable. |
1378 | 1379 | ||
1379 | (*) The compiler is within its rights to merge successive loads from | 1380 | (*) The compiler is within its rights to merge successive loads from |
1380 | the same variable. Such merging can cause the compiler to "optimize" | 1381 | the same variable. Such merging can cause the compiler to "optimize" |
@@ -1391,9 +1392,9 @@ code. Here are some examples of these sorts of optimizations: | |||
1391 | for (;;) | 1392 | for (;;) |
1392 | do_something_with(tmp); | 1393 | do_something_with(tmp); |
1393 | 1394 | ||
1394 | Use ACCESS_ONCE() to prevent the compiler from doing this to you: | 1395 | Use READ_ONCE() to prevent the compiler from doing this to you: |
1395 | 1396 | ||
1396 | while (tmp = ACCESS_ONCE(a)) | 1397 | while (tmp = READ_ONCE(a)) |
1397 | do_something_with(tmp); | 1398 | do_something_with(tmp); |
1398 | 1399 | ||
1399 | (*) The compiler is within its rights to reload a variable, for example, | 1400 | (*) The compiler is within its rights to reload a variable, for example, |
@@ -1415,9 +1416,9 @@ code. Here are some examples of these sorts of optimizations: | |||
1415 | a was modified by some other CPU between the "while" statement and | 1416 | a was modified by some other CPU between the "while" statement and |
1416 | the call to do_something_with(). | 1417 | the call to do_something_with(). |
1417 | 1418 | ||
1418 | Again, use ACCESS_ONCE() to prevent the compiler from doing this: | 1419 | Again, use READ_ONCE() to prevent the compiler from doing this: |
1419 | 1420 | ||
1420 | while (tmp = ACCESS_ONCE(a)) | 1421 | while (tmp = READ_ONCE(a)) |
1421 | do_something_with(tmp); | 1422 | do_something_with(tmp); |
1422 | 1423 | ||
1423 | Note that if the compiler runs short of registers, it might save | 1424 | Note that if the compiler runs short of registers, it might save |
@@ -1437,21 +1438,21 @@ code. Here are some examples of these sorts of optimizations: | |||
1437 | 1438 | ||
1438 | do { } while (0); | 1439 | do { } while (0); |
1439 | 1440 | ||
1440 | This transformation is a win for single-threaded code because it gets | 1441 | This transformation is a win for single-threaded code because it |
1441 | rid of a load and a branch. The problem is that the compiler will | 1442 | gets rid of a load and a branch. The problem is that the compiler |
1442 | carry out its proof assuming that the current CPU is the only one | 1443 | will carry out its proof assuming that the current CPU is the only |
1443 | updating variable 'a'. If variable 'a' is shared, then the compiler's | 1444 | one updating variable 'a'. If variable 'a' is shared, then the |
1444 | proof will be erroneous. Use ACCESS_ONCE() to tell the compiler | 1445 | compiler's proof will be erroneous. Use READ_ONCE() to tell the |
1445 | that it doesn't know as much as it thinks it does: | 1446 | compiler that it doesn't know as much as it thinks it does: |
1446 | 1447 | ||
1447 | while (tmp = ACCESS_ONCE(a)) | 1448 | while (tmp = READ_ONCE(a)) |
1448 | do_something_with(tmp); | 1449 | do_something_with(tmp); |
1449 | 1450 | ||
1450 | But please note that the compiler is also closely watching what you | 1451 | But please note that the compiler is also closely watching what you |
1451 | do with the value after the ACCESS_ONCE(). For example, suppose you | 1452 | do with the value after the READ_ONCE(). For example, suppose you |
1452 | do the following and MAX is a preprocessor macro with the value 1: | 1453 | do the following and MAX is a preprocessor macro with the value 1: |
1453 | 1454 | ||
1454 | while ((tmp = ACCESS_ONCE(a)) % MAX) | 1455 | while ((tmp = READ_ONCE(a)) % MAX) |
1455 | do_something_with(tmp); | 1456 | do_something_with(tmp); |
1456 | 1457 | ||
1457 | Then the compiler knows that the result of the "%" operator applied | 1458 | Then the compiler knows that the result of the "%" operator applied |
@@ -1475,12 +1476,12 @@ code. Here are some examples of these sorts of optimizations: | |||
1475 | surprise if some other CPU might have stored to variable 'a' in the | 1476 | surprise if some other CPU might have stored to variable 'a' in the |
1476 | meantime. | 1477 | meantime. |
1477 | 1478 | ||
1478 | Use ACCESS_ONCE() to prevent the compiler from making this sort of | 1479 | Use WRITE_ONCE() to prevent the compiler from making this sort of |
1479 | wrong guess: | 1480 | wrong guess: |
1480 | 1481 | ||
1481 | ACCESS_ONCE(a) = 0; | 1482 | WRITE_ONCE(a, 0); |
1482 | /* Code that does not store to variable a. */ | 1483 | /* Code that does not store to variable a. */ |
1483 | ACCESS_ONCE(a) = 0; | 1484 | WRITE_ONCE(a, 0); |
1484 | 1485 | ||
1485 | (*) The compiler is within its rights to reorder memory accesses unless | 1486 | (*) The compiler is within its rights to reorder memory accesses unless |
1486 | you tell it not to. For example, consider the following interaction | 1487 | you tell it not to. For example, consider the following interaction |
@@ -1509,40 +1510,43 @@ code. Here are some examples of these sorts of optimizations: | |||
1509 | } | 1510 | } |
1510 | 1511 | ||
1511 | If the interrupt occurs between these two statement, then | 1512 | If the interrupt occurs between these two statement, then |
1512 | interrupt_handler() might be passed a garbled msg. Use ACCESS_ONCE() | 1513 | interrupt_handler() might be passed a garbled msg. Use WRITE_ONCE() |
1513 | to prevent this as follows: | 1514 | to prevent this as follows: |
1514 | 1515 | ||
1515 | void process_level(void) | 1516 | void process_level(void) |
1516 | { | 1517 | { |
1517 | ACCESS_ONCE(msg) = get_message(); | 1518 | WRITE_ONCE(msg, get_message()); |
1518 | ACCESS_ONCE(flag) = true; | 1519 | WRITE_ONCE(flag, true); |
1519 | } | 1520 | } |
1520 | 1521 | ||
1521 | void interrupt_handler(void) | 1522 | void interrupt_handler(void) |
1522 | { | 1523 | { |
1523 | if (ACCESS_ONCE(flag)) | 1524 | if (READ_ONCE(flag)) |
1524 | process_message(ACCESS_ONCE(msg)); | 1525 | process_message(READ_ONCE(msg)); |
1525 | } | 1526 | } |
1526 | 1527 | ||
1527 | Note that the ACCESS_ONCE() wrappers in interrupt_handler() | 1528 | Note that the READ_ONCE() and WRITE_ONCE() wrappers in |
1528 | are needed if this interrupt handler can itself be interrupted | 1529 | interrupt_handler() are needed if this interrupt handler can itself |
1529 | by something that also accesses 'flag' and 'msg', for example, | 1530 | be interrupted by something that also accesses 'flag' and 'msg', |
1530 | a nested interrupt or an NMI. Otherwise, ACCESS_ONCE() is not | 1531 | for example, a nested interrupt or an NMI. Otherwise, READ_ONCE() |
1531 | needed in interrupt_handler() other than for documentation purposes. | 1532 | and WRITE_ONCE() are not needed in interrupt_handler() other than |
1532 | (Note also that nested interrupts do not typically occur in modern | 1533 | for documentation purposes. (Note also that nested interrupts |
1533 | Linux kernels, in fact, if an interrupt handler returns with | 1534 | do not typically occur in modern Linux kernels, in fact, if an |
1534 | interrupts enabled, you will get a WARN_ONCE() splat.) | 1535 | interrupt handler returns with interrupts enabled, you will get a |
1535 | 1536 | WARN_ONCE() splat.) | |
1536 | You should assume that the compiler can move ACCESS_ONCE() past | 1537 | |
1537 | code not containing ACCESS_ONCE(), barrier(), or similar primitives. | 1538 | You should assume that the compiler can move READ_ONCE() and |
1538 | 1539 | WRITE_ONCE() past code not containing READ_ONCE(), WRITE_ONCE(), | |
1539 | This effect could also be achieved using barrier(), but ACCESS_ONCE() | 1540 | barrier(), or similar primitives. |
1540 | is more selective: With ACCESS_ONCE(), the compiler need only forget | 1541 | |
1541 | the contents of the indicated memory locations, while with barrier() | 1542 | This effect could also be achieved using barrier(), but READ_ONCE() |
1542 | the compiler must discard the value of all memory locations that | 1543 | and WRITE_ONCE() are more selective: With READ_ONCE() and |
1543 | it has currented cached in any machine registers. Of course, | 1544 | WRITE_ONCE(), the compiler need only forget the contents of the |
1544 | the compiler must also respect the order in which the ACCESS_ONCE()s | 1545 | indicated memory locations, while with barrier() the compiler must |
1545 | occur, though the CPU of course need not do so. | 1546 | discard the value of all memory locations that it has currented |
1547 | cached in any machine registers. Of course, the compiler must also | ||
1548 | respect the order in which the READ_ONCE()s and WRITE_ONCE()s occur, | ||
1549 | though the CPU of course need not do so. | ||
1546 | 1550 | ||
1547 | (*) The compiler is within its rights to invent stores to a variable, | 1551 | (*) The compiler is within its rights to invent stores to a variable, |
1548 | as in the following example: | 1552 | as in the following example: |
@@ -1562,16 +1566,16 @@ code. Here are some examples of these sorts of optimizations: | |||
1562 | a branch. Unfortunately, in concurrent code, this optimization | 1566 | a branch. Unfortunately, in concurrent code, this optimization |
1563 | could cause some other CPU to see a spurious value of 42 -- even | 1567 | could cause some other CPU to see a spurious value of 42 -- even |
1564 | if variable 'a' was never zero -- when loading variable 'b'. | 1568 | if variable 'a' was never zero -- when loading variable 'b'. |
1565 | Use ACCESS_ONCE() to prevent this as follows: | 1569 | Use WRITE_ONCE() to prevent this as follows: |
1566 | 1570 | ||
1567 | if (a) | 1571 | if (a) |
1568 | ACCESS_ONCE(b) = a; | 1572 | WRITE_ONCE(b, a); |
1569 | else | 1573 | else |
1570 | ACCESS_ONCE(b) = 42; | 1574 | WRITE_ONCE(b, 42); |
1571 | 1575 | ||
1572 | The compiler can also invent loads. These are usually less | 1576 | The compiler can also invent loads. These are usually less |
1573 | damaging, but they can result in cache-line bouncing and thus in | 1577 | damaging, but they can result in cache-line bouncing and thus in |
1574 | poor performance and scalability. Use ACCESS_ONCE() to prevent | 1578 | poor performance and scalability. Use READ_ONCE() to prevent |
1575 | invented loads. | 1579 | invented loads. |
1576 | 1580 | ||
1577 | (*) For aligned memory locations whose size allows them to be accessed | 1581 | (*) For aligned memory locations whose size allows them to be accessed |
@@ -1590,9 +1594,9 @@ code. Here are some examples of these sorts of optimizations: | |||
1590 | This optimization can therefore be a win in single-threaded code. | 1594 | This optimization can therefore be a win in single-threaded code. |
1591 | In fact, a recent bug (since fixed) caused GCC to incorrectly use | 1595 | In fact, a recent bug (since fixed) caused GCC to incorrectly use |
1592 | this optimization in a volatile store. In the absence of such bugs, | 1596 | this optimization in a volatile store. In the absence of such bugs, |
1593 | use of ACCESS_ONCE() prevents store tearing in the following example: | 1597 | use of WRITE_ONCE() prevents store tearing in the following example: |
1594 | 1598 | ||
1595 | ACCESS_ONCE(p) = 0x00010002; | 1599 | WRITE_ONCE(p, 0x00010002); |
1596 | 1600 | ||
1597 | Use of packed structures can also result in load and store tearing, | 1601 | Use of packed structures can also result in load and store tearing, |
1598 | as in this example: | 1602 | as in this example: |
@@ -1609,22 +1613,23 @@ code. Here are some examples of these sorts of optimizations: | |||
1609 | foo2.b = foo1.b; | 1613 | foo2.b = foo1.b; |
1610 | foo2.c = foo1.c; | 1614 | foo2.c = foo1.c; |
1611 | 1615 | ||
1612 | Because there are no ACCESS_ONCE() wrappers and no volatile markings, | 1616 | Because there are no READ_ONCE() or WRITE_ONCE() wrappers and no |
1613 | the compiler would be well within its rights to implement these three | 1617 | volatile markings, the compiler would be well within its rights to |
1614 | assignment statements as a pair of 32-bit loads followed by a pair | 1618 | implement these three assignment statements as a pair of 32-bit |
1615 | of 32-bit stores. This would result in load tearing on 'foo1.b' | 1619 | loads followed by a pair of 32-bit stores. This would result in |
1616 | and store tearing on 'foo2.b'. ACCESS_ONCE() again prevents tearing | 1620 | load tearing on 'foo1.b' and store tearing on 'foo2.b'. READ_ONCE() |
1617 | in this example: | 1621 | and WRITE_ONCE() again prevent tearing in this example: |
1618 | 1622 | ||
1619 | foo2.a = foo1.a; | 1623 | foo2.a = foo1.a; |
1620 | ACCESS_ONCE(foo2.b) = ACCESS_ONCE(foo1.b); | 1624 | WRITE_ONCE(foo2.b, READ_ONCE(foo1.b)); |
1621 | foo2.c = foo1.c; | 1625 | foo2.c = foo1.c; |
1622 | 1626 | ||
1623 | All that aside, it is never necessary to use ACCESS_ONCE() on a variable | 1627 | All that aside, it is never necessary to use READ_ONCE() and |
1624 | that has been marked volatile. For example, because 'jiffies' is marked | 1628 | WRITE_ONCE() on a variable that has been marked volatile. For example, |
1625 | volatile, it is never necessary to say ACCESS_ONCE(jiffies). The reason | 1629 | because 'jiffies' is marked volatile, it is never necessary to |
1626 | for this is that ACCESS_ONCE() is implemented as a volatile cast, which | 1630 | say READ_ONCE(jiffies). The reason for this is that READ_ONCE() and |
1627 | has no effect when its argument is already marked volatile. | 1631 | WRITE_ONCE() are implemented as volatile casts, which has no effect when |
1632 | its argument is already marked volatile. | ||
1628 | 1633 | ||
1629 | Please note that these compiler barriers have no direct effect on the CPU, | 1634 | Please note that these compiler barriers have no direct effect on the CPU, |
1630 | which may then reorder things however it wishes. | 1635 | which may then reorder things however it wishes. |
@@ -1646,14 +1651,15 @@ The Linux kernel has eight basic CPU memory barriers: | |||
1646 | All memory barriers except the data dependency barriers imply a compiler | 1651 | All memory barriers except the data dependency barriers imply a compiler |
1647 | barrier. Data dependencies do not impose any additional compiler ordering. | 1652 | barrier. Data dependencies do not impose any additional compiler ordering. |
1648 | 1653 | ||
1649 | Aside: In the case of data dependencies, the compiler would be expected to | 1654 | Aside: In the case of data dependencies, the compiler would be expected |
1650 | issue the loads in the correct order (eg. `a[b]` would have to load the value | 1655 | to issue the loads in the correct order (eg. `a[b]` would have to load |
1651 | of b before loading a[b]), however there is no guarantee in the C specification | 1656 | the value of b before loading a[b]), however there is no guarantee in |
1652 | that the compiler may not speculate the value of b (eg. is equal to 1) and load | 1657 | the C specification that the compiler may not speculate the value of b |
1653 | a before b (eg. tmp = a[1]; if (b != 1) tmp = a[b]; ). There is also the | 1658 | (eg. is equal to 1) and load a before b (eg. tmp = a[1]; if (b != 1) |
1654 | problem of a compiler reloading b after having loaded a[b], thus having a newer | 1659 | tmp = a[b]; ). There is also the problem of a compiler reloading b after |
1655 | copy of b than a[b]. A consensus has not yet been reached about these problems, | 1660 | having loaded a[b], thus having a newer copy of b than a[b]. A consensus |
1656 | however the ACCESS_ONCE macro is a good place to start looking. | 1661 | has not yet been reached about these problems, however the READ_ONCE() |
1662 | macro is a good place to start looking. | ||
1657 | 1663 | ||
1658 | SMP memory barriers are reduced to compiler barriers on uniprocessor compiled | 1664 | SMP memory barriers are reduced to compiler barriers on uniprocessor compiled |
1659 | systems because it is assumed that a CPU will appear to be self-consistent, | 1665 | systems because it is assumed that a CPU will appear to be self-consistent, |
@@ -2126,12 +2132,12 @@ three CPUs; then should the following sequence of events occur: | |||
2126 | 2132 | ||
2127 | CPU 1 CPU 2 | 2133 | CPU 1 CPU 2 |
2128 | =============================== =============================== | 2134 | =============================== =============================== |
2129 | ACCESS_ONCE(*A) = a; ACCESS_ONCE(*E) = e; | 2135 | WRITE_ONCE(*A, a); WRITE_ONCE(*E, e); |
2130 | ACQUIRE M ACQUIRE Q | 2136 | ACQUIRE M ACQUIRE Q |
2131 | ACCESS_ONCE(*B) = b; ACCESS_ONCE(*F) = f; | 2137 | WRITE_ONCE(*B, b); WRITE_ONCE(*F, f); |
2132 | ACCESS_ONCE(*C) = c; ACCESS_ONCE(*G) = g; | 2138 | WRITE_ONCE(*C, c); WRITE_ONCE(*G, g); |
2133 | RELEASE M RELEASE Q | 2139 | RELEASE M RELEASE Q |
2134 | ACCESS_ONCE(*D) = d; ACCESS_ONCE(*H) = h; | 2140 | WRITE_ONCE(*D, d); WRITE_ONCE(*H, h); |
2135 | 2141 | ||
2136 | Then there is no guarantee as to what order CPU 3 will see the accesses to *A | 2142 | Then there is no guarantee as to what order CPU 3 will see the accesses to *A |
2137 | through *H occur in, other than the constraints imposed by the separate locks | 2143 | through *H occur in, other than the constraints imposed by the separate locks |
@@ -2151,18 +2157,18 @@ However, if the following occurs: | |||
2151 | 2157 | ||
2152 | CPU 1 CPU 2 | 2158 | CPU 1 CPU 2 |
2153 | =============================== =============================== | 2159 | =============================== =============================== |
2154 | ACCESS_ONCE(*A) = a; | 2160 | WRITE_ONCE(*A, a); |
2155 | ACQUIRE M [1] | 2161 | ACQUIRE M [1] |
2156 | ACCESS_ONCE(*B) = b; | 2162 | WRITE_ONCE(*B, b); |
2157 | ACCESS_ONCE(*C) = c; | 2163 | WRITE_ONCE(*C, c); |
2158 | RELEASE M [1] | 2164 | RELEASE M [1] |
2159 | ACCESS_ONCE(*D) = d; ACCESS_ONCE(*E) = e; | 2165 | WRITE_ONCE(*D, d); WRITE_ONCE(*E, e); |
2160 | ACQUIRE M [2] | 2166 | ACQUIRE M [2] |
2161 | smp_mb__after_unlock_lock(); | 2167 | smp_mb__after_unlock_lock(); |
2162 | ACCESS_ONCE(*F) = f; | 2168 | WRITE_ONCE(*F, f); |
2163 | ACCESS_ONCE(*G) = g; | 2169 | WRITE_ONCE(*G, g); |
2164 | RELEASE M [2] | 2170 | RELEASE M [2] |
2165 | ACCESS_ONCE(*H) = h; | 2171 | WRITE_ONCE(*H, h); |
2166 | 2172 | ||
2167 | CPU 3 might see: | 2173 | CPU 3 might see: |
2168 | 2174 | ||
@@ -2881,11 +2887,11 @@ A programmer might take it for granted that the CPU will perform memory | |||
2881 | operations in exactly the order specified, so that if the CPU is, for example, | 2887 | operations in exactly the order specified, so that if the CPU is, for example, |
2882 | given the following piece of code to execute: | 2888 | given the following piece of code to execute: |
2883 | 2889 | ||
2884 | a = ACCESS_ONCE(*A); | 2890 | a = READ_ONCE(*A); |
2885 | ACCESS_ONCE(*B) = b; | 2891 | WRITE_ONCE(*B, b); |
2886 | c = ACCESS_ONCE(*C); | 2892 | c = READ_ONCE(*C); |
2887 | d = ACCESS_ONCE(*D); | 2893 | d = READ_ONCE(*D); |
2888 | ACCESS_ONCE(*E) = e; | 2894 | WRITE_ONCE(*E, e); |
2889 | 2895 | ||
2890 | they would then expect that the CPU will complete the memory operation for each | 2896 | they would then expect that the CPU will complete the memory operation for each |
2891 | instruction before moving on to the next one, leading to a definite sequence of | 2897 | instruction before moving on to the next one, leading to a definite sequence of |
@@ -2932,12 +2938,12 @@ However, it is guaranteed that a CPU will be self-consistent: it will see its | |||
2932 | _own_ accesses appear to be correctly ordered, without the need for a memory | 2938 | _own_ accesses appear to be correctly ordered, without the need for a memory |
2933 | barrier. For instance with the following code: | 2939 | barrier. For instance with the following code: |
2934 | 2940 | ||
2935 | U = ACCESS_ONCE(*A); | 2941 | U = READ_ONCE(*A); |
2936 | ACCESS_ONCE(*A) = V; | 2942 | WRITE_ONCE(*A, V); |
2937 | ACCESS_ONCE(*A) = W; | 2943 | WRITE_ONCE(*A, W); |
2938 | X = ACCESS_ONCE(*A); | 2944 | X = READ_ONCE(*A); |
2939 | ACCESS_ONCE(*A) = Y; | 2945 | WRITE_ONCE(*A, Y); |
2940 | Z = ACCESS_ONCE(*A); | 2946 | Z = READ_ONCE(*A); |
2941 | 2947 | ||
2942 | and assuming no intervention by an external influence, it can be assumed that | 2948 | and assuming no intervention by an external influence, it can be assumed that |
2943 | the final result will appear to be: | 2949 | the final result will appear to be: |
@@ -2953,13 +2959,14 @@ accesses: | |||
2953 | U=LOAD *A, STORE *A=V, STORE *A=W, X=LOAD *A, STORE *A=Y, Z=LOAD *A | 2959 | U=LOAD *A, STORE *A=V, STORE *A=W, X=LOAD *A, STORE *A=Y, Z=LOAD *A |
2954 | 2960 | ||
2955 | in that order, but, without intervention, the sequence may have almost any | 2961 | in that order, but, without intervention, the sequence may have almost any |
2956 | combination of elements combined or discarded, provided the program's view of | 2962 | combination of elements combined or discarded, provided the program's view |
2957 | the world remains consistent. Note that ACCESS_ONCE() is -not- optional | 2963 | of the world remains consistent. Note that READ_ONCE() and WRITE_ONCE() |
2958 | in the above example, as there are architectures where a given CPU might | 2964 | are -not- optional in the above example, as there are architectures |
2959 | reorder successive loads to the same location. On such architectures, | 2965 | where a given CPU might reorder successive loads to the same location. |
2960 | ACCESS_ONCE() does whatever is necessary to prevent this, for example, on | 2966 | On such architectures, READ_ONCE() and WRITE_ONCE() do whatever is |
2961 | Itanium the volatile casts used by ACCESS_ONCE() cause GCC to emit the | 2967 | necessary to prevent this, for example, on Itanium the volatile casts |
2962 | special ld.acq and st.rel instructions that prevent such reordering. | 2968 | used by READ_ONCE() and WRITE_ONCE() cause GCC to emit the special ld.acq |
2969 | and st.rel instructions (respectively) that prevent such reordering. | ||
2963 | 2970 | ||
2964 | The compiler may also combine, discard or defer elements of the sequence before | 2971 | The compiler may also combine, discard or defer elements of the sequence before |
2965 | the CPU even sees them. | 2972 | the CPU even sees them. |
@@ -2973,13 +2980,14 @@ may be reduced to: | |||
2973 | 2980 | ||
2974 | *A = W; | 2981 | *A = W; |
2975 | 2982 | ||
2976 | since, without either a write barrier or an ACCESS_ONCE(), it can be | 2983 | since, without either a write barrier or an WRITE_ONCE(), it can be |
2977 | assumed that the effect of the storage of V to *A is lost. Similarly: | 2984 | assumed that the effect of the storage of V to *A is lost. Similarly: |
2978 | 2985 | ||
2979 | *A = Y; | 2986 | *A = Y; |
2980 | Z = *A; | 2987 | Z = *A; |
2981 | 2988 | ||
2982 | may, without a memory barrier or an ACCESS_ONCE(), be reduced to: | 2989 | may, without a memory barrier or an READ_ONCE() and WRITE_ONCE(), be |
2990 | reduced to: | ||
2983 | 2991 | ||
2984 | *A = Y; | 2992 | *A = Y; |
2985 | Z = Y; | 2993 | Z = Y; |