diff options
| -rw-r--r-- | Documentation/memory-barriers.txt | 348 |
1 files changed, 270 insertions, 78 deletions
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index c61d8b876fdb..4710845dbac4 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt | |||
| @@ -19,6 +19,7 @@ Contents: | |||
| 19 | - Control dependencies. | 19 | - Control dependencies. |
| 20 | - SMP barrier pairing. | 20 | - SMP barrier pairing. |
| 21 | - Examples of memory barrier sequences. | 21 | - Examples of memory barrier sequences. |
| 22 | - Read memory barriers vs load speculation. | ||
| 22 | 23 | ||
| 23 | (*) Explicit kernel barriers. | 24 | (*) Explicit kernel barriers. |
| 24 | 25 | ||
| @@ -248,7 +249,7 @@ And there are a number of things that _must_ or _must_not_ be assumed: | |||
| 248 | we may get either of: | 249 | we may get either of: |
| 249 | 250 | ||
| 250 | STORE *A = X; Y = LOAD *A; | 251 | STORE *A = X; Y = LOAD *A; |
| 251 | STORE *A = Y; | 252 | STORE *A = Y = X; |
| 252 | 253 | ||
| 253 | 254 | ||
| 254 | ========================= | 255 | ========================= |
| @@ -344,9 +345,12 @@ Memory barriers come in four basic varieties: | |||
| 344 | 345 | ||
| 345 | (4) General memory barriers. | 346 | (4) General memory barriers. |
| 346 | 347 | ||
| 347 | A general memory barrier is a combination of both a read memory barrier | 348 | A general memory barrier gives a guarantee that all the LOAD and STORE |
| 348 | and a write memory barrier. It is a partial ordering over both loads and | 349 | operations specified before the barrier will appear to happen before all |
| 349 | stores. | 350 | the LOAD and STORE operations specified after the barrier with respect to |
| 351 | the other components of the system. | ||
| 352 | |||
| 353 | A general memory barrier is a partial ordering over both loads and stores. | ||
| 350 | 354 | ||
| 351 | General memory barriers imply both read and write memory barriers, and so | 355 | General memory barriers imply both read and write memory barriers, and so |
| 352 | can substitute for either. | 356 | can substitute for either. |
| @@ -546,9 +550,9 @@ write barrier, though, again, a general barrier is viable: | |||
| 546 | =============== =============== | 550 | =============== =============== |
| 547 | a = 1; | 551 | a = 1; |
| 548 | <write barrier> | 552 | <write barrier> |
| 549 | b = 2; x = a; | 553 | b = 2; x = b; |
| 550 | <read barrier> | 554 | <read barrier> |
| 551 | y = b; | 555 | y = a; |
| 552 | 556 | ||
| 553 | Or: | 557 | Or: |
| 554 | 558 | ||
| @@ -563,6 +567,18 @@ Or: | |||
| 563 | Basically, the read barrier always has to be there, even though it can be of | 567 | Basically, the read barrier always has to be there, even though it can be of |
| 564 | the "weaker" type. | 568 | the "weaker" type. |
| 565 | 569 | ||
| 570 | [!] Note that the stores before the write barrier would normally be expected to | ||
| 571 | match the loads after the read barrier or data dependency barrier, and vice | ||
| 572 | versa: | ||
| 573 | |||
| 574 | CPU 1 CPU 2 | ||
| 575 | =============== =============== | ||
| 576 | a = 1; }---- --->{ v = c | ||
| 577 | b = 2; } \ / { w = d | ||
| 578 | <write barrier> \ <read barrier> | ||
| 579 | c = 3; } / \ { x = a; | ||
| 580 | d = 4; }---- --->{ y = b; | ||
| 581 | |||
| 566 | 582 | ||
| 567 | EXAMPLES OF MEMORY BARRIER SEQUENCES | 583 | EXAMPLES OF MEMORY BARRIER SEQUENCES |
| 568 | ------------------------------------ | 584 | ------------------------------------ |
| @@ -600,8 +616,8 @@ STORE B, STORE C } all occuring before the unordered set of { STORE D, STORE E | |||
| 600 | | | +------+ | 616 | | | +------+ |
| 601 | +-------+ : : | 617 | +-------+ : : |
| 602 | | | 618 | | |
| 603 | | Sequence in which stores committed to memory system | 619 | | Sequence in which stores are committed to the |
| 604 | | by CPU 1 | 620 | | memory system by CPU 1 |
| 605 | V | 621 | V |
| 606 | 622 | ||
| 607 | 623 | ||
| @@ -683,14 +699,12 @@ then the following will occur: | |||
| 683 | | : : | | | 699 | | : : | | |
| 684 | | : : | CPU 2 | | 700 | | : : | CPU 2 | |
| 685 | | +-------+ | | | 701 | | +-------+ | | |
| 686 | \ | X->9 |------>| | | 702 | | | X->9 |------>| | |
| 687 | \ +-------+ | | | 703 | | +-------+ | | |
| 688 | ----->| B->2 | | | | 704 | Makes sure all effects ---> \ ddddddddddddddddd | | |
| 689 | +-------+ | | | 705 | prior to the store of C \ +-------+ | | |
| 690 | Makes sure all effects ---> ddddddddddddddddd | | | 706 | are perceptible to ----->| B->2 |------>| | |
| 691 | prior to the store of C +-------+ | | | 707 | subsequent loads +-------+ | | |
| 692 | are perceptible to | B->2 |------>| | | ||
| 693 | successive loads +-------+ | | | ||
| 694 | : : +-------+ | 708 | : : +-------+ |
| 695 | 709 | ||
| 696 | 710 | ||
| @@ -699,73 +713,239 @@ following sequence of events: | |||
| 699 | 713 | ||
| 700 | CPU 1 CPU 2 | 714 | CPU 1 CPU 2 |
| 701 | ======================= ======================= | 715 | ======================= ======================= |
| 716 | { A = 0, B = 9 } | ||
| 702 | STORE A=1 | 717 | STORE A=1 |
| 703 | STORE B=2 | ||
| 704 | STORE C=3 | ||
| 705 | <write barrier> | 718 | <write barrier> |
| 706 | STORE D=4 | 719 | STORE B=2 |
| 707 | STORE E=5 | ||
| 708 | LOAD A | ||
| 709 | LOAD B | 720 | LOAD B |
| 710 | LOAD C | 721 | LOAD A |
| 711 | LOAD D | ||
| 712 | LOAD E | ||
| 713 | 722 | ||
| 714 | Without intervention, CPU 2 may then choose to perceive the events on CPU 1 in | 723 | Without intervention, CPU 2 may then choose to perceive the events on CPU 1 in |
| 715 | some effectively random order, despite the write barrier issued by CPU 1: | 724 | some effectively random order, despite the write barrier issued by CPU 1: |
| 716 | 725 | ||
| 717 | +-------+ : : | 726 | +-------+ : : : : |
| 718 | | | +------+ | 727 | | | +------+ +-------+ |
| 719 | | |------>| C=3 | } | 728 | | |------>| A=1 |------ --->| A->0 | |
| 720 | | | : +------+ } | 729 | | | +------+ \ +-------+ |
| 721 | | | : | A=1 | } | 730 | | CPU 1 | wwwwwwwwwwwwwwww \ --->| B->9 | |
| 722 | | | : +------+ } | 731 | | | +------+ | +-------+ |
| 723 | | CPU 1 | : | B=2 | }--- | 732 | | |------>| B=2 |--- | : : |
| 724 | | | +------+ } \ | 733 | | | +------+ \ | : : +-------+ |
| 725 | | | wwwwwwwwwwwww} \ | 734 | +-------+ : : \ | +-------+ | | |
| 726 | | | +------+ } \ : : +-------+ | 735 | ---------->| B->2 |------>| | |
| 727 | | | : | E=5 | } \ +-------+ | | | 736 | | +-------+ | CPU 2 | |
| 728 | | | : +------+ } \ { | C->3 |------>| | | 737 | | | A->0 |------>| | |
| 729 | | |------>| D=4 | } \ { +-------+ : | | | 738 | | +-------+ | | |
| 730 | | | +------+ \ { | E->5 | : | | | 739 | | : : +-------+ |
| 731 | +-------+ : : \ { +-------+ : | | | 740 | \ : : |
| 732 | Transfer -->{ | A->1 | : | CPU 2 | | 741 | \ +-------+ |
| 733 | from CPU 1 { +-------+ : | | | 742 | ---->| A->1 | |
| 734 | to CPU 2 { | D->4 | : | | | 743 | +-------+ |
| 735 | { +-------+ : | | | 744 | : : |
| 736 | { | B->2 |------>| | | ||
| 737 | +-------+ | | | ||
| 738 | : : +-------+ | ||
| 739 | |||
| 740 | |||
| 741 | If, however, a read barrier were to be placed between the load of C and the | ||
| 742 | load of D on CPU 2, then the partial ordering imposed by CPU 1 will be | ||
| 743 | perceived correctly by CPU 2. | ||
| 744 | 745 | ||
| 745 | +-------+ : : | 746 | |
| 746 | | | +------+ | 747 | If, however, a read barrier were to be placed between the load of E and the |
| 747 | | |------>| C=3 | } | 748 | load of A on CPU 2: |
| 748 | | | : +------+ } | 749 | |
| 749 | | | : | A=1 | }--- | 750 | CPU 1 CPU 2 |
| 750 | | | : +------+ } \ | 751 | ======================= ======================= |
| 751 | | CPU 1 | : | B=2 | } \ | 752 | { A = 0, B = 9 } |
| 752 | | | +------+ \ | 753 | STORE A=1 |
| 753 | | | wwwwwwwwwwwwwwww \ | 754 | <write barrier> |
| 754 | | | +------+ \ : : +-------+ | 755 | STORE B=2 |
| 755 | | | : | E=5 | } \ +-------+ | | | 756 | LOAD B |
| 756 | | | : +------+ }--- \ { | C->3 |------>| | | 757 | <read barrier> |
| 757 | | |------>| D=4 | } \ \ { +-------+ : | | | 758 | LOAD A |
| 758 | | | +------+ \ -->{ | B->2 | : | | | 759 | |
| 759 | +-------+ : : \ { +-------+ : | | | 760 | then the partial ordering imposed by CPU 1 will be perceived correctly by CPU |
| 760 | \ { | A->1 | : | CPU 2 | | 761 | 2: |
| 761 | \ +-------+ | | | 762 | |
| 762 | At this point the read ----> \ rrrrrrrrrrrrrrrrr | | | 763 | +-------+ : : : : |
| 763 | barrier causes all effects \ +-------+ | | | 764 | | | +------+ +-------+ |
| 764 | prior to the storage of C \ { | E->5 | : | | | 765 | | |------>| A=1 |------ --->| A->0 | |
| 765 | to be perceptible to CPU 2 -->{ +-------+ : | | | 766 | | | +------+ \ +-------+ |
| 766 | { | D->4 |------>| | | 767 | | CPU 1 | wwwwwwwwwwwwwwww \ --->| B->9 | |
| 767 | +-------+ | | | 768 | | | +------+ | +-------+ |
| 768 | : : +-------+ | 769 | | |------>| B=2 |--- | : : |
| 770 | | | +------+ \ | : : +-------+ | ||
| 771 | +-------+ : : \ | +-------+ | | | ||
| 772 | ---------->| B->2 |------>| | | ||
| 773 | | +-------+ | CPU 2 | | ||
| 774 | | : : | | | ||
| 775 | | : : | | | ||
| 776 | At this point the read ----> \ rrrrrrrrrrrrrrrrr | | | ||
| 777 | barrier causes all effects \ +-------+ | | | ||
| 778 | prior to the storage of B ---->| A->1 |------>| | | ||
| 779 | to be perceptible to CPU 2 +-------+ | | | ||
| 780 | : : +-------+ | ||
| 781 | |||
| 782 | |||
| 783 | To illustrate this more completely, consider what could happen if the code | ||
| 784 | contained a load of A either side of the read barrier: | ||
| 785 | |||
| 786 | CPU 1 CPU 2 | ||
| 787 | ======================= ======================= | ||
| 788 | { A = 0, B = 9 } | ||
| 789 | STORE A=1 | ||
| 790 | <write barrier> | ||
| 791 | STORE B=2 | ||
| 792 | LOAD B | ||
| 793 | LOAD A [first load of A] | ||
| 794 | <read barrier> | ||
| 795 | LOAD A [second load of A] | ||
| 796 | |||
| 797 | Even though the two loads of A both occur after the load of B, they may both | ||
| 798 | come up with different values: | ||
| 799 | |||
| 800 | +-------+ : : : : | ||
| 801 | | | +------+ +-------+ | ||
| 802 | | |------>| A=1 |------ --->| A->0 | | ||
| 803 | | | +------+ \ +-------+ | ||
| 804 | | CPU 1 | wwwwwwwwwwwwwwww \ --->| B->9 | | ||
| 805 | | | +------+ | +-------+ | ||
| 806 | | |------>| B=2 |--- | : : | ||
| 807 | | | +------+ \ | : : +-------+ | ||
| 808 | +-------+ : : \ | +-------+ | | | ||
| 809 | ---------->| B->2 |------>| | | ||
| 810 | | +-------+ | CPU 2 | | ||
| 811 | | : : | | | ||
| 812 | | : : | | | ||
| 813 | | +-------+ | | | ||
| 814 | | | A->0 |------>| 1st | | ||
| 815 | | +-------+ | | | ||
| 816 | At this point the read ----> \ rrrrrrrrrrrrrrrrr | | | ||
| 817 | barrier causes all effects \ +-------+ | | | ||
| 818 | prior to the storage of B ---->| A->1 |------>| 2nd | | ||
| 819 | to be perceptible to CPU 2 +-------+ | | | ||
| 820 | : : +-------+ | ||
| 821 | |||
| 822 | |||
| 823 | But it may be that the update to A from CPU 1 becomes perceptible to CPU 2 | ||
| 824 | before the read barrier completes anyway: | ||
| 825 | |||
| 826 | +-------+ : : : : | ||
| 827 | | | +------+ +-------+ | ||
| 828 | | |------>| A=1 |------ --->| A->0 | | ||
| 829 | | | +------+ \ +-------+ | ||
| 830 | | CPU 1 | wwwwwwwwwwwwwwww \ --->| B->9 | | ||
| 831 | | | +------+ | +-------+ | ||
| 832 | | |------>| B=2 |--- | : : | ||
| 833 | | | +------+ \ | : : +-------+ | ||
| 834 | +-------+ : : \ | +-------+ | | | ||
| 835 | ---------->| B->2 |------>| | | ||
| 836 | | +-------+ | CPU 2 | | ||
| 837 | | : : | | | ||
| 838 | \ : : | | | ||
| 839 | \ +-------+ | | | ||
| 840 | ---->| A->1 |------>| 1st | | ||
| 841 | +-------+ | | | ||
| 842 | rrrrrrrrrrrrrrrrr | | | ||
| 843 | +-------+ | | | ||
| 844 | | A->1 |------>| 2nd | | ||
| 845 | +-------+ | | | ||
| 846 | : : +-------+ | ||
| 847 | |||
| 848 | |||
| 849 | The guarantee is that the second load will always come up with A == 1 if the | ||
| 850 | load of B came up with B == 2. No such guarantee exists for the first load of | ||
| 851 | A; that may come up with either A == 0 or A == 1. | ||
| 852 | |||
| 853 | |||
| 854 | READ MEMORY BARRIERS VS LOAD SPECULATION | ||
| 855 | ---------------------------------------- | ||
| 856 | |||
| 857 | Many CPUs speculate with loads: that is they see that they will need to load an | ||
| 858 | item from memory, and they find a time where they're not using the bus for any | ||
| 859 | other loads, and so do the load in advance - even though they haven't actually | ||
| 860 | got to that point in the instruction execution flow yet. This permits the | ||
| 861 | actual load instruction to potentially complete immediately because the CPU | ||
| 862 | already has the value to hand. | ||
| 863 | |||
| 864 | It may turn out that the CPU didn't actually need the value - perhaps because a | ||
| 865 | branch circumvented the load - in which case it can discard the value or just | ||
| 866 | cache it for later use. | ||
| 867 | |||
| 868 | Consider: | ||
| 869 | |||
| 870 | CPU 1 CPU 2 | ||
| 871 | ======================= ======================= | ||
| 872 | LOAD B | ||
| 873 | DIVIDE } Divide instructions generally | ||
| 874 | DIVIDE } take a long time to perform | ||
| 875 | LOAD A | ||
| 876 | |||
| 877 | Which might appear as this: | ||
| 878 | |||
| 879 | : : +-------+ | ||
| 880 | +-------+ | | | ||
| 881 | --->| B->2 |------>| | | ||
| 882 | +-------+ | CPU 2 | | ||
| 883 | : :DIVIDE | | | ||
| 884 | +-------+ | | | ||
| 885 | The CPU being busy doing a ---> --->| A->0 |~~~~ | | | ||
| 886 | division speculates on the +-------+ ~ | | | ||
| 887 | LOAD of A : : ~ | | | ||
| 888 | : :DIVIDE | | | ||
| 889 | : : ~ | | | ||
| 890 | Once the divisions are complete --> : : ~-->| | | ||
| 891 | the CPU can then perform the : : | | | ||
| 892 | LOAD with immediate effect : : +-------+ | ||
| 893 | |||
| 894 | |||
| 895 | Placing a read barrier or a data dependency barrier just before the second | ||
| 896 | load: | ||
| 897 | |||
| 898 | CPU 1 CPU 2 | ||
| 899 | ======================= ======================= | ||
| 900 | LOAD B | ||
| 901 | DIVIDE | ||
| 902 | DIVIDE | ||
| 903 | <read barrier> | ||
| 904 | LOAD A | ||
| 905 | |||
| 906 | will force any value speculatively obtained to be reconsidered to an extent | ||
| 907 | dependent on the type of barrier used. If there was no change made to the | ||
| 908 | speculated memory location, then the speculated value will just be used: | ||
| 909 | |||
| 910 | : : +-------+ | ||
| 911 | +-------+ | | | ||
| 912 | --->| B->2 |------>| | | ||
| 913 | +-------+ | CPU 2 | | ||
| 914 | : :DIVIDE | | | ||
| 915 | +-------+ | | | ||
| 916 | The CPU being busy doing a ---> --->| A->0 |~~~~ | | | ||
| 917 | division speculates on the +-------+ ~ | | | ||
| 918 | LOAD of A : : ~ | | | ||
| 919 | : :DIVIDE | | | ||
| 920 | : : ~ | | | ||
| 921 | : : ~ | | | ||
| 922 | rrrrrrrrrrrrrrrr~ | | | ||
| 923 | : : ~ | | | ||
| 924 | : : ~-->| | | ||
| 925 | : : | | | ||
| 926 | : : +-------+ | ||
| 927 | |||
| 928 | |||
| 929 | but if there was an update or an invalidation from another CPU pending, then | ||
| 930 | the speculation will be cancelled and the value reloaded: | ||
| 931 | |||
| 932 | : : +-------+ | ||
| 933 | +-------+ | | | ||
| 934 | --->| B->2 |------>| | | ||
| 935 | +-------+ | CPU 2 | | ||
| 936 | : :DIVIDE | | | ||
| 937 | +-------+ | | | ||
| 938 | The CPU being busy doing a ---> --->| A->0 |~~~~ | | | ||
| 939 | division speculates on the +-------+ ~ | | | ||
| 940 | LOAD of A : : ~ | | | ||
| 941 | : :DIVIDE | | | ||
| 942 | : : ~ | | | ||
| 943 | : : ~ | | | ||
| 944 | rrrrrrrrrrrrrrrrr | | | ||
| 945 | +-------+ | | | ||
| 946 | The speculation is discarded ---> --->| A->1 |------>| | | ||
| 947 | and an updated value is +-------+ | | | ||
| 948 | retrieved : : +-------+ | ||
| 769 | 949 | ||
| 770 | 950 | ||
| 771 | ======================== | 951 | ======================== |
| @@ -901,7 +1081,7 @@ IMPLICIT KERNEL MEMORY BARRIERS | |||
| 901 | =============================== | 1081 | =============================== |
| 902 | 1082 | ||
| 903 | Some of the other functions in the linux kernel imply memory barriers, amongst | 1083 | Some of the other functions in the linux kernel imply memory barriers, amongst |
| 904 | which are locking, scheduling and memory allocation functions. | 1084 | which are locking and scheduling functions. |
| 905 | 1085 | ||
| 906 | This specification is a _minimum_ guarantee; any particular architecture may | 1086 | This specification is a _minimum_ guarantee; any particular architecture may |
| 907 | provide more substantial guarantees, but these may not be relied upon outside | 1087 | provide more substantial guarantees, but these may not be relied upon outside |
| @@ -966,6 +1146,20 @@ equivalent to a full barrier, but a LOCK followed by an UNLOCK is not. | |||
| 966 | barriers is that the effects instructions outside of a critical section may | 1146 | barriers is that the effects instructions outside of a critical section may |
| 967 | seep into the inside of the critical section. | 1147 | seep into the inside of the critical section. |
| 968 | 1148 | ||
| 1149 | A LOCK followed by an UNLOCK may not be assumed to be full memory barrier | ||
| 1150 | because it is possible for an access preceding the LOCK to happen after the | ||
| 1151 | LOCK, and an access following the UNLOCK to happen before the UNLOCK, and the | ||
| 1152 | two accesses can themselves then cross: | ||
| 1153 | |||
| 1154 | *A = a; | ||
| 1155 | LOCK | ||
| 1156 | UNLOCK | ||
| 1157 | *B = b; | ||
| 1158 | |||
| 1159 | may occur as: | ||
| 1160 | |||
| 1161 | LOCK, STORE *B, STORE *A, UNLOCK | ||
| 1162 | |||
| 969 | Locks and semaphores may not provide any guarantee of ordering on UP compiled | 1163 | Locks and semaphores may not provide any guarantee of ordering on UP compiled |
| 970 | systems, and so cannot be counted on in such a situation to actually achieve | 1164 | systems, and so cannot be counted on in such a situation to actually achieve |
| 971 | anything at all - especially with respect to I/O accesses - unless combined | 1165 | anything at all - especially with respect to I/O accesses - unless combined |
| @@ -1016,8 +1210,6 @@ Other functions that imply barriers: | |||
| 1016 | 1210 | ||
| 1017 | (*) schedule() and similar imply full memory barriers. | 1211 | (*) schedule() and similar imply full memory barriers. |
| 1018 | 1212 | ||
| 1019 | (*) Memory allocation and release functions imply full memory barriers. | ||
| 1020 | |||
| 1021 | 1213 | ||
| 1022 | ================================= | 1214 | ================================= |
| 1023 | INTER-CPU LOCKING BARRIER EFFECTS | 1215 | INTER-CPU LOCKING BARRIER EFFECTS |
