diff options
Diffstat (limited to 'Documentation/memory-barriers.txt')
| -rw-r--r-- | Documentation/memory-barriers.txt | 141 |
1 files changed, 116 insertions, 25 deletions
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 904ee42d078e..3729cbe60e41 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt | |||
| @@ -232,7 +232,7 @@ And there are a number of things that _must_ or _must_not_ be assumed: | |||
| 232 | with memory references that are not protected by READ_ONCE() and | 232 | with memory references that are not protected by READ_ONCE() and |
| 233 | WRITE_ONCE(). Without them, the compiler is within its rights to | 233 | WRITE_ONCE(). Without them, the compiler is within its rights to |
| 234 | do all sorts of "creative" transformations, which are covered in | 234 | do all sorts of "creative" transformations, which are covered in |
| 235 | the Compiler Barrier section. | 235 | the COMPILER BARRIER section. |
| 236 | 236 | ||
| 237 | (*) It _must_not_ be assumed that independent loads and stores will be issued | 237 | (*) It _must_not_ be assumed that independent loads and stores will be issued |
| 238 | in the order given. This means that for: | 238 | in the order given. This means that for: |
| @@ -555,6 +555,30 @@ between the address load and the data load: | |||
| 555 | This enforces the occurrence of one of the two implications, and prevents the | 555 | This enforces the occurrence of one of the two implications, and prevents the |
| 556 | third possibility from arising. | 556 | third possibility from arising. |
| 557 | 557 | ||
| 558 | A data-dependency barrier must also order against dependent writes: | ||
| 559 | |||
| 560 | CPU 1 CPU 2 | ||
| 561 | =============== =============== | ||
| 562 | { A == 1, B == 2, C = 3, P == &A, Q == &C } | ||
| 563 | B = 4; | ||
| 564 | <write barrier> | ||
| 565 | WRITE_ONCE(P, &B); | ||
| 566 | Q = READ_ONCE(P); | ||
| 567 | <data dependency barrier> | ||
| 568 | *Q = 5; | ||
| 569 | |||
| 570 | The data-dependency barrier must order the read into Q with the store | ||
| 571 | into *Q. This prohibits this outcome: | ||
| 572 | |||
| 573 | (Q == B) && (B == 4) | ||
| 574 | |||
| 575 | Please note that this pattern should be rare. After all, the whole point | ||
| 576 | of dependency ordering is to -prevent- writes to the data structure, along | ||
| 577 | with the expensive cache misses associated with those writes. This pattern | ||
| 578 | can be used to record rare error conditions and the like, and the ordering | ||
| 579 | prevents such records from being lost. | ||
| 580 | |||
| 581 | |||
| 558 | [!] Note that this extremely counterintuitive situation arises most easily on | 582 | [!] Note that this extremely counterintuitive situation arises most easily on |
| 559 | machines with split caches, so that, for example, one cache bank processes | 583 | machines with split caches, so that, for example, one cache bank processes |
| 560 | even-numbered cache lines and the other bank processes odd-numbered cache | 584 | even-numbered cache lines and the other bank processes odd-numbered cache |
| @@ -565,21 +589,6 @@ odd-numbered bank is idle, one can see the new value of the pointer P (&B), | |||
| 565 | but the old value of the variable B (2). | 589 | but the old value of the variable B (2). |
| 566 | 590 | ||
| 567 | 591 | ||
| 568 | Another example of where data dependency barriers might be required is where a | ||
| 569 | number is read from memory and then used to calculate the index for an array | ||
| 570 | access: | ||
| 571 | |||
| 572 | CPU 1 CPU 2 | ||
| 573 | =============== =============== | ||
| 574 | { M[0] == 1, M[1] == 2, M[3] = 3, P == 0, Q == 3 } | ||
| 575 | M[1] = 4; | ||
| 576 | <write barrier> | ||
| 577 | WRITE_ONCE(P, 1); | ||
| 578 | Q = READ_ONCE(P); | ||
| 579 | <data dependency barrier> | ||
| 580 | D = M[Q]; | ||
| 581 | |||
| 582 | |||
| 583 | The data dependency barrier is very important to the RCU system, | 592 | The data dependency barrier is very important to the RCU system, |
| 584 | for example. See rcu_assign_pointer() and rcu_dereference() in | 593 | for example. See rcu_assign_pointer() and rcu_dereference() in |
| 585 | include/linux/rcupdate.h. This permits the current target of an RCU'd | 594 | include/linux/rcupdate.h. This permits the current target of an RCU'd |
| @@ -800,9 +809,13 @@ In summary: | |||
| 800 | use smp_rmb(), smp_wmb(), or, in the case of prior stores and | 809 | use smp_rmb(), smp_wmb(), or, in the case of prior stores and |
| 801 | later loads, smp_mb(). | 810 | later loads, smp_mb(). |
| 802 | 811 | ||
| 803 | (*) If both legs of the "if" statement begin with identical stores | 812 | (*) If both legs of the "if" statement begin with identical stores to |
| 804 | to the same variable, a barrier() statement is required at the | 813 | the same variable, then those stores must be ordered, either by |
| 805 | beginning of each leg of the "if" statement. | 814 | preceding both of them with smp_mb() or by using smp_store_release() |
| 815 | to carry out the stores. Please note that it is -not- sufficient | ||
| 816 | to use barrier() at beginning of each leg of the "if" statement, | ||
| 817 | as optimizing compilers do not necessarily respect barrier() | ||
| 818 | in this case. | ||
| 806 | 819 | ||
| 807 | (*) Control dependencies require at least one run-time conditional | 820 | (*) Control dependencies require at least one run-time conditional |
| 808 | between the prior load and the subsequent store, and this | 821 | between the prior load and the subsequent store, and this |
| @@ -814,7 +827,7 @@ In summary: | |||
| 814 | (*) Control dependencies require that the compiler avoid reordering the | 827 | (*) Control dependencies require that the compiler avoid reordering the |
| 815 | dependency into nonexistence. Careful use of READ_ONCE() or | 828 | dependency into nonexistence. Careful use of READ_ONCE() or |
| 816 | atomic{,64}_read() can help to preserve your control dependency. | 829 | atomic{,64}_read() can help to preserve your control dependency. |
| 817 | Please see the Compiler Barrier section for more information. | 830 | Please see the COMPILER BARRIER section for more information. |
| 818 | 831 | ||
| 819 | (*) Control dependencies pair normally with other types of barriers. | 832 | (*) Control dependencies pair normally with other types of barriers. |
| 820 | 833 | ||
| @@ -1257,7 +1270,7 @@ TRANSITIVITY | |||
| 1257 | 1270 | ||
| 1258 | Transitivity is a deeply intuitive notion about ordering that is not | 1271 | Transitivity is a deeply intuitive notion about ordering that is not |
| 1259 | always provided by real computer systems. The following example | 1272 | always provided by real computer systems. The following example |
| 1260 | demonstrates transitivity (also called "cumulativity"): | 1273 | demonstrates transitivity: |
| 1261 | 1274 | ||
| 1262 | CPU 1 CPU 2 CPU 3 | 1275 | CPU 1 CPU 2 CPU 3 |
| 1263 | ======================= ======================= ======================= | 1276 | ======================= ======================= ======================= |
| @@ -1305,8 +1318,86 @@ or a level of cache, CPU 2 might have early access to CPU 1's writes. | |||
| 1305 | General barriers are therefore required to ensure that all CPUs agree | 1318 | General barriers are therefore required to ensure that all CPUs agree |
| 1306 | on the combined order of CPU 1's and CPU 2's accesses. | 1319 | on the combined order of CPU 1's and CPU 2's accesses. |
| 1307 | 1320 | ||
| 1308 | To reiterate, if your code requires transitivity, use general barriers | 1321 | General barriers provide "global transitivity", so that all CPUs will |
| 1309 | throughout. | 1322 | agree on the order of operations. In contrast, a chain of release-acquire |
| 1323 | pairs provides only "local transitivity", so that only those CPUs on | ||
| 1324 | the chain are guaranteed to agree on the combined order of the accesses. | ||
| 1325 | For example, switching to C code in deference to Herman Hollerith: | ||
| 1326 | |||
| 1327 | int u, v, x, y, z; | ||
| 1328 | |||
| 1329 | void cpu0(void) | ||
| 1330 | { | ||
| 1331 | r0 = smp_load_acquire(&x); | ||
| 1332 | WRITE_ONCE(u, 1); | ||
| 1333 | smp_store_release(&y, 1); | ||
| 1334 | } | ||
| 1335 | |||
| 1336 | void cpu1(void) | ||
| 1337 | { | ||
| 1338 | r1 = smp_load_acquire(&y); | ||
| 1339 | r4 = READ_ONCE(v); | ||
| 1340 | r5 = READ_ONCE(u); | ||
| 1341 | smp_store_release(&z, 1); | ||
| 1342 | } | ||
| 1343 | |||
| 1344 | void cpu2(void) | ||
| 1345 | { | ||
| 1346 | r2 = smp_load_acquire(&z); | ||
| 1347 | smp_store_release(&x, 1); | ||
| 1348 | } | ||
| 1349 | |||
| 1350 | void cpu3(void) | ||
| 1351 | { | ||
| 1352 | WRITE_ONCE(v, 1); | ||
| 1353 | smp_mb(); | ||
| 1354 | r3 = READ_ONCE(u); | ||
| 1355 | } | ||
| 1356 | |||
| 1357 | Because cpu0(), cpu1(), and cpu2() participate in a local transitive | ||
| 1358 | chain of smp_store_release()/smp_load_acquire() pairs, the following | ||
| 1359 | outcome is prohibited: | ||
| 1360 | |||
| 1361 | r0 == 1 && r1 == 1 && r2 == 1 | ||
| 1362 | |||
| 1363 | Furthermore, because of the release-acquire relationship between cpu0() | ||
| 1364 | and cpu1(), cpu1() must see cpu0()'s writes, so that the following | ||
| 1365 | outcome is prohibited: | ||
| 1366 | |||
| 1367 | r1 == 1 && r5 == 0 | ||
| 1368 | |||
| 1369 | However, the transitivity of release-acquire is local to the participating | ||
| 1370 | CPUs and does not apply to cpu3(). Therefore, the following outcome | ||
| 1371 | is possible: | ||
| 1372 | |||
| 1373 | r0 == 0 && r1 == 1 && r2 == 1 && r3 == 0 && r4 == 0 | ||
| 1374 | |||
| 1375 | As an aside, the following outcome is also possible: | ||
| 1376 | |||
| 1377 | r0 == 0 && r1 == 1 && r2 == 1 && r3 == 0 && r4 == 0 && r5 == 1 | ||
| 1378 | |||
| 1379 | Although cpu0(), cpu1(), and cpu2() will see their respective reads and | ||
| 1380 | writes in order, CPUs not involved in the release-acquire chain might | ||
| 1381 | well disagree on the order. This disagreement stems from the fact that | ||
| 1382 | the weak memory-barrier instructions used to implement smp_load_acquire() | ||
| 1383 | and smp_store_release() are not required to order prior stores against | ||
| 1384 | subsequent loads in all cases. This means that cpu3() can see cpu0()'s | ||
| 1385 | store to u as happening -after- cpu1()'s load from v, even though | ||
| 1386 | both cpu0() and cpu1() agree that these two operations occurred in the | ||
| 1387 | intended order. | ||
| 1388 | |||
| 1389 | However, please keep in mind that smp_load_acquire() is not magic. | ||
| 1390 | In particular, it simply reads from its argument with ordering. It does | ||
| 1391 | -not- ensure that any particular value will be read. Therefore, the | ||
| 1392 | following outcome is possible: | ||
| 1393 | |||
| 1394 | r0 == 0 && r1 == 0 && r2 == 0 && r5 == 0 | ||
| 1395 | |||
| 1396 | Note that this outcome can happen even on a mythical sequentially | ||
| 1397 | consistent system where nothing is ever reordered. | ||
| 1398 | |||
| 1399 | To reiterate, if your code requires global transitivity, use general | ||
| 1400 | barriers throughout. | ||
| 1310 | 1401 | ||
| 1311 | 1402 | ||
| 1312 | ======================== | 1403 | ======================== |
| @@ -1459,7 +1550,7 @@ of optimizations: | |||
| 1459 | the following: | 1550 | the following: |
| 1460 | 1551 | ||
| 1461 | a = 0; | 1552 | a = 0; |
| 1462 | /* Code that does not store to variable a. */ | 1553 | ... Code that does not store to variable a ... |
| 1463 | a = 0; | 1554 | a = 0; |
| 1464 | 1555 | ||
| 1465 | The compiler sees that the value of variable 'a' is already zero, so | 1556 | The compiler sees that the value of variable 'a' is already zero, so |
| @@ -1471,7 +1562,7 @@ of optimizations: | |||
| 1471 | wrong guess: | 1562 | wrong guess: |
| 1472 | 1563 | ||
| 1473 | WRITE_ONCE(a, 0); | 1564 | WRITE_ONCE(a, 0); |
| 1474 | /* Code that does not store to variable a. */ | 1565 | ... Code that does not store to variable a ... |
| 1475 | WRITE_ONCE(a, 0); | 1566 | WRITE_ONCE(a, 0); |
| 1476 | 1567 | ||
| 1477 | (*) The compiler is within its rights to reorder memory accesses unless | 1568 | (*) The compiler is within its rights to reorder memory accesses unless |
