diff options
author | Alan Stern <stern@rowland.harvard.edu> | 2018-05-14 19:33:40 -0400 |
---|---|---|
committer | Ingo Molnar <mingo@kernel.org> | 2018-05-15 02:11:16 -0400 |
commit | 9d036883a17969caf8796d1fce813af0ab016986 (patch) | |
tree | 5c9bcd17f62f63a8a3d85cb5abb9bf33a6d4cceb /tools/memory-model | |
parent | 1ee2da5f9b5a8e814b397b77a08d44fed5f96a4a (diff) |
tools/memory-model: Redefine rb in terms of rcu-fence
This patch reorganizes the definition of rb in the Linux Kernel Memory
Consistency Model. The relation is now expressed in terms of
rcu-fence, which consists of a sequence of gp and rscs links separated
by rcu-link links, in which the number of occurrences of gp is >= the
number of occurrences of rscs.
Arguments similar to those published in
http://diy.inria.fr/linux/long.pdf show that rcu-fence behaves like an
inter-CPU strong fence. Furthermore, the definition of rb in terms of
rcu-fence is highly analogous to the definition of pb in terms of
strong-fence, which can help explain why rcu-path expresses a form of
temporal ordering.
This change should not affect the semantics of the memory model, just
its internal organization.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: akiyks@gmail.com
Cc: dhowells@redhat.com
Cc: j.alglave@ucl.ac.uk
Cc: linux-arch@vger.kernel.org
Cc: luc.maranget@inria.fr
Cc: npiggin@gmail.com
Link: http://lkml.kernel.org/r/1526340837-12222-2-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Diffstat (limited to 'tools/memory-model')
-rw-r--r-- | tools/memory-model/Documentation/explanation.txt | 168 | ||||
-rw-r--r-- | tools/memory-model/linux-kernel.cat | 33 |
2 files changed, 129 insertions, 72 deletions
diff --git a/tools/memory-model/Documentation/explanation.txt b/tools/memory-model/Documentation/explanation.txt index 1a387d703212..1b09f3175a1f 100644 --- a/tools/memory-model/Documentation/explanation.txt +++ b/tools/memory-model/Documentation/explanation.txt | |||
@@ -27,7 +27,7 @@ Explanation of the Linux-Kernel Memory Consistency Model | |||
27 | 19. AND THEN THERE WAS ALPHA | 27 | 19. AND THEN THERE WAS ALPHA |
28 | 20. THE HAPPENS-BEFORE RELATION: hb | 28 | 20. THE HAPPENS-BEFORE RELATION: hb |
29 | 21. THE PROPAGATES-BEFORE RELATION: pb | 29 | 21. THE PROPAGATES-BEFORE RELATION: pb |
30 | 22. RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb | 30 | 22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb |
31 | 23. ODDS AND ENDS | 31 | 23. ODDS AND ENDS |
32 | 32 | ||
33 | 33 | ||
@@ -1451,8 +1451,8 @@ they execute means that it cannot have cycles. This requirement is | |||
1451 | the content of the LKMM's "propagation" axiom. | 1451 | the content of the LKMM's "propagation" axiom. |
1452 | 1452 | ||
1453 | 1453 | ||
1454 | RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb | 1454 | RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb |
1455 | --------------------------------------------------- | 1455 | ---------------------------------------------------- |
1456 | 1456 | ||
1457 | RCU (Read-Copy-Update) is a powerful synchronization mechanism. It | 1457 | RCU (Read-Copy-Update) is a powerful synchronization mechanism. It |
1458 | rests on two concepts: grace periods and read-side critical sections. | 1458 | rests on two concepts: grace periods and read-side critical sections. |
@@ -1537,49 +1537,100 @@ relation, and the details don't matter unless you want to comb through | |||
1537 | a somewhat lengthy formal proof. Pretty much all you need to know | 1537 | a somewhat lengthy formal proof. Pretty much all you need to know |
1538 | about rcu-link is the information in the preceding paragraph. | 1538 | about rcu-link is the information in the preceding paragraph. |
1539 | 1539 | ||
1540 | The LKMM goes on to define the gp-link and rscs-link relations. They | 1540 | The LKMM also defines the gp and rscs relations. They bring grace |
1541 | bring grace periods and read-side critical sections into the picture, | 1541 | periods and read-side critical sections into the picture, in the |
1542 | in the following way: | 1542 | following way: |
1543 | 1543 | ||
1544 | E ->gp-link F means there is a synchronize_rcu() fence event S | 1544 | E ->gp F means there is a synchronize_rcu() fence event S such |
1545 | and an event X such that E ->po S, either S ->po X or S = X, | 1545 | that E ->po S and either S ->po F or S = F. In simple terms, |
1546 | and X ->rcu-link F. In other words, E and F are linked by a | 1546 | there is a grace period po-between E and F. |
1547 | grace period followed by an instance of rcu-link. | ||
1548 | 1547 | ||
1549 | E ->rscs-link F means there is a critical section delimited by | 1548 | E ->rscs F means there is a critical section delimited by an |
1550 | an rcu_read_lock() fence L and an rcu_read_unlock() fence U, | 1549 | rcu_read_lock() fence L and an rcu_read_unlock() fence U, such |
1551 | and an event X such that E ->po U, either L ->po X or L = X, | 1550 | that E ->po U and either L ->po F or L = F. You can think of |
1552 | and X ->rcu-link F. Roughly speaking, this says that some | 1551 | this as saying that E and F are in the same critical section |
1553 | event in the same critical section as E is linked by rcu-link | 1552 | (in fact, it also allows E to be po-before the start of the |
1554 | to F. | 1553 | critical section and F to be po-after the end). |
1555 | 1554 | ||
1556 | If we think of the rcu-link relation as standing for an extended | 1555 | If we think of the rcu-link relation as standing for an extended |
1557 | "before", then E ->gp-link F says that E executes before a grace | 1556 | "before", then X ->gp Y ->rcu-link Z says that X executes before a |
1558 | period which ends before F executes. (In fact it covers more than | 1557 | grace period which ends before Z executes. (In fact it covers more |
1559 | this, because it also includes cases where E executes before a grace | 1558 | than this, because it also includes cases where X executes before a |
1560 | period and some store propagates to F's CPU before F executes and | 1559 | grace period and some store propagates to Z's CPU before Z executes |
1561 | doesn't propagate to some other CPU until after the grace period | 1560 | but doesn't propagate to some other CPU until after the grace period |
1562 | ends.) Similarly, E ->rscs-link F says that E is part of (or before | 1561 | ends.) Similarly, X ->rscs Y ->rcu-link Z says that X is part of (or |
1563 | the start of) a critical section which starts before F executes. | 1562 | before the start of) a critical section which starts before Z |
1563 | executes. | ||
1564 | |||
1565 | The LKMM goes on to define the rcu-fence relation as a sequence of gp | ||
1566 | and rscs links separated by rcu-link links, in which the number of gp | ||
1567 | links is >= the number of rscs links. For example: | ||
1568 | |||
1569 | X ->gp Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V | ||
1570 | |||
1571 | would imply that X ->rcu-fence V, because this sequence contains two | ||
1572 | gp links and only one rscs link. (It also implies that X ->rcu-fence T | ||
1573 | and Z ->rcu-fence V.) On the other hand: | ||
1574 | |||
1575 | X ->rscs Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V | ||
1576 | |||
1577 | does not imply X ->rcu-fence V, because the sequence contains only | ||
1578 | one gp link but two rscs links. | ||
1579 | |||
1580 | The rcu-fence relation is important because the Grace Period Guarantee | ||
1581 | means that rcu-fence acts kind of like a strong fence. In particular, | ||
1582 | if W is a write and we have W ->rcu-fence Z, the Guarantee says that W | ||
1583 | will propagate to every CPU before Z executes. | ||
1584 | |||
1585 | To prove this in full generality requires some intellectual effort. | ||
1586 | We'll consider just a very simple case: | ||
1587 | |||
1588 | W ->gp X ->rcu-link Y ->rscs Z. | ||
1589 | |||
1590 | This formula means that there is a grace period G and a critical | ||
1591 | section C such that: | ||
1592 | |||
1593 | 1. W is po-before G; | ||
1594 | |||
1595 | 2. X is equal to or po-after G; | ||
1596 | |||
1597 | 3. X comes "before" Y in some sense; | ||
1598 | |||
1599 | 4. Y is po-before the end of C; | ||
1600 | |||
1601 | 5. Z is equal to or po-after the start of C. | ||
1602 | |||
1603 | From 2 - 4 we deduce that the grace period G ends before the critical | ||
1604 | section C. Then the second part of the Grace Period Guarantee says | ||
1605 | not only that G starts before C does, but also that W (which executes | ||
1606 | on G's CPU before G starts) must propagate to every CPU before C | ||
1607 | starts. In particular, W propagates to every CPU before Z executes | ||
1608 | (or finishes executing, in the case where Z is equal to the | ||
1609 | rcu_read_lock() fence event which starts C.) This sort of reasoning | ||
1610 | can be expanded to handle all the situations covered by rcu-fence. | ||
1611 | |||
1612 | Finally, the LKMM defines the RCU-before (rb) relation in terms of | ||
1613 | rcu-fence. This is done in essentially the same way as the pb | ||
1614 | relation was defined in terms of strong-fence. We will omit the | ||
1615 | details; the end result is that E ->rb F implies E must execute before | ||
1616 | F, just as E ->pb F does (and for much the same reasons). | ||
1564 | 1617 | ||
1565 | Putting this all together, the LKMM expresses the Grace Period | 1618 | Putting this all together, the LKMM expresses the Grace Period |
1566 | Guarantee by requiring that there are no cycles consisting of gp-link | 1619 | Guarantee by requiring that the rb relation does not contain a cycle. |
1567 | and rscs-link links in which the number of gp-link instances is >= the | 1620 | Equivalently, this "rcu" axiom requires that there are no events E and |
1568 | number of rscs-link instances. It does this by defining the rb | 1621 | F with E ->rcu-link F ->rcu-fence E. Or to put it a third way, the |
1569 | relation to link events E and F whenever it is possible to pass from E | 1622 | axiom requires that there are no cycles consisting of gp and rscs |
1570 | to F by a sequence of gp-link and rscs-link links with at least as | 1623 | alternating with rcu-link, where the number of gp links is >= the |
1571 | many of the former as the latter. The LKMM's "rcu" axiom then says | 1624 | number of rscs links. |
1572 | that there are no events E with E ->rb E. | 1625 | |
1573 | 1626 | Justifying the axiom isn't easy, but it is in fact a valid | |
1574 | Justifying this axiom takes some intellectual effort, but it is in | 1627 | formalization of the Grace Period Guarantee. We won't attempt to go |
1575 | fact a valid formalization of the Grace Period Guarantee. We won't | 1628 | through the detailed argument, but the following analysis gives a |
1576 | attempt to go through the detailed argument, but the following | 1629 | taste of what is involved. Suppose we have a violation of the first |
1577 | analysis gives a taste of what is involved. Suppose we have a | 1630 | part of the Guarantee: A critical section starts before a grace |
1578 | violation of the first part of the Guarantee: A critical section | 1631 | period, and some store propagates to the critical section's CPU before |
1579 | starts before a grace period, and some store propagates to the | 1632 | the end of the critical section but doesn't propagate to some other |
1580 | critical section's CPU before the end of the critical section but | 1633 | CPU until after the end of the grace period. |
1581 | doesn't propagate to some other CPU until after the end of the grace | ||
1582 | period. | ||
1583 | 1634 | ||
1584 | Putting symbols to these ideas, let L and U be the rcu_read_lock() and | 1635 | Putting symbols to these ideas, let L and U be the rcu_read_lock() and |
1585 | rcu_read_unlock() fence events delimiting the critical section in | 1636 | rcu_read_unlock() fence events delimiting the critical section in |
@@ -1606,11 +1657,14 @@ by rcu-link, yielding: | |||
1606 | 1657 | ||
1607 | S ->po X ->rcu-link Z ->po U. | 1658 | S ->po X ->rcu-link Z ->po U. |
1608 | 1659 | ||
1609 | The formulas say that S is po-between F and X, hence F ->gp-link Z | 1660 | The formulas say that S is po-between F and X, hence F ->gp X. They |
1610 | via X. They also say that Z comes before the end of the critical | 1661 | also say that Z comes before the end of the critical section and E |
1611 | section and E comes after its start, hence Z ->rscs-link F via E. But | 1662 | comes after its start, hence Z ->rscs E. From all this we obtain: |
1612 | now we have a forbidden cycle: F ->gp-link Z ->rscs-link F. Thus the | 1663 | |
1613 | "rcu" axiom rules out this violation of the Grace Period Guarantee. | 1664 | F ->gp X ->rcu-link Z ->rscs E ->rcu-link F, |
1665 | |||
1666 | a forbidden cycle. Thus the "rcu" axiom rules out this violation of | ||
1667 | the Grace Period Guarantee. | ||
1614 | 1668 | ||
1615 | For something a little more down-to-earth, let's see how the axiom | 1669 | For something a little more down-to-earth, let's see how the axiom |
1616 | works out in practice. Consider the RCU code example from above, this | 1670 | works out in practice. Consider the RCU code example from above, this |
@@ -1639,15 +1693,15 @@ time with statement labels added to the memory access instructions: | |||
1639 | If r2 = 0 at the end then P0's store at X overwrites the value that | 1693 | If r2 = 0 at the end then P0's store at X overwrites the value that |
1640 | P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X. | 1694 | P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X. |
1641 | In addition, there is a synchronize_rcu() between Y and Z, so therefore | 1695 | In addition, there is a synchronize_rcu() between Y and Z, so therefore |
1642 | we have Y ->gp-link X. | 1696 | we have Y ->gp Z. |
1643 | 1697 | ||
1644 | If r1 = 1 at the end then P1's load at Y reads from P0's store at W, | 1698 | If r1 = 1 at the end then P1's load at Y reads from P0's store at W, |
1645 | so we have W ->rcu-link Y. In addition, W and X are in the same critical | 1699 | so we have W ->rcu-link Y. In addition, W and X are in the same critical |
1646 | section, so therefore we have X ->rscs-link Y. | 1700 | section, so therefore we have X ->rscs W. |
1647 | 1701 | ||
1648 | This gives us a cycle, Y ->gp-link X ->rscs-link Y, with one gp-link | 1702 | Then X ->rscs W ->rcu-link Y ->gp Z ->rcu-link X is a forbidden cycle, |
1649 | and one rscs-link, violating the "rcu" axiom. Hence the outcome is | 1703 | violating the "rcu" axiom. Hence the outcome is not allowed by the |
1650 | not allowed by the LKMM, as we would expect. | 1704 | LKMM, as we would expect. |
1651 | 1705 | ||
1652 | For contrast, let's see what can happen in a more complicated example: | 1706 | For contrast, let's see what can happen in a more complicated example: |
1653 | 1707 | ||
@@ -1683,15 +1737,11 @@ For contrast, let's see what can happen in a more complicated example: | |||
1683 | } | 1737 | } |
1684 | 1738 | ||
1685 | If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows | 1739 | If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows |
1686 | that W ->rscs-link Y via X, Y ->gp-link U via Z, and U ->rscs-link W | 1740 | that W ->rscs X ->rcu-link Y ->gp Z ->rcu-link U ->rscs V ->rcu-link W. |
1687 | via V. And just as before, this gives a cycle: | 1741 | However this cycle is not forbidden, because the sequence of relations |
1688 | 1742 | contains fewer instances of gp (one) than of rscs (two). Consequently | |
1689 | W ->rscs-link Y ->gp-link U ->rscs-link W. | 1743 | the outcome is allowed by the LKMM. The following instruction timing |
1690 | 1744 | diagram shows how it might actually occur: | |
1691 | However, this cycle has fewer gp-link instances than rscs-link | ||
1692 | instances, and consequently the outcome is not forbidden by the LKMM. | ||
1693 | The following instruction timing diagram shows how it might actually | ||
1694 | occur: | ||
1695 | 1745 | ||
1696 | P0 P1 P2 | 1746 | P0 P1 P2 |
1697 | -------------------- -------------------- -------------------- | 1747 | -------------------- -------------------- -------------------- |
diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat index cdf682859d4e..1e5c4653dd12 100644 --- a/tools/memory-model/linux-kernel.cat +++ b/tools/memory-model/linux-kernel.cat | |||
@@ -102,20 +102,27 @@ let rscs = po ; crit^-1 ; po? | |||
102 | *) | 102 | *) |
103 | let rcu-link = hb* ; pb* ; prop | 103 | let rcu-link = hb* ; pb* ; prop |
104 | 104 | ||
105 | (* Chains that affect the RCU grace-period guarantee *) | ||
106 | let gp-link = gp ; rcu-link | ||
107 | let rscs-link = rscs ; rcu-link | ||
108 | |||
109 | (* | 105 | (* |
110 | * A cycle containing at least as many grace periods as RCU read-side | 106 | * Any sequence containing at least as many grace periods as RCU read-side |
111 | * critical sections is forbidden. | 107 | * critical sections (joined by rcu-link) acts as a generalized strong fence. |
112 | *) | 108 | *) |
113 | let rec rb = | 109 | let rec rcu-fence = gp | |
114 | gp-link | | 110 | (gp ; rcu-link ; rscs) | |
115 | (gp-link ; rscs-link) | | 111 | (rscs ; rcu-link ; gp) | |
116 | (rscs-link ; gp-link) | | 112 | (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | |
117 | (rb ; rb) | | 113 | (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | |
118 | (gp-link ; rb ; rscs-link) | | 114 | (rcu-fence ; rcu-link ; rcu-fence) |
119 | (rscs-link ; rb ; gp-link) | 115 | |
116 | (* rb orders instructions just as pb does *) | ||
117 | let rb = prop ; rcu-fence ; hb* ; pb* | ||
120 | 118 | ||
121 | irreflexive rb as rcu | 119 | irreflexive rb as rcu |
120 | |||
121 | (* | ||
122 | * The happens-before, propagation, and rcu constraints are all | ||
123 | * expressions of temporal ordering. They could be replaced by | ||
124 | * a single constraint on an "executes-before" relation, xb: | ||
125 | * | ||
126 | * let xb = hb | pb | rb | ||
127 | * acyclic xb as executes-before | ||
128 | *) | ||