diff options
author | Paul E. McKenney <paulmck@linux.vnet.ibm.com> | 2008-01-25 15:08:25 -0500 |
---|---|---|
committer | Ingo Molnar <mingo@elte.hu> | 2008-01-25 15:08:25 -0500 |
commit | f85d6c7168887e6659f4d00fa5f34fa23dbde1bb (patch) | |
tree | 8a06a394c5963fe1428d023e03ac9896087b23b2 | |
parent | eaf649e9fe6685f4c5a392cd0e16df5fd6660b7c (diff) |
Preempt-RCU: update RCU Documentation.
This patch updates the RCU documentation to reflect preemptible RCU as
well as recent publications.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Reviewed-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
-rw-r--r-- | Documentation/RCU/RTFP.txt | 210 | ||||
-rw-r--r-- | Documentation/RCU/rcu.txt | 19 | ||||
-rw-r--r-- | Documentation/RCU/torture.txt | 11 |
3 files changed, 221 insertions, 19 deletions
diff --git a/Documentation/RCU/RTFP.txt b/Documentation/RCU/RTFP.txt index 6221464d1a7e..39ad8f56783a 100644 --- a/Documentation/RCU/RTFP.txt +++ b/Documentation/RCU/RTFP.txt | |||
@@ -9,8 +9,8 @@ The first thing resembling RCU was published in 1980, when Kung and Lehman | |||
9 | [Kung80] recommended use of a garbage collector to defer destruction | 9 | [Kung80] recommended use of a garbage collector to defer destruction |
10 | of nodes in a parallel binary search tree in order to simplify its | 10 | of nodes in a parallel binary search tree in order to simplify its |
11 | implementation. This works well in environments that have garbage | 11 | implementation. This works well in environments that have garbage |
12 | collectors, but current production garbage collectors incur significant | 12 | collectors, but most production garbage collectors incur significant |
13 | read-side overhead. | 13 | overhead. |
14 | 14 | ||
15 | In 1982, Manber and Ladner [Manber82,Manber84] recommended deferring | 15 | In 1982, Manber and Ladner [Manber82,Manber84] recommended deferring |
16 | destruction until all threads running at that time have terminated, again | 16 | destruction until all threads running at that time have terminated, again |
@@ -99,16 +99,25 @@ locking, reduces contention, reduces memory latency for readers, and | |||
99 | parallelizes pipeline stalls and memory latency for writers. However, | 99 | parallelizes pipeline stalls and memory latency for writers. However, |
100 | these techniques still impose significant read-side overhead in the | 100 | these techniques still impose significant read-side overhead in the |
101 | form of memory barriers. Researchers at Sun worked along similar lines | 101 | form of memory barriers. Researchers at Sun worked along similar lines |
102 | in the same timeframe [HerlihyLM02,HerlihyLMS03]. These techniques | 102 | in the same timeframe [HerlihyLM02]. These techniques can be thought |
103 | can be thought of as inside-out reference counts, where the count is | 103 | of as inside-out reference counts, where the count is represented by the |
104 | represented by the number of hazard pointers referencing a given data | 104 | number of hazard pointers referencing a given data structure (rather than |
105 | structure (rather than the more conventional counter field within the | 105 | the more conventional counter field within the data structure itself). |
106 | data structure itself). | 106 | |
107 | By the same token, RCU can be thought of as a "bulk reference count", | ||
108 | where some form of reference counter covers all reference by a given CPU | ||
109 | or thread during a set timeframe. This timeframe is related to, but | ||
110 | not necessarily exactly the same as, an RCU grace period. In classic | ||
111 | RCU, the reference counter is the per-CPU bit in the "bitmask" field, | ||
112 | and each such bit covers all references that might have been made by | ||
113 | the corresponding CPU during the prior grace period. Of course, RCU | ||
114 | can be thought of in other terms as well. | ||
107 | 115 | ||
108 | In 2003, the K42 group described how RCU could be used to create | 116 | In 2003, the K42 group described how RCU could be used to create |
109 | hot-pluggable implementations of operating-system functions. Later that | 117 | hot-pluggable implementations of operating-system functions [Appavoo03a]. |
110 | year saw a paper describing an RCU implementation of System V IPC | 118 | Later that year saw a paper describing an RCU implementation of System |
111 | [Arcangeli03], and an introduction to RCU in Linux Journal [McKenney03a]. | 119 | V IPC [Arcangeli03], and an introduction to RCU in Linux Journal |
120 | [McKenney03a]. | ||
112 | 121 | ||
113 | 2004 has seen a Linux-Journal article on use of RCU in dcache | 122 | 2004 has seen a Linux-Journal article on use of RCU in dcache |
114 | [McKenney04a], a performance comparison of locking to RCU on several | 123 | [McKenney04a], a performance comparison of locking to RCU on several |
@@ -117,10 +126,19 @@ number of operating-system kernels [PaulEdwardMcKenneyPhD], a paper | |||
117 | describing how to make RCU safe for soft-realtime applications [Sarma04c], | 126 | describing how to make RCU safe for soft-realtime applications [Sarma04c], |
118 | and a paper describing SELinux performance with RCU [JamesMorris04b]. | 127 | and a paper describing SELinux performance with RCU [JamesMorris04b]. |
119 | 128 | ||
120 | 2005 has seen further adaptation of RCU to realtime use, permitting | 129 | 2005 brought further adaptation of RCU to realtime use, permitting |
121 | preemption of RCU realtime critical sections [PaulMcKenney05a, | 130 | preemption of RCU realtime critical sections [PaulMcKenney05a, |
122 | PaulMcKenney05b]. | 131 | PaulMcKenney05b]. |
123 | 132 | ||
133 | 2006 saw the first best-paper award for an RCU paper [ThomasEHart2006a], | ||
134 | as well as further work on efficient implementations of preemptible | ||
135 | RCU [PaulEMcKenney2006b], but priority-boosting of RCU read-side critical | ||
136 | sections proved elusive. An RCU implementation permitting general | ||
137 | blocking in read-side critical sections appeared [PaulEMcKenney2006c], | ||
138 | Robert Olsson described an RCU-protected trie-hash combination | ||
139 | [RobertOlsson2006a]. | ||
140 | |||
141 | |||
124 | Bibtex Entries | 142 | Bibtex Entries |
125 | 143 | ||
126 | @article{Kung80 | 144 | @article{Kung80 |
@@ -203,6 +221,41 @@ Bibtex Entries | |||
203 | ,Address="New Orleans, LA" | 221 | ,Address="New Orleans, LA" |
204 | } | 222 | } |
205 | 223 | ||
224 | @conference{Pu95a, | ||
225 | Author = "Calton Pu and Tito Autrey and Andrew Black and Charles Consel and | ||
226 | Crispin Cowan and Jon Inouye and Lakshmi Kethana and Jonathan Walpole and | ||
227 | Ke Zhang", | ||
228 | Title = "Optimistic Incremental Specialization: Streamlining a Commercial | ||
229 | Operating System", | ||
230 | Booktitle = "15\textsuperscript{th} ACM Symposium on | ||
231 | Operating Systems Principles (SOSP'95)", | ||
232 | address = "Copper Mountain, CO", | ||
233 | month="December", | ||
234 | year="1995", | ||
235 | pages="314-321", | ||
236 | annotation=" | ||
237 | Uses a replugger, but with a flag to signal when people are | ||
238 | using the resource at hand. Only one reader at a time. | ||
239 | " | ||
240 | } | ||
241 | |||
242 | @conference{Cowan96a, | ||
243 | Author = "Crispin Cowan and Tito Autrey and Charles Krasic and | ||
244 | Calton Pu and Jonathan Walpole", | ||
245 | Title = "Fast Concurrent Dynamic Linking for an Adaptive Operating System", | ||
246 | Booktitle = "International Conference on Configurable Distributed Systems | ||
247 | (ICCDS'96)", | ||
248 | address = "Annapolis, MD", | ||
249 | month="May", | ||
250 | year="1996", | ||
251 | pages="108", | ||
252 | isbn="0-8186-7395-8", | ||
253 | annotation=" | ||
254 | Uses a replugger, but with a counter to signal when people are | ||
255 | using the resource at hand. Allows multiple readers. | ||
256 | " | ||
257 | } | ||
258 | |||
206 | @techreport{Slingwine95 | 259 | @techreport{Slingwine95 |
207 | ,author="John D. Slingwine and Paul E. McKenney" | 260 | ,author="John D. Slingwine and Paul E. McKenney" |
208 | ,title="Apparatus and Method for Achieving Reduced Overhead Mutual | 261 | ,title="Apparatus and Method for Achieving Reduced Overhead Mutual |
@@ -312,6 +365,49 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell" | |||
312 | [Viewed June 23, 2004]" | 365 | [Viewed June 23, 2004]" |
313 | } | 366 | } |
314 | 367 | ||
368 | @conference{Michael02a | ||
369 | ,author="Maged M. Michael" | ||
370 | ,title="Safe Memory Reclamation for Dynamic Lock-Free Objects Using Atomic | ||
371 | Reads and Writes" | ||
372 | ,Year="2002" | ||
373 | ,Month="August" | ||
374 | ,booktitle="{Proceedings of the 21\textsuperscript{st} Annual ACM | ||
375 | Symposium on Principles of Distributed Computing}" | ||
376 | ,pages="21-30" | ||
377 | ,annotation=" | ||
378 | Each thread keeps an array of pointers to items that it is | ||
379 | currently referencing. Sort of an inside-out garbage collection | ||
380 | mechanism, but one that requires the accessing code to explicitly | ||
381 | state its needs. Also requires read-side memory barriers on | ||
382 | most architectures. | ||
383 | " | ||
384 | } | ||
385 | |||
386 | @conference{Michael02b | ||
387 | ,author="Maged M. Michael" | ||
388 | ,title="High Performance Dynamic Lock-Free Hash Tables and List-Based Sets" | ||
389 | ,Year="2002" | ||
390 | ,Month="August" | ||
391 | ,booktitle="{Proceedings of the 14\textsuperscript{th} Annual ACM | ||
392 | Symposium on Parallel | ||
393 | Algorithms and Architecture}" | ||
394 | ,pages="73-82" | ||
395 | ,annotation=" | ||
396 | Like the title says... | ||
397 | " | ||
398 | } | ||
399 | |||
400 | @InProceedings{HerlihyLM02 | ||
401 | ,author={Maurice Herlihy and Victor Luchangco and Mark Moir} | ||
402 | ,title="The Repeat Offender Problem: A Mechanism for Supporting Dynamic-Sized, | ||
403 | Lock-Free Data Structures" | ||
404 | ,booktitle={Proceedings of 16\textsuperscript{th} International | ||
405 | Symposium on Distributed Computing} | ||
406 | ,year=2002 | ||
407 | ,month="October" | ||
408 | ,pages="339-353" | ||
409 | } | ||
410 | |||
315 | @article{Appavoo03a | 411 | @article{Appavoo03a |
316 | ,author="J. Appavoo and K. Hui and C. A. N. Soules and R. W. Wisniewski and | 412 | ,author="J. Appavoo and K. Hui and C. A. N. Soules and R. W. Wisniewski and |
317 | D. M. {Da Silva} and O. Krieger and M. A. Auslander and D. J. Edelsohn and | 413 | D. M. {Da Silva} and O. Krieger and M. A. Auslander and D. J. Edelsohn and |
@@ -447,3 +543,95 @@ Oregon Health and Sciences University" | |||
447 | Realtime turns into making RCU yet more realtime friendly. | 543 | Realtime turns into making RCU yet more realtime friendly. |
448 | " | 544 | " |
449 | } | 545 | } |
546 | |||
547 | @conference{ThomasEHart2006a | ||
548 | ,Author="Thomas E. Hart and Paul E. McKenney and Angela Demke Brown" | ||
549 | ,Title="Making Lockless Synchronization Fast: Performance Implications | ||
550 | of Memory Reclamation" | ||
551 | ,Booktitle="20\textsuperscript{th} {IEEE} International Parallel and | ||
552 | Distributed Processing Symposium" | ||
553 | ,month="April" | ||
554 | ,year="2006" | ||
555 | ,day="25-29" | ||
556 | ,address="Rhodes, Greece" | ||
557 | ,annotation=" | ||
558 | Compares QSBR (AKA "classic RCU"), HPBR, EBR, and lock-free | ||
559 | reference counting. | ||
560 | " | ||
561 | } | ||
562 | |||
563 | @Conference{PaulEMcKenney2006b | ||
564 | ,Author="Paul E. McKenney and Dipankar Sarma and Ingo Molnar and | ||
565 | Suparna Bhattacharya" | ||
566 | ,Title="Extending RCU for Realtime and Embedded Workloads" | ||
567 | ,Booktitle="{Ottawa Linux Symposium}" | ||
568 | ,Month="July" | ||
569 | ,Year="2006" | ||
570 | ,pages="v2 123-138" | ||
571 | ,note="Available: | ||
572 | \url{http://www.linuxsymposium.org/2006/view_abstract.php?content_key=184} | ||
573 | \url{http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf} | ||
574 | [Viewed January 1, 2007]" | ||
575 | ,annotation=" | ||
576 | Described how to improve the -rt implementation of realtime RCU. | ||
577 | " | ||
578 | } | ||
579 | |||
580 | @unpublished{PaulEMcKenney2006c | ||
581 | ,Author="Paul E. McKenney" | ||
582 | ,Title="Sleepable {RCU}" | ||
583 | ,month="October" | ||
584 | ,day="9" | ||
585 | ,year="2006" | ||
586 | ,note="Available: | ||
587 | \url{http://lwn.net/Articles/202847/} | ||
588 | Revised: | ||
589 | \url{http://www.rdrop.com/users/paulmck/RCU/srcu.2007.01.14a.pdf} | ||
590 | [Viewed August 21, 2006]" | ||
591 | ,annotation=" | ||
592 | LWN article introducing SRCU. | ||
593 | " | ||
594 | } | ||
595 | |||
596 | @unpublished{RobertOlsson2006a | ||
597 | ,Author="Robert Olsson and Stefan Nilsson" | ||
598 | ,Title="{TRASH}: A dynamic {LC}-trie and hash data structure" | ||
599 | ,month="August" | ||
600 | ,day="18" | ||
601 | ,year="2006" | ||
602 | ,note="Available: | ||
603 | \url{http://www.nada.kth.se/~snilsson/public/papers/trash/trash.pdf} | ||
604 | [Viewed February 24, 2007]" | ||
605 | ,annotation=" | ||
606 | RCU-protected dynamic trie-hash combination. | ||
607 | " | ||
608 | } | ||
609 | |||
610 | @unpublished{ThomasEHart2007a | ||
611 | ,Author="Thomas E. Hart and Paul E. McKenney and Angela Demke Brown and Jonathan Walpole" | ||
612 | ,Title="Performance of memory reclamation for lockless synchronization" | ||
613 | ,journal="J. Parallel Distrib. Comput." | ||
614 | ,year="2007" | ||
615 | ,note="To appear in J. Parallel Distrib. Comput. | ||
616 | \url{doi=10.1016/j.jpdc.2007.04.010}" | ||
617 | ,annotation={ | ||
618 | Compares QSBR (AKA "classic RCU"), HPBR, EBR, and lock-free | ||
619 | reference counting. Journal version of ThomasEHart2006a. | ||
620 | } | ||
621 | } | ||
622 | |||
623 | @unpublished{PaulEMcKenney2007QRCUspin | ||
624 | ,Author="Paul E. McKenney" | ||
625 | ,Title="Using Promela and Spin to verify parallel algorithms" | ||
626 | ,month="August" | ||
627 | ,day="1" | ||
628 | ,year="2007" | ||
629 | ,note="Available: | ||
630 | \url{http://lwn.net/Articles/243851/} | ||
631 | [Viewed September 8, 2007]" | ||
632 | ,annotation=" | ||
633 | LWN article describing Promela and spin, and also using Oleg | ||
634 | Nesterov's QRCU as an example (with Paul McKenney's fastpath). | ||
635 | " | ||
636 | } | ||
637 | |||
diff --git a/Documentation/RCU/rcu.txt b/Documentation/RCU/rcu.txt index f84407cba816..95821a29ae41 100644 --- a/Documentation/RCU/rcu.txt +++ b/Documentation/RCU/rcu.txt | |||
@@ -36,6 +36,14 @@ o How can the updater tell when a grace period has completed | |||
36 | executed in user mode, or executed in the idle loop, we can | 36 | executed in user mode, or executed in the idle loop, we can |
37 | safely free up that item. | 37 | safely free up that item. |
38 | 38 | ||
39 | Preemptible variants of RCU (CONFIG_PREEMPT_RCU) get the | ||
40 | same effect, but require that the readers manipulate CPU-local | ||
41 | counters. These counters allow limited types of blocking | ||
42 | within RCU read-side critical sections. SRCU also uses | ||
43 | CPU-local counters, and permits general blocking within | ||
44 | RCU read-side critical sections. These two variants of | ||
45 | RCU detect grace periods by sampling these counters. | ||
46 | |||
39 | o If I am running on a uniprocessor kernel, which can only do one | 47 | o If I am running on a uniprocessor kernel, which can only do one |
40 | thing at a time, why should I wait for a grace period? | 48 | thing at a time, why should I wait for a grace period? |
41 | 49 | ||
@@ -46,7 +54,10 @@ o How can I see where RCU is currently used in the Linux kernel? | |||
46 | Search for "rcu_read_lock", "rcu_read_unlock", "call_rcu", | 54 | Search for "rcu_read_lock", "rcu_read_unlock", "call_rcu", |
47 | "rcu_read_lock_bh", "rcu_read_unlock_bh", "call_rcu_bh", | 55 | "rcu_read_lock_bh", "rcu_read_unlock_bh", "call_rcu_bh", |
48 | "srcu_read_lock", "srcu_read_unlock", "synchronize_rcu", | 56 | "srcu_read_lock", "srcu_read_unlock", "synchronize_rcu", |
49 | "synchronize_net", and "synchronize_srcu". | 57 | "synchronize_net", "synchronize_srcu", and the other RCU |
58 | primitives. Or grab one of the cscope databases from: | ||
59 | |||
60 | http://www.rdrop.com/users/paulmck/RCU/linuxusage/rculocktab.html | ||
50 | 61 | ||
51 | o What guidelines should I follow when writing code that uses RCU? | 62 | o What guidelines should I follow when writing code that uses RCU? |
52 | 63 | ||
@@ -67,7 +78,11 @@ o I hear that RCU is patented? What is with that? | |||
67 | 78 | ||
68 | o I hear that RCU needs work in order to support realtime kernels? | 79 | o I hear that RCU needs work in order to support realtime kernels? |
69 | 80 | ||
70 | Yes, work in progress. | 81 | This work is largely completed. Realtime-friendly RCU can be |
82 | enabled via the CONFIG_PREEMPT_RCU kernel configuration parameter. | ||
83 | However, work is in progress for enabling priority boosting of | ||
84 | preempted RCU read-side critical sections.This is needed if you | ||
85 | have CPU-bound realtime threads. | ||
71 | 86 | ||
72 | o Where can I find more information on RCU? | 87 | o Where can I find more information on RCU? |
73 | 88 | ||
diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt index 25a3c3f7d378..2967a65269d8 100644 --- a/Documentation/RCU/torture.txt +++ b/Documentation/RCU/torture.txt | |||
@@ -46,12 +46,13 @@ stat_interval The number of seconds between output of torture | |||
46 | 46 | ||
47 | shuffle_interval | 47 | shuffle_interval |
48 | The number of seconds to keep the test threads affinitied | 48 | The number of seconds to keep the test threads affinitied |
49 | to a particular subset of the CPUs. Used in conjunction | 49 | to a particular subset of the CPUs, defaults to 5 seconds. |
50 | with test_no_idle_hz. | 50 | Used in conjunction with test_no_idle_hz. |
51 | 51 | ||
52 | test_no_idle_hz Whether or not to test the ability of RCU to operate in | 52 | test_no_idle_hz Whether or not to test the ability of RCU to operate in |
53 | a kernel that disables the scheduling-clock interrupt to | 53 | a kernel that disables the scheduling-clock interrupt to |
54 | idle CPUs. Boolean parameter, "1" to test, "0" otherwise. | 54 | idle CPUs. Boolean parameter, "1" to test, "0" otherwise. |
55 | Defaults to omitting this test. | ||
55 | 56 | ||
56 | torture_type The type of RCU to test: "rcu" for the rcu_read_lock() API, | 57 | torture_type The type of RCU to test: "rcu" for the rcu_read_lock() API, |
57 | "rcu_sync" for rcu_read_lock() with synchronous reclamation, | 58 | "rcu_sync" for rcu_read_lock() with synchronous reclamation, |
@@ -82,8 +83,6 @@ be evident. ;-) | |||
82 | 83 | ||
83 | The entries are as follows: | 84 | The entries are as follows: |
84 | 85 | ||
85 | o "ggp": The number of counter flips (or batches) since boot. | ||
86 | |||
87 | o "rtc": The hexadecimal address of the structure currently visible | 86 | o "rtc": The hexadecimal address of the structure currently visible |
88 | to readers. | 87 | to readers. |
89 | 88 | ||
@@ -117,8 +116,8 @@ o "Reader Pipe": Histogram of "ages" of structures seen by readers. | |||
117 | o "Reader Batch": Another histogram of "ages" of structures seen | 116 | o "Reader Batch": Another histogram of "ages" of structures seen |
118 | by readers, but in terms of counter flips (or batches) rather | 117 | by readers, but in terms of counter flips (or batches) rather |
119 | than in terms of grace periods. The legal number of non-zero | 118 | than in terms of grace periods. The legal number of non-zero |
120 | entries is again two. The reason for this separate view is | 119 | entries is again two. The reason for this separate view is that |
121 | that it is easier to get the third entry to show up in the | 120 | it is sometimes easier to get the third entry to show up in the |
122 | "Reader Batch" list than in the "Reader Pipe" list. | 121 | "Reader Batch" list than in the "Reader Pipe" list. |
123 | 122 | ||
124 | o "Free-Block Circulation": Shows the number of torture structures | 123 | o "Free-Block Circulation": Shows the number of torture structures |