diff options
Diffstat (limited to 'Documentation/RCU')
-rw-r--r-- | Documentation/RCU/checklist.txt | 46 | ||||
-rw-r--r-- | Documentation/RCU/stallwarn.txt | 18 | ||||
-rw-r--r-- | Documentation/RCU/trace.txt | 13 |
3 files changed, 69 insertions, 8 deletions
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt index 790d1a81237..0c134f8afc6 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.txt | |||
@@ -218,13 +218,22 @@ over a rather long period of time, but improvements are always welcome! | |||
218 | include: | 218 | include: |
219 | 219 | ||
220 | a. Keeping a count of the number of data-structure elements | 220 | a. Keeping a count of the number of data-structure elements |
221 | used by the RCU-protected data structure, including those | 221 | used by the RCU-protected data structure, including |
222 | waiting for a grace period to elapse. Enforce a limit | 222 | those waiting for a grace period to elapse. Enforce a |
223 | on this number, stalling updates as needed to allow | 223 | limit on this number, stalling updates as needed to allow |
224 | previously deferred frees to complete. | 224 | previously deferred frees to complete. Alternatively, |
225 | 225 | limit only the number awaiting deferred free rather than | |
226 | Alternatively, limit only the number awaiting deferred | 226 | the total number of elements. |
227 | free rather than the total number of elements. | 227 | |
228 | One way to stall the updates is to acquire the update-side | ||
229 | mutex. (Don't try this with a spinlock -- other CPUs | ||
230 | spinning on the lock could prevent the grace period | ||
231 | from ever ending.) Another way to stall the updates | ||
232 | is for the updates to use a wrapper function around | ||
233 | the memory allocator, so that this wrapper function | ||
234 | simulates OOM when there is too much memory awaiting an | ||
235 | RCU grace period. There are of course many other | ||
236 | variations on this theme. | ||
228 | 237 | ||
229 | b. Limiting update rate. For example, if updates occur only | 238 | b. Limiting update rate. For example, if updates occur only |
230 | once per hour, then no explicit rate limiting is required, | 239 | once per hour, then no explicit rate limiting is required, |
@@ -365,3 +374,26 @@ over a rather long period of time, but improvements are always welcome! | |||
365 | and the compiler to freely reorder code into and out of RCU | 374 | and the compiler to freely reorder code into and out of RCU |
366 | read-side critical sections. It is the responsibility of the | 375 | read-side critical sections. It is the responsibility of the |
367 | RCU update-side primitives to deal with this. | 376 | RCU update-side primitives to deal with this. |
377 | |||
378 | 17. Use CONFIG_PROVE_RCU, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and | ||
379 | the __rcu sparse checks to validate your RCU code. These | ||
380 | can help find problems as follows: | ||
381 | |||
382 | CONFIG_PROVE_RCU: check that accesses to RCU-protected data | ||
383 | structures are carried out under the proper RCU | ||
384 | read-side critical section, while holding the right | ||
385 | combination of locks, or whatever other conditions | ||
386 | are appropriate. | ||
387 | |||
388 | CONFIG_DEBUG_OBJECTS_RCU_HEAD: check that you don't pass the | ||
389 | same object to call_rcu() (or friends) before an RCU | ||
390 | grace period has elapsed since the last time that you | ||
391 | passed that same object to call_rcu() (or friends). | ||
392 | |||
393 | __rcu sparse checks: tag the pointer to the RCU-protected data | ||
394 | structure with __rcu, and sparse will warn you if you | ||
395 | access that pointer without the services of one of the | ||
396 | variants of rcu_dereference(). | ||
397 | |||
398 | These debugging aids can help you find problems that are | ||
399 | otherwise extremely difficult to spot. | ||
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index 44c6dcc93d6..862c08ef1fd 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt | |||
@@ -80,6 +80,24 @@ o A CPU looping with bottom halves disabled. This condition can | |||
80 | o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel | 80 | o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel |
81 | without invoking schedule(). | 81 | without invoking schedule(). |
82 | 82 | ||
83 | o A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might | ||
84 | happen to preempt a low-priority task in the middle of an RCU | ||
85 | read-side critical section. This is especially damaging if | ||
86 | that low-priority task is not permitted to run on any other CPU, | ||
87 | in which case the next RCU grace period can never complete, which | ||
88 | will eventually cause the system to run out of memory and hang. | ||
89 | While the system is in the process of running itself out of | ||
90 | memory, you might see stall-warning messages. | ||
91 | |||
92 | o A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that | ||
93 | is running at a higher priority than the RCU softirq threads. | ||
94 | This will prevent RCU callbacks from ever being invoked, | ||
95 | and in a CONFIG_TREE_PREEMPT_RCU kernel will further prevent | ||
96 | RCU grace periods from ever completing. Either way, the | ||
97 | system will eventually run out of memory and hang. In the | ||
98 | CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning | ||
99 | messages. | ||
100 | |||
83 | o A bug in the RCU implementation. | 101 | o A bug in the RCU implementation. |
84 | 102 | ||
85 | o A hardware failure. This is quite unlikely, but has occurred | 103 | o A hardware failure. This is quite unlikely, but has occurred |
diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt index efd8cc95c06..a851118775d 100644 --- a/Documentation/RCU/trace.txt +++ b/Documentation/RCU/trace.txt | |||
@@ -125,6 +125,17 @@ o "b" is the batch limit for this CPU. If more than this number | |||
125 | of RCU callbacks is ready to invoke, then the remainder will | 125 | of RCU callbacks is ready to invoke, then the remainder will |
126 | be deferred. | 126 | be deferred. |
127 | 127 | ||
128 | o "ci" is the number of RCU callbacks that have been invoked for | ||
129 | this CPU. Note that ci+ql is the number of callbacks that have | ||
130 | been registered in absence of CPU-hotplug activity. | ||
131 | |||
132 | o "co" is the number of RCU callbacks that have been orphaned due to | ||
133 | this CPU going offline. | ||
134 | |||
135 | o "ca" is the number of RCU callbacks that have been adopted due to | ||
136 | other CPUs going offline. Note that ci+co-ca+ql is the number of | ||
137 | RCU callbacks registered on this CPU. | ||
138 | |||
128 | There is also an rcu/rcudata.csv file with the same information in | 139 | There is also an rcu/rcudata.csv file with the same information in |
129 | comma-separated-variable spreadsheet format. | 140 | comma-separated-variable spreadsheet format. |
130 | 141 | ||
@@ -180,7 +191,7 @@ o "s" is the "signaled" state that drives force_quiescent_state()'s | |||
180 | 191 | ||
181 | o "jfq" is the number of jiffies remaining for this grace period | 192 | o "jfq" is the number of jiffies remaining for this grace period |
182 | before force_quiescent_state() is invoked to help push things | 193 | before force_quiescent_state() is invoked to help push things |
183 | along. Note that CPUs in dyntick-idle mode thoughout the grace | 194 | along. Note that CPUs in dyntick-idle mode throughout the grace |
184 | period will not report on their own, but rather must be check by | 195 | period will not report on their own, but rather must be check by |
185 | some other CPU via force_quiescent_state(). | 196 | some other CPU via force_quiescent_state(). |
186 | 197 | ||