diff options
author | Paul E. McKenney <paulmck@linux.vnet.ibm.com> | 2010-01-14 19:10:57 -0500 |
---|---|---|
committer | Ingo Molnar <mingo@elte.hu> | 2010-01-16 04:25:22 -0500 |
commit | 4c54005ca438a8b46dd542b497d4f0dc2ca375e8 (patch) | |
tree | 4274fb9dcbd94480b93fecefcf83969db53461ba /Documentation/RCU/stallwarn.txt | |
parent | b6407e863934965cdc66cbc244d811ceeb6f4d77 (diff) |
rcu: 1Q2010 update for RCU documentation
Add expedited functions. Review documentation and update
obsolete verbiage. Also fix the advice for the RCU CPU-stall
kernel configuration parameter, and document RCU CPU-stall
warnings.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <12635142581866-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Diffstat (limited to 'Documentation/RCU/stallwarn.txt')
-rw-r--r-- | Documentation/RCU/stallwarn.txt | 58 |
1 files changed, 58 insertions, 0 deletions
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt new file mode 100644 index 000000000000..1423d2570d78 --- /dev/null +++ b/Documentation/RCU/stallwarn.txt | |||
@@ -0,0 +1,58 @@ | |||
1 | Using RCU's CPU Stall Detector | ||
2 | |||
3 | The CONFIG_RCU_CPU_STALL_DETECTOR kernel config parameter enables | ||
4 | RCU's CPU stall detector, which detects conditions that unduly delay | ||
5 | RCU grace periods. The stall detector's idea of what constitutes | ||
6 | "unduly delayed" is controlled by a pair of C preprocessor macros: | ||
7 | |||
8 | RCU_SECONDS_TILL_STALL_CHECK | ||
9 | |||
10 | This macro defines the period of time that RCU will wait from | ||
11 | the beginning of a grace period until it issues an RCU CPU | ||
12 | stall warning. It is normally ten seconds. | ||
13 | |||
14 | RCU_SECONDS_TILL_STALL_RECHECK | ||
15 | |||
16 | This macro defines the period of time that RCU will wait after | ||
17 | issuing a stall warning until it issues another stall warning. | ||
18 | It is normally set to thirty seconds. | ||
19 | |||
20 | RCU_STALL_RAT_DELAY | ||
21 | |||
22 | The CPU stall detector tries to make the offending CPU rat on itself, | ||
23 | as this often gives better-quality stack traces. However, if | ||
24 | the offending CPU does not detect its own stall in the number | ||
25 | of jiffies specified by RCU_STALL_RAT_DELAY, then other CPUs will | ||
26 | complain. This is normally set to two jiffies. | ||
27 | |||
28 | The following problems can result in an RCU CPU stall warning: | ||
29 | |||
30 | o A CPU looping in an RCU read-side critical section. | ||
31 | |||
32 | o A CPU looping with interrupts disabled. | ||
33 | |||
34 | o A CPU looping with preemption disabled. | ||
35 | |||
36 | o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel | ||
37 | without invoking schedule(). | ||
38 | |||
39 | o A bug in the RCU implementation. | ||
40 | |||
41 | o A hardware failure. This is quite unlikely, but has occurred | ||
42 | at least once in a former life. A CPU failed in a running system, | ||
43 | becoming unresponsive, but not causing an immediate crash. | ||
44 | This resulted in a series of RCU CPU stall warnings, eventually | ||
45 | leading the realization that the CPU had failed. | ||
46 | |||
47 | The RCU, RCU-sched, and RCU-bh implementations have CPU stall warning. | ||
48 | SRCU does not do so directly, but its calls to synchronize_sched() will | ||
49 | result in RCU-sched detecting any CPU stalls that might be occurring. | ||
50 | |||
51 | To diagnose the cause of the stall, inspect the stack traces. The offending | ||
52 | function will usually be near the top of the stack. If you have a series | ||
53 | of stall warnings from a single extended stall, comparing the stack traces | ||
54 | can often help determine where the stall is occurring, which will usually | ||
55 | be in the function nearest the top of the stack that stays the same from | ||
56 | trace to trace. | ||
57 | |||
58 | RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE. | ||