aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorPaul E. McKenney <paul.mckenney@linaro.org>2012-01-20 20:35:55 -0500
committerPaul E. McKenney <paulmck@linux.vnet.ibm.com>2012-02-21 12:03:52 -0500
commit24cd7fd0eaa0d9f5e197ff77a83b006a86696068 (patch)
tree9dc9c058272951fa0e3d45dce111ef946ec4f59a /Documentation
parentc13f3757d0fcdcc2b7fc5d5e38da76b8913e6648 (diff)
rcu: Update stall-warning documentation
Add documentation of CONFIG_RCU_CPU_STALL_VERBOSE, CONFIG_RCU_CPU_STALL_INFO, and RCU_STALL_DELAY_DELTA. Describe multiple stall-warning messages from a single stall, and the timing of the subsequent messages. Add headings. Remove RCU_SECONDS_TILL_STALL_RECHECK because this value is now computed at runtime from RCU_CPU_STALL_TIMEOUT, so that sysfs changes to the timeout value now directly affect the RCU_SECONDS_TILL_STALL_RECHECK value. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/RCU/stallwarn.txt87
1 files changed, 80 insertions, 7 deletions
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index 083d88cbc089..523364e4e1f1 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -12,14 +12,38 @@ CONFIG_RCU_CPU_STALL_TIMEOUT
12 This kernel configuration parameter defines the period of time 12 This kernel configuration parameter defines the period of time
13 that RCU will wait from the beginning of a grace period until it 13 that RCU will wait from the beginning of a grace period until it
14 issues an RCU CPU stall warning. This time period is normally 14 issues an RCU CPU stall warning. This time period is normally
15 ten seconds. 15 sixty seconds.
16 16
17RCU_SECONDS_TILL_STALL_RECHECK 17 This configuration parameter may be changed at runtime via the
18 /sys/module/rcutree/parameters/rcu_cpu_stall_timeout, however
19 this parameter is checked only at the beginning of a cycle.
20 So if you are 30 seconds into a 70-second stall, setting this
21 sysfs parameter to (say) five will shorten the timeout for the
22 -next- stall, or the following warning for the current stall
23 (assuming the stall lasts long enough). It will not affect the
24 timing of the next warning for the current stall.
18 25
19 This macro defines the period of time that RCU will wait after 26 Stall-warning messages may be enabled and disabled completely via
20 issuing a stall warning until it issues another stall warning 27 /sys/module/rcutree/parameters/rcu_cpu_stall_suppress.
21 for the same stall. This time period is normally set to three 28
22 times the check interval plus thirty seconds. 29CONFIG_RCU_CPU_STALL_VERBOSE
30
31 This kernel configuration parameter causes the stall warning to
32 also dump the stacks of any tasks that are blocking the current
33 RCU-preempt grace period.
34
35RCU_CPU_STALL_INFO
36
37 This kernel configuration parameter causes the stall warning to
38 print out additional per-CPU diagnostic information, including
39 information on scheduling-clock ticks and RCU's idle-CPU tracking.
40
41RCU_STALL_DELAY_DELTA
42
43 Although the lockdep facility is extremely useful, it does add
44 some overhead. Therefore, under CONFIG_PROVE_RCU, the
45 RCU_STALL_DELAY_DELTA macro allows five extra seconds before
46 giving an RCU CPU stall warning message.
23 47
24RCU_STALL_RAT_DELAY 48RCU_STALL_RAT_DELAY
25 49
@@ -64,6 +88,54 @@ INFO: rcu_bh_state detected stalls on CPUs/tasks: { } (detected by 4, 2502 jiffi
64 88
65This is rare, but does happen from time to time in real life. 89This is rare, but does happen from time to time in real life.
66 90
91If the CONFIG_RCU_CPU_STALL_INFO kernel configuration parameter is set,
92more information is printed with the stall-warning message, for example:
93
94 INFO: rcu_preempt detected stall on CPU
95 0: (63959 ticks this GP) idle=241/3fffffffffffffff/0
96 (t=65000 jiffies)
97
98In kernels with CONFIG_RCU_FAST_NO_HZ, even more information is
99printed:
100
101 INFO: rcu_preempt detected stall on CPU
102 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 drain=0 . timer=-1
103 (t=65000 jiffies)
104
105The "(64628 ticks this GP)" indicates that this CPU has taken more
106than 64,000 scheduling-clock interrupts during the current stalled
107grace period. If the CPU was not yet aware of the current grace
108period (for example, if it was offline), then this part of the message
109indicates how many grace periods behind the CPU is.
110
111The "idle=" portion of the message prints the dyntick-idle state.
112The hex number before the first "/" is the low-order 12 bits of the
113dynticks counter, which will have an even-numbered value if the CPU is
114in dyntick-idle mode and an odd-numbered value otherwise. The hex
115number between the two "/"s is the value of the nesting, which will
116be a small positive number if in the idle loop and a very large positive
117number (as shown above) otherwise.
118
119For CONFIG_RCU_FAST_NO_HZ kernels, the "drain=0" indicates that the
120CPU is not in the process of trying to force itself into dyntick-idle
121state, the "." indicates that the CPU has not given up forcing RCU
122into dyntick-idle mode (it would be "H" otherwise), and the "timer=-1"
123indicates that the CPU has not recented forced RCU into dyntick-idle
124mode (it would otherwise indicate the number of microseconds remaining
125in this forced state).
126
127
128Multiple Warnings From One Stall
129
130If a stall lasts long enough, multiple stall-warning messages will be
131printed for it. The second and subsequent messages are printed at
132longer intervals, so that the time between (say) the first and second
133message will be about three times the interval between the beginning
134of the stall and the first message.
135
136
137What Causes RCU CPU Stall Warnings?
138
67So your kernel printed an RCU CPU stall warning. The next question is 139So your kernel printed an RCU CPU stall warning. The next question is
68"What caused it?" The following problems can result in RCU CPU stall 140"What caused it?" The following problems can result in RCU CPU stall
69warnings: 141warnings:
@@ -128,4 +200,5 @@ is occurring, which will usually be in the function nearest the top of
128that portion of the stack which remains the same from trace to trace. 200that portion of the stack which remains the same from trace to trace.
129If you can reliably trigger the stall, ftrace can be quite helpful. 201If you can reliably trigger the stall, ftrace can be quite helpful.
130 202
131RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE. 203RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE
204and with RCU's event tracing.