aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorPaul E. McKenney <paulmck@linux.vnet.ibm.com>2017-08-17 15:29:22 -0400
committerPaul E. McKenney <paulmck@linux.vnet.ibm.com>2017-10-09 17:23:36 -0400
commitd3cf5176d0b17610b7c7f1562e496ec401fe01f8 (patch)
treee72b6084e2bdbbfb4143b02654af025e20f0ec37
parent3d916a443e97169a3d88765c4e0b07ac813f439f (diff)
documentation: Update RCU CPU stall warning messages
The RCU CPU stall warnings have morphed significantly since the last update, so this commit brings the documentation up to date. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-rw-r--r--Documentation/RCU/stallwarn.txt182
1 files changed, 105 insertions, 77 deletions
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index 238acbd94917..a08f928c8557 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -171,67 +171,32 @@ Interpreting RCU's CPU Stall-Detector "Splats"
171For non-RCU-tasks flavors of RCU, when a CPU detects that it is stalling, 171For non-RCU-tasks flavors of RCU, when a CPU detects that it is stalling,
172it will print a message similar to the following: 172it will print a message similar to the following:
173 173
174INFO: rcu_sched_state detected stall on CPU 5 (t=2500 jiffies) 174 INFO: rcu_sched detected stalls on CPUs/tasks:
175 2-...: (3 GPs behind) idle=06c/0/0 softirq=1453/1455 fqs=0
176 16-...: (0 ticks this GP) idle=81c/0/0 softirq=764/764 fqs=0
177 (detected by 32, t=2603 jiffies, g=7073, c=7072, q=625)
175 178
176This message indicates that CPU 5 detected that it was causing a stall, 179This message indicates that CPU 32 detected that CPUs 2 and 16 were both
177and that the stall was affecting RCU-sched. This message will normally be 180causing stalls, and that the stall was affecting RCU-sched. This message
178followed by a stack dump of the offending CPU. On TREE_RCU kernel builds,
179RCU and RCU-sched are implemented by the same underlying mechanism,
180while on PREEMPT_RCU kernel builds, RCU is instead implemented
181by rcu_preempt_state.
182
183On the other hand, if the offending CPU fails to print out a stall-warning
184message quickly enough, some other CPU will print a message similar to
185the following:
186
187INFO: rcu_bh_state detected stalls on CPUs/tasks: { 3 5 } (detected by 2, 2502 jiffies)
188
189This message indicates that CPU 2 detected that CPUs 3 and 5 were both
190causing stalls, and that the stall was affecting RCU-bh. This message
191will normally be followed by stack dumps for each CPU. Please note that 181will normally be followed by stack dumps for each CPU. Please note that
192PREEMPT_RCU builds can be stalled by tasks as well as by CPUs, 182PREEMPT_RCU builds can be stalled by tasks as well as by CPUs, and that
193and that the tasks will be indicated by PID, for example, "P3421". 183the tasks will be indicated by PID, for example, "P3421". It is even
194It is even possible for a rcu_preempt_state stall to be caused by both 184possible for a rcu_preempt_state stall to be caused by both CPUs -and-
195CPUs -and- tasks, in which case the offending CPUs and tasks will all 185tasks, in which case the offending CPUs and tasks will all be called
196be called out in the list. 186out in the list.
197
198Finally, if the grace period ends just as the stall warning starts
199printing, there will be a spurious stall-warning message:
200
201INFO: rcu_bh_state detected stalls on CPUs/tasks: { } (detected by 4, 2502 jiffies)
202
203This is rare, but does happen from time to time in real life. It is also
204possible for a zero-jiffy stall to be flagged in this case, depending
205on how the stall warning and the grace-period initialization happen to
206interact. Please note that it is not possible to entirely eliminate this
207sort of false positive without resorting to things like stop_machine(),
208which is overkill for this sort of problem.
209
210Recent kernels will print a long form of the stall-warning message:
211
212 INFO: rcu_preempt detected stall on CPU
213 0: (63959 ticks this GP) idle=241/3fffffffffffffff/0 softirq=82/543
214 (t=65000 jiffies)
215
216In kernels with CONFIG_RCU_FAST_NO_HZ, more information is printed:
217
218 INFO: rcu_preempt detected stall on CPU
219 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 softirq=82/543 last_accelerate: a345/d342 nonlazy_posted: 25 .D
220 (t=65000 jiffies)
221 187
222The "(64628 ticks this GP)" indicates that this CPU has taken more 188CPU 2's "(3 GPs behind)" indicates that this CPU has not interacted with
223than 64,000 scheduling-clock interrupts during the current stalled 189the RCU core for the past three grace periods. In contrast, CPU 16's "(0
224grace period. If the CPU was not yet aware of the current grace 190ticks this GP)" indicates that this CPU has not taken any scheduling-clock
225period (for example, if it was offline), then this part of the message 191interrupts during the current stalled grace period.
226indicates how many grace periods behind the CPU is.
227 192
228The "idle=" portion of the message prints the dyntick-idle state. 193The "idle=" portion of the message prints the dyntick-idle state.
229The hex number before the first "/" is the low-order 12 bits of the 194The hex number before the first "/" is the low-order 12 bits of the
230dynticks counter, which will have an even-numbered value if the CPU is 195dynticks counter, which will have an even-numbered value if the CPU
231in dyntick-idle mode and an odd-numbered value otherwise. The hex 196is in dyntick-idle mode and an odd-numbered value otherwise. The hex
232number between the two "/"s is the value of the nesting, which will 197number between the two "/"s is the value of the nesting, which will be
233be a small positive number if in the idle loop and a very large positive 198a small non-negative number if in the idle loop (as shown above) and a
234number (as shown above) otherwise. 199very large positive number otherwise.
235 200
236The "softirq=" portion of the message tracks the number of RCU softirq 201The "softirq=" portion of the message tracks the number of RCU softirq
237handlers that the stalled CPU has executed. The number before the "/" 202handlers that the stalled CPU has executed. The number before the "/"
@@ -246,24 +211,72 @@ handlers are no longer able to execute on this CPU. This can happen if
246the stalled CPU is spinning with interrupts are disabled, or, in -rt 211the stalled CPU is spinning with interrupts are disabled, or, in -rt
247kernels, if a high-priority process is starving RCU's softirq handler. 212kernels, if a high-priority process is starving RCU's softirq handler.
248 213
249For CONFIG_RCU_FAST_NO_HZ kernels, the "last_accelerate:" prints the 214The "fps=" shows the number of force-quiescent-state idle/offline
250low-order 16 bits (in hex) of the jiffies counter when this CPU last 215detection passes that the grace-period kthread has made across this
251invoked rcu_try_advance_all_cbs() from rcu_needs_cpu() or last invoked 216CPU since the last time that this CPU noted the beginning of a grace
252rcu_accelerate_cbs() from rcu_prepare_for_idle(). The "nonlazy_posted:" 217period.
253prints the number of non-lazy callbacks posted since the last call to 218
254rcu_needs_cpu(). Finally, an "L" indicates that there are currently 219The "detected by" line indicates which CPU detected the stall (in this
255no non-lazy callbacks ("." is printed otherwise, as shown above) and 220case, CPU 32), how many jiffies have elapsed since the start of the
256"D" indicates that dyntick-idle processing is enabled ("." is printed 221grace period (in this case 2603), the number of the last grace period
257otherwise, for example, if disabled via the "nohz=" kernel boot parameter). 222to start and to complete (7073 and 7072, respectively), and an estimate
223of the total number of RCU callbacks queued across all CPUs (625 in
224this case).
225
226In kernels with CONFIG_RCU_FAST_NO_HZ, more information is printed
227for each CPU:
228
229 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 softirq=82/543 last_accelerate: a345/d342 nonlazy_posted: 25 .D
230
231The "last_accelerate:" prints the low-order 16 bits (in hex) of the
232jiffies counter when this CPU last invoked rcu_try_advance_all_cbs()
233from rcu_needs_cpu() or last invoked rcu_accelerate_cbs() from
234rcu_prepare_for_idle(). The "nonlazy_posted:" prints the number
235of non-lazy callbacks posted since the last call to rcu_needs_cpu().
236Finally, an "L" indicates that there are currently no non-lazy callbacks
237("." is printed otherwise, as shown above) and "D" indicates that
238dyntick-idle processing is enabled ("." is printed otherwise, for example,
239if disabled via the "nohz=" kernel boot parameter).
240
241If the grace period ends just as the stall warning starts printing,
242there will be a spurious stall-warning message, which will include
243the following:
244
245 INFO: Stall ended before state dump start
246
247This is rare, but does happen from time to time in real life. It is also
248possible for a zero-jiffy stall to be flagged in this case, depending
249on how the stall warning and the grace-period initialization happen to
250interact. Please note that it is not possible to entirely eliminate this
251sort of false positive without resorting to things like stop_machine(),
252which is overkill for this sort of problem.
253
254If all CPUs and tasks have passed through quiescent states, but the
255grace period has nevertheless failed to end, the stall-warning splat
256will include something like the following:
257
258 All QSes seen, last rcu_preempt kthread activity 23807 (4297905177-4297881370), jiffies_till_next_fqs=3, root ->qsmask 0x0
259
260The "23807" indicates that it has been more than 23 thousand jiffies
261since the grace-period kthread ran. The "jiffies_till_next_fqs"
262indicates how frequently that kthread should run, giving the number
263of jiffies between force-quiescent-state scans, in this case three,
264which is way less than 23807. Finally, the root rcu_node structure's
265->qsmask field is printed, which will normally be zero.
258 266
259If the relevant grace-period kthread has been unable to run prior to 267If the relevant grace-period kthread has been unable to run prior to
260the stall warning, the following additional line is printed: 268the stall warning, as was the case in the "All QSes seen" line above,
269the following additional line is printed:
261 270
262 rcu_preempt kthread starved for 2023 jiffies! 271 kthread starved for 23807 jiffies! g7073 c7072 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
263 272
264Starving the grace-period kthreads of CPU time can of course result in 273Starving the grace-period kthreads of CPU time can of course result
265RCU CPU stall warnings even when all CPUs and tasks have passed through 274in RCU CPU stall warnings even when all CPUs and tasks have passed
266the required quiescent states. 275through the required quiescent states. The "g" and "c" numbers flag the
276number of the last grace period started and completed, respectively,
277the "f" precedes the ->gp_flags command to the grace-period kthread,
278the "RCU_GP_WAIT_FQS" indicates that the kthread is waiting for a short
279timeout, and the "state" precedes value of the task_struct ->state field.
267 280
268 281
269Multiple Warnings From One Stall 282Multiple Warnings From One Stall
@@ -280,13 +293,28 @@ Stall Warnings for Expedited Grace Periods
280If an expedited grace period detects a stall, it will place a message 293If an expedited grace period detects a stall, it will place a message
281like the following in dmesg: 294like the following in dmesg:
282 295
283 INFO: rcu_sched detected expedited stalls on CPUs: { 1 2 6 } 26009 jiffies s: 1043 296 INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 7-... } 21119 jiffies s: 73 root: 0x2/.
284 297
285This indicates that CPUs 1, 2, and 6 have failed to respond to a 298This indicates that CPU 7 has failed to respond to a reschedule IPI.
286reschedule IPI, that the expedited grace period has been going on for 299The three periods (".") following the CPU number indicate that the CPU
28726,009 jiffies, and that the expedited grace-period sequence counter is 300is online (otherwise the first period would instead have been "O"),
2881043. The fact that this last value is odd indicates that an expedited 301that the CPU was online at the beginning of the expedited grace period
289grace period is in flight. 302(otherwise the second period would have instead been "o"), and that
303the CPU has been online at least once since boot (otherwise, the third
304period would instead have been "N"). The number before the "jiffies"
305indicates that the expedited grace period has been going on for 21,119
306jiffies. The number following the "s:" indicates that the expedited
307grace-period sequence counter is 73. The fact that this last value is
308odd indicates that an expedited grace period is in flight. The number
309following "root:" is a bitmask that indicates which children of the root
310rcu_node structure correspond to CPUs and/or tasks that are blocking the
311current expedited grace period. If the tree had more than one level,
312additional hex numbers would be printed for the states of the other
313rcu_node structures in the tree.
314
315As with normal grace periods, PREEMPT_RCU builds can be stalled by
316tasks as well as by CPUs, and that the tasks will be indicated by PID,
317for example, "P3421".
290 318
291It is entirely possible to see stall warnings from normal and from 319It is entirely possible to see stall warnings from normal and from
292expedited grace periods at about the same time from the same run. 320expedited grace periods at about the same time during the same run.