diff options
author | Paul E. McKenney <paulmck@linux.vnet.ibm.com> | 2017-05-15 18:30:32 -0400 |
---|---|---|
committer | Paul E. McKenney <paulmck@linux.vnet.ibm.com> | 2017-06-08 21:52:43 -0400 |
commit | ae91aa0adb14dc33114d566feca2f7cb7a96b8b7 (patch) | |
tree | d52dfad4c7c9be2acac7c8f0c6890acd38547d60 /Documentation/RCU | |
parent | bd8cc5a062f41e334596edbe823e2fa0adddd1b7 (diff) |
rcu: Remove debugfs tracing
RCU's debugfs tracing used to be the only reasonable low-level debug
information available, but ftrace and event tracing has since surpassed
the RCU debugfs level of usefulness. This commit therefore removes
RCU's debugfs tracing.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Diffstat (limited to 'Documentation/RCU')
-rw-r--r-- | Documentation/RCU/00-INDEX | 2 | ||||
-rw-r--r-- | Documentation/RCU/Design/Requirements/Requirements.html | 2 | ||||
-rw-r--r-- | Documentation/RCU/trace.txt | 535 |
3 files changed, 1 insertions, 538 deletions
diff --git a/Documentation/RCU/00-INDEX b/Documentation/RCU/00-INDEX index 1672573b037a..f46980c060aa 100644 --- a/Documentation/RCU/00-INDEX +++ b/Documentation/RCU/00-INDEX | |||
@@ -28,8 +28,6 @@ stallwarn.txt | |||
28 | - RCU CPU stall warnings (module parameter rcu_cpu_stall_suppress) | 28 | - RCU CPU stall warnings (module parameter rcu_cpu_stall_suppress) |
29 | torture.txt | 29 | torture.txt |
30 | - RCU Torture Test Operation (CONFIG_RCU_TORTURE_TEST) | 30 | - RCU Torture Test Operation (CONFIG_RCU_TORTURE_TEST) |
31 | trace.txt | ||
32 | - CONFIG_RCU_TRACE debugfs files and formats | ||
33 | UP.txt | 31 | UP.txt |
34 | - RCU on Uniprocessor Systems | 32 | - RCU on Uniprocessor Systems |
35 | whatisRCU.txt | 33 | whatisRCU.txt |
diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html index 0e6550a8c926..95b30fa25d56 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.html +++ b/Documentation/RCU/Design/Requirements/Requirements.html | |||
@@ -2034,7 +2034,7 @@ guard against mishaps and misuse: | |||
2034 | some other synchronization mechanism, for example, reference | 2034 | some other synchronization mechanism, for example, reference |
2035 | counting. | 2035 | counting. |
2036 | <li> In kernels built with <tt>CONFIG_RCU_TRACE=y</tt>, RCU-related | 2036 | <li> In kernels built with <tt>CONFIG_RCU_TRACE=y</tt>, RCU-related |
2037 | information is provided via both debugfs and event tracing. | 2037 | information is provided via event tracing. |
2038 | <li> Open-coded use of <tt>rcu_assign_pointer()</tt> and | 2038 | <li> Open-coded use of <tt>rcu_assign_pointer()</tt> and |
2039 | <tt>rcu_dereference()</tt> to create typical linked | 2039 | <tt>rcu_dereference()</tt> to create typical linked |
2040 | data structures can be surprisingly error-prone. | 2040 | data structures can be surprisingly error-prone. |
diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt deleted file mode 100644 index 6549012033f9..000000000000 --- a/Documentation/RCU/trace.txt +++ /dev/null | |||
@@ -1,535 +0,0 @@ | |||
1 | CONFIG_RCU_TRACE debugfs Files and Formats | ||
2 | |||
3 | |||
4 | The rcutree and rcutiny implementations of RCU provide debugfs trace | ||
5 | output that summarizes counters and state. This information is useful for | ||
6 | debugging RCU itself, and can sometimes also help to debug abuses of RCU. | ||
7 | The following sections describe the debugfs files and formats, first | ||
8 | for rcutree and next for rcutiny. | ||
9 | |||
10 | |||
11 | CONFIG_TREE_RCU and CONFIG_PREEMPT_RCU debugfs Files and Formats | ||
12 | |||
13 | These implementations of RCU provide several debugfs directories under the | ||
14 | top-level directory "rcu": | ||
15 | |||
16 | rcu/rcu_bh | ||
17 | rcu/rcu_preempt | ||
18 | rcu/rcu_sched | ||
19 | |||
20 | Each directory contains files for the corresponding flavor of RCU. | ||
21 | Note that rcu/rcu_preempt is only present for CONFIG_PREEMPT_RCU. | ||
22 | For CONFIG_TREE_RCU, the RCU flavor maps onto the RCU-sched flavor, | ||
23 | so that activity for both appears in rcu/rcu_sched. | ||
24 | |||
25 | In addition, the following file appears in the top-level directory: | ||
26 | rcu/rcutorture. This file displays rcutorture test progress. The output | ||
27 | of "cat rcu/rcutorture" looks as follows: | ||
28 | |||
29 | rcutorture test sequence: 0 (test in progress) | ||
30 | rcutorture update version number: 615 | ||
31 | |||
32 | The first line shows the number of rcutorture tests that have completed | ||
33 | since boot. If a test is currently running, the "(test in progress)" | ||
34 | string will appear as shown above. The second line shows the number of | ||
35 | update cycles that the current test has started, or zero if there is | ||
36 | no test in progress. | ||
37 | |||
38 | |||
39 | Within each flavor directory (rcu/rcu_bh, rcu/rcu_sched, and possibly | ||
40 | also rcu/rcu_preempt) the following files will be present: | ||
41 | |||
42 | rcudata: | ||
43 | Displays fields in struct rcu_data. | ||
44 | rcuexp: | ||
45 | Displays statistics for expedited grace periods. | ||
46 | rcugp: | ||
47 | Displays grace-period counters. | ||
48 | rcuhier: | ||
49 | Displays the struct rcu_node hierarchy. | ||
50 | rcu_pending: | ||
51 | Displays counts of the reasons rcu_pending() decided that RCU had | ||
52 | work to do. | ||
53 | rcuboost: | ||
54 | Displays RCU boosting statistics. Only present if | ||
55 | CONFIG_RCU_BOOST=y. | ||
56 | |||
57 | The output of "cat rcu/rcu_preempt/rcudata" looks as follows: | ||
58 | |||
59 | 0!c=30455 g=30456 cnq=1/0:1 dt=126535/140000000000000/0 df=2002 of=4 ql=0/0 qs=N... b=10 ci=74572 nci=0 co=1131 ca=716 | ||
60 | 1!c=30719 g=30720 cnq=1/0:0 dt=132007/140000000000000/0 df=1874 of=10 ql=0/0 qs=N... b=10 ci=123209 nci=0 co=685 ca=982 | ||
61 | 2!c=30150 g=30151 cnq=1/1:1 dt=138537/140000000000000/0 df=1707 of=8 ql=0/0 qs=N... b=10 ci=80132 nci=0 co=1328 ca=1458 | ||
62 | 3 c=31249 g=31250 cnq=1/1:0 dt=107255/140000000000000/0 df=1749 of=6 ql=0/450 qs=NRW. b=10 ci=151700 nci=0 co=509 ca=622 | ||
63 | 4!c=29502 g=29503 cnq=1/0:1 dt=83647/140000000000000/0 df=965 of=5 ql=0/0 qs=N... b=10 ci=65643 nci=0 co=1373 ca=1521 | ||
64 | 5 c=31201 g=31202 cnq=1/0:1 dt=70422/0/0 df=535 of=7 ql=0/0 qs=.... b=10 ci=58500 nci=0 co=764 ca=698 | ||
65 | 6!c=30253 g=30254 cnq=1/0:1 dt=95363/140000000000000/0 df=780 of=5 ql=0/0 qs=N... b=10 ci=100607 nci=0 co=1414 ca=1353 | ||
66 | 7 c=31178 g=31178 cnq=1/0:0 dt=91536/0/0 df=547 of=4 ql=0/0 qs=.... b=10 ci=109819 nci=0 co=1115 ca=969 | ||
67 | |||
68 | This file has one line per CPU, or eight for this 8-CPU system. | ||
69 | The fields are as follows: | ||
70 | |||
71 | o The number at the beginning of each line is the CPU number. | ||
72 | CPUs numbers followed by an exclamation mark are offline, | ||
73 | but have been online at least once since boot. There will be | ||
74 | no output for CPUs that have never been online, which can be | ||
75 | a good thing in the surprisingly common case where NR_CPUS is | ||
76 | substantially larger than the number of actual CPUs. | ||
77 | |||
78 | o "c" is the count of grace periods that this CPU believes have | ||
79 | completed. Offlined CPUs and CPUs in dynticks idle mode may lag | ||
80 | quite a ways behind, for example, CPU 4 under "rcu_sched" above, | ||
81 | which has been offline through 16 RCU grace periods. It is not | ||
82 | unusual to see offline CPUs lagging by thousands of grace periods. | ||
83 | Note that although the grace-period number is an unsigned long, | ||
84 | it is printed out as a signed long to allow more human-friendly | ||
85 | representation near boot time. | ||
86 | |||
87 | o "g" is the count of grace periods that this CPU believes have | ||
88 | started. Again, offlined CPUs and CPUs in dynticks idle mode | ||
89 | may lag behind. If the "c" and "g" values are equal, this CPU | ||
90 | has already reported a quiescent state for the last RCU grace | ||
91 | period that it is aware of, otherwise, the CPU believes that it | ||
92 | owes RCU a quiescent state. | ||
93 | |||
94 | o "pq" indicates that this CPU has passed through a quiescent state | ||
95 | for the current grace period. It is possible for "pq" to be | ||
96 | "1" and "c" different than "g", which indicates that although | ||
97 | the CPU has passed through a quiescent state, either (1) this | ||
98 | CPU has not yet reported that fact, (2) some other CPU has not | ||
99 | yet reported for this grace period, or (3) both. | ||
100 | |||
101 | o "qp" indicates that RCU still expects a quiescent state from | ||
102 | this CPU. Offlined CPUs and CPUs in dyntick idle mode might | ||
103 | well have qp=1, which is OK: RCU is still ignoring them. | ||
104 | |||
105 | o "dt" is the current value of the dyntick counter that is incremented | ||
106 | when entering or leaving idle, either due to a context switch or | ||
107 | due to an interrupt. This number is even if the CPU is in idle | ||
108 | from RCU's viewpoint and odd otherwise. The number after the | ||
109 | first "/" is the interrupt nesting depth when in idle state, | ||
110 | or a large number added to the interrupt-nesting depth when | ||
111 | running a non-idle task. Some architectures do not accurately | ||
112 | count interrupt nesting when running in non-idle kernel context, | ||
113 | which can result in interesting anomalies such as negative | ||
114 | interrupt-nesting levels. The number after the second "/" | ||
115 | is the NMI nesting depth. | ||
116 | |||
117 | o "df" is the number of times that some other CPU has forced a | ||
118 | quiescent state on behalf of this CPU due to this CPU being in | ||
119 | idle state. | ||
120 | |||
121 | o "of" is the number of times that some other CPU has forced a | ||
122 | quiescent state on behalf of this CPU due to this CPU being | ||
123 | offline. In a perfect world, this might never happen, but it | ||
124 | turns out that offlining and onlining a CPU can take several grace | ||
125 | periods, and so there is likely to be an extended period of time | ||
126 | when RCU believes that the CPU is online when it really is not. | ||
127 | Please note that erring in the other direction (RCU believing a | ||
128 | CPU is offline when it is really alive and kicking) is a fatal | ||
129 | error, so it makes sense to err conservatively. | ||
130 | |||
131 | o "ql" is the number of RCU callbacks currently residing on | ||
132 | this CPU. The first number is the number of "lazy" callbacks | ||
133 | that are known to RCU to only be freeing memory, and the number | ||
134 | after the "/" is the total number of callbacks, lazy or not. | ||
135 | These counters count callbacks regardless of what phase of | ||
136 | grace-period processing that they are in (new, waiting for | ||
137 | grace period to start, waiting for grace period to end, ready | ||
138 | to invoke). | ||
139 | |||
140 | o "qs" gives an indication of the state of the callback queue | ||
141 | with four characters: | ||
142 | |||
143 | "N" Indicates that there are callbacks queued that are not | ||
144 | ready to be handled by the next grace period, and thus | ||
145 | will be handled by the grace period following the next | ||
146 | one. | ||
147 | |||
148 | "R" Indicates that there are callbacks queued that are | ||
149 | ready to be handled by the next grace period. | ||
150 | |||
151 | "W" Indicates that there are callbacks queued that are | ||
152 | waiting on the current grace period. | ||
153 | |||
154 | "D" Indicates that there are callbacks queued that have | ||
155 | already been handled by a prior grace period, and are | ||
156 | thus waiting to be invoked. Note that callbacks in | ||
157 | the process of being invoked are not counted here. | ||
158 | Callbacks in the process of being invoked are those | ||
159 | that have been removed from the rcu_data structures | ||
160 | queues by rcu_do_batch(), but which have not yet been | ||
161 | invoked. | ||
162 | |||
163 | If there are no callbacks in a given one of the above states, | ||
164 | the corresponding character is replaced by ".". | ||
165 | |||
166 | o "b" is the batch limit for this CPU. If more than this number | ||
167 | of RCU callbacks is ready to invoke, then the remainder will | ||
168 | be deferred. | ||
169 | |||
170 | o "ci" is the number of RCU callbacks that have been invoked for | ||
171 | this CPU. Note that ci+nci+ql is the number of callbacks that have | ||
172 | been registered in absence of CPU-hotplug activity. | ||
173 | |||
174 | o "nci" is the number of RCU callbacks that have been offloaded from | ||
175 | this CPU. This will always be zero unless the kernel was built | ||
176 | with CONFIG_RCU_NOCB_CPU=y and the "rcu_nocbs=" kernel boot | ||
177 | parameter was specified. | ||
178 | |||
179 | o "co" is the number of RCU callbacks that have been orphaned due to | ||
180 | this CPU going offline. These orphaned callbacks have been moved | ||
181 | to an arbitrarily chosen online CPU. | ||
182 | |||
183 | o "ca" is the number of RCU callbacks that have been adopted by this | ||
184 | CPU due to other CPUs going offline. Note that ci+co-ca+ql is | ||
185 | the number of RCU callbacks registered on this CPU. | ||
186 | |||
187 | |||
188 | Kernels compiled with CONFIG_RCU_BOOST=y display the following from | ||
189 | /debug/rcu/rcu_preempt/rcudata: | ||
190 | |||
191 | 0!c=12865 g=12866 cnq=1/0:1 dt=83113/140000000000000/0 df=288 of=11 ql=0/0 qs=N... kt=0/O ktl=944 b=10 ci=60709 nci=0 co=748 ca=871 | ||
192 | 1 c=14407 g=14408 cnq=1/0:0 dt=100679/140000000000000/0 df=378 of=7 ql=0/119 qs=NRW. kt=0/W ktl=9b6 b=10 ci=109740 nci=0 co=589 ca=485 | ||
193 | 2 c=14407 g=14408 cnq=1/0:0 dt=105486/0/0 df=90 of=9 ql=0/89 qs=NRW. kt=0/W ktl=c0c b=10 ci=83113 nci=0 co=533 ca=490 | ||
194 | 3 c=14407 g=14408 cnq=1/0:0 dt=107138/0/0 df=142 of=8 ql=0/188 qs=NRW. kt=0/W ktl=b96 b=10 ci=121114 nci=0 co=426 ca=290 | ||
195 | 4 c=14405 g=14406 cnq=1/0:1 dt=50238/0/0 df=706 of=7 ql=0/0 qs=.... kt=0/W ktl=812 b=10 ci=34929 nci=0 co=643 ca=114 | ||
196 | 5!c=14168 g=14169 cnq=1/0:0 dt=45465/140000000000000/0 df=161 of=11 ql=0/0 qs=N... kt=0/O ktl=b4d b=10 ci=47712 nci=0 co=677 ca=722 | ||
197 | 6 c=14404 g=14405 cnq=1/0:0 dt=59454/0/0 df=94 of=6 ql=0/0 qs=.... kt=0/W ktl=e57 b=10 ci=55597 nci=0 co=701 ca=811 | ||
198 | 7 c=14407 g=14408 cnq=1/0:1 dt=68850/0/0 df=31 of=8 ql=0/0 qs=.... kt=0/W ktl=14bd b=10 ci=77475 nci=0 co=508 ca=1042 | ||
199 | |||
200 | This is similar to the output discussed above, but contains the following | ||
201 | additional fields: | ||
202 | |||
203 | o "kt" is the per-CPU kernel-thread state. The digit preceding | ||
204 | the first slash is zero if there is no work pending and 1 | ||
205 | otherwise. The character between the first pair of slashes is | ||
206 | as follows: | ||
207 | |||
208 | "S" The kernel thread is stopped, in other words, all | ||
209 | CPUs corresponding to this rcu_node structure are | ||
210 | offline. | ||
211 | |||
212 | "R" The kernel thread is running. | ||
213 | |||
214 | "W" The kernel thread is waiting because there is no work | ||
215 | for it to do. | ||
216 | |||
217 | "O" The kernel thread is waiting because it has been | ||
218 | forced off of its designated CPU or because its | ||
219 | ->cpus_allowed mask permits it to run on other than | ||
220 | its designated CPU. | ||
221 | |||
222 | "Y" The kernel thread is yielding to avoid hogging CPU. | ||
223 | |||
224 | "?" Unknown value, indicates a bug. | ||
225 | |||
226 | The number after the final slash is the CPU that the kthread | ||
227 | is actually running on. | ||
228 | |||
229 | This field is displayed only for CONFIG_RCU_BOOST kernels. | ||
230 | |||
231 | o "ktl" is the low-order 16 bits (in hexadecimal) of the count of | ||
232 | the number of times that this CPU's per-CPU kthread has gone | ||
233 | through its loop servicing invoke_rcu_cpu_kthread() requests. | ||
234 | |||
235 | This field is displayed only for CONFIG_RCU_BOOST kernels. | ||
236 | |||
237 | |||
238 | The output of "cat rcu/rcu_preempt/rcuexp" looks as follows: | ||
239 | |||
240 | s=21872 wd1=0 wd2=0 wd3=5 enq=0 sc=21872 | ||
241 | |||
242 | These fields are as follows: | ||
243 | |||
244 | o "s" is the sequence number, with an odd number indicating that | ||
245 | an expedited grace period is in progress. | ||
246 | |||
247 | o "wd1", "wd2", and "wd3" are the number of times that an attempt | ||
248 | to start an expedited grace period found that someone else had | ||
249 | completed an expedited grace period that satisfies the attempted | ||
250 | request. "Our work is done." | ||
251 | |||
252 | o "enq" is the number of quiescent states still outstanding. | ||
253 | |||
254 | o "sc" is the number of times that the attempt to start a | ||
255 | new expedited grace period succeeded. | ||
256 | |||
257 | |||
258 | The output of "cat rcu/rcu_preempt/rcugp" looks as follows: | ||
259 | |||
260 | completed=31249 gpnum=31250 age=1 max=18 | ||
261 | |||
262 | These fields are taken from the rcu_state structure, and are as follows: | ||
263 | |||
264 | o "completed" is the number of grace periods that have completed. | ||
265 | It is comparable to the "c" field from rcu/rcudata in that a | ||
266 | CPU whose "c" field matches the value of "completed" is aware | ||
267 | that the corresponding RCU grace period has completed. | ||
268 | |||
269 | o "gpnum" is the number of grace periods that have started. It is | ||
270 | similarly comparable to the "g" field from rcu/rcudata in that | ||
271 | a CPU whose "g" field matches the value of "gpnum" is aware that | ||
272 | the corresponding RCU grace period has started. | ||
273 | |||
274 | If these two fields are equal, then there is no grace period | ||
275 | in progress, in other words, RCU is idle. On the other hand, | ||
276 | if the two fields differ (as they are above), then an RCU grace | ||
277 | period is in progress. | ||
278 | |||
279 | o "age" is the number of jiffies that the current grace period | ||
280 | has extended for, or zero if there is no grace period currently | ||
281 | in effect. | ||
282 | |||
283 | o "max" is the age in jiffies of the longest-duration grace period | ||
284 | thus far. | ||
285 | |||
286 | The output of "cat rcu/rcu_preempt/rcuhier" looks as follows: | ||
287 | |||
288 | c=14407 g=14408 s=0 jfq=2 j=c863 nfqs=12040/nfqsng=0(12040) fqlh=1051 oqlen=0/0 | ||
289 | 3/3 ..>. 0:7 ^0 | ||
290 | e/e ..>. 0:3 ^0 d/d ..>. 4:7 ^1 | ||
291 | |||
292 | The fields are as follows: | ||
293 | |||
294 | o "c" is exactly the same as "completed" under rcu/rcu_preempt/rcugp. | ||
295 | |||
296 | o "g" is exactly the same as "gpnum" under rcu/rcu_preempt/rcugp. | ||
297 | |||
298 | o "s" is the current state of the force_quiescent_state() | ||
299 | state machine. | ||
300 | |||
301 | o "jfq" is the number of jiffies remaining for this grace period | ||
302 | before force_quiescent_state() is invoked to help push things | ||
303 | along. Note that CPUs in idle mode throughout the grace period | ||
304 | will not report on their own, but rather must be check by some | ||
305 | other CPU via force_quiescent_state(). | ||
306 | |||
307 | o "j" is the low-order four hex digits of the jiffies counter. | ||
308 | Yes, Paul did run into a number of problems that turned out to | ||
309 | be due to the jiffies counter no longer counting. Why do you ask? | ||
310 | |||
311 | o "nfqs" is the number of calls to force_quiescent_state() since | ||
312 | boot. | ||
313 | |||
314 | o "nfqsng" is the number of useless calls to force_quiescent_state(), | ||
315 | where there wasn't actually a grace period active. This can | ||
316 | no longer happen due to grace-period processing being pushed | ||
317 | into a kthread. The number in parentheses is the difference | ||
318 | between "nfqs" and "nfqsng", or the number of times that | ||
319 | force_quiescent_state() actually did some real work. | ||
320 | |||
321 | o "fqlh" is the number of calls to force_quiescent_state() that | ||
322 | exited immediately (without even being counted in nfqs above) | ||
323 | due to contention on ->fqslock. | ||
324 | |||
325 | o Each element of the form "3/3 ..>. 0:7 ^0" represents one rcu_node | ||
326 | structure. Each line represents one level of the hierarchy, | ||
327 | from root to leaves. It is best to think of the rcu_data | ||
328 | structures as forming yet another level after the leaves. | ||
329 | Note that there might be either one, two, three, or even four | ||
330 | levels of rcu_node structures, depending on the relationship | ||
331 | between CONFIG_RCU_FANOUT, CONFIG_RCU_FANOUT_LEAF (possibly | ||
332 | adjusted using the rcu_fanout_leaf kernel boot parameter), and | ||
333 | CONFIG_NR_CPUS (possibly adjusted using the nr_cpu_ids count of | ||
334 | possible CPUs for the booting hardware). | ||
335 | |||
336 | o The numbers separated by the "/" are the qsmask followed | ||
337 | by the qsmaskinit. The qsmask will have one bit | ||
338 | set for each entity in the next lower level that has | ||
339 | not yet checked in for the current grace period ("e" | ||
340 | indicating CPUs 5, 6, and 7 in the example above). | ||
341 | The qsmaskinit will have one bit for each entity that is | ||
342 | currently expected to check in during each grace period. | ||
343 | The value of qsmaskinit is assigned to that of qsmask | ||
344 | at the beginning of each grace period. | ||
345 | |||
346 | o The characters separated by the ">" indicate the state | ||
347 | of the blocked-tasks lists. A "G" preceding the ">" | ||
348 | indicates that at least one task blocked in an RCU | ||
349 | read-side critical section blocks the current grace | ||
350 | period, while a "E" preceding the ">" indicates that | ||
351 | at least one task blocked in an RCU read-side critical | ||
352 | section blocks the current expedited grace period. | ||
353 | A "T" character following the ">" indicates that at | ||
354 | least one task is blocked within an RCU read-side | ||
355 | critical section, regardless of whether any current | ||
356 | grace period (expedited or normal) is inconvenienced. | ||
357 | A "." character appears if the corresponding condition | ||
358 | does not hold, so that "..>." indicates that no tasks | ||
359 | are blocked. In contrast, "GE>T" indicates maximal | ||
360 | inconvenience from blocked tasks. CONFIG_TREE_RCU | ||
361 | builds of the kernel will always show "..>.". | ||
362 | |||
363 | o The numbers separated by the ":" are the range of CPUs | ||
364 | served by this struct rcu_node. This can be helpful | ||
365 | in working out how the hierarchy is wired together. | ||
366 | |||
367 | For example, the example rcu_node structure shown above | ||
368 | has "0:7", indicating that it covers CPUs 0 through 7. | ||
369 | |||
370 | o The number after the "^" indicates the bit in the | ||
371 | next higher level rcu_node structure that this rcu_node | ||
372 | structure corresponds to. For example, the "d/d ..>. 4:7 | ||
373 | ^1" has a "1" in this position, indicating that it | ||
374 | corresponds to the "1" bit in the "3" shown in the | ||
375 | "3/3 ..>. 0:7 ^0" entry on the next level up. | ||
376 | |||
377 | |||
378 | The output of "cat rcu/rcu_sched/rcu_pending" looks as follows: | ||
379 | |||
380 | 0!np=26111 qsp=29 rpq=5386 cbr=1 cng=570 gpc=3674 gps=577 nn=15903 ndw=0 | ||
381 | 1!np=28913 qsp=35 rpq=6097 cbr=1 cng=448 gpc=3700 gps=554 nn=18113 ndw=0 | ||
382 | 2!np=32740 qsp=37 rpq=6202 cbr=0 cng=476 gpc=4627 gps=546 nn=20889 ndw=0 | ||
383 | 3 np=23679 qsp=22 rpq=5044 cbr=1 cng=415 gpc=3403 gps=347 nn=14469 ndw=0 | ||
384 | 4!np=30714 qsp=4 rpq=5574 cbr=0 cng=528 gpc=3931 gps=639 nn=20042 ndw=0 | ||
385 | 5 np=28910 qsp=2 rpq=5246 cbr=0 cng=428 gpc=4105 gps=709 nn=18422 ndw=0 | ||
386 | 6!np=38648 qsp=5 rpq=7076 cbr=0 cng=840 gpc=4072 gps=961 nn=25699 ndw=0 | ||
387 | 7 np=37275 qsp=2 rpq=6873 cbr=0 cng=868 gpc=3416 gps=971 nn=25147 ndw=0 | ||
388 | |||
389 | The fields are as follows: | ||
390 | |||
391 | o The leading number is the CPU number, with "!" indicating | ||
392 | an offline CPU. | ||
393 | |||
394 | o "np" is the number of times that __rcu_pending() has been invoked | ||
395 | for the corresponding flavor of RCU. | ||
396 | |||
397 | o "qsp" is the number of times that the RCU was waiting for a | ||
398 | quiescent state from this CPU. | ||
399 | |||
400 | o "rpq" is the number of times that the CPU had passed through | ||
401 | a quiescent state, but not yet reported it to RCU. | ||
402 | |||
403 | o "cbr" is the number of times that this CPU had RCU callbacks | ||
404 | that had passed through a grace period, and were thus ready | ||
405 | to be invoked. | ||
406 | |||
407 | o "cng" is the number of times that this CPU needed another | ||
408 | grace period while RCU was idle. | ||
409 | |||
410 | o "gpc" is the number of times that an old grace period had | ||
411 | completed, but this CPU was not yet aware of it. | ||
412 | |||
413 | o "gps" is the number of times that a new grace period had started, | ||
414 | but this CPU was not yet aware of it. | ||
415 | |||
416 | o "ndw" is the number of times that a wakeup of an rcuo | ||
417 | callback-offload kthread had to be deferred in order to avoid | ||
418 | deadlock. | ||
419 | |||
420 | o "nn" is the number of times that this CPU needed nothing. | ||
421 | |||
422 | |||
423 | The output of "cat rcu/rcuboost" looks as follows: | ||
424 | |||
425 | 0:3 tasks=.... kt=W ntb=0 neb=0 nnb=0 j=c864 bt=c894 | ||
426 | balk: nt=0 egt=4695 bt=0 nb=0 ny=56 nos=0 | ||
427 | 4:7 tasks=.... kt=W ntb=0 neb=0 nnb=0 j=c864 bt=c894 | ||
428 | balk: nt=0 egt=6541 bt=0 nb=0 ny=126 nos=0 | ||
429 | |||
430 | This information is output only for rcu_preempt. Each two-line entry | ||
431 | corresponds to a leaf rcu_node structure. The fields are as follows: | ||
432 | |||
433 | o "n:m" is the CPU-number range for the corresponding two-line | ||
434 | entry. In the sample output above, the first entry covers | ||
435 | CPUs zero through three and the second entry covers CPUs four | ||
436 | through seven. | ||
437 | |||
438 | o "tasks=TNEB" gives the state of the various segments of the | ||
439 | rnp->blocked_tasks list: | ||
440 | |||
441 | "T" This indicates that there are some tasks that blocked | ||
442 | while running on one of the corresponding CPUs while | ||
443 | in an RCU read-side critical section. | ||
444 | |||
445 | "N" This indicates that some of the blocked tasks are preventing | ||
446 | the current normal (non-expedited) grace period from | ||
447 | completing. | ||
448 | |||
449 | "E" This indicates that some of the blocked tasks are preventing | ||
450 | the current expedited grace period from completing. | ||
451 | |||
452 | "B" This indicates that some of the blocked tasks are in | ||
453 | need of RCU priority boosting. | ||
454 | |||
455 | Each character is replaced with "." if the corresponding | ||
456 | condition does not hold. | ||
457 | |||
458 | o "kt" is the state of the RCU priority-boosting kernel | ||
459 | thread associated with the corresponding rcu_node structure. | ||
460 | The state can be one of the following: | ||
461 | |||
462 | "S" The kernel thread is stopped, in other words, all | ||
463 | CPUs corresponding to this rcu_node structure are | ||
464 | offline. | ||
465 | |||
466 | "R" The kernel thread is running. | ||
467 | |||
468 | "W" The kernel thread is waiting because there is no work | ||
469 | for it to do. | ||
470 | |||
471 | "Y" The kernel thread is yielding to avoid hogging CPU. | ||
472 | |||
473 | "?" Unknown value, indicates a bug. | ||
474 | |||
475 | o "ntb" is the number of tasks boosted. | ||
476 | |||
477 | o "neb" is the number of tasks boosted in order to complete an | ||
478 | expedited grace period. | ||
479 | |||
480 | o "nnb" is the number of tasks boosted in order to complete a | ||
481 | normal (non-expedited) grace period. When boosting a task | ||
482 | that was blocking both an expedited and a normal grace period, | ||
483 | it is counted against the expedited total above. | ||
484 | |||
485 | o "j" is the low-order 16 bits of the jiffies counter in | ||
486 | hexadecimal. | ||
487 | |||
488 | o "bt" is the low-order 16 bits of the value that the jiffies | ||
489 | counter will have when we next start boosting, assuming that | ||
490 | the current grace period does not end beforehand. This is | ||
491 | also in hexadecimal. | ||
492 | |||
493 | o "balk: nt" counts the number of times we didn't boost (in | ||
494 | other words, we balked) even though it was time to boost because | ||
495 | there were no blocked tasks to boost. This situation occurs | ||
496 | when there is one blocked task on one rcu_node structure and | ||
497 | none on some other rcu_node structure. | ||
498 | |||
499 | o "egt" counts the number of times we balked because although | ||
500 | there were blocked tasks, none of them were blocking the | ||
501 | current grace period, whether expedited or otherwise. | ||
502 | |||
503 | o "bt" counts the number of times we balked because boosting | ||
504 | had already been initiated for the current grace period. | ||
505 | |||
506 | o "nb" counts the number of times we balked because there | ||
507 | was at least one task blocking the current non-expedited grace | ||
508 | period that never had blocked. If it is already running, it | ||
509 | just won't help to boost its priority! | ||
510 | |||
511 | o "ny" counts the number of times we balked because it was | ||
512 | not yet time to start boosting. | ||
513 | |||
514 | o "nos" counts the number of times we balked for other | ||
515 | reasons, e.g., the grace period ended first. | ||
516 | |||
517 | |||
518 | CONFIG_TINY_RCU debugfs Files and Formats | ||
519 | |||
520 | These implementations of RCU provides a single debugfs file under the | ||
521 | top-level directory RCU, namely rcu/rcudata, which displays fields in | ||
522 | rcu_bh_ctrlblk and rcu_sched_ctrlblk. | ||
523 | |||
524 | The output of "cat rcu/rcudata" is as follows: | ||
525 | |||
526 | rcu_sched: qlen: 0 | ||
527 | rcu_bh: qlen: 0 | ||
528 | |||
529 | This is split into rcu_sched and rcu_bh sections. The field is as | ||
530 | follows: | ||
531 | |||
532 | o "qlen" is the number of RCU callbacks currently waiting either | ||
533 | for an RCU grace period or waiting to be invoked. This is the | ||
534 | only field present for rcu_sched and rcu_bh, due to the | ||
535 | short-circuiting of grace period in those two cases. | ||