diff options
author | Ingo Molnar <mingo@kernel.org> | 2015-08-12 06:12:12 -0400 |
---|---|---|
committer | Ingo Molnar <mingo@kernel.org> | 2015-08-12 06:12:12 -0400 |
commit | 9b9412dc7008f360c8e8ed10a654d3c8719f69d8 (patch) | |
tree | f70ad5404519008315d576de91eb1d4fb55116d5 | |
parent | 58ccab91342c1cc1fe08da9b198ac5d763706c2e (diff) | |
parent | 3dbe43f6fba9f2a0e46e371733575a45704c22ab (diff) |
Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU changes from Paul E. McKenney:
- The combination of tree geometry-initialization simplifications
and OS-jitter-reduction changes to expedited grace periods.
These two are stacked due to the large number of conflicts
that would otherwise result.
[ With one addition, a temporary commit to silence a lockdep false
positive. Additional changes to the expedited grace-period
primitives (queued for 4.4) remove the cause of this false
positive, and therefore include a revert of this temporary commit. ]
- Documentation updates.
- Torture-test updates.
- Miscellaneous fixes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
47 files changed, 989 insertions, 841 deletions
diff --git a/Documentation/RCU/rcu_dereference.txt b/Documentation/RCU/rcu_dereference.txt index 1e6c0da994f5..c0bf2441a2ba 100644 --- a/Documentation/RCU/rcu_dereference.txt +++ b/Documentation/RCU/rcu_dereference.txt | |||
@@ -28,7 +28,7 @@ o You must use one of the rcu_dereference() family of primitives | |||
28 | o Avoid cancellation when using the "+" and "-" infix arithmetic | 28 | o Avoid cancellation when using the "+" and "-" infix arithmetic |
29 | operators. For example, for a given variable "x", avoid | 29 | operators. For example, for a given variable "x", avoid |
30 | "(x-x)". There are similar arithmetic pitfalls from other | 30 | "(x-x)". There are similar arithmetic pitfalls from other |
31 | arithmetic operatiors, such as "(x*0)", "(x/(x+1))" or "(x%1)". | 31 | arithmetic operators, such as "(x*0)", "(x/(x+1))" or "(x%1)". |
32 | The compiler is within its rights to substitute zero for all of | 32 | The compiler is within its rights to substitute zero for all of |
33 | these expressions, so that subsequent accesses no longer depend | 33 | these expressions, so that subsequent accesses no longer depend |
34 | on the rcu_dereference(), again possibly resulting in bugs due | 34 | on the rcu_dereference(), again possibly resulting in bugs due |
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index b57c0c1cdac6..efb9454875ab 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt | |||
@@ -26,12 +26,6 @@ CONFIG_RCU_CPU_STALL_TIMEOUT | |||
26 | Stall-warning messages may be enabled and disabled completely via | 26 | Stall-warning messages may be enabled and disabled completely via |
27 | /sys/module/rcupdate/parameters/rcu_cpu_stall_suppress. | 27 | /sys/module/rcupdate/parameters/rcu_cpu_stall_suppress. |
28 | 28 | ||
29 | CONFIG_RCU_CPU_STALL_INFO | ||
30 | |||
31 | This kernel configuration parameter causes the stall warning to | ||
32 | print out additional per-CPU diagnostic information, including | ||
33 | information on scheduling-clock ticks and RCU's idle-CPU tracking. | ||
34 | |||
35 | RCU_STALL_DELAY_DELTA | 29 | RCU_STALL_DELAY_DELTA |
36 | 30 | ||
37 | Although the lockdep facility is extremely useful, it does add | 31 | Although the lockdep facility is extremely useful, it does add |
@@ -101,15 +95,13 @@ interact. Please note that it is not possible to entirely eliminate this | |||
101 | sort of false positive without resorting to things like stop_machine(), | 95 | sort of false positive without resorting to things like stop_machine(), |
102 | which is overkill for this sort of problem. | 96 | which is overkill for this sort of problem. |
103 | 97 | ||
104 | If the CONFIG_RCU_CPU_STALL_INFO kernel configuration parameter is set, | 98 | Recent kernels will print a long form of the stall-warning message: |
105 | more information is printed with the stall-warning message, for example: | ||
106 | 99 | ||
107 | INFO: rcu_preempt detected stall on CPU | 100 | INFO: rcu_preempt detected stall on CPU |
108 | 0: (63959 ticks this GP) idle=241/3fffffffffffffff/0 softirq=82/543 | 101 | 0: (63959 ticks this GP) idle=241/3fffffffffffffff/0 softirq=82/543 |
109 | (t=65000 jiffies) | 102 | (t=65000 jiffies) |
110 | 103 | ||
111 | In kernels with CONFIG_RCU_FAST_NO_HZ, even more information is | 104 | In kernels with CONFIG_RCU_FAST_NO_HZ, more information is printed: |
112 | printed: | ||
113 | 105 | ||
114 | INFO: rcu_preempt detected stall on CPU | 106 | INFO: rcu_preempt detected stall on CPU |
115 | 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 softirq=82/543 last_accelerate: a345/d342 nonlazy_posted: 25 .D | 107 | 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 softirq=82/543 last_accelerate: a345/d342 nonlazy_posted: 25 .D |
@@ -171,6 +163,23 @@ message will be about three times the interval between the beginning | |||
171 | of the stall and the first message. | 163 | of the stall and the first message. |
172 | 164 | ||
173 | 165 | ||
166 | Stall Warnings for Expedited Grace Periods | ||
167 | |||
168 | If an expedited grace period detects a stall, it will place a message | ||
169 | like the following in dmesg: | ||
170 | |||
171 | INFO: rcu_sched detected expedited stalls on CPUs: { 1 2 6 } 26009 jiffies s: 1043 | ||
172 | |||
173 | This indicates that CPUs 1, 2, and 6 have failed to respond to a | ||
174 | reschedule IPI, that the expedited grace period has been going on for | ||
175 | 26,009 jiffies, and that the expedited grace-period sequence counter is | ||
176 | 1043. The fact that this last value is odd indicates that an expedited | ||
177 | grace period is in flight. | ||
178 | |||
179 | It is entirely possible to see stall warnings from normal and from | ||
180 | expedited grace periods at about the same time from the same run. | ||
181 | |||
182 | |||
174 | What Causes RCU CPU Stall Warnings? | 183 | What Causes RCU CPU Stall Warnings? |
175 | 184 | ||
176 | So your kernel printed an RCU CPU stall warning. The next question is | 185 | So your kernel printed an RCU CPU stall warning. The next question is |
diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt index 08651da15448..97f17e9decda 100644 --- a/Documentation/RCU/trace.txt +++ b/Documentation/RCU/trace.txt | |||
@@ -237,42 +237,26 @@ o "ktl" is the low-order 16 bits (in hexadecimal) of the count of | |||
237 | 237 | ||
238 | The output of "cat rcu/rcu_preempt/rcuexp" looks as follows: | 238 | The output of "cat rcu/rcu_preempt/rcuexp" looks as follows: |
239 | 239 | ||
240 | s=21872 d=21872 w=0 tf=0 wd1=0 wd2=0 n=0 sc=21872 dt=21872 dl=0 dx=21872 | 240 | s=21872 wd0=0 wd1=0 wd2=0 wd3=5 n=0 enq=0 sc=21872 |
241 | 241 | ||
242 | These fields are as follows: | 242 | These fields are as follows: |
243 | 243 | ||
244 | o "s" is the starting sequence number. | 244 | o "s" is the sequence number, with an odd number indicating that |
245 | an expedited grace period is in progress. | ||
245 | 246 | ||
246 | o "d" is the ending sequence number. When the starting and ending | 247 | o "wd0", "wd1", "wd2", and "wd3" are the number of times that an |
247 | numbers differ, there is an expedited grace period in progress. | 248 | attempt to start an expedited grace period found that someone |
248 | 249 | else had completed an expedited grace period that satisfies the | |
249 | o "w" is the number of times that the sequence numbers have been | ||
250 | in danger of wrapping. | ||
251 | |||
252 | o "tf" is the number of times that contention has resulted in a | ||
253 | failure to begin an expedited grace period. | ||
254 | |||
255 | o "wd1" and "wd2" are the number of times that an attempt to | ||
256 | start an expedited grace period found that someone else had | ||
257 | completed an expedited grace period that satisfies the | ||
258 | attempted request. "Our work is done." | 250 | attempted request. "Our work is done." |
259 | 251 | ||
260 | o "n" is number of times that contention was so great that | 252 | o "n" is number of times that a concurrent CPU-hotplug operation |
261 | the request was demoted from an expedited grace period to | 253 | forced a fallback to a normal grace period. |
262 | a normal grace period. | 254 | |
255 | o "enq" is the number of quiescent states still outstanding. | ||
263 | 256 | ||
264 | o "sc" is the number of times that the attempt to start a | 257 | o "sc" is the number of times that the attempt to start a |
265 | new expedited grace period succeeded. | 258 | new expedited grace period succeeded. |
266 | 259 | ||
267 | o "dt" is the number of times that we attempted to update | ||
268 | the "d" counter. | ||
269 | |||
270 | o "dl" is the number of times that we failed to update the "d" | ||
271 | counter. | ||
272 | |||
273 | o "dx" is the number of times that we succeeded in updating | ||
274 | the "d" counter. | ||
275 | |||
276 | 260 | ||
277 | The output of "cat rcu/rcu_preempt/rcugp" looks as follows: | 261 | The output of "cat rcu/rcu_preempt/rcugp" looks as follows: |
278 | 262 | ||
diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index 5746b0c77f3e..adc2184009c5 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt | |||
@@ -883,7 +883,7 @@ All: lockdep-checked RCU-protected pointer access | |||
883 | 883 | ||
884 | rcu_access_pointer | 884 | rcu_access_pointer |
885 | rcu_dereference_raw | 885 | rcu_dereference_raw |
886 | rcu_lockdep_assert | 886 | RCU_LOCKDEP_WARN |
887 | rcu_sleep_check | 887 | rcu_sleep_check |
888 | RCU_NONIDLE | 888 | RCU_NONIDLE |
889 | 889 | ||
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 1d6f0459cd7b..01b5b68a237a 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt | |||
@@ -3135,22 +3135,35 @@ bytes respectively. Such letter suffixes can also be entirely omitted. | |||
3135 | in a given burst of a callback-flood test. | 3135 | in a given burst of a callback-flood test. |
3136 | 3136 | ||
3137 | rcutorture.fqs_duration= [KNL] | 3137 | rcutorture.fqs_duration= [KNL] |
3138 | Set duration of force_quiescent_state bursts. | 3138 | Set duration of force_quiescent_state bursts |
3139 | in microseconds. | ||
3139 | 3140 | ||
3140 | rcutorture.fqs_holdoff= [KNL] | 3141 | rcutorture.fqs_holdoff= [KNL] |
3141 | Set holdoff time within force_quiescent_state bursts. | 3142 | Set holdoff time within force_quiescent_state bursts |
3143 | in microseconds. | ||
3142 | 3144 | ||
3143 | rcutorture.fqs_stutter= [KNL] | 3145 | rcutorture.fqs_stutter= [KNL] |
3144 | Set wait time between force_quiescent_state bursts. | 3146 | Set wait time between force_quiescent_state bursts |
3147 | in seconds. | ||
3148 | |||
3149 | rcutorture.gp_cond= [KNL] | ||
3150 | Use conditional/asynchronous update-side | ||
3151 | primitives, if available. | ||
3145 | 3152 | ||
3146 | rcutorture.gp_exp= [KNL] | 3153 | rcutorture.gp_exp= [KNL] |
3147 | Use expedited update-side primitives. | 3154 | Use expedited update-side primitives, if available. |
3148 | 3155 | ||
3149 | rcutorture.gp_normal= [KNL] | 3156 | rcutorture.gp_normal= [KNL] |
3150 | Use normal (non-expedited) update-side primitives. | 3157 | Use normal (non-expedited) asynchronous |
3151 | If both gp_exp and gp_normal are set, do both. | 3158 | update-side primitives, if available. |
3152 | If neither gp_exp nor gp_normal are set, still | 3159 | |
3153 | do both. | 3160 | rcutorture.gp_sync= [KNL] |
3161 | Use normal (non-expedited) synchronous | ||
3162 | update-side primitives, if available. If all | ||
3163 | of rcutorture.gp_cond=, rcutorture.gp_exp=, | ||
3164 | rcutorture.gp_normal=, and rcutorture.gp_sync= | ||
3165 | are zero, rcutorture acts as if is interpreted | ||
3166 | they are all non-zero. | ||
3154 | 3167 | ||
3155 | rcutorture.n_barrier_cbs= [KNL] | 3168 | rcutorture.n_barrier_cbs= [KNL] |
3156 | Set callbacks/threads for rcu_barrier() testing. | 3169 | Set callbacks/threads for rcu_barrier() testing. |
@@ -3177,9 +3190,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted. | |||
3177 | Set time (s) between CPU-hotplug operations, or | 3190 | Set time (s) between CPU-hotplug operations, or |
3178 | zero to disable CPU-hotplug testing. | 3191 | zero to disable CPU-hotplug testing. |
3179 | 3192 | ||
3180 | rcutorture.torture_runnable= [BOOT] | ||
3181 | Start rcutorture running at boot time. | ||
3182 | |||
3183 | rcutorture.shuffle_interval= [KNL] | 3193 | rcutorture.shuffle_interval= [KNL] |
3184 | Set task-shuffle interval (s). Shuffling tasks | 3194 | Set task-shuffle interval (s). Shuffling tasks |
3185 | allows some CPUs to go into dyntick-idle mode | 3195 | allows some CPUs to go into dyntick-idle mode |
@@ -3220,6 +3230,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted. | |||
3220 | Test RCU's dyntick-idle handling. See also the | 3230 | Test RCU's dyntick-idle handling. See also the |
3221 | rcutorture.shuffle_interval parameter. | 3231 | rcutorture.shuffle_interval parameter. |
3222 | 3232 | ||
3233 | rcutorture.torture_runnable= [BOOT] | ||
3234 | Start rcutorture running at boot time. | ||
3235 | |||
3223 | rcutorture.torture_type= [KNL] | 3236 | rcutorture.torture_type= [KNL] |
3224 | Specify the RCU implementation to test. | 3237 | Specify the RCU implementation to test. |
3225 | 3238 | ||
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 13feb697271f..318523872db5 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt | |||
@@ -194,22 +194,22 @@ There are some minimal guarantees that may be expected of a CPU: | |||
194 | (*) On any given CPU, dependent memory accesses will be issued in order, with | 194 | (*) On any given CPU, dependent memory accesses will be issued in order, with |
195 | respect to itself. This means that for: | 195 | respect to itself. This means that for: |
196 | 196 | ||
197 | ACCESS_ONCE(Q) = P; smp_read_barrier_depends(); D = ACCESS_ONCE(*Q); | 197 | WRITE_ONCE(Q, P); smp_read_barrier_depends(); D = READ_ONCE(*Q); |
198 | 198 | ||
199 | the CPU will issue the following memory operations: | 199 | the CPU will issue the following memory operations: |
200 | 200 | ||
201 | Q = LOAD P, D = LOAD *Q | 201 | Q = LOAD P, D = LOAD *Q |
202 | 202 | ||
203 | and always in that order. On most systems, smp_read_barrier_depends() | 203 | and always in that order. On most systems, smp_read_barrier_depends() |
204 | does nothing, but it is required for DEC Alpha. The ACCESS_ONCE() | 204 | does nothing, but it is required for DEC Alpha. The READ_ONCE() |
205 | is required to prevent compiler mischief. Please note that you | 205 | and WRITE_ONCE() are required to prevent compiler mischief. Please |
206 | should normally use something like rcu_dereference() instead of | 206 | note that you should normally use something like rcu_dereference() |
207 | open-coding smp_read_barrier_depends(). | 207 | instead of open-coding smp_read_barrier_depends(). |
208 | 208 | ||
209 | (*) Overlapping loads and stores within a particular CPU will appear to be | 209 | (*) Overlapping loads and stores within a particular CPU will appear to be |
210 | ordered within that CPU. This means that for: | 210 | ordered within that CPU. This means that for: |
211 | 211 | ||
212 | a = ACCESS_ONCE(*X); ACCESS_ONCE(*X) = b; | 212 | a = READ_ONCE(*X); WRITE_ONCE(*X, b); |
213 | 213 | ||
214 | the CPU will only issue the following sequence of memory operations: | 214 | the CPU will only issue the following sequence of memory operations: |
215 | 215 | ||
@@ -217,7 +217,7 @@ There are some minimal guarantees that may be expected of a CPU: | |||
217 | 217 | ||
218 | And for: | 218 | And for: |
219 | 219 | ||
220 | ACCESS_ONCE(*X) = c; d = ACCESS_ONCE(*X); | 220 | WRITE_ONCE(*X, c); d = READ_ONCE(*X); |
221 | 221 | ||
222 | the CPU will only issue: | 222 | the CPU will only issue: |
223 | 223 | ||
@@ -228,11 +228,11 @@ There are some minimal guarantees that may be expected of a CPU: | |||
228 | 228 | ||
229 | And there are a number of things that _must_ or _must_not_ be assumed: | 229 | And there are a number of things that _must_ or _must_not_ be assumed: |
230 | 230 | ||
231 | (*) It _must_not_ be assumed that the compiler will do what you want with | 231 | (*) It _must_not_ be assumed that the compiler will do what you want |
232 | memory references that are not protected by ACCESS_ONCE(). Without | 232 | with memory references that are not protected by READ_ONCE() and |
233 | ACCESS_ONCE(), the compiler is within its rights to do all sorts | 233 | WRITE_ONCE(). Without them, the compiler is within its rights to |
234 | of "creative" transformations, which are covered in the Compiler | 234 | do all sorts of "creative" transformations, which are covered in |
235 | Barrier section. | 235 | the Compiler Barrier section. |
236 | 236 | ||
237 | (*) It _must_not_ be assumed that independent loads and stores will be issued | 237 | (*) It _must_not_ be assumed that independent loads and stores will be issued |
238 | in the order given. This means that for: | 238 | in the order given. This means that for: |
@@ -520,8 +520,8 @@ following sequence of events: | |||
520 | { A == 1, B == 2, C = 3, P == &A, Q == &C } | 520 | { A == 1, B == 2, C = 3, P == &A, Q == &C } |
521 | B = 4; | 521 | B = 4; |
522 | <write barrier> | 522 | <write barrier> |
523 | ACCESS_ONCE(P) = &B | 523 | WRITE_ONCE(P, &B) |
524 | Q = ACCESS_ONCE(P); | 524 | Q = READ_ONCE(P); |
525 | D = *Q; | 525 | D = *Q; |
526 | 526 | ||
527 | There's a clear data dependency here, and it would seem that by the end of the | 527 | There's a clear data dependency here, and it would seem that by the end of the |
@@ -547,8 +547,8 @@ between the address load and the data load: | |||
547 | { A == 1, B == 2, C = 3, P == &A, Q == &C } | 547 | { A == 1, B == 2, C = 3, P == &A, Q == &C } |
548 | B = 4; | 548 | B = 4; |
549 | <write barrier> | 549 | <write barrier> |
550 | ACCESS_ONCE(P) = &B | 550 | WRITE_ONCE(P, &B); |
551 | Q = ACCESS_ONCE(P); | 551 | Q = READ_ONCE(P); |
552 | <data dependency barrier> | 552 | <data dependency barrier> |
553 | D = *Q; | 553 | D = *Q; |
554 | 554 | ||
@@ -574,8 +574,8 @@ access: | |||
574 | { M[0] == 1, M[1] == 2, M[3] = 3, P == 0, Q == 3 } | 574 | { M[0] == 1, M[1] == 2, M[3] = 3, P == 0, Q == 3 } |
575 | M[1] = 4; | 575 | M[1] = 4; |
576 | <write barrier> | 576 | <write barrier> |
577 | ACCESS_ONCE(P) = 1 | 577 | WRITE_ONCE(P, 1); |
578 | Q = ACCESS_ONCE(P); | 578 | Q = READ_ONCE(P); |
579 | <data dependency barrier> | 579 | <data dependency barrier> |
580 | D = M[Q]; | 580 | D = M[Q]; |
581 | 581 | ||
@@ -596,10 +596,10 @@ A load-load control dependency requires a full read memory barrier, not | |||
596 | simply a data dependency barrier to make it work correctly. Consider the | 596 | simply a data dependency barrier to make it work correctly. Consider the |
597 | following bit of code: | 597 | following bit of code: |
598 | 598 | ||
599 | q = ACCESS_ONCE(a); | 599 | q = READ_ONCE(a); |
600 | if (q) { | 600 | if (q) { |
601 | <data dependency barrier> /* BUG: No data dependency!!! */ | 601 | <data dependency barrier> /* BUG: No data dependency!!! */ |
602 | p = ACCESS_ONCE(b); | 602 | p = READ_ONCE(b); |
603 | } | 603 | } |
604 | 604 | ||
605 | This will not have the desired effect because there is no actual data | 605 | This will not have the desired effect because there is no actual data |
@@ -608,10 +608,10 @@ by attempting to predict the outcome in advance, so that other CPUs see | |||
608 | the load from b as having happened before the load from a. In such a | 608 | the load from b as having happened before the load from a. In such a |
609 | case what's actually required is: | 609 | case what's actually required is: |
610 | 610 | ||
611 | q = ACCESS_ONCE(a); | 611 | q = READ_ONCE(a); |
612 | if (q) { | 612 | if (q) { |
613 | <read barrier> | 613 | <read barrier> |
614 | p = ACCESS_ONCE(b); | 614 | p = READ_ONCE(b); |
615 | } | 615 | } |
616 | 616 | ||
617 | However, stores are not speculated. This means that ordering -is- provided | 617 | However, stores are not speculated. This means that ordering -is- provided |
@@ -619,7 +619,7 @@ for load-store control dependencies, as in the following example: | |||
619 | 619 | ||
620 | q = READ_ONCE_CTRL(a); | 620 | q = READ_ONCE_CTRL(a); |
621 | if (q) { | 621 | if (q) { |
622 | ACCESS_ONCE(b) = p; | 622 | WRITE_ONCE(b, p); |
623 | } | 623 | } |
624 | 624 | ||
625 | Control dependencies pair normally with other types of barriers. That | 625 | Control dependencies pair normally with other types of barriers. That |
@@ -647,11 +647,11 @@ branches of the "if" statement as follows: | |||
647 | q = READ_ONCE_CTRL(a); | 647 | q = READ_ONCE_CTRL(a); |
648 | if (q) { | 648 | if (q) { |
649 | barrier(); | 649 | barrier(); |
650 | ACCESS_ONCE(b) = p; | 650 | WRITE_ONCE(b, p); |
651 | do_something(); | 651 | do_something(); |
652 | } else { | 652 | } else { |
653 | barrier(); | 653 | barrier(); |
654 | ACCESS_ONCE(b) = p; | 654 | WRITE_ONCE(b, p); |
655 | do_something_else(); | 655 | do_something_else(); |
656 | } | 656 | } |
657 | 657 | ||
@@ -660,12 +660,12 @@ optimization levels: | |||
660 | 660 | ||
661 | q = READ_ONCE_CTRL(a); | 661 | q = READ_ONCE_CTRL(a); |
662 | barrier(); | 662 | barrier(); |
663 | ACCESS_ONCE(b) = p; /* BUG: No ordering vs. load from a!!! */ | 663 | WRITE_ONCE(b, p); /* BUG: No ordering vs. load from a!!! */ |
664 | if (q) { | 664 | if (q) { |
665 | /* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */ | 665 | /* WRITE_ONCE(b, p); -- moved up, BUG!!! */ |
666 | do_something(); | 666 | do_something(); |
667 | } else { | 667 | } else { |
668 | /* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */ | 668 | /* WRITE_ONCE(b, p); -- moved up, BUG!!! */ |
669 | do_something_else(); | 669 | do_something_else(); |
670 | } | 670 | } |
671 | 671 | ||
@@ -676,7 +676,7 @@ assembly code even after all compiler optimizations have been applied. | |||
676 | Therefore, if you need ordering in this example, you need explicit | 676 | Therefore, if you need ordering in this example, you need explicit |
677 | memory barriers, for example, smp_store_release(): | 677 | memory barriers, for example, smp_store_release(): |
678 | 678 | ||
679 | q = ACCESS_ONCE(a); | 679 | q = READ_ONCE(a); |
680 | if (q) { | 680 | if (q) { |
681 | smp_store_release(&b, p); | 681 | smp_store_release(&b, p); |
682 | do_something(); | 682 | do_something(); |
@@ -690,10 +690,10 @@ ordering is guaranteed only when the stores differ, for example: | |||
690 | 690 | ||
691 | q = READ_ONCE_CTRL(a); | 691 | q = READ_ONCE_CTRL(a); |
692 | if (q) { | 692 | if (q) { |
693 | ACCESS_ONCE(b) = p; | 693 | WRITE_ONCE(b, p); |
694 | do_something(); | 694 | do_something(); |
695 | } else { | 695 | } else { |
696 | ACCESS_ONCE(b) = r; | 696 | WRITE_ONCE(b, r); |
697 | do_something_else(); | 697 | do_something_else(); |
698 | } | 698 | } |
699 | 699 | ||
@@ -706,10 +706,10 @@ the needed conditional. For example: | |||
706 | 706 | ||
707 | q = READ_ONCE_CTRL(a); | 707 | q = READ_ONCE_CTRL(a); |
708 | if (q % MAX) { | 708 | if (q % MAX) { |
709 | ACCESS_ONCE(b) = p; | 709 | WRITE_ONCE(b, p); |
710 | do_something(); | 710 | do_something(); |
711 | } else { | 711 | } else { |
712 | ACCESS_ONCE(b) = r; | 712 | WRITE_ONCE(b, r); |
713 | do_something_else(); | 713 | do_something_else(); |
714 | } | 714 | } |
715 | 715 | ||
@@ -718,7 +718,7 @@ equal to zero, in which case the compiler is within its rights to | |||
718 | transform the above code into the following: | 718 | transform the above code into the following: |
719 | 719 | ||
720 | q = READ_ONCE_CTRL(a); | 720 | q = READ_ONCE_CTRL(a); |
721 | ACCESS_ONCE(b) = p; | 721 | WRITE_ONCE(b, p); |
722 | do_something_else(); | 722 | do_something_else(); |
723 | 723 | ||
724 | Given this transformation, the CPU is not required to respect the ordering | 724 | Given this transformation, the CPU is not required to respect the ordering |
@@ -731,10 +731,10 @@ one, perhaps as follows: | |||
731 | q = READ_ONCE_CTRL(a); | 731 | q = READ_ONCE_CTRL(a); |
732 | BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */ | 732 | BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */ |
733 | if (q % MAX) { | 733 | if (q % MAX) { |
734 | ACCESS_ONCE(b) = p; | 734 | WRITE_ONCE(b, p); |
735 | do_something(); | 735 | do_something(); |
736 | } else { | 736 | } else { |
737 | ACCESS_ONCE(b) = r; | 737 | WRITE_ONCE(b, r); |
738 | do_something_else(); | 738 | do_something_else(); |
739 | } | 739 | } |
740 | 740 | ||
@@ -746,18 +746,18 @@ You must also be careful not to rely too much on boolean short-circuit | |||
746 | evaluation. Consider this example: | 746 | evaluation. Consider this example: |
747 | 747 | ||
748 | q = READ_ONCE_CTRL(a); | 748 | q = READ_ONCE_CTRL(a); |
749 | if (a || 1 > 0) | 749 | if (q || 1 > 0) |
750 | ACCESS_ONCE(b) = 1; | 750 | WRITE_ONCE(b, 1); |
751 | 751 | ||
752 | Because the first condition cannot fault and the second condition is | 752 | Because the first condition cannot fault and the second condition is |
753 | always true, the compiler can transform this example as following, | 753 | always true, the compiler can transform this example as following, |
754 | defeating control dependency: | 754 | defeating control dependency: |
755 | 755 | ||
756 | q = READ_ONCE_CTRL(a); | 756 | q = READ_ONCE_CTRL(a); |
757 | ACCESS_ONCE(b) = 1; | 757 | WRITE_ONCE(b, 1); |
758 | 758 | ||
759 | This example underscores the need to ensure that the compiler cannot | 759 | This example underscores the need to ensure that the compiler cannot |
760 | out-guess your code. More generally, although ACCESS_ONCE() does force | 760 | out-guess your code. More generally, although READ_ONCE() does force |
761 | the compiler to actually emit code for a given load, it does not force | 761 | the compiler to actually emit code for a given load, it does not force |
762 | the compiler to use the results. | 762 | the compiler to use the results. |
763 | 763 | ||
@@ -769,7 +769,7 @@ x and y both being zero: | |||
769 | ======================= ======================= | 769 | ======================= ======================= |
770 | r1 = READ_ONCE_CTRL(x); r2 = READ_ONCE_CTRL(y); | 770 | r1 = READ_ONCE_CTRL(x); r2 = READ_ONCE_CTRL(y); |
771 | if (r1 > 0) if (r2 > 0) | 771 | if (r1 > 0) if (r2 > 0) |
772 | ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1; | 772 | WRITE_ONCE(y, 1); WRITE_ONCE(x, 1); |
773 | 773 | ||
774 | assert(!(r1 == 1 && r2 == 1)); | 774 | assert(!(r1 == 1 && r2 == 1)); |
775 | 775 | ||
@@ -779,7 +779,7 @@ then adding the following CPU would guarantee a related assertion: | |||
779 | 779 | ||
780 | CPU 2 | 780 | CPU 2 |
781 | ===================== | 781 | ===================== |
782 | ACCESS_ONCE(x) = 2; | 782 | WRITE_ONCE(x, 2); |
783 | 783 | ||
784 | assert(!(r1 == 2 && r2 == 1 && x == 2)); /* FAILS!!! */ | 784 | assert(!(r1 == 2 && r2 == 1 && x == 2)); /* FAILS!!! */ |
785 | 785 | ||
@@ -798,8 +798,7 @@ In summary: | |||
798 | 798 | ||
799 | (*) Control dependencies must be headed by READ_ONCE_CTRL(). | 799 | (*) Control dependencies must be headed by READ_ONCE_CTRL(). |
800 | Or, as a much less preferable alternative, interpose | 800 | Or, as a much less preferable alternative, interpose |
801 | be headed by READ_ONCE() or an ACCESS_ONCE() read and must | 801 | smp_read_barrier_depends() between a READ_ONCE() and the |
802 | have smp_read_barrier_depends() between this read and the | ||
803 | control-dependent write. | 802 | control-dependent write. |
804 | 803 | ||
805 | (*) Control dependencies can order prior loads against later stores. | 804 | (*) Control dependencies can order prior loads against later stores. |
@@ -815,15 +814,16 @@ In summary: | |||
815 | 814 | ||
816 | (*) Control dependencies require at least one run-time conditional | 815 | (*) Control dependencies require at least one run-time conditional |
817 | between the prior load and the subsequent store, and this | 816 | between the prior load and the subsequent store, and this |
818 | conditional must involve the prior load. If the compiler | 817 | conditional must involve the prior load. If the compiler is able |
819 | is able to optimize the conditional away, it will have also | 818 | to optimize the conditional away, it will have also optimized |
820 | optimized away the ordering. Careful use of ACCESS_ONCE() can | 819 | away the ordering. Careful use of READ_ONCE_CTRL() READ_ONCE(), |
821 | help to preserve the needed conditional. | 820 | and WRITE_ONCE() can help to preserve the needed conditional. |
822 | 821 | ||
823 | (*) Control dependencies require that the compiler avoid reordering the | 822 | (*) Control dependencies require that the compiler avoid reordering the |
824 | dependency into nonexistence. Careful use of ACCESS_ONCE() or | 823 | dependency into nonexistence. Careful use of READ_ONCE_CTRL() |
825 | barrier() can help to preserve your control dependency. Please | 824 | or smp_read_barrier_depends() can help to preserve your control |
826 | see the Compiler Barrier section for more information. | 825 | dependency. Please see the Compiler Barrier section for more |
826 | information. | ||
827 | 827 | ||
828 | (*) Control dependencies pair normally with other types of barriers. | 828 | (*) Control dependencies pair normally with other types of barriers. |
829 | 829 | ||
@@ -848,11 +848,11 @@ barrier, an acquire barrier, a release barrier, or a general barrier: | |||
848 | 848 | ||
849 | CPU 1 CPU 2 | 849 | CPU 1 CPU 2 |
850 | =============== =============== | 850 | =============== =============== |
851 | ACCESS_ONCE(a) = 1; | 851 | WRITE_ONCE(a, 1); |
852 | <write barrier> | 852 | <write barrier> |
853 | ACCESS_ONCE(b) = 2; x = ACCESS_ONCE(b); | 853 | WRITE_ONCE(b, 2); x = READ_ONCE(b); |
854 | <read barrier> | 854 | <read barrier> |
855 | y = ACCESS_ONCE(a); | 855 | y = READ_ONCE(a); |
856 | 856 | ||
857 | Or: | 857 | Or: |
858 | 858 | ||
@@ -860,7 +860,7 @@ Or: | |||
860 | =============== =============================== | 860 | =============== =============================== |
861 | a = 1; | 861 | a = 1; |
862 | <write barrier> | 862 | <write barrier> |
863 | ACCESS_ONCE(b) = &a; x = ACCESS_ONCE(b); | 863 | WRITE_ONCE(b, &a); x = READ_ONCE(b); |
864 | <data dependency barrier> | 864 | <data dependency barrier> |
865 | y = *x; | 865 | y = *x; |
866 | 866 | ||
@@ -868,11 +868,11 @@ Or even: | |||
868 | 868 | ||
869 | CPU 1 CPU 2 | 869 | CPU 1 CPU 2 |
870 | =============== =============================== | 870 | =============== =============================== |
871 | r1 = ACCESS_ONCE(y); | 871 | r1 = READ_ONCE(y); |
872 | <general barrier> | 872 | <general barrier> |
873 | ACCESS_ONCE(y) = 1; if (r2 = ACCESS_ONCE(x)) { | 873 | WRITE_ONCE(y, 1); if (r2 = READ_ONCE(x)) { |
874 | <implicit control dependency> | 874 | <implicit control dependency> |
875 | ACCESS_ONCE(y) = 1; | 875 | WRITE_ONCE(y, 1); |
876 | } | 876 | } |
877 | 877 | ||
878 | assert(r1 == 0 || r2 == 0); | 878 | assert(r1 == 0 || r2 == 0); |
@@ -886,11 +886,11 @@ versa: | |||
886 | 886 | ||
887 | CPU 1 CPU 2 | 887 | CPU 1 CPU 2 |
888 | =================== =================== | 888 | =================== =================== |
889 | ACCESS_ONCE(a) = 1; }---- --->{ v = ACCESS_ONCE(c); | 889 | WRITE_ONCE(a, 1); }---- --->{ v = READ_ONCE(c); |
890 | ACCESS_ONCE(b) = 2; } \ / { w = ACCESS_ONCE(d); | 890 | WRITE_ONCE(b, 2); } \ / { w = READ_ONCE(d); |
891 | <write barrier> \ <read barrier> | 891 | <write barrier> \ <read barrier> |
892 | ACCESS_ONCE(c) = 3; } / \ { x = ACCESS_ONCE(a); | 892 | WRITE_ONCE(c, 3); } / \ { x = READ_ONCE(a); |
893 | ACCESS_ONCE(d) = 4; }---- --->{ y = ACCESS_ONCE(b); | 893 | WRITE_ONCE(d, 4); }---- --->{ y = READ_ONCE(b); |
894 | 894 | ||
895 | 895 | ||
896 | EXAMPLES OF MEMORY BARRIER SEQUENCES | 896 | EXAMPLES OF MEMORY BARRIER SEQUENCES |
@@ -1340,10 +1340,10 @@ compiler from moving the memory accesses either side of it to the other side: | |||
1340 | 1340 | ||
1341 | barrier(); | 1341 | barrier(); |
1342 | 1342 | ||
1343 | This is a general barrier -- there are no read-read or write-write variants | 1343 | This is a general barrier -- there are no read-read or write-write |
1344 | of barrier(). However, ACCESS_ONCE() can be thought of as a weak form | 1344 | variants of barrier(). However, READ_ONCE() and WRITE_ONCE() can be |
1345 | for barrier() that affects only the specific accesses flagged by the | 1345 | thought of as weak forms of barrier() that affect only the specific |
1346 | ACCESS_ONCE(). | 1346 | accesses flagged by the READ_ONCE() or WRITE_ONCE(). |
1347 | 1347 | ||
1348 | The barrier() function has the following effects: | 1348 | The barrier() function has the following effects: |
1349 | 1349 | ||
@@ -1355,9 +1355,10 @@ The barrier() function has the following effects: | |||
1355 | (*) Within a loop, forces the compiler to load the variables used | 1355 | (*) Within a loop, forces the compiler to load the variables used |
1356 | in that loop's conditional on each pass through that loop. | 1356 | in that loop's conditional on each pass through that loop. |
1357 | 1357 | ||
1358 | The ACCESS_ONCE() function can prevent any number of optimizations that, | 1358 | The READ_ONCE() and WRITE_ONCE() functions can prevent any number of |
1359 | while perfectly safe in single-threaded code, can be fatal in concurrent | 1359 | optimizations that, while perfectly safe in single-threaded code, can |
1360 | code. Here are some examples of these sorts of optimizations: | 1360 | be fatal in concurrent code. Here are some examples of these sorts |
1361 | of optimizations: | ||
1361 | 1362 | ||
1362 | (*) The compiler is within its rights to reorder loads and stores | 1363 | (*) The compiler is within its rights to reorder loads and stores |
1363 | to the same variable, and in some cases, the CPU is within its | 1364 | to the same variable, and in some cases, the CPU is within its |
@@ -1370,11 +1371,11 @@ code. Here are some examples of these sorts of optimizations: | |||
1370 | Might result in an older value of x stored in a[1] than in a[0]. | 1371 | Might result in an older value of x stored in a[1] than in a[0]. |
1371 | Prevent both the compiler and the CPU from doing this as follows: | 1372 | Prevent both the compiler and the CPU from doing this as follows: |
1372 | 1373 | ||
1373 | a[0] = ACCESS_ONCE(x); | 1374 | a[0] = READ_ONCE(x); |
1374 | a[1] = ACCESS_ONCE(x); | 1375 | a[1] = READ_ONCE(x); |
1375 | 1376 | ||
1376 | In short, ACCESS_ONCE() provides cache coherence for accesses from | 1377 | In short, READ_ONCE() and WRITE_ONCE() provide cache coherence for |
1377 | multiple CPUs to a single variable. | 1378 | accesses from multiple CPUs to a single variable. |
1378 | 1379 | ||
1379 | (*) The compiler is within its rights to merge successive loads from | 1380 | (*) The compiler is within its rights to merge successive loads from |
1380 | the same variable. Such merging can cause the compiler to "optimize" | 1381 | the same variable. Such merging can cause the compiler to "optimize" |
@@ -1391,9 +1392,9 @@ code. Here are some examples of these sorts of optimizations: | |||
1391 | for (;;) | 1392 | for (;;) |
1392 | do_something_with(tmp); | 1393 | do_something_with(tmp); |
1393 | 1394 | ||
1394 | Use ACCESS_ONCE() to prevent the compiler from doing this to you: | 1395 | Use READ_ONCE() to prevent the compiler from doing this to you: |
1395 | 1396 | ||
1396 | while (tmp = ACCESS_ONCE(a)) | 1397 | while (tmp = READ_ONCE(a)) |
1397 | do_something_with(tmp); | 1398 | do_something_with(tmp); |
1398 | 1399 | ||
1399 | (*) The compiler is within its rights to reload a variable, for example, | 1400 | (*) The compiler is within its rights to reload a variable, for example, |
@@ -1415,9 +1416,9 @@ code. Here are some examples of these sorts of optimizations: | |||
1415 | a was modified by some other CPU between the "while" statement and | 1416 | a was modified by some other CPU between the "while" statement and |
1416 | the call to do_something_with(). | 1417 | the call to do_something_with(). |
1417 | 1418 | ||
1418 | Again, use ACCESS_ONCE() to prevent the compiler from doing this: | 1419 | Again, use READ_ONCE() to prevent the compiler from doing this: |
1419 | 1420 | ||
1420 | while (tmp = ACCESS_ONCE(a)) | 1421 | while (tmp = READ_ONCE(a)) |
1421 | do_something_with(tmp); | 1422 | do_something_with(tmp); |
1422 | 1423 | ||
1423 | Note that if the compiler runs short of registers, it might save | 1424 | Note that if the compiler runs short of registers, it might save |
@@ -1437,21 +1438,21 @@ code. Here are some examples of these sorts of optimizations: | |||
1437 | 1438 | ||
1438 | do { } while (0); | 1439 | do { } while (0); |
1439 | 1440 | ||
1440 | This transformation is a win for single-threaded code because it gets | 1441 | This transformation is a win for single-threaded code because it |
1441 | rid of a load and a branch. The problem is that the compiler will | 1442 | gets rid of a load and a branch. The problem is that the compiler |
1442 | carry out its proof assuming that the current CPU is the only one | 1443 | will carry out its proof assuming that the current CPU is the only |
1443 | updating variable 'a'. If variable 'a' is shared, then the compiler's | 1444 | one updating variable 'a'. If variable 'a' is shared, then the |
1444 | proof will be erroneous. Use ACCESS_ONCE() to tell the compiler | 1445 | compiler's proof will be erroneous. Use READ_ONCE() to tell the |
1445 | that it doesn't know as much as it thinks it does: | 1446 | compiler that it doesn't know as much as it thinks it does: |
1446 | 1447 | ||
1447 | while (tmp = ACCESS_ONCE(a)) | 1448 | while (tmp = READ_ONCE(a)) |
1448 | do_something_with(tmp); | 1449 | do_something_with(tmp); |
1449 | 1450 | ||
1450 | But please note that the compiler is also closely watching what you | 1451 | But please note that the compiler is also closely watching what you |
1451 | do with the value after the ACCESS_ONCE(). For example, suppose you | 1452 | do with the value after the READ_ONCE(). For example, suppose you |
1452 | do the following and MAX is a preprocessor macro with the value 1: | 1453 | do the following and MAX is a preprocessor macro with the value 1: |
1453 | 1454 | ||
1454 | while ((tmp = ACCESS_ONCE(a)) % MAX) | 1455 | while ((tmp = READ_ONCE(a)) % MAX) |
1455 | do_something_with(tmp); | 1456 | do_something_with(tmp); |
1456 | 1457 | ||
1457 | Then the compiler knows that the result of the "%" operator applied | 1458 | Then the compiler knows that the result of the "%" operator applied |
@@ -1475,12 +1476,12 @@ code. Here are some examples of these sorts of optimizations: | |||
1475 | surprise if some other CPU might have stored to variable 'a' in the | 1476 | surprise if some other CPU might have stored to variable 'a' in the |
1476 | meantime. | 1477 | meantime. |
1477 | 1478 | ||
1478 | Use ACCESS_ONCE() to prevent the compiler from making this sort of | 1479 | Use WRITE_ONCE() to prevent the compiler from making this sort of |
1479 | wrong guess: | 1480 | wrong guess: |
1480 | 1481 | ||
1481 | ACCESS_ONCE(a) = 0; | 1482 | WRITE_ONCE(a, 0); |
1482 | /* Code that does not store to variable a. */ | 1483 | /* Code that does not store to variable a. */ |
1483 | ACCESS_ONCE(a) = 0; | 1484 | WRITE_ONCE(a, 0); |
1484 | 1485 | ||
1485 | (*) The compiler is within its rights to reorder memory accesses unless | 1486 | (*) The compiler is within its rights to reorder memory accesses unless |
1486 | you tell it not to. For example, consider the following interaction | 1487 | you tell it not to. For example, consider the following interaction |
@@ -1509,40 +1510,43 @@ code. Here are some examples of these sorts of optimizations: | |||
1509 | } | 1510 | } |
1510 | 1511 | ||
1511 | If the interrupt occurs between these two statement, then | 1512 | If the interrupt occurs between these two statement, then |
1512 | interrupt_handler() might be passed a garbled msg. Use ACCESS_ONCE() | 1513 | interrupt_handler() might be passed a garbled msg. Use WRITE_ONCE() |
1513 | to prevent this as follows: | 1514 | to prevent this as follows: |
1514 | 1515 | ||
1515 | void process_level(void) | 1516 | void process_level(void) |
1516 | { | 1517 | { |
1517 | ACCESS_ONCE(msg) = get_message(); | 1518 | WRITE_ONCE(msg, get_message()); |
1518 | ACCESS_ONCE(flag) = true; | 1519 | WRITE_ONCE(flag, true); |
1519 | } | 1520 | } |
1520 | 1521 | ||
1521 | void interrupt_handler(void) | 1522 | void interrupt_handler(void) |
1522 | { | 1523 | { |
1523 | if (ACCESS_ONCE(flag)) | 1524 | if (READ_ONCE(flag)) |
1524 | process_message(ACCESS_ONCE(msg)); | 1525 | process_message(READ_ONCE(msg)); |
1525 | } | 1526 | } |
1526 | 1527 | ||
1527 | Note that the ACCESS_ONCE() wrappers in interrupt_handler() | 1528 | Note that the READ_ONCE() and WRITE_ONCE() wrappers in |
1528 | are needed if this interrupt handler can itself be interrupted | 1529 | interrupt_handler() are needed if this interrupt handler can itself |
1529 | by something that also accesses 'flag' and 'msg', for example, | 1530 | be interrupted by something that also accesses 'flag' and 'msg', |
1530 | a nested interrupt or an NMI. Otherwise, ACCESS_ONCE() is not | 1531 | for example, a nested interrupt or an NMI. Otherwise, READ_ONCE() |
1531 | needed in interrupt_handler() other than for documentation purposes. | 1532 | and WRITE_ONCE() are not needed in interrupt_handler() other than |
1532 | (Note also that nested interrupts do not typically occur in modern | 1533 | for documentation purposes. (Note also that nested interrupts |
1533 | Linux kernels, in fact, if an interrupt handler returns with | 1534 | do not typically occur in modern Linux kernels, in fact, if an |
1534 | interrupts enabled, you will get a WARN_ONCE() splat.) | 1535 | interrupt handler returns with interrupts enabled, you will get a |
1535 | 1536 | WARN_ONCE() splat.) | |
1536 | You should assume that the compiler can move ACCESS_ONCE() past | 1537 | |
1537 | code not containing ACCESS_ONCE(), barrier(), or similar primitives. | 1538 | You should assume that the compiler can move READ_ONCE() and |
1538 | 1539 | WRITE_ONCE() past code not containing READ_ONCE(), WRITE_ONCE(), | |
1539 | This effect could also be achieved using barrier(), but ACCESS_ONCE() | 1540 | barrier(), or similar primitives. |
1540 | is more selective: With ACCESS_ONCE(), the compiler need only forget | 1541 | |
1541 | the contents of the indicated memory locations, while with barrier() | 1542 | This effect could also be achieved using barrier(), but READ_ONCE() |
1542 | the compiler must discard the value of all memory locations that | 1543 | and WRITE_ONCE() are more selective: With READ_ONCE() and |
1543 | it has currented cached in any machine registers. Of course, | 1544 | WRITE_ONCE(), the compiler need only forget the contents of the |
1544 | the compiler must also respect the order in which the ACCESS_ONCE()s | 1545 | indicated memory locations, while with barrier() the compiler must |
1545 | occur, though the CPU of course need not do so. | 1546 | discard the value of all memory locations that it has currented |
1547 | cached in any machine registers. Of course, the compiler must also | ||
1548 | respect the order in which the READ_ONCE()s and WRITE_ONCE()s occur, | ||
1549 | though the CPU of course need not do so. | ||
1546 | 1550 | ||
1547 | (*) The compiler is within its rights to invent stores to a variable, | 1551 | (*) The compiler is within its rights to invent stores to a variable, |
1548 | as in the following example: | 1552 | as in the following example: |
@@ -1562,16 +1566,16 @@ code. Here are some examples of these sorts of optimizations: | |||
1562 | a branch. Unfortunately, in concurrent code, this optimization | 1566 | a branch. Unfortunately, in concurrent code, this optimization |
1563 | could cause some other CPU to see a spurious value of 42 -- even | 1567 | could cause some other CPU to see a spurious value of 42 -- even |
1564 | if variable 'a' was never zero -- when loading variable 'b'. | 1568 | if variable 'a' was never zero -- when loading variable 'b'. |
1565 | Use ACCESS_ONCE() to prevent this as follows: | 1569 | Use WRITE_ONCE() to prevent this as follows: |
1566 | 1570 | ||
1567 | if (a) | 1571 | if (a) |
1568 | ACCESS_ONCE(b) = a; | 1572 | WRITE_ONCE(b, a); |
1569 | else | 1573 | else |
1570 | ACCESS_ONCE(b) = 42; | 1574 | WRITE_ONCE(b, 42); |
1571 | 1575 | ||
1572 | The compiler can also invent loads. These are usually less | 1576 | The compiler can also invent loads. These are usually less |
1573 | damaging, but they can result in cache-line bouncing and thus in | 1577 | damaging, but they can result in cache-line bouncing and thus in |
1574 | poor performance and scalability. Use ACCESS_ONCE() to prevent | 1578 | poor performance and scalability. Use READ_ONCE() to prevent |
1575 | invented loads. | 1579 | invented loads. |
1576 | 1580 | ||
1577 | (*) For aligned memory locations whose size allows them to be accessed | 1581 | (*) For aligned memory locations whose size allows them to be accessed |
@@ -1590,9 +1594,9 @@ code. Here are some examples of these sorts of optimizations: | |||
1590 | This optimization can therefore be a win in single-threaded code. | 1594 | This optimization can therefore be a win in single-threaded code. |
1591 | In fact, a recent bug (since fixed) caused GCC to incorrectly use | 1595 | In fact, a recent bug (since fixed) caused GCC to incorrectly use |
1592 | this optimization in a volatile store. In the absence of such bugs, | 1596 | this optimization in a volatile store. In the absence of such bugs, |
1593 | use of ACCESS_ONCE() prevents store tearing in the following example: | 1597 | use of WRITE_ONCE() prevents store tearing in the following example: |
1594 | 1598 | ||
1595 | ACCESS_ONCE(p) = 0x00010002; | 1599 | WRITE_ONCE(p, 0x00010002); |
1596 | 1600 | ||
1597 | Use of packed structures can also result in load and store tearing, | 1601 | Use of packed structures can also result in load and store tearing, |
1598 | as in this example: | 1602 | as in this example: |
@@ -1609,22 +1613,23 @@ code. Here are some examples of these sorts of optimizations: | |||
1609 | foo2.b = foo1.b; | 1613 | foo2.b = foo1.b; |
1610 | foo2.c = foo1.c; | 1614 | foo2.c = foo1.c; |
1611 | 1615 | ||
1612 | Because there are no ACCESS_ONCE() wrappers and no volatile markings, | 1616 | Because there are no READ_ONCE() or WRITE_ONCE() wrappers and no |
1613 | the compiler would be well within its rights to implement these three | 1617 | volatile markings, the compiler would be well within its rights to |
1614 | assignment statements as a pair of 32-bit loads followed by a pair | 1618 | implement these three assignment statements as a pair of 32-bit |
1615 | of 32-bit stores. This would result in load tearing on 'foo1.b' | 1619 | loads followed by a pair of 32-bit stores. This would result in |
1616 | and store tearing on 'foo2.b'. ACCESS_ONCE() again prevents tearing | 1620 | load tearing on 'foo1.b' and store tearing on 'foo2.b'. READ_ONCE() |
1617 | in this example: | 1621 | and WRITE_ONCE() again prevent tearing in this example: |
1618 | 1622 | ||
1619 | foo2.a = foo1.a; | 1623 | foo2.a = foo1.a; |
1620 | ACCESS_ONCE(foo2.b) = ACCESS_ONCE(foo1.b); | 1624 | WRITE_ONCE(foo2.b, READ_ONCE(foo1.b)); |
1621 | foo2.c = foo1.c; | 1625 | foo2.c = foo1.c; |
1622 | 1626 | ||
1623 | All that aside, it is never necessary to use ACCESS_ONCE() on a variable | 1627 | All that aside, it is never necessary to use READ_ONCE() and |
1624 | that has been marked volatile. For example, because 'jiffies' is marked | 1628 | WRITE_ONCE() on a variable that has been marked volatile. For example, |
1625 | volatile, it is never necessary to say ACCESS_ONCE(jiffies). The reason | 1629 | because 'jiffies' is marked volatile, it is never necessary to |
1626 | for this is that ACCESS_ONCE() is implemented as a volatile cast, which | 1630 | say READ_ONCE(jiffies). The reason for this is that READ_ONCE() and |
1627 | has no effect when its argument is already marked volatile. | 1631 | WRITE_ONCE() are implemented as volatile casts, which has no effect when |
1632 | its argument is already marked volatile. | ||
1628 | 1633 | ||
1629 | Please note that these compiler barriers have no direct effect on the CPU, | 1634 | Please note that these compiler barriers have no direct effect on the CPU, |
1630 | which may then reorder things however it wishes. | 1635 | which may then reorder things however it wishes. |
@@ -1646,14 +1651,15 @@ The Linux kernel has eight basic CPU memory barriers: | |||
1646 | All memory barriers except the data dependency barriers imply a compiler | 1651 | All memory barriers except the data dependency barriers imply a compiler |
1647 | barrier. Data dependencies do not impose any additional compiler ordering. | 1652 | barrier. Data dependencies do not impose any additional compiler ordering. |
1648 | 1653 | ||
1649 | Aside: In the case of data dependencies, the compiler would be expected to | 1654 | Aside: In the case of data dependencies, the compiler would be expected |
1650 | issue the loads in the correct order (eg. `a[b]` would have to load the value | 1655 | to issue the loads in the correct order (eg. `a[b]` would have to load |
1651 | of b before loading a[b]), however there is no guarantee in the C specification | 1656 | the value of b before loading a[b]), however there is no guarantee in |
1652 | that the compiler may not speculate the value of b (eg. is equal to 1) and load | 1657 | the C specification that the compiler may not speculate the value of b |
1653 | a before b (eg. tmp = a[1]; if (b != 1) tmp = a[b]; ). There is also the | 1658 | (eg. is equal to 1) and load a before b (eg. tmp = a[1]; if (b != 1) |
1654 | problem of a compiler reloading b after having loaded a[b], thus having a newer | 1659 | tmp = a[b]; ). There is also the problem of a compiler reloading b after |
1655 | copy of b than a[b]. A consensus has not yet been reached about these problems, | 1660 | having loaded a[b], thus having a newer copy of b than a[b]. A consensus |
1656 | however the ACCESS_ONCE macro is a good place to start looking. | 1661 | has not yet been reached about these problems, however the READ_ONCE() |
1662 | macro is a good place to start looking. | ||
1657 | 1663 | ||
1658 | SMP memory barriers are reduced to compiler barriers on uniprocessor compiled | 1664 | SMP memory barriers are reduced to compiler barriers on uniprocessor compiled |
1659 | systems because it is assumed that a CPU will appear to be self-consistent, | 1665 | systems because it is assumed that a CPU will appear to be self-consistent, |
@@ -1852,11 +1858,12 @@ Similarly, the reverse case of a RELEASE followed by an ACQUIRE does not | |||
1852 | imply a full memory barrier. If it is necessary for a RELEASE-ACQUIRE | 1858 | imply a full memory barrier. If it is necessary for a RELEASE-ACQUIRE |
1853 | pair to produce a full barrier, the ACQUIRE can be followed by an | 1859 | pair to produce a full barrier, the ACQUIRE can be followed by an |
1854 | smp_mb__after_unlock_lock() invocation. This will produce a full barrier | 1860 | smp_mb__after_unlock_lock() invocation. This will produce a full barrier |
1855 | if either (a) the RELEASE and the ACQUIRE are executed by the same | 1861 | (including transitivity) if either (a) the RELEASE and the ACQUIRE are |
1856 | CPU or task, or (b) the RELEASE and ACQUIRE act on the same variable. | 1862 | executed by the same CPU or task, or (b) the RELEASE and ACQUIRE act on |
1857 | The smp_mb__after_unlock_lock() primitive is free on many architectures. | 1863 | the same variable. The smp_mb__after_unlock_lock() primitive is free |
1858 | Without smp_mb__after_unlock_lock(), the CPU's execution of the critical | 1864 | on many architectures. Without smp_mb__after_unlock_lock(), the CPU's |
1859 | sections corresponding to the RELEASE and the ACQUIRE can cross, so that: | 1865 | execution of the critical sections corresponding to the RELEASE and the |
1866 | ACQUIRE can cross, so that: | ||
1860 | 1867 | ||
1861 | *A = a; | 1868 | *A = a; |
1862 | RELEASE M | 1869 | RELEASE M |
@@ -2126,12 +2133,12 @@ three CPUs; then should the following sequence of events occur: | |||
2126 | 2133 | ||
2127 | CPU 1 CPU 2 | 2134 | CPU 1 CPU 2 |
2128 | =============================== =============================== | 2135 | =============================== =============================== |
2129 | ACCESS_ONCE(*A) = a; ACCESS_ONCE(*E) = e; | 2136 | WRITE_ONCE(*A, a); WRITE_ONCE(*E, e); |
2130 | ACQUIRE M ACQUIRE Q | 2137 | ACQUIRE M ACQUIRE Q |
2131 | ACCESS_ONCE(*B) = b; ACCESS_ONCE(*F) = f; | 2138 | WRITE_ONCE(*B, b); WRITE_ONCE(*F, f); |
2132 | ACCESS_ONCE(*C) = c; ACCESS_ONCE(*G) = g; | 2139 | WRITE_ONCE(*C, c); WRITE_ONCE(*G, g); |
2133 | RELEASE M RELEASE Q | 2140 | RELEASE M RELEASE Q |
2134 | ACCESS_ONCE(*D) = d; ACCESS_ONCE(*H) = h; | 2141 | WRITE_ONCE(*D, d); WRITE_ONCE(*H, h); |
2135 | 2142 | ||
2136 | Then there is no guarantee as to what order CPU 3 will see the accesses to *A | 2143 | Then there is no guarantee as to what order CPU 3 will see the accesses to *A |
2137 | through *H occur in, other than the constraints imposed by the separate locks | 2144 | through *H occur in, other than the constraints imposed by the separate locks |
@@ -2151,18 +2158,18 @@ However, if the following occurs: | |||
2151 | 2158 | ||
2152 | CPU 1 CPU 2 | 2159 | CPU 1 CPU 2 |
2153 | =============================== =============================== | 2160 | =============================== =============================== |
2154 | ACCESS_ONCE(*A) = a; | 2161 | WRITE_ONCE(*A, a); |
2155 | ACQUIRE M [1] | 2162 | ACQUIRE M [1] |
2156 | ACCESS_ONCE(*B) = b; | 2163 | WRITE_ONCE(*B, b); |
2157 | ACCESS_ONCE(*C) = c; | 2164 | WRITE_ONCE(*C, c); |
2158 | RELEASE M [1] | 2165 | RELEASE M [1] |
2159 | ACCESS_ONCE(*D) = d; ACCESS_ONCE(*E) = e; | 2166 | WRITE_ONCE(*D, d); WRITE_ONCE(*E, e); |
2160 | ACQUIRE M [2] | 2167 | ACQUIRE M [2] |
2161 | smp_mb__after_unlock_lock(); | 2168 | smp_mb__after_unlock_lock(); |
2162 | ACCESS_ONCE(*F) = f; | 2169 | WRITE_ONCE(*F, f); |
2163 | ACCESS_ONCE(*G) = g; | 2170 | WRITE_ONCE(*G, g); |
2164 | RELEASE M [2] | 2171 | RELEASE M [2] |
2165 | ACCESS_ONCE(*H) = h; | 2172 | WRITE_ONCE(*H, h); |
2166 | 2173 | ||
2167 | CPU 3 might see: | 2174 | CPU 3 might see: |
2168 | 2175 | ||
@@ -2881,11 +2888,11 @@ A programmer might take it for granted that the CPU will perform memory | |||
2881 | operations in exactly the order specified, so that if the CPU is, for example, | 2888 | operations in exactly the order specified, so that if the CPU is, for example, |
2882 | given the following piece of code to execute: | 2889 | given the following piece of code to execute: |
2883 | 2890 | ||
2884 | a = ACCESS_ONCE(*A); | 2891 | a = READ_ONCE(*A); |
2885 | ACCESS_ONCE(*B) = b; | 2892 | WRITE_ONCE(*B, b); |
2886 | c = ACCESS_ONCE(*C); | 2893 | c = READ_ONCE(*C); |
2887 | d = ACCESS_ONCE(*D); | 2894 | d = READ_ONCE(*D); |
2888 | ACCESS_ONCE(*E) = e; | 2895 | WRITE_ONCE(*E, e); |
2889 | 2896 | ||
2890 | they would then expect that the CPU will complete the memory operation for each | 2897 | they would then expect that the CPU will complete the memory operation for each |
2891 | instruction before moving on to the next one, leading to a definite sequence of | 2898 | instruction before moving on to the next one, leading to a definite sequence of |
@@ -2932,12 +2939,12 @@ However, it is guaranteed that a CPU will be self-consistent: it will see its | |||
2932 | _own_ accesses appear to be correctly ordered, without the need for a memory | 2939 | _own_ accesses appear to be correctly ordered, without the need for a memory |
2933 | barrier. For instance with the following code: | 2940 | barrier. For instance with the following code: |
2934 | 2941 | ||
2935 | U = ACCESS_ONCE(*A); | 2942 | U = READ_ONCE(*A); |
2936 | ACCESS_ONCE(*A) = V; | 2943 | WRITE_ONCE(*A, V); |
2937 | ACCESS_ONCE(*A) = W; | 2944 | WRITE_ONCE(*A, W); |
2938 | X = ACCESS_ONCE(*A); | 2945 | X = READ_ONCE(*A); |
2939 | ACCESS_ONCE(*A) = Y; | 2946 | WRITE_ONCE(*A, Y); |
2940 | Z = ACCESS_ONCE(*A); | 2947 | Z = READ_ONCE(*A); |
2941 | 2948 | ||
2942 | and assuming no intervention by an external influence, it can be assumed that | 2949 | and assuming no intervention by an external influence, it can be assumed that |
2943 | the final result will appear to be: | 2950 | the final result will appear to be: |
@@ -2953,13 +2960,14 @@ accesses: | |||
2953 | U=LOAD *A, STORE *A=V, STORE *A=W, X=LOAD *A, STORE *A=Y, Z=LOAD *A | 2960 | U=LOAD *A, STORE *A=V, STORE *A=W, X=LOAD *A, STORE *A=Y, Z=LOAD *A |
2954 | 2961 | ||
2955 | in that order, but, without intervention, the sequence may have almost any | 2962 | in that order, but, without intervention, the sequence may have almost any |
2956 | combination of elements combined or discarded, provided the program's view of | 2963 | combination of elements combined or discarded, provided the program's view |
2957 | the world remains consistent. Note that ACCESS_ONCE() is -not- optional | 2964 | of the world remains consistent. Note that READ_ONCE() and WRITE_ONCE() |
2958 | in the above example, as there are architectures where a given CPU might | 2965 | are -not- optional in the above example, as there are architectures |
2959 | reorder successive loads to the same location. On such architectures, | 2966 | where a given CPU might reorder successive loads to the same location. |
2960 | ACCESS_ONCE() does whatever is necessary to prevent this, for example, on | 2967 | On such architectures, READ_ONCE() and WRITE_ONCE() do whatever is |
2961 | Itanium the volatile casts used by ACCESS_ONCE() cause GCC to emit the | 2968 | necessary to prevent this, for example, on Itanium the volatile casts |
2962 | special ld.acq and st.rel instructions that prevent such reordering. | 2969 | used by READ_ONCE() and WRITE_ONCE() cause GCC to emit the special ld.acq |
2970 | and st.rel instructions (respectively) that prevent such reordering. | ||
2963 | 2971 | ||
2964 | The compiler may also combine, discard or defer elements of the sequence before | 2972 | The compiler may also combine, discard or defer elements of the sequence before |
2965 | the CPU even sees them. | 2973 | the CPU even sees them. |
@@ -2973,13 +2981,14 @@ may be reduced to: | |||
2973 | 2981 | ||
2974 | *A = W; | 2982 | *A = W; |
2975 | 2983 | ||
2976 | since, without either a write barrier or an ACCESS_ONCE(), it can be | 2984 | since, without either a write barrier or an WRITE_ONCE(), it can be |
2977 | assumed that the effect of the storage of V to *A is lost. Similarly: | 2985 | assumed that the effect of the storage of V to *A is lost. Similarly: |
2978 | 2986 | ||
2979 | *A = Y; | 2987 | *A = Y; |
2980 | Z = *A; | 2988 | Z = *A; |
2981 | 2989 | ||
2982 | may, without a memory barrier or an ACCESS_ONCE(), be reduced to: | 2990 | may, without a memory barrier or an READ_ONCE() and WRITE_ONCE(), be |
2991 | reduced to: | ||
2983 | 2992 | ||
2984 | *A = Y; | 2993 | *A = Y; |
2985 | Z = Y; | 2994 | Z = Y; |
diff --git a/MAINTAINERS b/MAINTAINERS index a9ae6c105520..20f3735fbda7 100644 --- a/MAINTAINERS +++ b/MAINTAINERS | |||
@@ -8472,7 +8472,7 @@ M: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> | |||
8472 | M: Josh Triplett <josh@joshtriplett.org> | 8472 | M: Josh Triplett <josh@joshtriplett.org> |
8473 | R: Steven Rostedt <rostedt@goodmis.org> | 8473 | R: Steven Rostedt <rostedt@goodmis.org> |
8474 | R: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> | 8474 | R: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> |
8475 | R: Lai Jiangshan <laijs@cn.fujitsu.com> | 8475 | R: Lai Jiangshan <jiangshanlai@gmail.com> |
8476 | L: linux-kernel@vger.kernel.org | 8476 | L: linux-kernel@vger.kernel.org |
8477 | S: Supported | 8477 | S: Supported |
8478 | T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git | 8478 | T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git |
@@ -8499,7 +8499,7 @@ M: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> | |||
8499 | M: Josh Triplett <josh@joshtriplett.org> | 8499 | M: Josh Triplett <josh@joshtriplett.org> |
8500 | R: Steven Rostedt <rostedt@goodmis.org> | 8500 | R: Steven Rostedt <rostedt@goodmis.org> |
8501 | R: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> | 8501 | R: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> |
8502 | R: Lai Jiangshan <laijs@cn.fujitsu.com> | 8502 | R: Lai Jiangshan <jiangshanlai@gmail.com> |
8503 | L: linux-kernel@vger.kernel.org | 8503 | L: linux-kernel@vger.kernel.org |
8504 | W: http://www.rdrop.com/users/paulmck/RCU/ | 8504 | W: http://www.rdrop.com/users/paulmck/RCU/ |
8505 | S: Supported | 8505 | S: Supported |
@@ -9367,7 +9367,7 @@ F: include/linux/sl?b*.h | |||
9367 | F: mm/sl?b* | 9367 | F: mm/sl?b* |
9368 | 9368 | ||
9369 | SLEEPABLE READ-COPY UPDATE (SRCU) | 9369 | SLEEPABLE READ-COPY UPDATE (SRCU) |
9370 | M: Lai Jiangshan <laijs@cn.fujitsu.com> | 9370 | M: Lai Jiangshan <jiangshanlai@gmail.com> |
9371 | M: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> | 9371 | M: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> |
9372 | M: Josh Triplett <josh@joshtriplett.org> | 9372 | M: Josh Triplett <josh@joshtriplett.org> |
9373 | R: Steven Rostedt <rostedt@goodmis.org> | 9373 | R: Steven Rostedt <rostedt@goodmis.org> |
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index df919ff103c3..3d6b5269fb2e 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c | |||
@@ -54,9 +54,9 @@ static DEFINE_MUTEX(mce_chrdev_read_mutex); | |||
54 | 54 | ||
55 | #define rcu_dereference_check_mce(p) \ | 55 | #define rcu_dereference_check_mce(p) \ |
56 | ({ \ | 56 | ({ \ |
57 | rcu_lockdep_assert(rcu_read_lock_sched_held() || \ | 57 | RCU_LOCKDEP_WARN(!rcu_read_lock_sched_held() && \ |
58 | lockdep_is_held(&mce_chrdev_read_mutex), \ | 58 | !lockdep_is_held(&mce_chrdev_read_mutex), \ |
59 | "suspicious rcu_dereference_check_mce() usage"); \ | 59 | "suspicious rcu_dereference_check_mce() usage"); \ |
60 | smp_load_acquire(&(p)); \ | 60 | smp_load_acquire(&(p)); \ |
61 | }) | 61 | }) |
62 | 62 | ||
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index f5791927aa64..c5a5231d1d11 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c | |||
@@ -136,7 +136,7 @@ enum ctx_state ist_enter(struct pt_regs *regs) | |||
136 | preempt_count_add(HARDIRQ_OFFSET); | 136 | preempt_count_add(HARDIRQ_OFFSET); |
137 | 137 | ||
138 | /* This code is a bit fragile. Test it. */ | 138 | /* This code is a bit fragile. Test it. */ |
139 | rcu_lockdep_assert(rcu_is_watching(), "ist_enter didn't work"); | 139 | RCU_LOCKDEP_WARN(!rcu_is_watching(), "ist_enter didn't work"); |
140 | 140 | ||
141 | return prev_state; | 141 | return prev_state; |
142 | } | 142 | } |
diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c index 677fb2843553..3b188f20b43f 100644 --- a/drivers/base/power/opp.c +++ b/drivers/base/power/opp.c | |||
@@ -110,8 +110,8 @@ static DEFINE_MUTEX(dev_opp_list_lock); | |||
110 | 110 | ||
111 | #define opp_rcu_lockdep_assert() \ | 111 | #define opp_rcu_lockdep_assert() \ |
112 | do { \ | 112 | do { \ |
113 | rcu_lockdep_assert(rcu_read_lock_held() || \ | 113 | RCU_LOCKDEP_WARN(!rcu_read_lock_held() && \ |
114 | lockdep_is_held(&dev_opp_list_lock), \ | 114 | !lockdep_is_held(&dev_opp_list_lock), \ |
115 | "Missing rcu_read_lock() or " \ | 115 | "Missing rcu_read_lock() or " \ |
116 | "dev_opp_list_lock protection"); \ | 116 | "dev_opp_list_lock protection"); \ |
117 | } while (0) | 117 | } while (0) |
diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index fbb88740634a..674e3e226465 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h | |||
@@ -86,8 +86,8 @@ static inline struct file *__fcheck_files(struct files_struct *files, unsigned i | |||
86 | 86 | ||
87 | static inline struct file *fcheck_files(struct files_struct *files, unsigned int fd) | 87 | static inline struct file *fcheck_files(struct files_struct *files, unsigned int fd) |
88 | { | 88 | { |
89 | rcu_lockdep_assert(rcu_read_lock_held() || | 89 | RCU_LOCKDEP_WARN(!rcu_read_lock_held() && |
90 | lockdep_is_held(&files->file_lock), | 90 | !lockdep_is_held(&files->file_lock), |
91 | "suspicious rcu_dereference_check() usage"); | 91 | "suspicious rcu_dereference_check() usage"); |
92 | return __fcheck_files(files, fd); | 92 | return __fcheck_files(files, fd); |
93 | } | 93 | } |
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 4cf5f51b4c9c..ff476515f716 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h | |||
@@ -226,6 +226,37 @@ struct rcu_synchronize { | |||
226 | }; | 226 | }; |
227 | void wakeme_after_rcu(struct rcu_head *head); | 227 | void wakeme_after_rcu(struct rcu_head *head); |
228 | 228 | ||
229 | void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array, | ||
230 | struct rcu_synchronize *rs_array); | ||
231 | |||
232 | #define _wait_rcu_gp(checktiny, ...) \ | ||
233 | do { \ | ||
234 | call_rcu_func_t __crcu_array[] = { __VA_ARGS__ }; \ | ||
235 | const int __n = ARRAY_SIZE(__crcu_array); \ | ||
236 | struct rcu_synchronize __rs_array[__n]; \ | ||
237 | \ | ||
238 | __wait_rcu_gp(checktiny, __n, __crcu_array, __rs_array); \ | ||
239 | } while (0) | ||
240 | |||
241 | #define wait_rcu_gp(...) _wait_rcu_gp(false, __VA_ARGS__) | ||
242 | |||
243 | /** | ||
244 | * synchronize_rcu_mult - Wait concurrently for multiple grace periods | ||
245 | * @...: List of call_rcu() functions for the flavors to wait on. | ||
246 | * | ||
247 | * This macro waits concurrently for multiple flavors of RCU grace periods. | ||
248 | * For example, synchronize_rcu_mult(call_rcu, call_rcu_bh) would wait | ||
249 | * on concurrent RCU and RCU-bh grace periods. Waiting on a give SRCU | ||
250 | * domain requires you to write a wrapper function for that SRCU domain's | ||
251 | * call_srcu() function, supplying the corresponding srcu_struct. | ||
252 | * | ||
253 | * If Tiny RCU, tell _wait_rcu_gp() not to bother waiting for RCU | ||
254 | * or RCU-bh, given that anywhere synchronize_rcu_mult() can be called | ||
255 | * is automatically a grace period. | ||
256 | */ | ||
257 | #define synchronize_rcu_mult(...) \ | ||
258 | _wait_rcu_gp(IS_ENABLED(CONFIG_TINY_RCU), __VA_ARGS__) | ||
259 | |||
229 | /** | 260 | /** |
230 | * call_rcu_tasks() - Queue an RCU for invocation task-based grace period | 261 | * call_rcu_tasks() - Queue an RCU for invocation task-based grace period |
231 | * @head: structure to be used for queueing the RCU updates. | 262 | * @head: structure to be used for queueing the RCU updates. |
@@ -309,7 +340,7 @@ static inline void rcu_sysrq_end(void) | |||
309 | } | 340 | } |
310 | #endif /* #else #ifdef CONFIG_RCU_STALL_COMMON */ | 341 | #endif /* #else #ifdef CONFIG_RCU_STALL_COMMON */ |
311 | 342 | ||
312 | #ifdef CONFIG_RCU_USER_QS | 343 | #ifdef CONFIG_NO_HZ_FULL |
313 | void rcu_user_enter(void); | 344 | void rcu_user_enter(void); |
314 | void rcu_user_exit(void); | 345 | void rcu_user_exit(void); |
315 | #else | 346 | #else |
@@ -317,7 +348,7 @@ static inline void rcu_user_enter(void) { } | |||
317 | static inline void rcu_user_exit(void) { } | 348 | static inline void rcu_user_exit(void) { } |
318 | static inline void rcu_user_hooks_switch(struct task_struct *prev, | 349 | static inline void rcu_user_hooks_switch(struct task_struct *prev, |
319 | struct task_struct *next) { } | 350 | struct task_struct *next) { } |
320 | #endif /* CONFIG_RCU_USER_QS */ | 351 | #endif /* CONFIG_NO_HZ_FULL */ |
321 | 352 | ||
322 | #ifdef CONFIG_RCU_NOCB_CPU | 353 | #ifdef CONFIG_RCU_NOCB_CPU |
323 | void rcu_init_nohz(void); | 354 | void rcu_init_nohz(void); |
@@ -392,10 +423,6 @@ bool __rcu_is_watching(void); | |||
392 | * TREE_RCU and rcu_barrier_() primitives in TINY_RCU. | 423 | * TREE_RCU and rcu_barrier_() primitives in TINY_RCU. |
393 | */ | 424 | */ |
394 | 425 | ||
395 | typedef void call_rcu_func_t(struct rcu_head *head, | ||
396 | void (*func)(struct rcu_head *head)); | ||
397 | void wait_rcu_gp(call_rcu_func_t crf); | ||
398 | |||
399 | #if defined(CONFIG_TREE_RCU) || defined(CONFIG_PREEMPT_RCU) | 426 | #if defined(CONFIG_TREE_RCU) || defined(CONFIG_PREEMPT_RCU) |
400 | #include <linux/rcutree.h> | 427 | #include <linux/rcutree.h> |
401 | #elif defined(CONFIG_TINY_RCU) | 428 | #elif defined(CONFIG_TINY_RCU) |
@@ -469,46 +496,10 @@ int rcu_read_lock_bh_held(void); | |||
469 | * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an | 496 | * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an |
470 | * RCU-sched read-side critical section. In absence of | 497 | * RCU-sched read-side critical section. In absence of |
471 | * CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched read-side | 498 | * CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched read-side |
472 | * critical section unless it can prove otherwise. Note that disabling | 499 | * critical section unless it can prove otherwise. |
473 | * of preemption (including disabling irqs) counts as an RCU-sched | ||
474 | * read-side critical section. This is useful for debug checks in functions | ||
475 | * that required that they be called within an RCU-sched read-side | ||
476 | * critical section. | ||
477 | * | ||
478 | * Check debug_lockdep_rcu_enabled() to prevent false positives during boot | ||
479 | * and while lockdep is disabled. | ||
480 | * | ||
481 | * Note that if the CPU is in the idle loop from an RCU point of | ||
482 | * view (ie: that we are in the section between rcu_idle_enter() and | ||
483 | * rcu_idle_exit()) then rcu_read_lock_held() returns false even if the CPU | ||
484 | * did an rcu_read_lock(). The reason for this is that RCU ignores CPUs | ||
485 | * that are in such a section, considering these as in extended quiescent | ||
486 | * state, so such a CPU is effectively never in an RCU read-side critical | ||
487 | * section regardless of what RCU primitives it invokes. This state of | ||
488 | * affairs is required --- we need to keep an RCU-free window in idle | ||
489 | * where the CPU may possibly enter into low power mode. This way we can | ||
490 | * notice an extended quiescent state to other CPUs that started a grace | ||
491 | * period. Otherwise we would delay any grace period as long as we run in | ||
492 | * the idle task. | ||
493 | * | ||
494 | * Similarly, we avoid claiming an SRCU read lock held if the current | ||
495 | * CPU is offline. | ||
496 | */ | 500 | */ |
497 | #ifdef CONFIG_PREEMPT_COUNT | 501 | #ifdef CONFIG_PREEMPT_COUNT |
498 | static inline int rcu_read_lock_sched_held(void) | 502 | int rcu_read_lock_sched_held(void); |
499 | { | ||
500 | int lockdep_opinion = 0; | ||
501 | |||
502 | if (!debug_lockdep_rcu_enabled()) | ||
503 | return 1; | ||
504 | if (!rcu_is_watching()) | ||
505 | return 0; | ||
506 | if (!rcu_lockdep_current_cpu_online()) | ||
507 | return 0; | ||
508 | if (debug_locks) | ||
509 | lockdep_opinion = lock_is_held(&rcu_sched_lock_map); | ||
510 | return lockdep_opinion || preempt_count() != 0 || irqs_disabled(); | ||
511 | } | ||
512 | #else /* #ifdef CONFIG_PREEMPT_COUNT */ | 503 | #else /* #ifdef CONFIG_PREEMPT_COUNT */ |
513 | static inline int rcu_read_lock_sched_held(void) | 504 | static inline int rcu_read_lock_sched_held(void) |
514 | { | 505 | { |
@@ -545,6 +536,11 @@ static inline int rcu_read_lock_sched_held(void) | |||
545 | 536 | ||
546 | #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */ | 537 | #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */ |
547 | 538 | ||
539 | /* Deprecate rcu_lockdep_assert(): Use RCU_LOCKDEP_WARN() instead. */ | ||
540 | static inline void __attribute((deprecated)) deprecate_rcu_lockdep_assert(void) | ||
541 | { | ||
542 | } | ||
543 | |||
548 | #ifdef CONFIG_PROVE_RCU | 544 | #ifdef CONFIG_PROVE_RCU |
549 | 545 | ||
550 | /** | 546 | /** |
@@ -555,17 +551,32 @@ static inline int rcu_read_lock_sched_held(void) | |||
555 | #define rcu_lockdep_assert(c, s) \ | 551 | #define rcu_lockdep_assert(c, s) \ |
556 | do { \ | 552 | do { \ |
557 | static bool __section(.data.unlikely) __warned; \ | 553 | static bool __section(.data.unlikely) __warned; \ |
554 | deprecate_rcu_lockdep_assert(); \ | ||
558 | if (debug_lockdep_rcu_enabled() && !__warned && !(c)) { \ | 555 | if (debug_lockdep_rcu_enabled() && !__warned && !(c)) { \ |
559 | __warned = true; \ | 556 | __warned = true; \ |
560 | lockdep_rcu_suspicious(__FILE__, __LINE__, s); \ | 557 | lockdep_rcu_suspicious(__FILE__, __LINE__, s); \ |
561 | } \ | 558 | } \ |
562 | } while (0) | 559 | } while (0) |
563 | 560 | ||
561 | /** | ||
562 | * RCU_LOCKDEP_WARN - emit lockdep splat if specified condition is met | ||
563 | * @c: condition to check | ||
564 | * @s: informative message | ||
565 | */ | ||
566 | #define RCU_LOCKDEP_WARN(c, s) \ | ||
567 | do { \ | ||
568 | static bool __section(.data.unlikely) __warned; \ | ||
569 | if (debug_lockdep_rcu_enabled() && !__warned && (c)) { \ | ||
570 | __warned = true; \ | ||
571 | lockdep_rcu_suspicious(__FILE__, __LINE__, s); \ | ||
572 | } \ | ||
573 | } while (0) | ||
574 | |||
564 | #if defined(CONFIG_PROVE_RCU) && !defined(CONFIG_PREEMPT_RCU) | 575 | #if defined(CONFIG_PROVE_RCU) && !defined(CONFIG_PREEMPT_RCU) |
565 | static inline void rcu_preempt_sleep_check(void) | 576 | static inline void rcu_preempt_sleep_check(void) |
566 | { | 577 | { |
567 | rcu_lockdep_assert(!lock_is_held(&rcu_lock_map), | 578 | RCU_LOCKDEP_WARN(lock_is_held(&rcu_lock_map), |
568 | "Illegal context switch in RCU read-side critical section"); | 579 | "Illegal context switch in RCU read-side critical section"); |
569 | } | 580 | } |
570 | #else /* #ifdef CONFIG_PROVE_RCU */ | 581 | #else /* #ifdef CONFIG_PROVE_RCU */ |
571 | static inline void rcu_preempt_sleep_check(void) | 582 | static inline void rcu_preempt_sleep_check(void) |
@@ -576,15 +587,16 @@ static inline void rcu_preempt_sleep_check(void) | |||
576 | #define rcu_sleep_check() \ | 587 | #define rcu_sleep_check() \ |
577 | do { \ | 588 | do { \ |
578 | rcu_preempt_sleep_check(); \ | 589 | rcu_preempt_sleep_check(); \ |
579 | rcu_lockdep_assert(!lock_is_held(&rcu_bh_lock_map), \ | 590 | RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map), \ |
580 | "Illegal context switch in RCU-bh read-side critical section"); \ | 591 | "Illegal context switch in RCU-bh read-side critical section"); \ |
581 | rcu_lockdep_assert(!lock_is_held(&rcu_sched_lock_map), \ | 592 | RCU_LOCKDEP_WARN(lock_is_held(&rcu_sched_lock_map), \ |
582 | "Illegal context switch in RCU-sched read-side critical section"); \ | 593 | "Illegal context switch in RCU-sched read-side critical section"); \ |
583 | } while (0) | 594 | } while (0) |
584 | 595 | ||
585 | #else /* #ifdef CONFIG_PROVE_RCU */ | 596 | #else /* #ifdef CONFIG_PROVE_RCU */ |
586 | 597 | ||
587 | #define rcu_lockdep_assert(c, s) do { } while (0) | 598 | #define rcu_lockdep_assert(c, s) deprecate_rcu_lockdep_assert() |
599 | #define RCU_LOCKDEP_WARN(c, s) do { } while (0) | ||
588 | #define rcu_sleep_check() do { } while (0) | 600 | #define rcu_sleep_check() do { } while (0) |
589 | 601 | ||
590 | #endif /* #else #ifdef CONFIG_PROVE_RCU */ | 602 | #endif /* #else #ifdef CONFIG_PROVE_RCU */ |
@@ -615,13 +627,13 @@ static inline void rcu_preempt_sleep_check(void) | |||
615 | ({ \ | 627 | ({ \ |
616 | /* Dependency order vs. p above. */ \ | 628 | /* Dependency order vs. p above. */ \ |
617 | typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \ | 629 | typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \ |
618 | rcu_lockdep_assert(c, "suspicious rcu_dereference_check() usage"); \ | 630 | RCU_LOCKDEP_WARN(!(c), "suspicious rcu_dereference_check() usage"); \ |
619 | rcu_dereference_sparse(p, space); \ | 631 | rcu_dereference_sparse(p, space); \ |
620 | ((typeof(*p) __force __kernel *)(________p1)); \ | 632 | ((typeof(*p) __force __kernel *)(________p1)); \ |
621 | }) | 633 | }) |
622 | #define __rcu_dereference_protected(p, c, space) \ | 634 | #define __rcu_dereference_protected(p, c, space) \ |
623 | ({ \ | 635 | ({ \ |
624 | rcu_lockdep_assert(c, "suspicious rcu_dereference_protected() usage"); \ | 636 | RCU_LOCKDEP_WARN(!(c), "suspicious rcu_dereference_protected() usage"); \ |
625 | rcu_dereference_sparse(p, space); \ | 637 | rcu_dereference_sparse(p, space); \ |
626 | ((typeof(*p) __force __kernel *)(p)); \ | 638 | ((typeof(*p) __force __kernel *)(p)); \ |
627 | }) | 639 | }) |
@@ -845,8 +857,8 @@ static inline void rcu_read_lock(void) | |||
845 | __rcu_read_lock(); | 857 | __rcu_read_lock(); |
846 | __acquire(RCU); | 858 | __acquire(RCU); |
847 | rcu_lock_acquire(&rcu_lock_map); | 859 | rcu_lock_acquire(&rcu_lock_map); |
848 | rcu_lockdep_assert(rcu_is_watching(), | 860 | RCU_LOCKDEP_WARN(!rcu_is_watching(), |
849 | "rcu_read_lock() used illegally while idle"); | 861 | "rcu_read_lock() used illegally while idle"); |
850 | } | 862 | } |
851 | 863 | ||
852 | /* | 864 | /* |
@@ -896,8 +908,8 @@ static inline void rcu_read_lock(void) | |||
896 | */ | 908 | */ |
897 | static inline void rcu_read_unlock(void) | 909 | static inline void rcu_read_unlock(void) |
898 | { | 910 | { |
899 | rcu_lockdep_assert(rcu_is_watching(), | 911 | RCU_LOCKDEP_WARN(!rcu_is_watching(), |
900 | "rcu_read_unlock() used illegally while idle"); | 912 | "rcu_read_unlock() used illegally while idle"); |
901 | __release(RCU); | 913 | __release(RCU); |
902 | __rcu_read_unlock(); | 914 | __rcu_read_unlock(); |
903 | rcu_lock_release(&rcu_lock_map); /* Keep acq info for rls diags. */ | 915 | rcu_lock_release(&rcu_lock_map); /* Keep acq info for rls diags. */ |
@@ -925,8 +937,8 @@ static inline void rcu_read_lock_bh(void) | |||
925 | local_bh_disable(); | 937 | local_bh_disable(); |
926 | __acquire(RCU_BH); | 938 | __acquire(RCU_BH); |
927 | rcu_lock_acquire(&rcu_bh_lock_map); | 939 | rcu_lock_acquire(&rcu_bh_lock_map); |
928 | rcu_lockdep_assert(rcu_is_watching(), | 940 | RCU_LOCKDEP_WARN(!rcu_is_watching(), |
929 | "rcu_read_lock_bh() used illegally while idle"); | 941 | "rcu_read_lock_bh() used illegally while idle"); |
930 | } | 942 | } |
931 | 943 | ||
932 | /* | 944 | /* |
@@ -936,8 +948,8 @@ static inline void rcu_read_lock_bh(void) | |||
936 | */ | 948 | */ |
937 | static inline void rcu_read_unlock_bh(void) | 949 | static inline void rcu_read_unlock_bh(void) |
938 | { | 950 | { |
939 | rcu_lockdep_assert(rcu_is_watching(), | 951 | RCU_LOCKDEP_WARN(!rcu_is_watching(), |
940 | "rcu_read_unlock_bh() used illegally while idle"); | 952 | "rcu_read_unlock_bh() used illegally while idle"); |
941 | rcu_lock_release(&rcu_bh_lock_map); | 953 | rcu_lock_release(&rcu_bh_lock_map); |
942 | __release(RCU_BH); | 954 | __release(RCU_BH); |
943 | local_bh_enable(); | 955 | local_bh_enable(); |
@@ -961,8 +973,8 @@ static inline void rcu_read_lock_sched(void) | |||
961 | preempt_disable(); | 973 | preempt_disable(); |
962 | __acquire(RCU_SCHED); | 974 | __acquire(RCU_SCHED); |
963 | rcu_lock_acquire(&rcu_sched_lock_map); | 975 | rcu_lock_acquire(&rcu_sched_lock_map); |
964 | rcu_lockdep_assert(rcu_is_watching(), | 976 | RCU_LOCKDEP_WARN(!rcu_is_watching(), |
965 | "rcu_read_lock_sched() used illegally while idle"); | 977 | "rcu_read_lock_sched() used illegally while idle"); |
966 | } | 978 | } |
967 | 979 | ||
968 | /* Used by lockdep and tracing: cannot be traced, cannot call lockdep. */ | 980 | /* Used by lockdep and tracing: cannot be traced, cannot call lockdep. */ |
@@ -979,8 +991,8 @@ static inline notrace void rcu_read_lock_sched_notrace(void) | |||
979 | */ | 991 | */ |
980 | static inline void rcu_read_unlock_sched(void) | 992 | static inline void rcu_read_unlock_sched(void) |
981 | { | 993 | { |
982 | rcu_lockdep_assert(rcu_is_watching(), | 994 | RCU_LOCKDEP_WARN(!rcu_is_watching(), |
983 | "rcu_read_unlock_sched() used illegally while idle"); | 995 | "rcu_read_unlock_sched() used illegally while idle"); |
984 | rcu_lock_release(&rcu_sched_lock_map); | 996 | rcu_lock_release(&rcu_sched_lock_map); |
985 | __release(RCU_SCHED); | 997 | __release(RCU_SCHED); |
986 | preempt_enable(); | 998 | preempt_enable(); |
@@ -1031,7 +1043,7 @@ static inline notrace void rcu_read_unlock_sched_notrace(void) | |||
1031 | #define RCU_INIT_POINTER(p, v) \ | 1043 | #define RCU_INIT_POINTER(p, v) \ |
1032 | do { \ | 1044 | do { \ |
1033 | rcu_dereference_sparse(p, __rcu); \ | 1045 | rcu_dereference_sparse(p, __rcu); \ |
1034 | p = RCU_INITIALIZER(v); \ | 1046 | WRITE_ONCE(p, RCU_INITIALIZER(v)); \ |
1035 | } while (0) | 1047 | } while (0) |
1036 | 1048 | ||
1037 | /** | 1049 | /** |
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index 3df6c1ec4e25..ff968b7af3a4 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h | |||
@@ -37,6 +37,16 @@ static inline void cond_synchronize_rcu(unsigned long oldstate) | |||
37 | might_sleep(); | 37 | might_sleep(); |
38 | } | 38 | } |
39 | 39 | ||
40 | static inline unsigned long get_state_synchronize_sched(void) | ||
41 | { | ||
42 | return 0; | ||
43 | } | ||
44 | |||
45 | static inline void cond_synchronize_sched(unsigned long oldstate) | ||
46 | { | ||
47 | might_sleep(); | ||
48 | } | ||
49 | |||
40 | static inline void rcu_barrier_bh(void) | 50 | static inline void rcu_barrier_bh(void) |
41 | { | 51 | { |
42 | wait_rcu_gp(call_rcu_bh); | 52 | wait_rcu_gp(call_rcu_bh); |
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index 456879143f89..5abec82f325e 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h | |||
@@ -76,6 +76,8 @@ void rcu_barrier_bh(void); | |||
76 | void rcu_barrier_sched(void); | 76 | void rcu_barrier_sched(void); |
77 | unsigned long get_state_synchronize_rcu(void); | 77 | unsigned long get_state_synchronize_rcu(void); |
78 | void cond_synchronize_rcu(unsigned long oldstate); | 78 | void cond_synchronize_rcu(unsigned long oldstate); |
79 | unsigned long get_state_synchronize_sched(void); | ||
80 | void cond_synchronize_sched(unsigned long oldstate); | ||
79 | 81 | ||
80 | extern unsigned long rcutorture_testseq; | 82 | extern unsigned long rcutorture_testseq; |
81 | extern unsigned long rcutorture_vernum; | 83 | extern unsigned long rcutorture_vernum; |
diff --git a/include/linux/types.h b/include/linux/types.h index 8715287c3b1f..c314989d9158 100644 --- a/include/linux/types.h +++ b/include/linux/types.h | |||
@@ -212,6 +212,9 @@ struct callback_head { | |||
212 | }; | 212 | }; |
213 | #define rcu_head callback_head | 213 | #define rcu_head callback_head |
214 | 214 | ||
215 | typedef void (*rcu_callback_t)(struct rcu_head *head); | ||
216 | typedef void (*call_rcu_func_t)(struct rcu_head *head, rcu_callback_t func); | ||
217 | |||
215 | /* clocksource cycle base type */ | 218 | /* clocksource cycle base type */ |
216 | typedef u64 cycle_t; | 219 | typedef u64 cycle_t; |
217 | 220 | ||
diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h index c78e88ce5ea3..ef72c4aada56 100644 --- a/include/trace/events/rcu.h +++ b/include/trace/events/rcu.h | |||
@@ -661,7 +661,6 @@ TRACE_EVENT(rcu_torture_read, | |||
661 | * Tracepoint for _rcu_barrier() execution. The string "s" describes | 661 | * Tracepoint for _rcu_barrier() execution. The string "s" describes |
662 | * the _rcu_barrier phase: | 662 | * the _rcu_barrier phase: |
663 | * "Begin": _rcu_barrier() started. | 663 | * "Begin": _rcu_barrier() started. |
664 | * "Check": _rcu_barrier() checking for piggybacking. | ||
665 | * "EarlyExit": _rcu_barrier() piggybacked, thus early exit. | 664 | * "EarlyExit": _rcu_barrier() piggybacked, thus early exit. |
666 | * "Inc1": _rcu_barrier() piggyback check counter incremented. | 665 | * "Inc1": _rcu_barrier() piggyback check counter incremented. |
667 | * "OfflineNoCB": _rcu_barrier() found callback on never-online CPU | 666 | * "OfflineNoCB": _rcu_barrier() found callback on never-online CPU |
diff --git a/init/Kconfig b/init/Kconfig index af09b4fb43d2..ba1e6eaf4c36 100644 --- a/init/Kconfig +++ b/init/Kconfig | |||
@@ -538,15 +538,6 @@ config RCU_STALL_COMMON | |||
538 | config CONTEXT_TRACKING | 538 | config CONTEXT_TRACKING |
539 | bool | 539 | bool |
540 | 540 | ||
541 | config RCU_USER_QS | ||
542 | bool | ||
543 | help | ||
544 | This option sets hooks on kernel / userspace boundaries and | ||
545 | puts RCU in extended quiescent state when the CPU runs in | ||
546 | userspace. It means that when a CPU runs in userspace, it is | ||
547 | excluded from the global RCU state machine and thus doesn't | ||
548 | try to keep the timer tick on for RCU. | ||
549 | |||
550 | config CONTEXT_TRACKING_FORCE | 541 | config CONTEXT_TRACKING_FORCE |
551 | bool "Force context tracking" | 542 | bool "Force context tracking" |
552 | depends on CONTEXT_TRACKING | 543 | depends on CONTEXT_TRACKING |
@@ -707,6 +698,7 @@ config RCU_BOOST_DELAY | |||
707 | config RCU_NOCB_CPU | 698 | config RCU_NOCB_CPU |
708 | bool "Offload RCU callback processing from boot-selected CPUs" | 699 | bool "Offload RCU callback processing from boot-selected CPUs" |
709 | depends on TREE_RCU || PREEMPT_RCU | 700 | depends on TREE_RCU || PREEMPT_RCU |
701 | depends on RCU_EXPERT || NO_HZ_FULL | ||
710 | default n | 702 | default n |
711 | help | 703 | help |
712 | Use this option to reduce OS jitter for aggressive HPC or | 704 | Use this option to reduce OS jitter for aggressive HPC or |
diff --git a/kernel/cgroup.c b/kernel/cgroup.c index f89d9292eee6..b89f3168411b 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c | |||
@@ -107,8 +107,8 @@ static DEFINE_SPINLOCK(release_agent_path_lock); | |||
107 | struct percpu_rw_semaphore cgroup_threadgroup_rwsem; | 107 | struct percpu_rw_semaphore cgroup_threadgroup_rwsem; |
108 | 108 | ||
109 | #define cgroup_assert_mutex_or_rcu_locked() \ | 109 | #define cgroup_assert_mutex_or_rcu_locked() \ |
110 | rcu_lockdep_assert(rcu_read_lock_held() || \ | 110 | RCU_LOCKDEP_WARN(!rcu_read_lock_held() && \ |
111 | lockdep_is_held(&cgroup_mutex), \ | 111 | !lockdep_is_held(&cgroup_mutex), \ |
112 | "cgroup_mutex or RCU read lock required"); | 112 | "cgroup_mutex or RCU read lock required"); |
113 | 113 | ||
114 | /* | 114 | /* |
diff --git a/kernel/cpu.c b/kernel/cpu.c index 5644ec5582b9..910d709b578a 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c | |||
@@ -381,14 +381,14 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen) | |||
381 | * will observe it. | 381 | * will observe it. |
382 | * | 382 | * |
383 | * For CONFIG_PREEMPT we have preemptible RCU and its sync_rcu() might | 383 | * For CONFIG_PREEMPT we have preemptible RCU and its sync_rcu() might |
384 | * not imply sync_sched(), so explicitly call both. | 384 | * not imply sync_sched(), so wait for both. |
385 | * | 385 | * |
386 | * Do sync before park smpboot threads to take care the rcu boost case. | 386 | * Do sync before park smpboot threads to take care the rcu boost case. |
387 | */ | 387 | */ |
388 | #ifdef CONFIG_PREEMPT | 388 | if (IS_ENABLED(CONFIG_PREEMPT)) |
389 | synchronize_sched(); | 389 | synchronize_rcu_mult(call_rcu, call_rcu_sched); |
390 | #endif | 390 | else |
391 | synchronize_rcu(); | 391 | synchronize_rcu(); |
392 | 392 | ||
393 | smpboot_park_threads(cpu); | 393 | smpboot_park_threads(cpu); |
394 | 394 | ||
diff --git a/kernel/pid.c b/kernel/pid.c index 4fd07d5b7baf..ca368793808e 100644 --- a/kernel/pid.c +++ b/kernel/pid.c | |||
@@ -451,9 +451,8 @@ EXPORT_SYMBOL(pid_task); | |||
451 | */ | 451 | */ |
452 | struct task_struct *find_task_by_pid_ns(pid_t nr, struct pid_namespace *ns) | 452 | struct task_struct *find_task_by_pid_ns(pid_t nr, struct pid_namespace *ns) |
453 | { | 453 | { |
454 | rcu_lockdep_assert(rcu_read_lock_held(), | 454 | RCU_LOCKDEP_WARN(!rcu_read_lock_held(), |
455 | "find_task_by_pid_ns() needs rcu_read_lock()" | 455 | "find_task_by_pid_ns() needs rcu_read_lock() protection"); |
456 | " protection"); | ||
457 | return pid_task(find_pid_ns(nr, ns), PIDTYPE_PID); | 456 | return pid_task(find_pid_ns(nr, ns), PIDTYPE_PID); |
458 | } | 457 | } |
459 | 458 | ||
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 59e32684c23b..77192953dee5 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c | |||
@@ -635,6 +635,8 @@ static struct rcu_torture_ops sched_ops = { | |||
635 | .deferred_free = rcu_sched_torture_deferred_free, | 635 | .deferred_free = rcu_sched_torture_deferred_free, |
636 | .sync = synchronize_sched, | 636 | .sync = synchronize_sched, |
637 | .exp_sync = synchronize_sched_expedited, | 637 | .exp_sync = synchronize_sched_expedited, |
638 | .get_state = get_state_synchronize_sched, | ||
639 | .cond_sync = cond_synchronize_sched, | ||
638 | .call = call_rcu_sched, | 640 | .call = call_rcu_sched, |
639 | .cb_barrier = rcu_barrier_sched, | 641 | .cb_barrier = rcu_barrier_sched, |
640 | .fqs = rcu_sched_force_quiescent_state, | 642 | .fqs = rcu_sched_force_quiescent_state, |
@@ -684,10 +686,20 @@ static struct rcu_torture_ops tasks_ops = { | |||
684 | 686 | ||
685 | #define RCUTORTURE_TASKS_OPS &tasks_ops, | 687 | #define RCUTORTURE_TASKS_OPS &tasks_ops, |
686 | 688 | ||
689 | static bool __maybe_unused torturing_tasks(void) | ||
690 | { | ||
691 | return cur_ops == &tasks_ops; | ||
692 | } | ||
693 | |||
687 | #else /* #ifdef CONFIG_TASKS_RCU */ | 694 | #else /* #ifdef CONFIG_TASKS_RCU */ |
688 | 695 | ||
689 | #define RCUTORTURE_TASKS_OPS | 696 | #define RCUTORTURE_TASKS_OPS |
690 | 697 | ||
698 | static bool torturing_tasks(void) | ||
699 | { | ||
700 | return false; | ||
701 | } | ||
702 | |||
691 | #endif /* #else #ifdef CONFIG_TASKS_RCU */ | 703 | #endif /* #else #ifdef CONFIG_TASKS_RCU */ |
692 | 704 | ||
693 | /* | 705 | /* |
@@ -823,9 +835,7 @@ rcu_torture_cbflood(void *arg) | |||
823 | } | 835 | } |
824 | if (err) { | 836 | if (err) { |
825 | VERBOSE_TOROUT_STRING("rcu_torture_cbflood disabled: Bad args or OOM"); | 837 | VERBOSE_TOROUT_STRING("rcu_torture_cbflood disabled: Bad args or OOM"); |
826 | while (!torture_must_stop()) | 838 | goto wait_for_stop; |
827 | schedule_timeout_interruptible(HZ); | ||
828 | return 0; | ||
829 | } | 839 | } |
830 | VERBOSE_TOROUT_STRING("rcu_torture_cbflood task started"); | 840 | VERBOSE_TOROUT_STRING("rcu_torture_cbflood task started"); |
831 | do { | 841 | do { |
@@ -844,6 +854,7 @@ rcu_torture_cbflood(void *arg) | |||
844 | stutter_wait("rcu_torture_cbflood"); | 854 | stutter_wait("rcu_torture_cbflood"); |
845 | } while (!torture_must_stop()); | 855 | } while (!torture_must_stop()); |
846 | vfree(rhp); | 856 | vfree(rhp); |
857 | wait_for_stop: | ||
847 | torture_kthread_stopping("rcu_torture_cbflood"); | 858 | torture_kthread_stopping("rcu_torture_cbflood"); |
848 | return 0; | 859 | return 0; |
849 | } | 860 | } |
@@ -1088,7 +1099,8 @@ static void rcu_torture_timer(unsigned long unused) | |||
1088 | p = rcu_dereference_check(rcu_torture_current, | 1099 | p = rcu_dereference_check(rcu_torture_current, |
1089 | rcu_read_lock_bh_held() || | 1100 | rcu_read_lock_bh_held() || |
1090 | rcu_read_lock_sched_held() || | 1101 | rcu_read_lock_sched_held() || |
1091 | srcu_read_lock_held(srcu_ctlp)); | 1102 | srcu_read_lock_held(srcu_ctlp) || |
1103 | torturing_tasks()); | ||
1092 | if (p == NULL) { | 1104 | if (p == NULL) { |
1093 | /* Leave because rcu_torture_writer is not yet underway */ | 1105 | /* Leave because rcu_torture_writer is not yet underway */ |
1094 | cur_ops->readunlock(idx); | 1106 | cur_ops->readunlock(idx); |
@@ -1162,7 +1174,8 @@ rcu_torture_reader(void *arg) | |||
1162 | p = rcu_dereference_check(rcu_torture_current, | 1174 | p = rcu_dereference_check(rcu_torture_current, |
1163 | rcu_read_lock_bh_held() || | 1175 | rcu_read_lock_bh_held() || |
1164 | rcu_read_lock_sched_held() || | 1176 | rcu_read_lock_sched_held() || |
1165 | srcu_read_lock_held(srcu_ctlp)); | 1177 | srcu_read_lock_held(srcu_ctlp) || |
1178 | torturing_tasks()); | ||
1166 | if (p == NULL) { | 1179 | if (p == NULL) { |
1167 | /* Wait for rcu_torture_writer to get underway */ | 1180 | /* Wait for rcu_torture_writer to get underway */ |
1168 | cur_ops->readunlock(idx); | 1181 | cur_ops->readunlock(idx); |
@@ -1507,7 +1520,7 @@ static int rcu_torture_barrier_init(void) | |||
1507 | int i; | 1520 | int i; |
1508 | int ret; | 1521 | int ret; |
1509 | 1522 | ||
1510 | if (n_barrier_cbs == 0) | 1523 | if (n_barrier_cbs <= 0) |
1511 | return 0; | 1524 | return 0; |
1512 | if (cur_ops->call == NULL || cur_ops->cb_barrier == NULL) { | 1525 | if (cur_ops->call == NULL || cur_ops->cb_barrier == NULL) { |
1513 | pr_alert("%s" TORTURE_FLAG | 1526 | pr_alert("%s" TORTURE_FLAG |
@@ -1786,12 +1799,15 @@ rcu_torture_init(void) | |||
1786 | writer_task); | 1799 | writer_task); |
1787 | if (firsterr) | 1800 | if (firsterr) |
1788 | goto unwind; | 1801 | goto unwind; |
1789 | fakewriter_tasks = kzalloc(nfakewriters * sizeof(fakewriter_tasks[0]), | 1802 | if (nfakewriters > 0) { |
1790 | GFP_KERNEL); | 1803 | fakewriter_tasks = kzalloc(nfakewriters * |
1791 | if (fakewriter_tasks == NULL) { | 1804 | sizeof(fakewriter_tasks[0]), |
1792 | VERBOSE_TOROUT_ERRSTRING("out of memory"); | 1805 | GFP_KERNEL); |
1793 | firsterr = -ENOMEM; | 1806 | if (fakewriter_tasks == NULL) { |
1794 | goto unwind; | 1807 | VERBOSE_TOROUT_ERRSTRING("out of memory"); |
1808 | firsterr = -ENOMEM; | ||
1809 | goto unwind; | ||
1810 | } | ||
1795 | } | 1811 | } |
1796 | for (i = 0; i < nfakewriters; i++) { | 1812 | for (i = 0; i < nfakewriters; i++) { |
1797 | firsterr = torture_create_kthread(rcu_torture_fakewriter, | 1813 | firsterr = torture_create_kthread(rcu_torture_fakewriter, |
@@ -1818,7 +1834,7 @@ rcu_torture_init(void) | |||
1818 | if (firsterr) | 1834 | if (firsterr) |
1819 | goto unwind; | 1835 | goto unwind; |
1820 | } | 1836 | } |
1821 | if (test_no_idle_hz) { | 1837 | if (test_no_idle_hz && shuffle_interval > 0) { |
1822 | firsterr = torture_shuffle_init(shuffle_interval * HZ); | 1838 | firsterr = torture_shuffle_init(shuffle_interval * HZ); |
1823 | if (firsterr) | 1839 | if (firsterr) |
1824 | goto unwind; | 1840 | goto unwind; |
diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c index fb33d35ee0b7..d3fcb2ec8536 100644 --- a/kernel/rcu/srcu.c +++ b/kernel/rcu/srcu.c | |||
@@ -252,14 +252,15 @@ static bool srcu_readers_active_idx_check(struct srcu_struct *sp, int idx) | |||
252 | } | 252 | } |
253 | 253 | ||
254 | /** | 254 | /** |
255 | * srcu_readers_active - returns approximate number of readers. | 255 | * srcu_readers_active - returns true if there are readers. and false |
256 | * otherwise | ||
256 | * @sp: which srcu_struct to count active readers (holding srcu_read_lock). | 257 | * @sp: which srcu_struct to count active readers (holding srcu_read_lock). |
257 | * | 258 | * |
258 | * Note that this is not an atomic primitive, and can therefore suffer | 259 | * Note that this is not an atomic primitive, and can therefore suffer |
259 | * severe errors when invoked on an active srcu_struct. That said, it | 260 | * severe errors when invoked on an active srcu_struct. That said, it |
260 | * can be useful as an error check at cleanup time. | 261 | * can be useful as an error check at cleanup time. |
261 | */ | 262 | */ |
262 | static int srcu_readers_active(struct srcu_struct *sp) | 263 | static bool srcu_readers_active(struct srcu_struct *sp) |
263 | { | 264 | { |
264 | int cpu; | 265 | int cpu; |
265 | unsigned long sum = 0; | 266 | unsigned long sum = 0; |
@@ -414,11 +415,11 @@ static void __synchronize_srcu(struct srcu_struct *sp, int trycount) | |||
414 | struct rcu_head *head = &rcu.head; | 415 | struct rcu_head *head = &rcu.head; |
415 | bool done = false; | 416 | bool done = false; |
416 | 417 | ||
417 | rcu_lockdep_assert(!lock_is_held(&sp->dep_map) && | 418 | RCU_LOCKDEP_WARN(lock_is_held(&sp->dep_map) || |
418 | !lock_is_held(&rcu_bh_lock_map) && | 419 | lock_is_held(&rcu_bh_lock_map) || |
419 | !lock_is_held(&rcu_lock_map) && | 420 | lock_is_held(&rcu_lock_map) || |
420 | !lock_is_held(&rcu_sched_lock_map), | 421 | lock_is_held(&rcu_sched_lock_map), |
421 | "Illegal synchronize_srcu() in same-type SRCU (or RCU) read-side critical section"); | 422 | "Illegal synchronize_srcu() in same-type SRCU (or in RCU) read-side critical section"); |
422 | 423 | ||
423 | might_sleep(); | 424 | might_sleep(); |
424 | init_completion(&rcu.completion); | 425 | init_completion(&rcu.completion); |
diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c index c291bd65d2cb..d0471056d0af 100644 --- a/kernel/rcu/tiny.c +++ b/kernel/rcu/tiny.c | |||
@@ -191,10 +191,10 @@ static void rcu_process_callbacks(struct softirq_action *unused) | |||
191 | */ | 191 | */ |
192 | void synchronize_sched(void) | 192 | void synchronize_sched(void) |
193 | { | 193 | { |
194 | rcu_lockdep_assert(!lock_is_held(&rcu_bh_lock_map) && | 194 | RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || |
195 | !lock_is_held(&rcu_lock_map) && | 195 | lock_is_held(&rcu_lock_map) || |
196 | !lock_is_held(&rcu_sched_lock_map), | 196 | lock_is_held(&rcu_sched_lock_map), |
197 | "Illegal synchronize_sched() in RCU read-side critical section"); | 197 | "Illegal synchronize_sched() in RCU read-side critical section"); |
198 | cond_resched(); | 198 | cond_resched(); |
199 | } | 199 | } |
200 | EXPORT_SYMBOL_GPL(synchronize_sched); | 200 | EXPORT_SYMBOL_GPL(synchronize_sched); |
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 65137bc28b2b..9f75f25cc5d9 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c | |||
@@ -70,6 +70,8 @@ MODULE_ALIAS("rcutree"); | |||
70 | 70 | ||
71 | static struct lock_class_key rcu_node_class[RCU_NUM_LVLS]; | 71 | static struct lock_class_key rcu_node_class[RCU_NUM_LVLS]; |
72 | static struct lock_class_key rcu_fqs_class[RCU_NUM_LVLS]; | 72 | static struct lock_class_key rcu_fqs_class[RCU_NUM_LVLS]; |
73 | static struct lock_class_key rcu_exp_class[RCU_NUM_LVLS]; | ||
74 | static struct lock_class_key rcu_exp_sched_class[RCU_NUM_LVLS]; | ||
73 | 75 | ||
74 | /* | 76 | /* |
75 | * In order to export the rcu_state name to the tracing tools, it | 77 | * In order to export the rcu_state name to the tracing tools, it |
@@ -124,13 +126,8 @@ module_param(rcu_fanout_exact, bool, 0444); | |||
124 | static int rcu_fanout_leaf = RCU_FANOUT_LEAF; | 126 | static int rcu_fanout_leaf = RCU_FANOUT_LEAF; |
125 | module_param(rcu_fanout_leaf, int, 0444); | 127 | module_param(rcu_fanout_leaf, int, 0444); |
126 | int rcu_num_lvls __read_mostly = RCU_NUM_LVLS; | 128 | int rcu_num_lvls __read_mostly = RCU_NUM_LVLS; |
127 | static int num_rcu_lvl[] = { /* Number of rcu_nodes at specified level. */ | 129 | /* Number of rcu_nodes at specified level. */ |
128 | NUM_RCU_LVL_0, | 130 | static int num_rcu_lvl[] = NUM_RCU_LVL_INIT; |
129 | NUM_RCU_LVL_1, | ||
130 | NUM_RCU_LVL_2, | ||
131 | NUM_RCU_LVL_3, | ||
132 | NUM_RCU_LVL_4, | ||
133 | }; | ||
134 | int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. */ | 131 | int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. */ |
135 | 132 | ||
136 | /* | 133 | /* |
@@ -649,12 +646,12 @@ static void rcu_eqs_enter_common(long long oldval, bool user) | |||
649 | * It is illegal to enter an extended quiescent state while | 646 | * It is illegal to enter an extended quiescent state while |
650 | * in an RCU read-side critical section. | 647 | * in an RCU read-side critical section. |
651 | */ | 648 | */ |
652 | rcu_lockdep_assert(!lock_is_held(&rcu_lock_map), | 649 | RCU_LOCKDEP_WARN(lock_is_held(&rcu_lock_map), |
653 | "Illegal idle entry in RCU read-side critical section."); | 650 | "Illegal idle entry in RCU read-side critical section."); |
654 | rcu_lockdep_assert(!lock_is_held(&rcu_bh_lock_map), | 651 | RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map), |
655 | "Illegal idle entry in RCU-bh read-side critical section."); | 652 | "Illegal idle entry in RCU-bh read-side critical section."); |
656 | rcu_lockdep_assert(!lock_is_held(&rcu_sched_lock_map), | 653 | RCU_LOCKDEP_WARN(lock_is_held(&rcu_sched_lock_map), |
657 | "Illegal idle entry in RCU-sched read-side critical section."); | 654 | "Illegal idle entry in RCU-sched read-side critical section."); |
658 | } | 655 | } |
659 | 656 | ||
660 | /* | 657 | /* |
@@ -701,7 +698,7 @@ void rcu_idle_enter(void) | |||
701 | } | 698 | } |
702 | EXPORT_SYMBOL_GPL(rcu_idle_enter); | 699 | EXPORT_SYMBOL_GPL(rcu_idle_enter); |
703 | 700 | ||
704 | #ifdef CONFIG_RCU_USER_QS | 701 | #ifdef CONFIG_NO_HZ_FULL |
705 | /** | 702 | /** |
706 | * rcu_user_enter - inform RCU that we are resuming userspace. | 703 | * rcu_user_enter - inform RCU that we are resuming userspace. |
707 | * | 704 | * |
@@ -714,7 +711,7 @@ void rcu_user_enter(void) | |||
714 | { | 711 | { |
715 | rcu_eqs_enter(1); | 712 | rcu_eqs_enter(1); |
716 | } | 713 | } |
717 | #endif /* CONFIG_RCU_USER_QS */ | 714 | #endif /* CONFIG_NO_HZ_FULL */ |
718 | 715 | ||
719 | /** | 716 | /** |
720 | * rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle | 717 | * rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle |
@@ -828,7 +825,7 @@ void rcu_idle_exit(void) | |||
828 | } | 825 | } |
829 | EXPORT_SYMBOL_GPL(rcu_idle_exit); | 826 | EXPORT_SYMBOL_GPL(rcu_idle_exit); |
830 | 827 | ||
831 | #ifdef CONFIG_RCU_USER_QS | 828 | #ifdef CONFIG_NO_HZ_FULL |
832 | /** | 829 | /** |
833 | * rcu_user_exit - inform RCU that we are exiting userspace. | 830 | * rcu_user_exit - inform RCU that we are exiting userspace. |
834 | * | 831 | * |
@@ -839,7 +836,7 @@ void rcu_user_exit(void) | |||
839 | { | 836 | { |
840 | rcu_eqs_exit(1); | 837 | rcu_eqs_exit(1); |
841 | } | 838 | } |
842 | #endif /* CONFIG_RCU_USER_QS */ | 839 | #endif /* CONFIG_NO_HZ_FULL */ |
843 | 840 | ||
844 | /** | 841 | /** |
845 | * rcu_irq_enter - inform RCU that current CPU is entering irq away from idle | 842 | * rcu_irq_enter - inform RCU that current CPU is entering irq away from idle |
@@ -978,9 +975,9 @@ bool notrace rcu_is_watching(void) | |||
978 | { | 975 | { |
979 | bool ret; | 976 | bool ret; |
980 | 977 | ||
981 | preempt_disable(); | 978 | preempt_disable_notrace(); |
982 | ret = __rcu_is_watching(); | 979 | ret = __rcu_is_watching(); |
983 | preempt_enable(); | 980 | preempt_enable_notrace(); |
984 | return ret; | 981 | return ret; |
985 | } | 982 | } |
986 | EXPORT_SYMBOL_GPL(rcu_is_watching); | 983 | EXPORT_SYMBOL_GPL(rcu_is_watching); |
@@ -1178,9 +1175,11 @@ static void rcu_check_gp_kthread_starvation(struct rcu_state *rsp) | |||
1178 | j = jiffies; | 1175 | j = jiffies; |
1179 | gpa = READ_ONCE(rsp->gp_activity); | 1176 | gpa = READ_ONCE(rsp->gp_activity); |
1180 | if (j - gpa > 2 * HZ) | 1177 | if (j - gpa > 2 * HZ) |
1181 | pr_err("%s kthread starved for %ld jiffies! g%lu c%lu f%#x\n", | 1178 | pr_err("%s kthread starved for %ld jiffies! g%lu c%lu f%#x s%d ->state=%#lx\n", |
1182 | rsp->name, j - gpa, | 1179 | rsp->name, j - gpa, |
1183 | rsp->gpnum, rsp->completed, rsp->gp_flags); | 1180 | rsp->gpnum, rsp->completed, |
1181 | rsp->gp_flags, rsp->gp_state, | ||
1182 | rsp->gp_kthread ? rsp->gp_kthread->state : 0); | ||
1184 | } | 1183 | } |
1185 | 1184 | ||
1186 | /* | 1185 | /* |
@@ -1906,6 +1905,26 @@ static int rcu_gp_init(struct rcu_state *rsp) | |||
1906 | } | 1905 | } |
1907 | 1906 | ||
1908 | /* | 1907 | /* |
1908 | * Helper function for wait_event_interruptible_timeout() wakeup | ||
1909 | * at force-quiescent-state time. | ||
1910 | */ | ||
1911 | static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp) | ||
1912 | { | ||
1913 | struct rcu_node *rnp = rcu_get_root(rsp); | ||
1914 | |||
1915 | /* Someone like call_rcu() requested a force-quiescent-state scan. */ | ||
1916 | *gfp = READ_ONCE(rsp->gp_flags); | ||
1917 | if (*gfp & RCU_GP_FLAG_FQS) | ||
1918 | return true; | ||
1919 | |||
1920 | /* The current grace period has completed. */ | ||
1921 | if (!READ_ONCE(rnp->qsmask) && !rcu_preempt_blocked_readers_cgp(rnp)) | ||
1922 | return true; | ||
1923 | |||
1924 | return false; | ||
1925 | } | ||
1926 | |||
1927 | /* | ||
1909 | * Do one round of quiescent-state forcing. | 1928 | * Do one round of quiescent-state forcing. |
1910 | */ | 1929 | */ |
1911 | static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in) | 1930 | static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in) |
@@ -2041,6 +2060,7 @@ static int __noreturn rcu_gp_kthread(void *arg) | |||
2041 | wait_event_interruptible(rsp->gp_wq, | 2060 | wait_event_interruptible(rsp->gp_wq, |
2042 | READ_ONCE(rsp->gp_flags) & | 2061 | READ_ONCE(rsp->gp_flags) & |
2043 | RCU_GP_FLAG_INIT); | 2062 | RCU_GP_FLAG_INIT); |
2063 | rsp->gp_state = RCU_GP_DONE_GPS; | ||
2044 | /* Locking provides needed memory barrier. */ | 2064 | /* Locking provides needed memory barrier. */ |
2045 | if (rcu_gp_init(rsp)) | 2065 | if (rcu_gp_init(rsp)) |
2046 | break; | 2066 | break; |
@@ -2068,11 +2088,8 @@ static int __noreturn rcu_gp_kthread(void *arg) | |||
2068 | TPS("fqswait")); | 2088 | TPS("fqswait")); |
2069 | rsp->gp_state = RCU_GP_WAIT_FQS; | 2089 | rsp->gp_state = RCU_GP_WAIT_FQS; |
2070 | ret = wait_event_interruptible_timeout(rsp->gp_wq, | 2090 | ret = wait_event_interruptible_timeout(rsp->gp_wq, |
2071 | ((gf = READ_ONCE(rsp->gp_flags)) & | 2091 | rcu_gp_fqs_check_wake(rsp, &gf), j); |
2072 | RCU_GP_FLAG_FQS) || | 2092 | rsp->gp_state = RCU_GP_DOING_FQS; |
2073 | (!READ_ONCE(rnp->qsmask) && | ||
2074 | !rcu_preempt_blocked_readers_cgp(rnp)), | ||
2075 | j); | ||
2076 | /* Locking provides needed memory barriers. */ | 2093 | /* Locking provides needed memory barriers. */ |
2077 | /* If grace period done, leave loop. */ | 2094 | /* If grace period done, leave loop. */ |
2078 | if (!READ_ONCE(rnp->qsmask) && | 2095 | if (!READ_ONCE(rnp->qsmask) && |
@@ -2110,7 +2127,9 @@ static int __noreturn rcu_gp_kthread(void *arg) | |||
2110 | } | 2127 | } |
2111 | 2128 | ||
2112 | /* Handle grace-period end. */ | 2129 | /* Handle grace-period end. */ |
2130 | rsp->gp_state = RCU_GP_CLEANUP; | ||
2113 | rcu_gp_cleanup(rsp); | 2131 | rcu_gp_cleanup(rsp); |
2132 | rsp->gp_state = RCU_GP_CLEANED; | ||
2114 | } | 2133 | } |
2115 | } | 2134 | } |
2116 | 2135 | ||
@@ -3161,10 +3180,10 @@ static inline int rcu_blocking_is_gp(void) | |||
3161 | */ | 3180 | */ |
3162 | void synchronize_sched(void) | 3181 | void synchronize_sched(void) |
3163 | { | 3182 | { |
3164 | rcu_lockdep_assert(!lock_is_held(&rcu_bh_lock_map) && | 3183 | RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || |
3165 | !lock_is_held(&rcu_lock_map) && | 3184 | lock_is_held(&rcu_lock_map) || |
3166 | !lock_is_held(&rcu_sched_lock_map), | 3185 | lock_is_held(&rcu_sched_lock_map), |
3167 | "Illegal synchronize_sched() in RCU-sched read-side critical section"); | 3186 | "Illegal synchronize_sched() in RCU-sched read-side critical section"); |
3168 | if (rcu_blocking_is_gp()) | 3187 | if (rcu_blocking_is_gp()) |
3169 | return; | 3188 | return; |
3170 | if (rcu_gp_is_expedited()) | 3189 | if (rcu_gp_is_expedited()) |
@@ -3188,10 +3207,10 @@ EXPORT_SYMBOL_GPL(synchronize_sched); | |||
3188 | */ | 3207 | */ |
3189 | void synchronize_rcu_bh(void) | 3208 | void synchronize_rcu_bh(void) |
3190 | { | 3209 | { |
3191 | rcu_lockdep_assert(!lock_is_held(&rcu_bh_lock_map) && | 3210 | RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || |
3192 | !lock_is_held(&rcu_lock_map) && | 3211 | lock_is_held(&rcu_lock_map) || |
3193 | !lock_is_held(&rcu_sched_lock_map), | 3212 | lock_is_held(&rcu_sched_lock_map), |
3194 | "Illegal synchronize_rcu_bh() in RCU-bh read-side critical section"); | 3213 | "Illegal synchronize_rcu_bh() in RCU-bh read-side critical section"); |
3195 | if (rcu_blocking_is_gp()) | 3214 | if (rcu_blocking_is_gp()) |
3196 | return; | 3215 | return; |
3197 | if (rcu_gp_is_expedited()) | 3216 | if (rcu_gp_is_expedited()) |
@@ -3253,23 +3272,247 @@ void cond_synchronize_rcu(unsigned long oldstate) | |||
3253 | } | 3272 | } |
3254 | EXPORT_SYMBOL_GPL(cond_synchronize_rcu); | 3273 | EXPORT_SYMBOL_GPL(cond_synchronize_rcu); |
3255 | 3274 | ||
3256 | static int synchronize_sched_expedited_cpu_stop(void *data) | 3275 | /** |
3276 | * get_state_synchronize_sched - Snapshot current RCU-sched state | ||
3277 | * | ||
3278 | * Returns a cookie that is used by a later call to cond_synchronize_sched() | ||
3279 | * to determine whether or not a full grace period has elapsed in the | ||
3280 | * meantime. | ||
3281 | */ | ||
3282 | unsigned long get_state_synchronize_sched(void) | ||
3257 | { | 3283 | { |
3258 | /* | 3284 | /* |
3259 | * There must be a full memory barrier on each affected CPU | 3285 | * Any prior manipulation of RCU-protected data must happen |
3260 | * between the time that try_stop_cpus() is called and the | 3286 | * before the load from ->gpnum. |
3261 | * time that it returns. | 3287 | */ |
3262 | * | 3288 | smp_mb(); /* ^^^ */ |
3263 | * In the current initial implementation of cpu_stop, the | 3289 | |
3264 | * above condition is already met when the control reaches | 3290 | /* |
3265 | * this point and the following smp_mb() is not strictly | 3291 | * Make sure this load happens before the purportedly |
3266 | * necessary. Do smp_mb() anyway for documentation and | 3292 | * time-consuming work between get_state_synchronize_sched() |
3267 | * robustness against future implementation changes. | 3293 | * and cond_synchronize_sched(). |
3294 | */ | ||
3295 | return smp_load_acquire(&rcu_sched_state.gpnum); | ||
3296 | } | ||
3297 | EXPORT_SYMBOL_GPL(get_state_synchronize_sched); | ||
3298 | |||
3299 | /** | ||
3300 | * cond_synchronize_sched - Conditionally wait for an RCU-sched grace period | ||
3301 | * | ||
3302 | * @oldstate: return value from earlier call to get_state_synchronize_sched() | ||
3303 | * | ||
3304 | * If a full RCU-sched grace period has elapsed since the earlier call to | ||
3305 | * get_state_synchronize_sched(), just return. Otherwise, invoke | ||
3306 | * synchronize_sched() to wait for a full grace period. | ||
3307 | * | ||
3308 | * Yes, this function does not take counter wrap into account. But | ||
3309 | * counter wrap is harmless. If the counter wraps, we have waited for | ||
3310 | * more than 2 billion grace periods (and way more on a 64-bit system!), | ||
3311 | * so waiting for one additional grace period should be just fine. | ||
3312 | */ | ||
3313 | void cond_synchronize_sched(unsigned long oldstate) | ||
3314 | { | ||
3315 | unsigned long newstate; | ||
3316 | |||
3317 | /* | ||
3318 | * Ensure that this load happens before any RCU-destructive | ||
3319 | * actions the caller might carry out after we return. | ||
3268 | */ | 3320 | */ |
3269 | smp_mb(); /* See above comment block. */ | 3321 | newstate = smp_load_acquire(&rcu_sched_state.completed); |
3322 | if (ULONG_CMP_GE(oldstate, newstate)) | ||
3323 | synchronize_sched(); | ||
3324 | } | ||
3325 | EXPORT_SYMBOL_GPL(cond_synchronize_sched); | ||
3326 | |||
3327 | /* Adjust sequence number for start of update-side operation. */ | ||
3328 | static void rcu_seq_start(unsigned long *sp) | ||
3329 | { | ||
3330 | WRITE_ONCE(*sp, *sp + 1); | ||
3331 | smp_mb(); /* Ensure update-side operation after counter increment. */ | ||
3332 | WARN_ON_ONCE(!(*sp & 0x1)); | ||
3333 | } | ||
3334 | |||
3335 | /* Adjust sequence number for end of update-side operation. */ | ||
3336 | static void rcu_seq_end(unsigned long *sp) | ||
3337 | { | ||
3338 | smp_mb(); /* Ensure update-side operation before counter increment. */ | ||
3339 | WRITE_ONCE(*sp, *sp + 1); | ||
3340 | WARN_ON_ONCE(*sp & 0x1); | ||
3341 | } | ||
3342 | |||
3343 | /* Take a snapshot of the update side's sequence number. */ | ||
3344 | static unsigned long rcu_seq_snap(unsigned long *sp) | ||
3345 | { | ||
3346 | unsigned long s; | ||
3347 | |||
3348 | smp_mb(); /* Caller's modifications seen first by other CPUs. */ | ||
3349 | s = (READ_ONCE(*sp) + 3) & ~0x1; | ||
3350 | smp_mb(); /* Above access must not bleed into critical section. */ | ||
3351 | return s; | ||
3352 | } | ||
3353 | |||
3354 | /* | ||
3355 | * Given a snapshot from rcu_seq_snap(), determine whether or not a | ||
3356 | * full update-side operation has occurred. | ||
3357 | */ | ||
3358 | static bool rcu_seq_done(unsigned long *sp, unsigned long s) | ||
3359 | { | ||
3360 | return ULONG_CMP_GE(READ_ONCE(*sp), s); | ||
3361 | } | ||
3362 | |||
3363 | /* Wrapper functions for expedited grace periods. */ | ||
3364 | static void rcu_exp_gp_seq_start(struct rcu_state *rsp) | ||
3365 | { | ||
3366 | rcu_seq_start(&rsp->expedited_sequence); | ||
3367 | } | ||
3368 | static void rcu_exp_gp_seq_end(struct rcu_state *rsp) | ||
3369 | { | ||
3370 | rcu_seq_end(&rsp->expedited_sequence); | ||
3371 | smp_mb(); /* Ensure that consecutive grace periods serialize. */ | ||
3372 | } | ||
3373 | static unsigned long rcu_exp_gp_seq_snap(struct rcu_state *rsp) | ||
3374 | { | ||
3375 | return rcu_seq_snap(&rsp->expedited_sequence); | ||
3376 | } | ||
3377 | static bool rcu_exp_gp_seq_done(struct rcu_state *rsp, unsigned long s) | ||
3378 | { | ||
3379 | return rcu_seq_done(&rsp->expedited_sequence, s); | ||
3380 | } | ||
3381 | |||
3382 | /* Common code for synchronize_{rcu,sched}_expedited() work-done checking. */ | ||
3383 | static bool sync_exp_work_done(struct rcu_state *rsp, struct rcu_node *rnp, | ||
3384 | struct rcu_data *rdp, | ||
3385 | atomic_long_t *stat, unsigned long s) | ||
3386 | { | ||
3387 | if (rcu_exp_gp_seq_done(rsp, s)) { | ||
3388 | if (rnp) | ||
3389 | mutex_unlock(&rnp->exp_funnel_mutex); | ||
3390 | else if (rdp) | ||
3391 | mutex_unlock(&rdp->exp_funnel_mutex); | ||
3392 | /* Ensure test happens before caller kfree(). */ | ||
3393 | smp_mb__before_atomic(); /* ^^^ */ | ||
3394 | atomic_long_inc(stat); | ||
3395 | return true; | ||
3396 | } | ||
3397 | return false; | ||
3398 | } | ||
3399 | |||
3400 | /* | ||
3401 | * Funnel-lock acquisition for expedited grace periods. Returns a | ||
3402 | * pointer to the root rcu_node structure, or NULL if some other | ||
3403 | * task did the expedited grace period for us. | ||
3404 | */ | ||
3405 | static struct rcu_node *exp_funnel_lock(struct rcu_state *rsp, unsigned long s) | ||
3406 | { | ||
3407 | struct rcu_data *rdp; | ||
3408 | struct rcu_node *rnp0; | ||
3409 | struct rcu_node *rnp1 = NULL; | ||
3410 | |||
3411 | /* | ||
3412 | * First try directly acquiring the root lock in order to reduce | ||
3413 | * latency in the common case where expedited grace periods are | ||
3414 | * rare. We check mutex_is_locked() to avoid pathological levels of | ||
3415 | * memory contention on ->exp_funnel_mutex in the heavy-load case. | ||
3416 | */ | ||
3417 | rnp0 = rcu_get_root(rsp); | ||
3418 | if (!mutex_is_locked(&rnp0->exp_funnel_mutex)) { | ||
3419 | if (mutex_trylock(&rnp0->exp_funnel_mutex)) { | ||
3420 | if (sync_exp_work_done(rsp, rnp0, NULL, | ||
3421 | &rsp->expedited_workdone0, s)) | ||
3422 | return NULL; | ||
3423 | return rnp0; | ||
3424 | } | ||
3425 | } | ||
3426 | |||
3427 | /* | ||
3428 | * Each pass through the following loop works its way | ||
3429 | * up the rcu_node tree, returning if others have done the | ||
3430 | * work or otherwise falls through holding the root rnp's | ||
3431 | * ->exp_funnel_mutex. The mapping from CPU to rcu_node structure | ||
3432 | * can be inexact, as it is just promoting locality and is not | ||
3433 | * strictly needed for correctness. | ||
3434 | */ | ||
3435 | rdp = per_cpu_ptr(rsp->rda, raw_smp_processor_id()); | ||
3436 | if (sync_exp_work_done(rsp, NULL, NULL, &rsp->expedited_workdone1, s)) | ||
3437 | return NULL; | ||
3438 | mutex_lock(&rdp->exp_funnel_mutex); | ||
3439 | rnp0 = rdp->mynode; | ||
3440 | for (; rnp0 != NULL; rnp0 = rnp0->parent) { | ||
3441 | if (sync_exp_work_done(rsp, rnp1, rdp, | ||
3442 | &rsp->expedited_workdone2, s)) | ||
3443 | return NULL; | ||
3444 | mutex_lock(&rnp0->exp_funnel_mutex); | ||
3445 | if (rnp1) | ||
3446 | mutex_unlock(&rnp1->exp_funnel_mutex); | ||
3447 | else | ||
3448 | mutex_unlock(&rdp->exp_funnel_mutex); | ||
3449 | rnp1 = rnp0; | ||
3450 | } | ||
3451 | if (sync_exp_work_done(rsp, rnp1, rdp, | ||
3452 | &rsp->expedited_workdone3, s)) | ||
3453 | return NULL; | ||
3454 | return rnp1; | ||
3455 | } | ||
3456 | |||
3457 | /* Invoked on each online non-idle CPU for expedited quiescent state. */ | ||
3458 | static int synchronize_sched_expedited_cpu_stop(void *data) | ||
3459 | { | ||
3460 | struct rcu_data *rdp = data; | ||
3461 | struct rcu_state *rsp = rdp->rsp; | ||
3462 | |||
3463 | /* We are here: If we are last, do the wakeup. */ | ||
3464 | rdp->exp_done = true; | ||
3465 | if (atomic_dec_and_test(&rsp->expedited_need_qs)) | ||
3466 | wake_up(&rsp->expedited_wq); | ||
3270 | return 0; | 3467 | return 0; |
3271 | } | 3468 | } |
3272 | 3469 | ||
3470 | static void synchronize_sched_expedited_wait(struct rcu_state *rsp) | ||
3471 | { | ||
3472 | int cpu; | ||
3473 | unsigned long jiffies_stall; | ||
3474 | unsigned long jiffies_start; | ||
3475 | struct rcu_data *rdp; | ||
3476 | int ret; | ||
3477 | |||
3478 | jiffies_stall = rcu_jiffies_till_stall_check(); | ||
3479 | jiffies_start = jiffies; | ||
3480 | |||
3481 | for (;;) { | ||
3482 | ret = wait_event_interruptible_timeout( | ||
3483 | rsp->expedited_wq, | ||
3484 | !atomic_read(&rsp->expedited_need_qs), | ||
3485 | jiffies_stall); | ||
3486 | if (ret > 0) | ||
3487 | return; | ||
3488 | if (ret < 0) { | ||
3489 | /* Hit a signal, disable CPU stall warnings. */ | ||
3490 | wait_event(rsp->expedited_wq, | ||
3491 | !atomic_read(&rsp->expedited_need_qs)); | ||
3492 | return; | ||
3493 | } | ||
3494 | pr_err("INFO: %s detected expedited stalls on CPUs: {", | ||
3495 | rsp->name); | ||
3496 | for_each_online_cpu(cpu) { | ||
3497 | rdp = per_cpu_ptr(rsp->rda, cpu); | ||
3498 | |||
3499 | if (rdp->exp_done) | ||
3500 | continue; | ||
3501 | pr_cont(" %d", cpu); | ||
3502 | } | ||
3503 | pr_cont(" } %lu jiffies s: %lu\n", | ||
3504 | jiffies - jiffies_start, rsp->expedited_sequence); | ||
3505 | for_each_online_cpu(cpu) { | ||
3506 | rdp = per_cpu_ptr(rsp->rda, cpu); | ||
3507 | |||
3508 | if (rdp->exp_done) | ||
3509 | continue; | ||
3510 | dump_cpu_task(cpu); | ||
3511 | } | ||
3512 | jiffies_stall = 3 * rcu_jiffies_till_stall_check() + 3; | ||
3513 | } | ||
3514 | } | ||
3515 | |||
3273 | /** | 3516 | /** |
3274 | * synchronize_sched_expedited - Brute-force RCU-sched grace period | 3517 | * synchronize_sched_expedited - Brute-force RCU-sched grace period |
3275 | * | 3518 | * |
@@ -3281,58 +3524,21 @@ static int synchronize_sched_expedited_cpu_stop(void *data) | |||
3281 | * restructure your code to batch your updates, and then use a single | 3524 | * restructure your code to batch your updates, and then use a single |
3282 | * synchronize_sched() instead. | 3525 | * synchronize_sched() instead. |
3283 | * | 3526 | * |
3284 | * This implementation can be thought of as an application of ticket | 3527 | * This implementation can be thought of as an application of sequence |
3285 | * locking to RCU, with sync_sched_expedited_started and | 3528 | * locking to expedited grace periods, but using the sequence counter to |
3286 | * sync_sched_expedited_done taking on the roles of the halves | 3529 | * determine when someone else has already done the work instead of for |
3287 | * of the ticket-lock word. Each task atomically increments | 3530 | * retrying readers. |
3288 | * sync_sched_expedited_started upon entry, snapshotting the old value, | ||
3289 | * then attempts to stop all the CPUs. If this succeeds, then each | ||
3290 | * CPU will have executed a context switch, resulting in an RCU-sched | ||
3291 | * grace period. We are then done, so we use atomic_cmpxchg() to | ||
3292 | * update sync_sched_expedited_done to match our snapshot -- but | ||
3293 | * only if someone else has not already advanced past our snapshot. | ||
3294 | * | ||
3295 | * On the other hand, if try_stop_cpus() fails, we check the value | ||
3296 | * of sync_sched_expedited_done. If it has advanced past our | ||
3297 | * initial snapshot, then someone else must have forced a grace period | ||
3298 | * some time after we took our snapshot. In this case, our work is | ||
3299 | * done for us, and we can simply return. Otherwise, we try again, | ||
3300 | * but keep our initial snapshot for purposes of checking for someone | ||
3301 | * doing our work for us. | ||
3302 | * | ||
3303 | * If we fail too many times in a row, we fall back to synchronize_sched(). | ||
3304 | */ | 3531 | */ |
3305 | void synchronize_sched_expedited(void) | 3532 | void synchronize_sched_expedited(void) |
3306 | { | 3533 | { |
3307 | cpumask_var_t cm; | ||
3308 | bool cma = false; | ||
3309 | int cpu; | 3534 | int cpu; |
3310 | long firstsnap, s, snap; | 3535 | unsigned long s; |
3311 | int trycount = 0; | 3536 | struct rcu_node *rnp; |
3312 | struct rcu_state *rsp = &rcu_sched_state; | 3537 | struct rcu_state *rsp = &rcu_sched_state; |
3313 | 3538 | ||
3314 | /* | 3539 | /* Take a snapshot of the sequence number. */ |
3315 | * If we are in danger of counter wrap, just do synchronize_sched(). | 3540 | s = rcu_exp_gp_seq_snap(rsp); |
3316 | * By allowing sync_sched_expedited_started to advance no more than | ||
3317 | * ULONG_MAX/8 ahead of sync_sched_expedited_done, we are ensuring | ||
3318 | * that more than 3.5 billion CPUs would be required to force a | ||
3319 | * counter wrap on a 32-bit system. Quite a few more CPUs would of | ||
3320 | * course be required on a 64-bit system. | ||
3321 | */ | ||
3322 | if (ULONG_CMP_GE((ulong)atomic_long_read(&rsp->expedited_start), | ||
3323 | (ulong)atomic_long_read(&rsp->expedited_done) + | ||
3324 | ULONG_MAX / 8)) { | ||
3325 | wait_rcu_gp(call_rcu_sched); | ||
3326 | atomic_long_inc(&rsp->expedited_wrap); | ||
3327 | return; | ||
3328 | } | ||
3329 | 3541 | ||
3330 | /* | ||
3331 | * Take a ticket. Note that atomic_inc_return() implies a | ||
3332 | * full memory barrier. | ||
3333 | */ | ||
3334 | snap = atomic_long_inc_return(&rsp->expedited_start); | ||
3335 | firstsnap = snap; | ||
3336 | if (!try_get_online_cpus()) { | 3542 | if (!try_get_online_cpus()) { |
3337 | /* CPU hotplug operation in flight, fall back to normal GP. */ | 3543 | /* CPU hotplug operation in flight, fall back to normal GP. */ |
3338 | wait_rcu_gp(call_rcu_sched); | 3544 | wait_rcu_gp(call_rcu_sched); |
@@ -3341,100 +3547,38 @@ void synchronize_sched_expedited(void) | |||
3341 | } | 3547 | } |
3342 | WARN_ON_ONCE(cpu_is_offline(raw_smp_processor_id())); | 3548 | WARN_ON_ONCE(cpu_is_offline(raw_smp_processor_id())); |
3343 | 3549 | ||
3344 | /* Offline CPUs, idle CPUs, and any CPU we run on are quiescent. */ | 3550 | rnp = exp_funnel_lock(rsp, s); |
3345 | cma = zalloc_cpumask_var(&cm, GFP_KERNEL); | 3551 | if (rnp == NULL) { |
3346 | if (cma) { | 3552 | put_online_cpus(); |
3347 | cpumask_copy(cm, cpu_online_mask); | 3553 | return; /* Someone else did our work for us. */ |
3348 | cpumask_clear_cpu(raw_smp_processor_id(), cm); | ||
3349 | for_each_cpu(cpu, cm) { | ||
3350 | struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu); | ||
3351 | |||
3352 | if (!(atomic_add_return(0, &rdtp->dynticks) & 0x1)) | ||
3353 | cpumask_clear_cpu(cpu, cm); | ||
3354 | } | ||
3355 | if (cpumask_weight(cm) == 0) | ||
3356 | goto all_cpus_idle; | ||
3357 | } | 3554 | } |
3358 | 3555 | ||
3359 | /* | 3556 | rcu_exp_gp_seq_start(rsp); |
3360 | * Each pass through the following loop attempts to force a | ||
3361 | * context switch on each CPU. | ||
3362 | */ | ||
3363 | while (try_stop_cpus(cma ? cm : cpu_online_mask, | ||
3364 | synchronize_sched_expedited_cpu_stop, | ||
3365 | NULL) == -EAGAIN) { | ||
3366 | put_online_cpus(); | ||
3367 | atomic_long_inc(&rsp->expedited_tryfail); | ||
3368 | |||
3369 | /* Check to see if someone else did our work for us. */ | ||
3370 | s = atomic_long_read(&rsp->expedited_done); | ||
3371 | if (ULONG_CMP_GE((ulong)s, (ulong)firstsnap)) { | ||
3372 | /* ensure test happens before caller kfree */ | ||
3373 | smp_mb__before_atomic(); /* ^^^ */ | ||
3374 | atomic_long_inc(&rsp->expedited_workdone1); | ||
3375 | free_cpumask_var(cm); | ||
3376 | return; | ||
3377 | } | ||
3378 | 3557 | ||
3379 | /* No joy, try again later. Or just synchronize_sched(). */ | 3558 | /* Stop each CPU that is online, non-idle, and not us. */ |
3380 | if (trycount++ < 10) { | 3559 | init_waitqueue_head(&rsp->expedited_wq); |
3381 | udelay(trycount * num_online_cpus()); | 3560 | atomic_set(&rsp->expedited_need_qs, 1); /* Extra count avoids race. */ |
3382 | } else { | 3561 | for_each_online_cpu(cpu) { |
3383 | wait_rcu_gp(call_rcu_sched); | 3562 | struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); |
3384 | atomic_long_inc(&rsp->expedited_normal); | 3563 | struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu); |
3385 | free_cpumask_var(cm); | ||
3386 | return; | ||
3387 | } | ||
3388 | 3564 | ||
3389 | /* Recheck to see if someone else did our work for us. */ | 3565 | rdp->exp_done = false; |
3390 | s = atomic_long_read(&rsp->expedited_done); | ||
3391 | if (ULONG_CMP_GE((ulong)s, (ulong)firstsnap)) { | ||
3392 | /* ensure test happens before caller kfree */ | ||
3393 | smp_mb__before_atomic(); /* ^^^ */ | ||
3394 | atomic_long_inc(&rsp->expedited_workdone2); | ||
3395 | free_cpumask_var(cm); | ||
3396 | return; | ||
3397 | } | ||
3398 | 3566 | ||
3399 | /* | 3567 | /* Skip our CPU and any idle CPUs. */ |
3400 | * Refetching sync_sched_expedited_started allows later | 3568 | if (raw_smp_processor_id() == cpu || |
3401 | * callers to piggyback on our grace period. We retry | 3569 | !(atomic_add_return(0, &rdtp->dynticks) & 0x1)) |
3402 | * after they started, so our grace period works for them, | 3570 | continue; |
3403 | * and they started after our first try, so their grace | 3571 | atomic_inc(&rsp->expedited_need_qs); |
3404 | * period works for us. | 3572 | stop_one_cpu_nowait(cpu, synchronize_sched_expedited_cpu_stop, |
3405 | */ | 3573 | rdp, &rdp->exp_stop_work); |
3406 | if (!try_get_online_cpus()) { | ||
3407 | /* CPU hotplug operation in flight, use normal GP. */ | ||
3408 | wait_rcu_gp(call_rcu_sched); | ||
3409 | atomic_long_inc(&rsp->expedited_normal); | ||
3410 | free_cpumask_var(cm); | ||
3411 | return; | ||
3412 | } | ||
3413 | snap = atomic_long_read(&rsp->expedited_start); | ||
3414 | smp_mb(); /* ensure read is before try_stop_cpus(). */ | ||
3415 | } | 3574 | } |
3416 | atomic_long_inc(&rsp->expedited_stoppedcpus); | ||
3417 | 3575 | ||
3418 | all_cpus_idle: | 3576 | /* Remove extra count and, if necessary, wait for CPUs to stop. */ |
3419 | free_cpumask_var(cm); | 3577 | if (!atomic_dec_and_test(&rsp->expedited_need_qs)) |
3578 | synchronize_sched_expedited_wait(rsp); | ||
3420 | 3579 | ||
3421 | /* | 3580 | rcu_exp_gp_seq_end(rsp); |
3422 | * Everyone up to our most recent fetch is covered by our grace | 3581 | mutex_unlock(&rnp->exp_funnel_mutex); |
3423 | * period. Update the counter, but only if our work is still | ||
3424 | * relevant -- which it won't be if someone who started later | ||
3425 | * than we did already did their update. | ||
3426 | */ | ||
3427 | do { | ||
3428 | atomic_long_inc(&rsp->expedited_done_tries); | ||
3429 | s = atomic_long_read(&rsp->expedited_done); | ||
3430 | if (ULONG_CMP_GE((ulong)s, (ulong)snap)) { | ||
3431 | /* ensure test happens before caller kfree */ | ||
3432 | smp_mb__before_atomic(); /* ^^^ */ | ||
3433 | atomic_long_inc(&rsp->expedited_done_lost); | ||
3434 | break; | ||
3435 | } | ||
3436 | } while (atomic_long_cmpxchg(&rsp->expedited_done, s, snap) != s); | ||
3437 | atomic_long_inc(&rsp->expedited_done_exit); | ||
3438 | 3582 | ||
3439 | put_online_cpus(); | 3583 | put_online_cpus(); |
3440 | } | 3584 | } |
@@ -3571,10 +3715,10 @@ static void rcu_barrier_callback(struct rcu_head *rhp) | |||
3571 | struct rcu_state *rsp = rdp->rsp; | 3715 | struct rcu_state *rsp = rdp->rsp; |
3572 | 3716 | ||
3573 | if (atomic_dec_and_test(&rsp->barrier_cpu_count)) { | 3717 | if (atomic_dec_and_test(&rsp->barrier_cpu_count)) { |
3574 | _rcu_barrier_trace(rsp, "LastCB", -1, rsp->n_barrier_done); | 3718 | _rcu_barrier_trace(rsp, "LastCB", -1, rsp->barrier_sequence); |
3575 | complete(&rsp->barrier_completion); | 3719 | complete(&rsp->barrier_completion); |
3576 | } else { | 3720 | } else { |
3577 | _rcu_barrier_trace(rsp, "CB", -1, rsp->n_barrier_done); | 3721 | _rcu_barrier_trace(rsp, "CB", -1, rsp->barrier_sequence); |
3578 | } | 3722 | } |
3579 | } | 3723 | } |
3580 | 3724 | ||
@@ -3586,7 +3730,7 @@ static void rcu_barrier_func(void *type) | |||
3586 | struct rcu_state *rsp = type; | 3730 | struct rcu_state *rsp = type; |
3587 | struct rcu_data *rdp = raw_cpu_ptr(rsp->rda); | 3731 | struct rcu_data *rdp = raw_cpu_ptr(rsp->rda); |
3588 | 3732 | ||
3589 | _rcu_barrier_trace(rsp, "IRQ", -1, rsp->n_barrier_done); | 3733 | _rcu_barrier_trace(rsp, "IRQ", -1, rsp->barrier_sequence); |
3590 | atomic_inc(&rsp->barrier_cpu_count); | 3734 | atomic_inc(&rsp->barrier_cpu_count); |
3591 | rsp->call(&rdp->barrier_head, rcu_barrier_callback); | 3735 | rsp->call(&rdp->barrier_head, rcu_barrier_callback); |
3592 | } | 3736 | } |
@@ -3599,55 +3743,24 @@ static void _rcu_barrier(struct rcu_state *rsp) | |||
3599 | { | 3743 | { |
3600 | int cpu; | 3744 | int cpu; |
3601 | struct rcu_data *rdp; | 3745 | struct rcu_data *rdp; |
3602 | unsigned long snap = READ_ONCE(rsp->n_barrier_done); | 3746 | unsigned long s = rcu_seq_snap(&rsp->barrier_sequence); |
3603 | unsigned long snap_done; | ||
3604 | 3747 | ||
3605 | _rcu_barrier_trace(rsp, "Begin", -1, snap); | 3748 | _rcu_barrier_trace(rsp, "Begin", -1, s); |
3606 | 3749 | ||
3607 | /* Take mutex to serialize concurrent rcu_barrier() requests. */ | 3750 | /* Take mutex to serialize concurrent rcu_barrier() requests. */ |
3608 | mutex_lock(&rsp->barrier_mutex); | 3751 | mutex_lock(&rsp->barrier_mutex); |
3609 | 3752 | ||
3610 | /* | 3753 | /* Did someone else do our work for us? */ |
3611 | * Ensure that all prior references, including to ->n_barrier_done, | 3754 | if (rcu_seq_done(&rsp->barrier_sequence, s)) { |
3612 | * are ordered before the _rcu_barrier() machinery. | 3755 | _rcu_barrier_trace(rsp, "EarlyExit", -1, rsp->barrier_sequence); |
3613 | */ | ||
3614 | smp_mb(); /* See above block comment. */ | ||
3615 | |||
3616 | /* | ||
3617 | * Recheck ->n_barrier_done to see if others did our work for us. | ||
3618 | * This means checking ->n_barrier_done for an even-to-odd-to-even | ||
3619 | * transition. The "if" expression below therefore rounds the old | ||
3620 | * value up to the next even number and adds two before comparing. | ||
3621 | */ | ||
3622 | snap_done = rsp->n_barrier_done; | ||
3623 | _rcu_barrier_trace(rsp, "Check", -1, snap_done); | ||
3624 | |||
3625 | /* | ||
3626 | * If the value in snap is odd, we needed to wait for the current | ||
3627 | * rcu_barrier() to complete, then wait for the next one, in other | ||
3628 | * words, we need the value of snap_done to be three larger than | ||
3629 | * the value of snap. On the other hand, if the value in snap is | ||
3630 | * even, we only had to wait for the next rcu_barrier() to complete, | ||
3631 | * in other words, we need the value of snap_done to be only two | ||
3632 | * greater than the value of snap. The "(snap + 3) & ~0x1" computes | ||
3633 | * this for us (thank you, Linus!). | ||
3634 | */ | ||
3635 | if (ULONG_CMP_GE(snap_done, (snap + 3) & ~0x1)) { | ||
3636 | _rcu_barrier_trace(rsp, "EarlyExit", -1, snap_done); | ||
3637 | smp_mb(); /* caller's subsequent code after above check. */ | 3756 | smp_mb(); /* caller's subsequent code after above check. */ |
3638 | mutex_unlock(&rsp->barrier_mutex); | 3757 | mutex_unlock(&rsp->barrier_mutex); |
3639 | return; | 3758 | return; |
3640 | } | 3759 | } |
3641 | 3760 | ||
3642 | /* | 3761 | /* Mark the start of the barrier operation. */ |
3643 | * Increment ->n_barrier_done to avoid duplicate work. Use | 3762 | rcu_seq_start(&rsp->barrier_sequence); |
3644 | * WRITE_ONCE() to prevent the compiler from speculating | 3763 | _rcu_barrier_trace(rsp, "Inc1", -1, rsp->barrier_sequence); |
3645 | * the increment to precede the early-exit check. | ||
3646 | */ | ||
3647 | WRITE_ONCE(rsp->n_barrier_done, rsp->n_barrier_done + 1); | ||
3648 | WARN_ON_ONCE((rsp->n_barrier_done & 0x1) != 1); | ||
3649 | _rcu_barrier_trace(rsp, "Inc1", -1, rsp->n_barrier_done); | ||
3650 | smp_mb(); /* Order ->n_barrier_done increment with below mechanism. */ | ||
3651 | 3764 | ||
3652 | /* | 3765 | /* |
3653 | * Initialize the count to one rather than to zero in order to | 3766 | * Initialize the count to one rather than to zero in order to |
@@ -3671,10 +3784,10 @@ static void _rcu_barrier(struct rcu_state *rsp) | |||
3671 | if (rcu_is_nocb_cpu(cpu)) { | 3784 | if (rcu_is_nocb_cpu(cpu)) { |
3672 | if (!rcu_nocb_cpu_needs_barrier(rsp, cpu)) { | 3785 | if (!rcu_nocb_cpu_needs_barrier(rsp, cpu)) { |
3673 | _rcu_barrier_trace(rsp, "OfflineNoCB", cpu, | 3786 | _rcu_barrier_trace(rsp, "OfflineNoCB", cpu, |
3674 | rsp->n_barrier_done); | 3787 | rsp->barrier_sequence); |
3675 | } else { | 3788 | } else { |
3676 | _rcu_barrier_trace(rsp, "OnlineNoCB", cpu, | 3789 | _rcu_barrier_trace(rsp, "OnlineNoCB", cpu, |
3677 | rsp->n_barrier_done); | 3790 | rsp->barrier_sequence); |
3678 | smp_mb__before_atomic(); | 3791 | smp_mb__before_atomic(); |
3679 | atomic_inc(&rsp->barrier_cpu_count); | 3792 | atomic_inc(&rsp->barrier_cpu_count); |
3680 | __call_rcu(&rdp->barrier_head, | 3793 | __call_rcu(&rdp->barrier_head, |
@@ -3682,11 +3795,11 @@ static void _rcu_barrier(struct rcu_state *rsp) | |||
3682 | } | 3795 | } |
3683 | } else if (READ_ONCE(rdp->qlen)) { | 3796 | } else if (READ_ONCE(rdp->qlen)) { |
3684 | _rcu_barrier_trace(rsp, "OnlineQ", cpu, | 3797 | _rcu_barrier_trace(rsp, "OnlineQ", cpu, |
3685 | rsp->n_barrier_done); | 3798 | rsp->barrier_sequence); |
3686 | smp_call_function_single(cpu, rcu_barrier_func, rsp, 1); | 3799 | smp_call_function_single(cpu, rcu_barrier_func, rsp, 1); |
3687 | } else { | 3800 | } else { |
3688 | _rcu_barrier_trace(rsp, "OnlineNQ", cpu, | 3801 | _rcu_barrier_trace(rsp, "OnlineNQ", cpu, |
3689 | rsp->n_barrier_done); | 3802 | rsp->barrier_sequence); |
3690 | } | 3803 | } |
3691 | } | 3804 | } |
3692 | put_online_cpus(); | 3805 | put_online_cpus(); |
@@ -3698,16 +3811,13 @@ static void _rcu_barrier(struct rcu_state *rsp) | |||
3698 | if (atomic_dec_and_test(&rsp->barrier_cpu_count)) | 3811 | if (atomic_dec_and_test(&rsp->barrier_cpu_count)) |
3699 | complete(&rsp->barrier_completion); | 3812 | complete(&rsp->barrier_completion); |
3700 | 3813 | ||
3701 | /* Increment ->n_barrier_done to prevent duplicate work. */ | ||
3702 | smp_mb(); /* Keep increment after above mechanism. */ | ||
3703 | WRITE_ONCE(rsp->n_barrier_done, rsp->n_barrier_done + 1); | ||
3704 | WARN_ON_ONCE((rsp->n_barrier_done & 0x1) != 0); | ||
3705 | _rcu_barrier_trace(rsp, "Inc2", -1, rsp->n_barrier_done); | ||
3706 | smp_mb(); /* Keep increment before caller's subsequent code. */ | ||
3707 | |||
3708 | /* Wait for all rcu_barrier_callback() callbacks to be invoked. */ | 3814 | /* Wait for all rcu_barrier_callback() callbacks to be invoked. */ |
3709 | wait_for_completion(&rsp->barrier_completion); | 3815 | wait_for_completion(&rsp->barrier_completion); |
3710 | 3816 | ||
3817 | /* Mark the end of the barrier operation. */ | ||
3818 | _rcu_barrier_trace(rsp, "Inc2", -1, rsp->barrier_sequence); | ||
3819 | rcu_seq_end(&rsp->barrier_sequence); | ||
3820 | |||
3711 | /* Other rcu_barrier() invocations can now safely proceed. */ | 3821 | /* Other rcu_barrier() invocations can now safely proceed. */ |
3712 | mutex_unlock(&rsp->barrier_mutex); | 3822 | mutex_unlock(&rsp->barrier_mutex); |
3713 | } | 3823 | } |
@@ -3770,6 +3880,7 @@ rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp) | |||
3770 | WARN_ON_ONCE(atomic_read(&rdp->dynticks->dynticks) != 1); | 3880 | WARN_ON_ONCE(atomic_read(&rdp->dynticks->dynticks) != 1); |
3771 | rdp->cpu = cpu; | 3881 | rdp->cpu = cpu; |
3772 | rdp->rsp = rsp; | 3882 | rdp->rsp = rsp; |
3883 | mutex_init(&rdp->exp_funnel_mutex); | ||
3773 | rcu_boot_init_nocb_percpu_data(rdp); | 3884 | rcu_boot_init_nocb_percpu_data(rdp); |
3774 | raw_spin_unlock_irqrestore(&rnp->lock, flags); | 3885 | raw_spin_unlock_irqrestore(&rnp->lock, flags); |
3775 | } | 3886 | } |
@@ -3961,22 +4072,22 @@ void rcu_scheduler_starting(void) | |||
3961 | * Compute the per-level fanout, either using the exact fanout specified | 4072 | * Compute the per-level fanout, either using the exact fanout specified |
3962 | * or balancing the tree, depending on the rcu_fanout_exact boot parameter. | 4073 | * or balancing the tree, depending on the rcu_fanout_exact boot parameter. |
3963 | */ | 4074 | */ |
3964 | static void __init rcu_init_levelspread(struct rcu_state *rsp) | 4075 | static void __init rcu_init_levelspread(int *levelspread, const int *levelcnt) |
3965 | { | 4076 | { |
3966 | int i; | 4077 | int i; |
3967 | 4078 | ||
3968 | if (rcu_fanout_exact) { | 4079 | if (rcu_fanout_exact) { |
3969 | rsp->levelspread[rcu_num_lvls - 1] = rcu_fanout_leaf; | 4080 | levelspread[rcu_num_lvls - 1] = rcu_fanout_leaf; |
3970 | for (i = rcu_num_lvls - 2; i >= 0; i--) | 4081 | for (i = rcu_num_lvls - 2; i >= 0; i--) |
3971 | rsp->levelspread[i] = RCU_FANOUT; | 4082 | levelspread[i] = RCU_FANOUT; |
3972 | } else { | 4083 | } else { |
3973 | int ccur; | 4084 | int ccur; |
3974 | int cprv; | 4085 | int cprv; |
3975 | 4086 | ||
3976 | cprv = nr_cpu_ids; | 4087 | cprv = nr_cpu_ids; |
3977 | for (i = rcu_num_lvls - 1; i >= 0; i--) { | 4088 | for (i = rcu_num_lvls - 1; i >= 0; i--) { |
3978 | ccur = rsp->levelcnt[i]; | 4089 | ccur = levelcnt[i]; |
3979 | rsp->levelspread[i] = (cprv + ccur - 1) / ccur; | 4090 | levelspread[i] = (cprv + ccur - 1) / ccur; |
3980 | cprv = ccur; | 4091 | cprv = ccur; |
3981 | } | 4092 | } |
3982 | } | 4093 | } |
@@ -3988,23 +4099,20 @@ static void __init rcu_init_levelspread(struct rcu_state *rsp) | |||
3988 | static void __init rcu_init_one(struct rcu_state *rsp, | 4099 | static void __init rcu_init_one(struct rcu_state *rsp, |
3989 | struct rcu_data __percpu *rda) | 4100 | struct rcu_data __percpu *rda) |
3990 | { | 4101 | { |
3991 | static const char * const buf[] = { | 4102 | static const char * const buf[] = RCU_NODE_NAME_INIT; |
3992 | "rcu_node_0", | 4103 | static const char * const fqs[] = RCU_FQS_NAME_INIT; |
3993 | "rcu_node_1", | 4104 | static const char * const exp[] = RCU_EXP_NAME_INIT; |
3994 | "rcu_node_2", | 4105 | static const char * const exp_sched[] = RCU_EXP_SCHED_NAME_INIT; |
3995 | "rcu_node_3" }; /* Match MAX_RCU_LVLS */ | ||
3996 | static const char * const fqs[] = { | ||
3997 | "rcu_node_fqs_0", | ||
3998 | "rcu_node_fqs_1", | ||
3999 | "rcu_node_fqs_2", | ||
4000 | "rcu_node_fqs_3" }; /* Match MAX_RCU_LVLS */ | ||
4001 | static u8 fl_mask = 0x1; | 4106 | static u8 fl_mask = 0x1; |
4107 | |||
4108 | int levelcnt[RCU_NUM_LVLS]; /* # nodes in each level. */ | ||
4109 | int levelspread[RCU_NUM_LVLS]; /* kids/node in each level. */ | ||
4002 | int cpustride = 1; | 4110 | int cpustride = 1; |
4003 | int i; | 4111 | int i; |
4004 | int j; | 4112 | int j; |
4005 | struct rcu_node *rnp; | 4113 | struct rcu_node *rnp; |
4006 | 4114 | ||
4007 | BUILD_BUG_ON(MAX_RCU_LVLS > ARRAY_SIZE(buf)); /* Fix buf[] init! */ | 4115 | BUILD_BUG_ON(RCU_NUM_LVLS > ARRAY_SIZE(buf)); /* Fix buf[] init! */ |
4008 | 4116 | ||
4009 | /* Silence gcc 4.8 false positive about array index out of range. */ | 4117 | /* Silence gcc 4.8 false positive about array index out of range. */ |
4010 | if (rcu_num_lvls <= 0 || rcu_num_lvls > RCU_NUM_LVLS) | 4118 | if (rcu_num_lvls <= 0 || rcu_num_lvls > RCU_NUM_LVLS) |
@@ -4013,19 +4121,19 @@ static void __init rcu_init_one(struct rcu_state *rsp, | |||
4013 | /* Initialize the level-tracking arrays. */ | 4121 | /* Initialize the level-tracking arrays. */ |
4014 | 4122 | ||
4015 | for (i = 0; i < rcu_num_lvls; i++) | 4123 | for (i = 0; i < rcu_num_lvls; i++) |
4016 | rsp->levelcnt[i] = num_rcu_lvl[i]; | 4124 | levelcnt[i] = num_rcu_lvl[i]; |
4017 | for (i = 1; i < rcu_num_lvls; i++) | 4125 | for (i = 1; i < rcu_num_lvls; i++) |
4018 | rsp->level[i] = rsp->level[i - 1] + rsp->levelcnt[i - 1]; | 4126 | rsp->level[i] = rsp->level[i - 1] + levelcnt[i - 1]; |
4019 | rcu_init_levelspread(rsp); | 4127 | rcu_init_levelspread(levelspread, levelcnt); |
4020 | rsp->flavor_mask = fl_mask; | 4128 | rsp->flavor_mask = fl_mask; |
4021 | fl_mask <<= 1; | 4129 | fl_mask <<= 1; |
4022 | 4130 | ||
4023 | /* Initialize the elements themselves, starting from the leaves. */ | 4131 | /* Initialize the elements themselves, starting from the leaves. */ |
4024 | 4132 | ||
4025 | for (i = rcu_num_lvls - 1; i >= 0; i--) { | 4133 | for (i = rcu_num_lvls - 1; i >= 0; i--) { |
4026 | cpustride *= rsp->levelspread[i]; | 4134 | cpustride *= levelspread[i]; |
4027 | rnp = rsp->level[i]; | 4135 | rnp = rsp->level[i]; |
4028 | for (j = 0; j < rsp->levelcnt[i]; j++, rnp++) { | 4136 | for (j = 0; j < levelcnt[i]; j++, rnp++) { |
4029 | raw_spin_lock_init(&rnp->lock); | 4137 | raw_spin_lock_init(&rnp->lock); |
4030 | lockdep_set_class_and_name(&rnp->lock, | 4138 | lockdep_set_class_and_name(&rnp->lock, |
4031 | &rcu_node_class[i], buf[i]); | 4139 | &rcu_node_class[i], buf[i]); |
@@ -4045,14 +4153,23 @@ static void __init rcu_init_one(struct rcu_state *rsp, | |||
4045 | rnp->grpmask = 0; | 4153 | rnp->grpmask = 0; |
4046 | rnp->parent = NULL; | 4154 | rnp->parent = NULL; |
4047 | } else { | 4155 | } else { |
4048 | rnp->grpnum = j % rsp->levelspread[i - 1]; | 4156 | rnp->grpnum = j % levelspread[i - 1]; |
4049 | rnp->grpmask = 1UL << rnp->grpnum; | 4157 | rnp->grpmask = 1UL << rnp->grpnum; |
4050 | rnp->parent = rsp->level[i - 1] + | 4158 | rnp->parent = rsp->level[i - 1] + |
4051 | j / rsp->levelspread[i - 1]; | 4159 | j / levelspread[i - 1]; |
4052 | } | 4160 | } |
4053 | rnp->level = i; | 4161 | rnp->level = i; |
4054 | INIT_LIST_HEAD(&rnp->blkd_tasks); | 4162 | INIT_LIST_HEAD(&rnp->blkd_tasks); |
4055 | rcu_init_one_nocb(rnp); | 4163 | rcu_init_one_nocb(rnp); |
4164 | mutex_init(&rnp->exp_funnel_mutex); | ||
4165 | if (rsp == &rcu_sched_state) | ||
4166 | lockdep_set_class_and_name( | ||
4167 | &rnp->exp_funnel_mutex, | ||
4168 | &rcu_exp_sched_class[i], exp_sched[i]); | ||
4169 | else | ||
4170 | lockdep_set_class_and_name( | ||
4171 | &rnp->exp_funnel_mutex, | ||
4172 | &rcu_exp_class[i], exp[i]); | ||
4056 | } | 4173 | } |
4057 | } | 4174 | } |
4058 | 4175 | ||
@@ -4076,9 +4193,7 @@ static void __init rcu_init_geometry(void) | |||
4076 | { | 4193 | { |
4077 | ulong d; | 4194 | ulong d; |
4078 | int i; | 4195 | int i; |
4079 | int j; | 4196 | int rcu_capacity[RCU_NUM_LVLS]; |
4080 | int n = nr_cpu_ids; | ||
4081 | int rcu_capacity[MAX_RCU_LVLS + 1]; | ||
4082 | 4197 | ||
4083 | /* | 4198 | /* |
4084 | * Initialize any unspecified boot parameters. | 4199 | * Initialize any unspecified boot parameters. |
@@ -4101,47 +4216,49 @@ static void __init rcu_init_geometry(void) | |||
4101 | rcu_fanout_leaf, nr_cpu_ids); | 4216 | rcu_fanout_leaf, nr_cpu_ids); |
4102 | 4217 | ||
4103 | /* | 4218 | /* |
4104 | * Compute number of nodes that can be handled an rcu_node tree | ||
4105 | * with the given number of levels. Setting rcu_capacity[0] makes | ||
4106 | * some of the arithmetic easier. | ||
4107 | */ | ||
4108 | rcu_capacity[0] = 1; | ||
4109 | rcu_capacity[1] = rcu_fanout_leaf; | ||
4110 | for (i = 2; i <= MAX_RCU_LVLS; i++) | ||
4111 | rcu_capacity[i] = rcu_capacity[i - 1] * RCU_FANOUT; | ||
4112 | |||
4113 | /* | ||
4114 | * The boot-time rcu_fanout_leaf parameter is only permitted | 4219 | * The boot-time rcu_fanout_leaf parameter is only permitted |
4115 | * to increase the leaf-level fanout, not decrease it. Of course, | 4220 | * to increase the leaf-level fanout, not decrease it. Of course, |
4116 | * the leaf-level fanout cannot exceed the number of bits in | 4221 | * the leaf-level fanout cannot exceed the number of bits in |
4117 | * the rcu_node masks. Finally, the tree must be able to accommodate | 4222 | * the rcu_node masks. Complain and fall back to the compile- |
4118 | * the configured number of CPUs. Complain and fall back to the | 4223 | * time values if these limits are exceeded. |
4119 | * compile-time values if these limits are exceeded. | ||
4120 | */ | 4224 | */ |
4121 | if (rcu_fanout_leaf < RCU_FANOUT_LEAF || | 4225 | if (rcu_fanout_leaf < RCU_FANOUT_LEAF || |
4122 | rcu_fanout_leaf > sizeof(unsigned long) * 8 || | 4226 | rcu_fanout_leaf > sizeof(unsigned long) * 8) { |
4123 | n > rcu_capacity[MAX_RCU_LVLS]) { | 4227 | rcu_fanout_leaf = RCU_FANOUT_LEAF; |
4124 | WARN_ON(1); | 4228 | WARN_ON(1); |
4125 | return; | 4229 | return; |
4126 | } | 4230 | } |
4127 | 4231 | ||
4232 | /* | ||
4233 | * Compute number of nodes that can be handled an rcu_node tree | ||
4234 | * with the given number of levels. | ||
4235 | */ | ||
4236 | rcu_capacity[0] = rcu_fanout_leaf; | ||
4237 | for (i = 1; i < RCU_NUM_LVLS; i++) | ||
4238 | rcu_capacity[i] = rcu_capacity[i - 1] * RCU_FANOUT; | ||
4239 | |||
4240 | /* | ||
4241 | * The tree must be able to accommodate the configured number of CPUs. | ||
4242 | * If this limit is exceeded than we have a serious problem elsewhere. | ||
4243 | */ | ||
4244 | if (nr_cpu_ids > rcu_capacity[RCU_NUM_LVLS - 1]) | ||
4245 | panic("rcu_init_geometry: rcu_capacity[] is too small"); | ||
4246 | |||
4247 | /* Calculate the number of levels in the tree. */ | ||
4248 | for (i = 0; nr_cpu_ids > rcu_capacity[i]; i++) { | ||
4249 | } | ||
4250 | rcu_num_lvls = i + 1; | ||
4251 | |||
4128 | /* Calculate the number of rcu_nodes at each level of the tree. */ | 4252 | /* Calculate the number of rcu_nodes at each level of the tree. */ |
4129 | for (i = 1; i <= MAX_RCU_LVLS; i++) | 4253 | for (i = 0; i < rcu_num_lvls; i++) { |
4130 | if (n <= rcu_capacity[i]) { | 4254 | int cap = rcu_capacity[(rcu_num_lvls - 1) - i]; |
4131 | for (j = 0; j <= i; j++) | 4255 | num_rcu_lvl[i] = DIV_ROUND_UP(nr_cpu_ids, cap); |
4132 | num_rcu_lvl[j] = | 4256 | } |
4133 | DIV_ROUND_UP(n, rcu_capacity[i - j]); | ||
4134 | rcu_num_lvls = i; | ||
4135 | for (j = i + 1; j <= MAX_RCU_LVLS; j++) | ||
4136 | num_rcu_lvl[j] = 0; | ||
4137 | break; | ||
4138 | } | ||
4139 | 4257 | ||
4140 | /* Calculate the total number of rcu_node structures. */ | 4258 | /* Calculate the total number of rcu_node structures. */ |
4141 | rcu_num_nodes = 0; | 4259 | rcu_num_nodes = 0; |
4142 | for (i = 0; i <= MAX_RCU_LVLS; i++) | 4260 | for (i = 0; i < rcu_num_lvls; i++) |
4143 | rcu_num_nodes += num_rcu_lvl[i]; | 4261 | rcu_num_nodes += num_rcu_lvl[i]; |
4144 | rcu_num_nodes -= n; | ||
4145 | } | 4262 | } |
4146 | 4263 | ||
4147 | /* | 4264 | /* |
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 4adb7ca0bf47..0412030ca882 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h | |||
@@ -27,6 +27,7 @@ | |||
27 | #include <linux/threads.h> | 27 | #include <linux/threads.h> |
28 | #include <linux/cpumask.h> | 28 | #include <linux/cpumask.h> |
29 | #include <linux/seqlock.h> | 29 | #include <linux/seqlock.h> |
30 | #include <linux/stop_machine.h> | ||
30 | 31 | ||
31 | /* | 32 | /* |
32 | * Define shape of hierarchy based on NR_CPUS, CONFIG_RCU_FANOUT, and | 33 | * Define shape of hierarchy based on NR_CPUS, CONFIG_RCU_FANOUT, and |
@@ -36,8 +37,6 @@ | |||
36 | * Of course, your mileage may vary. | 37 | * Of course, your mileage may vary. |
37 | */ | 38 | */ |
38 | 39 | ||
39 | #define MAX_RCU_LVLS 4 | ||
40 | |||
41 | #ifdef CONFIG_RCU_FANOUT | 40 | #ifdef CONFIG_RCU_FANOUT |
42 | #define RCU_FANOUT CONFIG_RCU_FANOUT | 41 | #define RCU_FANOUT CONFIG_RCU_FANOUT |
43 | #else /* #ifdef CONFIG_RCU_FANOUT */ | 42 | #else /* #ifdef CONFIG_RCU_FANOUT */ |
@@ -66,38 +65,53 @@ | |||
66 | #if NR_CPUS <= RCU_FANOUT_1 | 65 | #if NR_CPUS <= RCU_FANOUT_1 |
67 | # define RCU_NUM_LVLS 1 | 66 | # define RCU_NUM_LVLS 1 |
68 | # define NUM_RCU_LVL_0 1 | 67 | # define NUM_RCU_LVL_0 1 |
69 | # define NUM_RCU_LVL_1 (NR_CPUS) | 68 | # define NUM_RCU_NODES NUM_RCU_LVL_0 |
70 | # define NUM_RCU_LVL_2 0 | 69 | # define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0 } |
71 | # define NUM_RCU_LVL_3 0 | 70 | # define RCU_NODE_NAME_INIT { "rcu_node_0" } |
72 | # define NUM_RCU_LVL_4 0 | 71 | # define RCU_FQS_NAME_INIT { "rcu_node_fqs_0" } |
72 | # define RCU_EXP_NAME_INIT { "rcu_node_exp_0" } | ||
73 | # define RCU_EXP_SCHED_NAME_INIT \ | ||
74 | { "rcu_node_exp_sched_0" } | ||
73 | #elif NR_CPUS <= RCU_FANOUT_2 | 75 | #elif NR_CPUS <= RCU_FANOUT_2 |
74 | # define RCU_NUM_LVLS 2 | 76 | # define RCU_NUM_LVLS 2 |
75 | # define NUM_RCU_LVL_0 1 | 77 | # define NUM_RCU_LVL_0 1 |
76 | # define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1) | 78 | # define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1) |
77 | # define NUM_RCU_LVL_2 (NR_CPUS) | 79 | # define NUM_RCU_NODES (NUM_RCU_LVL_0 + NUM_RCU_LVL_1) |
78 | # define NUM_RCU_LVL_3 0 | 80 | # define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0, NUM_RCU_LVL_1 } |
79 | # define NUM_RCU_LVL_4 0 | 81 | # define RCU_NODE_NAME_INIT { "rcu_node_0", "rcu_node_1" } |
82 | # define RCU_FQS_NAME_INIT { "rcu_node_fqs_0", "rcu_node_fqs_1" } | ||
83 | # define RCU_EXP_NAME_INIT { "rcu_node_exp_0", "rcu_node_exp_1" } | ||
84 | # define RCU_EXP_SCHED_NAME_INIT \ | ||
85 | { "rcu_node_exp_sched_0", "rcu_node_exp_sched_1" } | ||
80 | #elif NR_CPUS <= RCU_FANOUT_3 | 86 | #elif NR_CPUS <= RCU_FANOUT_3 |
81 | # define RCU_NUM_LVLS 3 | 87 | # define RCU_NUM_LVLS 3 |
82 | # define NUM_RCU_LVL_0 1 | 88 | # define NUM_RCU_LVL_0 1 |
83 | # define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2) | 89 | # define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2) |
84 | # define NUM_RCU_LVL_2 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1) | 90 | # define NUM_RCU_LVL_2 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1) |
85 | # define NUM_RCU_LVL_3 (NR_CPUS) | 91 | # define NUM_RCU_NODES (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2) |
86 | # define NUM_RCU_LVL_4 0 | 92 | # define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0, NUM_RCU_LVL_1, NUM_RCU_LVL_2 } |
93 | # define RCU_NODE_NAME_INIT { "rcu_node_0", "rcu_node_1", "rcu_node_2" } | ||
94 | # define RCU_FQS_NAME_INIT { "rcu_node_fqs_0", "rcu_node_fqs_1", "rcu_node_fqs_2" } | ||
95 | # define RCU_EXP_NAME_INIT { "rcu_node_exp_0", "rcu_node_exp_1", "rcu_node_exp_2" } | ||
96 | # define RCU_EXP_SCHED_NAME_INIT \ | ||
97 | { "rcu_node_exp_sched_0", "rcu_node_exp_sched_1", "rcu_node_exp_sched_2" } | ||
87 | #elif NR_CPUS <= RCU_FANOUT_4 | 98 | #elif NR_CPUS <= RCU_FANOUT_4 |
88 | # define RCU_NUM_LVLS 4 | 99 | # define RCU_NUM_LVLS 4 |
89 | # define NUM_RCU_LVL_0 1 | 100 | # define NUM_RCU_LVL_0 1 |
90 | # define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_3) | 101 | # define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_3) |
91 | # define NUM_RCU_LVL_2 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2) | 102 | # define NUM_RCU_LVL_2 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2) |
92 | # define NUM_RCU_LVL_3 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1) | 103 | # define NUM_RCU_LVL_3 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1) |
93 | # define NUM_RCU_LVL_4 (NR_CPUS) | 104 | # define NUM_RCU_NODES (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2 + NUM_RCU_LVL_3) |
105 | # define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0, NUM_RCU_LVL_1, NUM_RCU_LVL_2, NUM_RCU_LVL_3 } | ||
106 | # define RCU_NODE_NAME_INIT { "rcu_node_0", "rcu_node_1", "rcu_node_2", "rcu_node_3" } | ||
107 | # define RCU_FQS_NAME_INIT { "rcu_node_fqs_0", "rcu_node_fqs_1", "rcu_node_fqs_2", "rcu_node_fqs_3" } | ||
108 | # define RCU_EXP_NAME_INIT { "rcu_node_exp_0", "rcu_node_exp_1", "rcu_node_exp_2", "rcu_node_exp_3" } | ||
109 | # define RCU_EXP_SCHED_NAME_INIT \ | ||
110 | { "rcu_node_exp_sched_0", "rcu_node_exp_sched_1", "rcu_node_exp_sched_2", "rcu_node_exp_sched_3" } | ||
94 | #else | 111 | #else |
95 | # error "CONFIG_RCU_FANOUT insufficient for NR_CPUS" | 112 | # error "CONFIG_RCU_FANOUT insufficient for NR_CPUS" |
96 | #endif /* #if (NR_CPUS) <= RCU_FANOUT_1 */ | 113 | #endif /* #if (NR_CPUS) <= RCU_FANOUT_1 */ |
97 | 114 | ||
98 | #define RCU_SUM (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2 + NUM_RCU_LVL_3 + NUM_RCU_LVL_4) | ||
99 | #define NUM_RCU_NODES (RCU_SUM - NR_CPUS) | ||
100 | |||
101 | extern int rcu_num_lvls; | 115 | extern int rcu_num_lvls; |
102 | extern int rcu_num_nodes; | 116 | extern int rcu_num_nodes; |
103 | 117 | ||
@@ -236,6 +250,8 @@ struct rcu_node { | |||
236 | int need_future_gp[2]; | 250 | int need_future_gp[2]; |
237 | /* Counts of upcoming no-CB GP requests. */ | 251 | /* Counts of upcoming no-CB GP requests. */ |
238 | raw_spinlock_t fqslock ____cacheline_internodealigned_in_smp; | 252 | raw_spinlock_t fqslock ____cacheline_internodealigned_in_smp; |
253 | |||
254 | struct mutex exp_funnel_mutex ____cacheline_internodealigned_in_smp; | ||
239 | } ____cacheline_internodealigned_in_smp; | 255 | } ____cacheline_internodealigned_in_smp; |
240 | 256 | ||
241 | /* | 257 | /* |
@@ -287,12 +303,13 @@ struct rcu_data { | |||
287 | bool gpwrap; /* Possible gpnum/completed wrap. */ | 303 | bool gpwrap; /* Possible gpnum/completed wrap. */ |
288 | struct rcu_node *mynode; /* This CPU's leaf of hierarchy */ | 304 | struct rcu_node *mynode; /* This CPU's leaf of hierarchy */ |
289 | unsigned long grpmask; /* Mask to apply to leaf qsmask. */ | 305 | unsigned long grpmask; /* Mask to apply to leaf qsmask. */ |
290 | #ifdef CONFIG_RCU_CPU_STALL_INFO | ||
291 | unsigned long ticks_this_gp; /* The number of scheduling-clock */ | 306 | unsigned long ticks_this_gp; /* The number of scheduling-clock */ |
292 | /* ticks this CPU has handled */ | 307 | /* ticks this CPU has handled */ |
293 | /* during and after the last grace */ | 308 | /* during and after the last grace */ |
294 | /* period it is aware of. */ | 309 | /* period it is aware of. */ |
295 | #endif /* #ifdef CONFIG_RCU_CPU_STALL_INFO */ | 310 | struct cpu_stop_work exp_stop_work; |
311 | /* Expedited grace-period control */ | ||
312 | /* for CPU stopping. */ | ||
296 | 313 | ||
297 | /* 2) batch handling */ | 314 | /* 2) batch handling */ |
298 | /* | 315 | /* |
@@ -355,11 +372,13 @@ struct rcu_data { | |||
355 | unsigned long n_rp_nocb_defer_wakeup; | 372 | unsigned long n_rp_nocb_defer_wakeup; |
356 | unsigned long n_rp_need_nothing; | 373 | unsigned long n_rp_need_nothing; |
357 | 374 | ||
358 | /* 6) _rcu_barrier() and OOM callbacks. */ | 375 | /* 6) _rcu_barrier(), OOM callbacks, and expediting. */ |
359 | struct rcu_head barrier_head; | 376 | struct rcu_head barrier_head; |
360 | #ifdef CONFIG_RCU_FAST_NO_HZ | 377 | #ifdef CONFIG_RCU_FAST_NO_HZ |
361 | struct rcu_head oom_head; | 378 | struct rcu_head oom_head; |
362 | #endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */ | 379 | #endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */ |
380 | struct mutex exp_funnel_mutex; | ||
381 | bool exp_done; /* Expedited QS for this CPU? */ | ||
363 | 382 | ||
364 | /* 7) Callback offloading. */ | 383 | /* 7) Callback offloading. */ |
365 | #ifdef CONFIG_RCU_NOCB_CPU | 384 | #ifdef CONFIG_RCU_NOCB_CPU |
@@ -387,9 +406,7 @@ struct rcu_data { | |||
387 | #endif /* #ifdef CONFIG_RCU_NOCB_CPU */ | 406 | #endif /* #ifdef CONFIG_RCU_NOCB_CPU */ |
388 | 407 | ||
389 | /* 8) RCU CPU stall data. */ | 408 | /* 8) RCU CPU stall data. */ |
390 | #ifdef CONFIG_RCU_CPU_STALL_INFO | ||
391 | unsigned int softirq_snap; /* Snapshot of softirq activity. */ | 409 | unsigned int softirq_snap; /* Snapshot of softirq activity. */ |
392 | #endif /* #ifdef CONFIG_RCU_CPU_STALL_INFO */ | ||
393 | 410 | ||
394 | int cpu; | 411 | int cpu; |
395 | struct rcu_state *rsp; | 412 | struct rcu_state *rsp; |
@@ -442,9 +459,9 @@ do { \ | |||
442 | */ | 459 | */ |
443 | struct rcu_state { | 460 | struct rcu_state { |
444 | struct rcu_node node[NUM_RCU_NODES]; /* Hierarchy. */ | 461 | struct rcu_node node[NUM_RCU_NODES]; /* Hierarchy. */ |
445 | struct rcu_node *level[RCU_NUM_LVLS]; /* Hierarchy levels. */ | 462 | struct rcu_node *level[RCU_NUM_LVLS + 1]; |
446 | u32 levelcnt[MAX_RCU_LVLS + 1]; /* # nodes in each level. */ | 463 | /* Hierarchy levels (+1 to */ |
447 | u8 levelspread[RCU_NUM_LVLS]; /* kids/node in each level. */ | 464 | /* shut bogus gcc warning) */ |
448 | u8 flavor_mask; /* bit in flavor mask. */ | 465 | u8 flavor_mask; /* bit in flavor mask. */ |
449 | struct rcu_data __percpu *rda; /* pointer of percu rcu_data. */ | 466 | struct rcu_data __percpu *rda; /* pointer of percu rcu_data. */ |
450 | void (*call)(struct rcu_head *head, /* call_rcu() flavor. */ | 467 | void (*call)(struct rcu_head *head, /* call_rcu() flavor. */ |
@@ -479,21 +496,18 @@ struct rcu_state { | |||
479 | struct mutex barrier_mutex; /* Guards barrier fields. */ | 496 | struct mutex barrier_mutex; /* Guards barrier fields. */ |
480 | atomic_t barrier_cpu_count; /* # CPUs waiting on. */ | 497 | atomic_t barrier_cpu_count; /* # CPUs waiting on. */ |
481 | struct completion barrier_completion; /* Wake at barrier end. */ | 498 | struct completion barrier_completion; /* Wake at barrier end. */ |
482 | unsigned long n_barrier_done; /* ++ at start and end of */ | 499 | unsigned long barrier_sequence; /* ++ at start and end of */ |
483 | /* _rcu_barrier(). */ | 500 | /* _rcu_barrier(). */ |
484 | /* End of fields guarded by barrier_mutex. */ | 501 | /* End of fields guarded by barrier_mutex. */ |
485 | 502 | ||
486 | atomic_long_t expedited_start; /* Starting ticket. */ | 503 | unsigned long expedited_sequence; /* Take a ticket. */ |
487 | atomic_long_t expedited_done; /* Done ticket. */ | 504 | atomic_long_t expedited_workdone0; /* # done by others #0. */ |
488 | atomic_long_t expedited_wrap; /* # near-wrap incidents. */ | ||
489 | atomic_long_t expedited_tryfail; /* # acquisition failures. */ | ||
490 | atomic_long_t expedited_workdone1; /* # done by others #1. */ | 505 | atomic_long_t expedited_workdone1; /* # done by others #1. */ |
491 | atomic_long_t expedited_workdone2; /* # done by others #2. */ | 506 | atomic_long_t expedited_workdone2; /* # done by others #2. */ |
507 | atomic_long_t expedited_workdone3; /* # done by others #3. */ | ||
492 | atomic_long_t expedited_normal; /* # fallbacks to normal. */ | 508 | atomic_long_t expedited_normal; /* # fallbacks to normal. */ |
493 | atomic_long_t expedited_stoppedcpus; /* # successful stop_cpus. */ | 509 | atomic_t expedited_need_qs; /* # CPUs left to check in. */ |
494 | atomic_long_t expedited_done_tries; /* # tries to update _done. */ | 510 | wait_queue_head_t expedited_wq; /* Wait for check-ins. */ |
495 | atomic_long_t expedited_done_lost; /* # times beaten to _done. */ | ||
496 | atomic_long_t expedited_done_exit; /* # times exited _done loop. */ | ||
497 | 511 | ||
498 | unsigned long jiffies_force_qs; /* Time at which to invoke */ | 512 | unsigned long jiffies_force_qs; /* Time at which to invoke */ |
499 | /* force_quiescent_state(). */ | 513 | /* force_quiescent_state(). */ |
@@ -527,7 +541,11 @@ struct rcu_state { | |||
527 | /* Values for rcu_state structure's gp_flags field. */ | 541 | /* Values for rcu_state structure's gp_flags field. */ |
528 | #define RCU_GP_WAIT_INIT 0 /* Initial state. */ | 542 | #define RCU_GP_WAIT_INIT 0 /* Initial state. */ |
529 | #define RCU_GP_WAIT_GPS 1 /* Wait for grace-period start. */ | 543 | #define RCU_GP_WAIT_GPS 1 /* Wait for grace-period start. */ |
530 | #define RCU_GP_WAIT_FQS 2 /* Wait for force-quiescent-state time. */ | 544 | #define RCU_GP_DONE_GPS 2 /* Wait done for grace-period start. */ |
545 | #define RCU_GP_WAIT_FQS 3 /* Wait for force-quiescent-state time. */ | ||
546 | #define RCU_GP_DOING_FQS 4 /* Wait done for force-quiescent-state time. */ | ||
547 | #define RCU_GP_CLEANUP 5 /* Grace-period cleanup started. */ | ||
548 | #define RCU_GP_CLEANED 6 /* Grace-period cleanup complete. */ | ||
531 | 549 | ||
532 | extern struct list_head rcu_struct_flavors; | 550 | extern struct list_head rcu_struct_flavors; |
533 | 551 | ||
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 013485fb2b06..b2bf3963a0ae 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h | |||
@@ -82,10 +82,8 @@ static void __init rcu_bootup_announce_oddness(void) | |||
82 | pr_info("\tRCU lockdep checking is enabled.\n"); | 82 | pr_info("\tRCU lockdep checking is enabled.\n"); |
83 | if (IS_ENABLED(CONFIG_RCU_TORTURE_TEST_RUNNABLE)) | 83 | if (IS_ENABLED(CONFIG_RCU_TORTURE_TEST_RUNNABLE)) |
84 | pr_info("\tRCU torture testing starts during boot.\n"); | 84 | pr_info("\tRCU torture testing starts during boot.\n"); |
85 | if (IS_ENABLED(CONFIG_RCU_CPU_STALL_INFO)) | 85 | if (RCU_NUM_LVLS >= 4) |
86 | pr_info("\tAdditional per-CPU info printed with stalls.\n"); | 86 | pr_info("\tFour(or more)-level hierarchy is enabled.\n"); |
87 | if (NUM_RCU_LVL_4 != 0) | ||
88 | pr_info("\tFour-level hierarchy is enabled.\n"); | ||
89 | if (RCU_FANOUT_LEAF != 16) | 87 | if (RCU_FANOUT_LEAF != 16) |
90 | pr_info("\tBuild-time adjustment of leaf fanout to %d.\n", | 88 | pr_info("\tBuild-time adjustment of leaf fanout to %d.\n", |
91 | RCU_FANOUT_LEAF); | 89 | RCU_FANOUT_LEAF); |
@@ -418,8 +416,6 @@ static void rcu_print_detail_task_stall(struct rcu_state *rsp) | |||
418 | rcu_print_detail_task_stall_rnp(rnp); | 416 | rcu_print_detail_task_stall_rnp(rnp); |
419 | } | 417 | } |
420 | 418 | ||
421 | #ifdef CONFIG_RCU_CPU_STALL_INFO | ||
422 | |||
423 | static void rcu_print_task_stall_begin(struct rcu_node *rnp) | 419 | static void rcu_print_task_stall_begin(struct rcu_node *rnp) |
424 | { | 420 | { |
425 | pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):", | 421 | pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):", |
@@ -431,18 +427,6 @@ static void rcu_print_task_stall_end(void) | |||
431 | pr_cont("\n"); | 427 | pr_cont("\n"); |
432 | } | 428 | } |
433 | 429 | ||
434 | #else /* #ifdef CONFIG_RCU_CPU_STALL_INFO */ | ||
435 | |||
436 | static void rcu_print_task_stall_begin(struct rcu_node *rnp) | ||
437 | { | ||
438 | } | ||
439 | |||
440 | static void rcu_print_task_stall_end(void) | ||
441 | { | ||
442 | } | ||
443 | |||
444 | #endif /* #else #ifdef CONFIG_RCU_CPU_STALL_INFO */ | ||
445 | |||
446 | /* | 430 | /* |
447 | * Scan the current list of tasks blocked within RCU read-side critical | 431 | * Scan the current list of tasks blocked within RCU read-side critical |
448 | * sections, printing out the tid of each. | 432 | * sections, printing out the tid of each. |
@@ -538,10 +522,10 @@ EXPORT_SYMBOL_GPL(call_rcu); | |||
538 | */ | 522 | */ |
539 | void synchronize_rcu(void) | 523 | void synchronize_rcu(void) |
540 | { | 524 | { |
541 | rcu_lockdep_assert(!lock_is_held(&rcu_bh_lock_map) && | 525 | RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || |
542 | !lock_is_held(&rcu_lock_map) && | 526 | lock_is_held(&rcu_lock_map) || |
543 | !lock_is_held(&rcu_sched_lock_map), | 527 | lock_is_held(&rcu_sched_lock_map), |
544 | "Illegal synchronize_rcu() in RCU read-side critical section"); | 528 | "Illegal synchronize_rcu() in RCU read-side critical section"); |
545 | if (!rcu_scheduler_active) | 529 | if (!rcu_scheduler_active) |
546 | return; | 530 | return; |
547 | if (rcu_gp_is_expedited()) | 531 | if (rcu_gp_is_expedited()) |
@@ -552,8 +536,6 @@ void synchronize_rcu(void) | |||
552 | EXPORT_SYMBOL_GPL(synchronize_rcu); | 536 | EXPORT_SYMBOL_GPL(synchronize_rcu); |
553 | 537 | ||
554 | static DECLARE_WAIT_QUEUE_HEAD(sync_rcu_preempt_exp_wq); | 538 | static DECLARE_WAIT_QUEUE_HEAD(sync_rcu_preempt_exp_wq); |
555 | static unsigned long sync_rcu_preempt_exp_count; | ||
556 | static DEFINE_MUTEX(sync_rcu_preempt_exp_mutex); | ||
557 | 539 | ||
558 | /* | 540 | /* |
559 | * Return non-zero if there are any tasks in RCU read-side critical | 541 | * Return non-zero if there are any tasks in RCU read-side critical |
@@ -573,7 +555,7 @@ static int rcu_preempted_readers_exp(struct rcu_node *rnp) | |||
573 | * for the current expedited grace period. Works only for preemptible | 555 | * for the current expedited grace period. Works only for preemptible |
574 | * RCU -- other RCU implementation use other means. | 556 | * RCU -- other RCU implementation use other means. |
575 | * | 557 | * |
576 | * Caller must hold sync_rcu_preempt_exp_mutex. | 558 | * Caller must hold the root rcu_node's exp_funnel_mutex. |
577 | */ | 559 | */ |
578 | static int sync_rcu_preempt_exp_done(struct rcu_node *rnp) | 560 | static int sync_rcu_preempt_exp_done(struct rcu_node *rnp) |
579 | { | 561 | { |
@@ -589,7 +571,7 @@ static int sync_rcu_preempt_exp_done(struct rcu_node *rnp) | |||
589 | * recursively up the tree. (Calm down, calm down, we do the recursion | 571 | * recursively up the tree. (Calm down, calm down, we do the recursion |
590 | * iteratively!) | 572 | * iteratively!) |
591 | * | 573 | * |
592 | * Caller must hold sync_rcu_preempt_exp_mutex. | 574 | * Caller must hold the root rcu_node's exp_funnel_mutex. |
593 | */ | 575 | */ |
594 | static void rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp, | 576 | static void rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp, |
595 | bool wake) | 577 | bool wake) |
@@ -628,7 +610,7 @@ static void rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp, | |||
628 | * set the ->expmask bits on the leaf rcu_node structures to tell phase 2 | 610 | * set the ->expmask bits on the leaf rcu_node structures to tell phase 2 |
629 | * that work is needed here. | 611 | * that work is needed here. |
630 | * | 612 | * |
631 | * Caller must hold sync_rcu_preempt_exp_mutex. | 613 | * Caller must hold the root rcu_node's exp_funnel_mutex. |
632 | */ | 614 | */ |
633 | static void | 615 | static void |
634 | sync_rcu_preempt_exp_init1(struct rcu_state *rsp, struct rcu_node *rnp) | 616 | sync_rcu_preempt_exp_init1(struct rcu_state *rsp, struct rcu_node *rnp) |
@@ -671,7 +653,7 @@ sync_rcu_preempt_exp_init1(struct rcu_state *rsp, struct rcu_node *rnp) | |||
671 | * invoke rcu_report_exp_rnp() to clear out the upper-level ->expmask bits, | 653 | * invoke rcu_report_exp_rnp() to clear out the upper-level ->expmask bits, |
672 | * enabling rcu_read_unlock_special() to do the bit-clearing. | 654 | * enabling rcu_read_unlock_special() to do the bit-clearing. |
673 | * | 655 | * |
674 | * Caller must hold sync_rcu_preempt_exp_mutex. | 656 | * Caller must hold the root rcu_node's exp_funnel_mutex. |
675 | */ | 657 | */ |
676 | static void | 658 | static void |
677 | sync_rcu_preempt_exp_init2(struct rcu_state *rsp, struct rcu_node *rnp) | 659 | sync_rcu_preempt_exp_init2(struct rcu_state *rsp, struct rcu_node *rnp) |
@@ -719,51 +701,17 @@ sync_rcu_preempt_exp_init2(struct rcu_state *rsp, struct rcu_node *rnp) | |||
719 | void synchronize_rcu_expedited(void) | 701 | void synchronize_rcu_expedited(void) |
720 | { | 702 | { |
721 | struct rcu_node *rnp; | 703 | struct rcu_node *rnp; |
704 | struct rcu_node *rnp_unlock; | ||
722 | struct rcu_state *rsp = rcu_state_p; | 705 | struct rcu_state *rsp = rcu_state_p; |
723 | unsigned long snap; | 706 | unsigned long s; |
724 | int trycount = 0; | ||
725 | 707 | ||
726 | smp_mb(); /* Caller's modifications seen first by other CPUs. */ | 708 | s = rcu_exp_gp_seq_snap(rsp); |
727 | snap = READ_ONCE(sync_rcu_preempt_exp_count) + 1; | ||
728 | smp_mb(); /* Above access cannot bleed into critical section. */ | ||
729 | 709 | ||
730 | /* | 710 | rnp_unlock = exp_funnel_lock(rsp, s); |
731 | * Block CPU-hotplug operations. This means that any CPU-hotplug | 711 | if (rnp_unlock == NULL) |
732 | * operation that finds an rcu_node structure with tasks in the | 712 | return; /* Someone else did our work for us. */ |
733 | * process of being boosted will know that all tasks blocking | ||
734 | * this expedited grace period will already be in the process of | ||
735 | * being boosted. This simplifies the process of moving tasks | ||
736 | * from leaf to root rcu_node structures. | ||
737 | */ | ||
738 | if (!try_get_online_cpus()) { | ||
739 | /* CPU-hotplug operation in flight, fall back to normal GP. */ | ||
740 | wait_rcu_gp(call_rcu); | ||
741 | return; | ||
742 | } | ||
743 | 713 | ||
744 | /* | 714 | rcu_exp_gp_seq_start(rsp); |
745 | * Acquire lock, falling back to synchronize_rcu() if too many | ||
746 | * lock-acquisition failures. Of course, if someone does the | ||
747 | * expedited grace period for us, just leave. | ||
748 | */ | ||
749 | while (!mutex_trylock(&sync_rcu_preempt_exp_mutex)) { | ||
750 | if (ULONG_CMP_LT(snap, | ||
751 | READ_ONCE(sync_rcu_preempt_exp_count))) { | ||
752 | put_online_cpus(); | ||
753 | goto mb_ret; /* Others did our work for us. */ | ||
754 | } | ||
755 | if (trycount++ < 10) { | ||
756 | udelay(trycount * num_online_cpus()); | ||
757 | } else { | ||
758 | put_online_cpus(); | ||
759 | wait_rcu_gp(call_rcu); | ||
760 | return; | ||
761 | } | ||
762 | } | ||
763 | if (ULONG_CMP_LT(snap, READ_ONCE(sync_rcu_preempt_exp_count))) { | ||
764 | put_online_cpus(); | ||
765 | goto unlock_mb_ret; /* Others did our work for us. */ | ||
766 | } | ||
767 | 715 | ||
768 | /* force all RCU readers onto ->blkd_tasks lists. */ | 716 | /* force all RCU readers onto ->blkd_tasks lists. */ |
769 | synchronize_sched_expedited(); | 717 | synchronize_sched_expedited(); |
@@ -779,20 +727,14 @@ void synchronize_rcu_expedited(void) | |||
779 | rcu_for_each_leaf_node(rsp, rnp) | 727 | rcu_for_each_leaf_node(rsp, rnp) |
780 | sync_rcu_preempt_exp_init2(rsp, rnp); | 728 | sync_rcu_preempt_exp_init2(rsp, rnp); |
781 | 729 | ||
782 | put_online_cpus(); | ||
783 | |||
784 | /* Wait for snapshotted ->blkd_tasks lists to drain. */ | 730 | /* Wait for snapshotted ->blkd_tasks lists to drain. */ |
785 | rnp = rcu_get_root(rsp); | 731 | rnp = rcu_get_root(rsp); |
786 | wait_event(sync_rcu_preempt_exp_wq, | 732 | wait_event(sync_rcu_preempt_exp_wq, |
787 | sync_rcu_preempt_exp_done(rnp)); | 733 | sync_rcu_preempt_exp_done(rnp)); |
788 | 734 | ||
789 | /* Clean up and exit. */ | 735 | /* Clean up and exit. */ |
790 | smp_mb(); /* ensure expedited GP seen before counter increment. */ | 736 | rcu_exp_gp_seq_end(rsp); |
791 | WRITE_ONCE(sync_rcu_preempt_exp_count, sync_rcu_preempt_exp_count + 1); | 737 | mutex_unlock(&rnp_unlock->exp_funnel_mutex); |
792 | unlock_mb_ret: | ||
793 | mutex_unlock(&sync_rcu_preempt_exp_mutex); | ||
794 | mb_ret: | ||
795 | smp_mb(); /* ensure subsequent action seen after grace period. */ | ||
796 | } | 738 | } |
797 | EXPORT_SYMBOL_GPL(synchronize_rcu_expedited); | 739 | EXPORT_SYMBOL_GPL(synchronize_rcu_expedited); |
798 | 740 | ||
@@ -1061,8 +1003,7 @@ static int rcu_boost(struct rcu_node *rnp) | |||
1061 | } | 1003 | } |
1062 | 1004 | ||
1063 | /* | 1005 | /* |
1064 | * Priority-boosting kthread. One per leaf rcu_node and one for the | 1006 | * Priority-boosting kthread, one per leaf rcu_node. |
1065 | * root rcu_node. | ||
1066 | */ | 1007 | */ |
1067 | static int rcu_boost_kthread(void *arg) | 1008 | static int rcu_boost_kthread(void *arg) |
1068 | { | 1009 | { |
@@ -1680,12 +1621,10 @@ static int rcu_oom_notify(struct notifier_block *self, | |||
1680 | */ | 1621 | */ |
1681 | atomic_set(&oom_callback_count, 1); | 1622 | atomic_set(&oom_callback_count, 1); |
1682 | 1623 | ||
1683 | get_online_cpus(); | ||
1684 | for_each_online_cpu(cpu) { | 1624 | for_each_online_cpu(cpu) { |
1685 | smp_call_function_single(cpu, rcu_oom_notify_cpu, NULL, 1); | 1625 | smp_call_function_single(cpu, rcu_oom_notify_cpu, NULL, 1); |
1686 | cond_resched_rcu_qs(); | 1626 | cond_resched_rcu_qs(); |
1687 | } | 1627 | } |
1688 | put_online_cpus(); | ||
1689 | 1628 | ||
1690 | /* Unconditionally decrement: no need to wake ourselves up. */ | 1629 | /* Unconditionally decrement: no need to wake ourselves up. */ |
1691 | atomic_dec(&oom_callback_count); | 1630 | atomic_dec(&oom_callback_count); |
@@ -1706,8 +1645,6 @@ early_initcall(rcu_register_oom_notifier); | |||
1706 | 1645 | ||
1707 | #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */ | 1646 | #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */ |
1708 | 1647 | ||
1709 | #ifdef CONFIG_RCU_CPU_STALL_INFO | ||
1710 | |||
1711 | #ifdef CONFIG_RCU_FAST_NO_HZ | 1648 | #ifdef CONFIG_RCU_FAST_NO_HZ |
1712 | 1649 | ||
1713 | static void print_cpu_stall_fast_no_hz(char *cp, int cpu) | 1650 | static void print_cpu_stall_fast_no_hz(char *cp, int cpu) |
@@ -1796,33 +1733,6 @@ static void increment_cpu_stall_ticks(void) | |||
1796 | raw_cpu_inc(rsp->rda->ticks_this_gp); | 1733 | raw_cpu_inc(rsp->rda->ticks_this_gp); |
1797 | } | 1734 | } |
1798 | 1735 | ||
1799 | #else /* #ifdef CONFIG_RCU_CPU_STALL_INFO */ | ||
1800 | |||
1801 | static void print_cpu_stall_info_begin(void) | ||
1802 | { | ||
1803 | pr_cont(" {"); | ||
1804 | } | ||
1805 | |||
1806 | static void print_cpu_stall_info(struct rcu_state *rsp, int cpu) | ||
1807 | { | ||
1808 | pr_cont(" %d", cpu); | ||
1809 | } | ||
1810 | |||
1811 | static void print_cpu_stall_info_end(void) | ||
1812 | { | ||
1813 | pr_cont("} "); | ||
1814 | } | ||
1815 | |||
1816 | static void zero_cpu_stall_ticks(struct rcu_data *rdp) | ||
1817 | { | ||
1818 | } | ||
1819 | |||
1820 | static void increment_cpu_stall_ticks(void) | ||
1821 | { | ||
1822 | } | ||
1823 | |||
1824 | #endif /* #else #ifdef CONFIG_RCU_CPU_STALL_INFO */ | ||
1825 | |||
1826 | #ifdef CONFIG_RCU_NOCB_CPU | 1736 | #ifdef CONFIG_RCU_NOCB_CPU |
1827 | 1737 | ||
1828 | /* | 1738 | /* |
diff --git a/kernel/rcu/tree_trace.c b/kernel/rcu/tree_trace.c index 3ea7ffc7d5c4..6fc4c5ff3bb5 100644 --- a/kernel/rcu/tree_trace.c +++ b/kernel/rcu/tree_trace.c | |||
@@ -81,9 +81,9 @@ static void r_stop(struct seq_file *m, void *v) | |||
81 | static int show_rcubarrier(struct seq_file *m, void *v) | 81 | static int show_rcubarrier(struct seq_file *m, void *v) |
82 | { | 82 | { |
83 | struct rcu_state *rsp = (struct rcu_state *)m->private; | 83 | struct rcu_state *rsp = (struct rcu_state *)m->private; |
84 | seq_printf(m, "bcc: %d nbd: %lu\n", | 84 | seq_printf(m, "bcc: %d bseq: %lu\n", |
85 | atomic_read(&rsp->barrier_cpu_count), | 85 | atomic_read(&rsp->barrier_cpu_count), |
86 | rsp->n_barrier_done); | 86 | rsp->barrier_sequence); |
87 | return 0; | 87 | return 0; |
88 | } | 88 | } |
89 | 89 | ||
@@ -185,18 +185,15 @@ static int show_rcuexp(struct seq_file *m, void *v) | |||
185 | { | 185 | { |
186 | struct rcu_state *rsp = (struct rcu_state *)m->private; | 186 | struct rcu_state *rsp = (struct rcu_state *)m->private; |
187 | 187 | ||
188 | seq_printf(m, "s=%lu d=%lu w=%lu tf=%lu wd1=%lu wd2=%lu n=%lu sc=%lu dt=%lu dl=%lu dx=%lu\n", | 188 | seq_printf(m, "s=%lu wd0=%lu wd1=%lu wd2=%lu wd3=%lu n=%lu enq=%d sc=%lu\n", |
189 | atomic_long_read(&rsp->expedited_start), | 189 | rsp->expedited_sequence, |
190 | atomic_long_read(&rsp->expedited_done), | 190 | atomic_long_read(&rsp->expedited_workdone0), |
191 | atomic_long_read(&rsp->expedited_wrap), | ||
192 | atomic_long_read(&rsp->expedited_tryfail), | ||
193 | atomic_long_read(&rsp->expedited_workdone1), | 191 | atomic_long_read(&rsp->expedited_workdone1), |
194 | atomic_long_read(&rsp->expedited_workdone2), | 192 | atomic_long_read(&rsp->expedited_workdone2), |
193 | atomic_long_read(&rsp->expedited_workdone3), | ||
195 | atomic_long_read(&rsp->expedited_normal), | 194 | atomic_long_read(&rsp->expedited_normal), |
196 | atomic_long_read(&rsp->expedited_stoppedcpus), | 195 | atomic_read(&rsp->expedited_need_qs), |
197 | atomic_long_read(&rsp->expedited_done_tries), | 196 | rsp->expedited_sequence / 2); |
198 | atomic_long_read(&rsp->expedited_done_lost), | ||
199 | atomic_long_read(&rsp->expedited_done_exit)); | ||
200 | return 0; | 197 | return 0; |
201 | } | 198 | } |
202 | 199 | ||
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index afaecb7a799a..7a0b3bc7c5ed 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c | |||
@@ -62,6 +62,55 @@ MODULE_ALIAS("rcupdate"); | |||
62 | 62 | ||
63 | module_param(rcu_expedited, int, 0); | 63 | module_param(rcu_expedited, int, 0); |
64 | 64 | ||
65 | #if defined(CONFIG_DEBUG_LOCK_ALLOC) && defined(CONFIG_PREEMPT_COUNT) | ||
66 | /** | ||
67 | * rcu_read_lock_sched_held() - might we be in RCU-sched read-side critical section? | ||
68 | * | ||
69 | * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an | ||
70 | * RCU-sched read-side critical section. In absence of | ||
71 | * CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched read-side | ||
72 | * critical section unless it can prove otherwise. Note that disabling | ||
73 | * of preemption (including disabling irqs) counts as an RCU-sched | ||
74 | * read-side critical section. This is useful for debug checks in functions | ||
75 | * that required that they be called within an RCU-sched read-side | ||
76 | * critical section. | ||
77 | * | ||
78 | * Check debug_lockdep_rcu_enabled() to prevent false positives during boot | ||
79 | * and while lockdep is disabled. | ||
80 | * | ||
81 | * Note that if the CPU is in the idle loop from an RCU point of | ||
82 | * view (ie: that we are in the section between rcu_idle_enter() and | ||
83 | * rcu_idle_exit()) then rcu_read_lock_held() returns false even if the CPU | ||
84 | * did an rcu_read_lock(). The reason for this is that RCU ignores CPUs | ||
85 | * that are in such a section, considering these as in extended quiescent | ||
86 | * state, so such a CPU is effectively never in an RCU read-side critical | ||
87 | * section regardless of what RCU primitives it invokes. This state of | ||
88 | * affairs is required --- we need to keep an RCU-free window in idle | ||
89 | * where the CPU may possibly enter into low power mode. This way we can | ||
90 | * notice an extended quiescent state to other CPUs that started a grace | ||
91 | * period. Otherwise we would delay any grace period as long as we run in | ||
92 | * the idle task. | ||
93 | * | ||
94 | * Similarly, we avoid claiming an SRCU read lock held if the current | ||
95 | * CPU is offline. | ||
96 | */ | ||
97 | int rcu_read_lock_sched_held(void) | ||
98 | { | ||
99 | int lockdep_opinion = 0; | ||
100 | |||
101 | if (!debug_lockdep_rcu_enabled()) | ||
102 | return 1; | ||
103 | if (!rcu_is_watching()) | ||
104 | return 0; | ||
105 | if (!rcu_lockdep_current_cpu_online()) | ||
106 | return 0; | ||
107 | if (debug_locks) | ||
108 | lockdep_opinion = lock_is_held(&rcu_sched_lock_map); | ||
109 | return lockdep_opinion || preempt_count() != 0 || irqs_disabled(); | ||
110 | } | ||
111 | EXPORT_SYMBOL(rcu_read_lock_sched_held); | ||
112 | #endif | ||
113 | |||
65 | #ifndef CONFIG_TINY_RCU | 114 | #ifndef CONFIG_TINY_RCU |
66 | 115 | ||
67 | static atomic_t rcu_expedited_nesting = | 116 | static atomic_t rcu_expedited_nesting = |
@@ -269,20 +318,37 @@ void wakeme_after_rcu(struct rcu_head *head) | |||
269 | rcu = container_of(head, struct rcu_synchronize, head); | 318 | rcu = container_of(head, struct rcu_synchronize, head); |
270 | complete(&rcu->completion); | 319 | complete(&rcu->completion); |
271 | } | 320 | } |
321 | EXPORT_SYMBOL_GPL(wakeme_after_rcu); | ||
272 | 322 | ||
273 | void wait_rcu_gp(call_rcu_func_t crf) | 323 | void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array, |
324 | struct rcu_synchronize *rs_array) | ||
274 | { | 325 | { |
275 | struct rcu_synchronize rcu; | 326 | int i; |
276 | 327 | ||
277 | init_rcu_head_on_stack(&rcu.head); | 328 | /* Initialize and register callbacks for each flavor specified. */ |
278 | init_completion(&rcu.completion); | 329 | for (i = 0; i < n; i++) { |
279 | /* Will wake me after RCU finished. */ | 330 | if (checktiny && |
280 | crf(&rcu.head, wakeme_after_rcu); | 331 | (crcu_array[i] == call_rcu || |
281 | /* Wait for it. */ | 332 | crcu_array[i] == call_rcu_bh)) { |
282 | wait_for_completion(&rcu.completion); | 333 | might_sleep(); |
283 | destroy_rcu_head_on_stack(&rcu.head); | 334 | continue; |
335 | } | ||
336 | init_rcu_head_on_stack(&rs_array[i].head); | ||
337 | init_completion(&rs_array[i].completion); | ||
338 | (crcu_array[i])(&rs_array[i].head, wakeme_after_rcu); | ||
339 | } | ||
340 | |||
341 | /* Wait for all callbacks to be invoked. */ | ||
342 | for (i = 0; i < n; i++) { | ||
343 | if (checktiny && | ||
344 | (crcu_array[i] == call_rcu || | ||
345 | crcu_array[i] == call_rcu_bh)) | ||
346 | continue; | ||
347 | wait_for_completion(&rs_array[i].completion); | ||
348 | destroy_rcu_head_on_stack(&rs_array[i].head); | ||
349 | } | ||
284 | } | 350 | } |
285 | EXPORT_SYMBOL_GPL(wait_rcu_gp); | 351 | EXPORT_SYMBOL_GPL(__wait_rcu_gp); |
286 | 352 | ||
287 | #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD | 353 | #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD |
288 | void init_rcu_head(struct rcu_head *head) | 354 | void init_rcu_head(struct rcu_head *head) |
@@ -523,8 +589,8 @@ EXPORT_SYMBOL_GPL(call_rcu_tasks); | |||
523 | void synchronize_rcu_tasks(void) | 589 | void synchronize_rcu_tasks(void) |
524 | { | 590 | { |
525 | /* Complain if the scheduler has not started. */ | 591 | /* Complain if the scheduler has not started. */ |
526 | rcu_lockdep_assert(!rcu_scheduler_active, | 592 | RCU_LOCKDEP_WARN(!rcu_scheduler_active, |
527 | "synchronize_rcu_tasks called too soon"); | 593 | "synchronize_rcu_tasks called too soon"); |
528 | 594 | ||
529 | /* Wait for the grace period. */ | 595 | /* Wait for the grace period. */ |
530 | wait_rcu_gp(call_rcu_tasks); | 596 | wait_rcu_gp(call_rcu_tasks); |
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 78b4bad10081..5e73c79fadd0 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c | |||
@@ -2200,8 +2200,8 @@ unsigned long to_ratio(u64 period, u64 runtime) | |||
2200 | #ifdef CONFIG_SMP | 2200 | #ifdef CONFIG_SMP |
2201 | inline struct dl_bw *dl_bw_of(int i) | 2201 | inline struct dl_bw *dl_bw_of(int i) |
2202 | { | 2202 | { |
2203 | rcu_lockdep_assert(rcu_read_lock_sched_held(), | 2203 | RCU_LOCKDEP_WARN(!rcu_read_lock_sched_held(), |
2204 | "sched RCU must be held"); | 2204 | "sched RCU must be held"); |
2205 | return &cpu_rq(i)->rd->dl_bw; | 2205 | return &cpu_rq(i)->rd->dl_bw; |
2206 | } | 2206 | } |
2207 | 2207 | ||
@@ -2210,8 +2210,8 @@ static inline int dl_bw_cpus(int i) | |||
2210 | struct root_domain *rd = cpu_rq(i)->rd; | 2210 | struct root_domain *rd = cpu_rq(i)->rd; |
2211 | int cpus = 0; | 2211 | int cpus = 0; |
2212 | 2212 | ||
2213 | rcu_lockdep_assert(rcu_read_lock_sched_held(), | 2213 | RCU_LOCKDEP_WARN(!rcu_read_lock_sched_held(), |
2214 | "sched RCU must be held"); | 2214 | "sched RCU must be held"); |
2215 | for_each_cpu_and(i, rd->span, cpu_active_mask) | 2215 | for_each_cpu_and(i, rd->span, cpu_active_mask) |
2216 | cpus++; | 2216 | cpus++; |
2217 | 2217 | ||
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig index 579ce1b929af..4008d9f95dd7 100644 --- a/kernel/time/Kconfig +++ b/kernel/time/Kconfig | |||
@@ -92,12 +92,10 @@ config NO_HZ_FULL | |||
92 | depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS | 92 | depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS |
93 | # We need at least one periodic CPU for timekeeping | 93 | # We need at least one periodic CPU for timekeeping |
94 | depends on SMP | 94 | depends on SMP |
95 | # RCU_USER_QS dependency | ||
96 | depends on HAVE_CONTEXT_TRACKING | 95 | depends on HAVE_CONTEXT_TRACKING |
97 | # VIRT_CPU_ACCOUNTING_GEN dependency | 96 | # VIRT_CPU_ACCOUNTING_GEN dependency |
98 | depends on HAVE_VIRT_CPU_ACCOUNTING_GEN | 97 | depends on HAVE_VIRT_CPU_ACCOUNTING_GEN |
99 | select NO_HZ_COMMON | 98 | select NO_HZ_COMMON |
100 | select RCU_USER_QS | ||
101 | select RCU_NOCB_CPU | 99 | select RCU_NOCB_CPU |
102 | select VIRT_CPU_ACCOUNTING_GEN | 100 | select VIRT_CPU_ACCOUNTING_GEN |
103 | select IRQ_WORK | 101 | select IRQ_WORK |
diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 4c4f06176f74..cb91c63b4f4a 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c | |||
@@ -338,20 +338,20 @@ static void workqueue_sysfs_unregister(struct workqueue_struct *wq); | |||
338 | #include <trace/events/workqueue.h> | 338 | #include <trace/events/workqueue.h> |
339 | 339 | ||
340 | #define assert_rcu_or_pool_mutex() \ | 340 | #define assert_rcu_or_pool_mutex() \ |
341 | rcu_lockdep_assert(rcu_read_lock_sched_held() || \ | 341 | RCU_LOCKDEP_WARN(!rcu_read_lock_sched_held() && \ |
342 | lockdep_is_held(&wq_pool_mutex), \ | 342 | !lockdep_is_held(&wq_pool_mutex), \ |
343 | "sched RCU or wq_pool_mutex should be held") | 343 | "sched RCU or wq_pool_mutex should be held") |
344 | 344 | ||
345 | #define assert_rcu_or_wq_mutex(wq) \ | 345 | #define assert_rcu_or_wq_mutex(wq) \ |
346 | rcu_lockdep_assert(rcu_read_lock_sched_held() || \ | 346 | RCU_LOCKDEP_WARN(!rcu_read_lock_sched_held() && \ |
347 | lockdep_is_held(&wq->mutex), \ | 347 | !lockdep_is_held(&wq->mutex), \ |
348 | "sched RCU or wq->mutex should be held") | 348 | "sched RCU or wq->mutex should be held") |
349 | 349 | ||
350 | #define assert_rcu_or_wq_mutex_or_pool_mutex(wq) \ | 350 | #define assert_rcu_or_wq_mutex_or_pool_mutex(wq) \ |
351 | rcu_lockdep_assert(rcu_read_lock_sched_held() || \ | 351 | RCU_LOCKDEP_WARN(!rcu_read_lock_sched_held() && \ |
352 | lockdep_is_held(&wq->mutex) || \ | 352 | !lockdep_is_held(&wq->mutex) && \ |
353 | lockdep_is_held(&wq_pool_mutex), \ | 353 | !lockdep_is_held(&wq_pool_mutex), \ |
354 | "sched RCU, wq->mutex or wq_pool_mutex should be held") | 354 | "sched RCU, wq->mutex or wq_pool_mutex should be held") |
355 | 355 | ||
356 | #define for_each_cpu_worker_pool(pool, cpu) \ | 356 | #define for_each_cpu_worker_pool(pool, cpu) \ |
357 | for ((pool) = &per_cpu(cpu_worker_pools, cpu)[0]; \ | 357 | for ((pool) = &per_cpu(cpu_worker_pools, cpu)[0]; \ |
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index e2894b23efb6..3e0b662cae09 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug | |||
@@ -1353,20 +1353,6 @@ config RCU_CPU_STALL_TIMEOUT | |||
1353 | RCU grace period persists, additional CPU stall warnings are | 1353 | RCU grace period persists, additional CPU stall warnings are |
1354 | printed at more widely spaced intervals. | 1354 | printed at more widely spaced intervals. |
1355 | 1355 | ||
1356 | config RCU_CPU_STALL_INFO | ||
1357 | bool "Print additional diagnostics on RCU CPU stall" | ||
1358 | depends on (TREE_RCU || PREEMPT_RCU) && DEBUG_KERNEL | ||
1359 | default y | ||
1360 | help | ||
1361 | For each stalled CPU that is aware of the current RCU grace | ||
1362 | period, print out additional per-CPU diagnostic information | ||
1363 | regarding scheduling-clock ticks, idle state, and, | ||
1364 | for RCU_FAST_NO_HZ kernels, idle-entry state. | ||
1365 | |||
1366 | Say N if you are unsure. | ||
1367 | |||
1368 | Say Y if you want to enable such diagnostics. | ||
1369 | |||
1370 | config RCU_TRACE | 1356 | config RCU_TRACE |
1371 | bool "Enable tracing for RCU" | 1357 | bool "Enable tracing for RCU" |
1372 | depends on DEBUG_KERNEL | 1358 | depends on DEBUG_KERNEL |
@@ -1379,7 +1365,7 @@ config RCU_TRACE | |||
1379 | Say N if you are unsure. | 1365 | Say N if you are unsure. |
1380 | 1366 | ||
1381 | config RCU_EQS_DEBUG | 1367 | config RCU_EQS_DEBUG |
1382 | bool "Use this when adding any sort of NO_HZ support to your arch" | 1368 | bool "Provide debugging asserts for adding NO_HZ support to an arch" |
1383 | depends on DEBUG_KERNEL | 1369 | depends on DEBUG_KERNEL |
1384 | help | 1370 | help |
1385 | This option provides consistency checks in RCU's handling of | 1371 | This option provides consistency checks in RCU's handling of |
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index d5c8e9a3a73c..a51ca0e5beef 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl | |||
@@ -5011,6 +5011,7 @@ sub process { | |||
5011 | "memory barrier without comment\n" . $herecurr); | 5011 | "memory barrier without comment\n" . $herecurr); |
5012 | } | 5012 | } |
5013 | } | 5013 | } |
5014 | |||
5014 | # check for waitqueue_active without a comment. | 5015 | # check for waitqueue_active without a comment. |
5015 | if ($line =~ /\bwaitqueue_active\s*\(/) { | 5016 | if ($line =~ /\bwaitqueue_active\s*\(/) { |
5016 | if (!ctx_has_comment($first_line, $linenr)) { | 5017 | if (!ctx_has_comment($first_line, $linenr)) { |
@@ -5018,6 +5019,24 @@ sub process { | |||
5018 | "waitqueue_active without comment\n" . $herecurr); | 5019 | "waitqueue_active without comment\n" . $herecurr); |
5019 | } | 5020 | } |
5020 | } | 5021 | } |
5022 | |||
5023 | # Check for expedited grace periods that interrupt non-idle non-nohz | ||
5024 | # online CPUs. These expedited can therefore degrade real-time response | ||
5025 | # if used carelessly, and should be avoided where not absolutely | ||
5026 | # needed. It is always OK to use synchronize_rcu_expedited() and | ||
5027 | # synchronize_sched_expedited() at boot time (before real-time applications | ||
5028 | # start) and in error situations where real-time response is compromised in | ||
5029 | # any case. Note that synchronize_srcu_expedited() does -not- interrupt | ||
5030 | # other CPUs, so don't warn on uses of synchronize_srcu_expedited(). | ||
5031 | # Of course, nothing comes for free, and srcu_read_lock() and | ||
5032 | # srcu_read_unlock() do contain full memory barriers in payment for | ||
5033 | # synchronize_srcu_expedited() non-interruption properties. | ||
5034 | if ($line =~ /\b(synchronize_rcu_expedited|synchronize_sched_expedited)\(/) { | ||
5035 | WARN("EXPEDITED_RCU_GRACE_PERIOD", | ||
5036 | "expedited RCU grace periods should be avoided where they can degrade real-time response\n" . $herecurr); | ||
5037 | |||
5038 | } | ||
5039 | |||
5021 | # check of hardware specific defines | 5040 | # check of hardware specific defines |
5022 | if ($line =~ m@^.\s*\#\s*if.*\b(__i386__|__powerpc64__|__sun__|__s390x__)\b@ && $realfile !~ m@include/asm-@) { | 5041 | if ($line =~ m@^.\s*\#\s*if.*\b(__i386__|__powerpc64__|__sun__|__s390x__)\b@ && $realfile !~ m@include/asm-@) { |
5023 | CHK("ARCH_DEFINES", | 5042 | CHK("ARCH_DEFINES", |
diff --git a/security/device_cgroup.c b/security/device_cgroup.c index 188c1d26393b..73455089feef 100644 --- a/security/device_cgroup.c +++ b/security/device_cgroup.c | |||
@@ -400,9 +400,9 @@ static bool verify_new_ex(struct dev_cgroup *dev_cgroup, | |||
400 | { | 400 | { |
401 | bool match = false; | 401 | bool match = false; |
402 | 402 | ||
403 | rcu_lockdep_assert(rcu_read_lock_held() || | 403 | RCU_LOCKDEP_WARN(!rcu_read_lock_held() && |
404 | lockdep_is_held(&devcgroup_mutex), | 404 | lockdep_is_held(&devcgroup_mutex), |
405 | "device_cgroup:verify_new_ex called without proper synchronization"); | 405 | "device_cgroup:verify_new_ex called without proper synchronization"); |
406 | 406 | ||
407 | if (dev_cgroup->behavior == DEVCG_DEFAULT_ALLOW) { | 407 | if (dev_cgroup->behavior == DEVCG_DEFAULT_ALLOW) { |
408 | if (behavior == DEVCG_DEFAULT_ALLOW) { | 408 | if (behavior == DEVCG_DEFAULT_ALLOW) { |
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TASKS01 b/tools/testing/selftests/rcutorture/configs/rcu/TASKS01 index 2cc0e60eba6e..bafe94cbd739 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TASKS01 +++ b/tools/testing/selftests/rcutorture/configs/rcu/TASKS01 | |||
@@ -5,6 +5,6 @@ CONFIG_PREEMPT_NONE=n | |||
5 | CONFIG_PREEMPT_VOLUNTARY=n | 5 | CONFIG_PREEMPT_VOLUNTARY=n |
6 | CONFIG_PREEMPT=y | 6 | CONFIG_PREEMPT=y |
7 | CONFIG_DEBUG_LOCK_ALLOC=y | 7 | CONFIG_DEBUG_LOCK_ALLOC=y |
8 | CONFIG_PROVE_LOCKING=n | 8 | CONFIG_PROVE_LOCKING=y |
9 | #CHECK#CONFIG_PROVE_RCU=n | 9 | #CHECK#CONFIG_PROVE_RCU=y |
10 | CONFIG_RCU_EXPERT=y | 10 | CONFIG_RCU_EXPERT=y |
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE01 b/tools/testing/selftests/rcutorture/configs/rcu/TREE01 index 8e9137f66831..f572b873c620 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE01 +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE01 | |||
@@ -13,7 +13,6 @@ CONFIG_MAXSMP=y | |||
13 | CONFIG_RCU_NOCB_CPU=y | 13 | CONFIG_RCU_NOCB_CPU=y |
14 | CONFIG_RCU_NOCB_CPU_ZERO=y | 14 | CONFIG_RCU_NOCB_CPU_ZERO=y |
15 | CONFIG_DEBUG_LOCK_ALLOC=n | 15 | CONFIG_DEBUG_LOCK_ALLOC=n |
16 | CONFIG_RCU_CPU_STALL_INFO=n | ||
17 | CONFIG_RCU_BOOST=n | 16 | CONFIG_RCU_BOOST=n |
18 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n | 17 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n |
19 | CONFIG_RCU_EXPERT=y | 18 | CONFIG_RCU_EXPERT=y |
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE02 b/tools/testing/selftests/rcutorture/configs/rcu/TREE02 index aeea6a204d14..ef6a22c44dea 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE02 +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE02 | |||
@@ -17,7 +17,6 @@ CONFIG_RCU_FANOUT_LEAF=3 | |||
17 | CONFIG_RCU_NOCB_CPU=n | 17 | CONFIG_RCU_NOCB_CPU=n |
18 | CONFIG_DEBUG_LOCK_ALLOC=y | 18 | CONFIG_DEBUG_LOCK_ALLOC=y |
19 | CONFIG_PROVE_LOCKING=n | 19 | CONFIG_PROVE_LOCKING=n |
20 | CONFIG_RCU_CPU_STALL_INFO=n | ||
21 | CONFIG_RCU_BOOST=n | 20 | CONFIG_RCU_BOOST=n |
22 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n | 21 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n |
23 | CONFIG_RCU_EXPERT=y | 22 | CONFIG_RCU_EXPERT=y |
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE02-T b/tools/testing/selftests/rcutorture/configs/rcu/TREE02-T index 2ac9e68ea3d1..917d2517b5b5 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE02-T +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE02-T | |||
@@ -17,6 +17,5 @@ CONFIG_RCU_FANOUT_LEAF=3 | |||
17 | CONFIG_RCU_NOCB_CPU=n | 17 | CONFIG_RCU_NOCB_CPU=n |
18 | CONFIG_DEBUG_LOCK_ALLOC=y | 18 | CONFIG_DEBUG_LOCK_ALLOC=y |
19 | CONFIG_PROVE_LOCKING=n | 19 | CONFIG_PROVE_LOCKING=n |
20 | CONFIG_RCU_CPU_STALL_INFO=n | ||
21 | CONFIG_RCU_BOOST=n | 20 | CONFIG_RCU_BOOST=n |
22 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n | 21 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n |
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE03 b/tools/testing/selftests/rcutorture/configs/rcu/TREE03 index 72aa7d87ea99..7a17c503b382 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE03 +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE03 | |||
@@ -13,7 +13,6 @@ CONFIG_RCU_FANOUT=2 | |||
13 | CONFIG_RCU_FANOUT_LEAF=2 | 13 | CONFIG_RCU_FANOUT_LEAF=2 |
14 | CONFIG_RCU_NOCB_CPU=n | 14 | CONFIG_RCU_NOCB_CPU=n |
15 | CONFIG_DEBUG_LOCK_ALLOC=n | 15 | CONFIG_DEBUG_LOCK_ALLOC=n |
16 | CONFIG_RCU_CPU_STALL_INFO=n | ||
17 | CONFIG_RCU_BOOST=y | 16 | CONFIG_RCU_BOOST=y |
18 | CONFIG_RCU_KTHREAD_PRIO=2 | 17 | CONFIG_RCU_KTHREAD_PRIO=2 |
19 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n | 18 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n |
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE04 b/tools/testing/selftests/rcutorture/configs/rcu/TREE04 index 3f5112751cda..39a2c6d7d7ec 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE04 +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE04 | |||
@@ -17,6 +17,5 @@ CONFIG_RCU_FANOUT=4 | |||
17 | CONFIG_RCU_FANOUT_LEAF=4 | 17 | CONFIG_RCU_FANOUT_LEAF=4 |
18 | CONFIG_RCU_NOCB_CPU=n | 18 | CONFIG_RCU_NOCB_CPU=n |
19 | CONFIG_DEBUG_LOCK_ALLOC=n | 19 | CONFIG_DEBUG_LOCK_ALLOC=n |
20 | CONFIG_RCU_CPU_STALL_INFO=n | ||
21 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n | 20 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n |
22 | CONFIG_RCU_EXPERT=y | 21 | CONFIG_RCU_EXPERT=y |
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE05 b/tools/testing/selftests/rcutorture/configs/rcu/TREE05 index c04dfea6fd21..1257d3227b1e 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE05 +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE05 | |||
@@ -17,6 +17,5 @@ CONFIG_RCU_NOCB_CPU_NONE=y | |||
17 | CONFIG_DEBUG_LOCK_ALLOC=y | 17 | CONFIG_DEBUG_LOCK_ALLOC=y |
18 | CONFIG_PROVE_LOCKING=y | 18 | CONFIG_PROVE_LOCKING=y |
19 | #CHECK#CONFIG_PROVE_RCU=y | 19 | #CHECK#CONFIG_PROVE_RCU=y |
20 | CONFIG_RCU_CPU_STALL_INFO=n | ||
21 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n | 20 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n |
22 | CONFIG_RCU_EXPERT=y | 21 | CONFIG_RCU_EXPERT=y |
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE06 b/tools/testing/selftests/rcutorture/configs/rcu/TREE06 index f51d2c73a68e..d3e456b74cbe 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE06 +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE06 | |||
@@ -18,6 +18,5 @@ CONFIG_RCU_NOCB_CPU=n | |||
18 | CONFIG_DEBUG_LOCK_ALLOC=y | 18 | CONFIG_DEBUG_LOCK_ALLOC=y |
19 | CONFIG_PROVE_LOCKING=y | 19 | CONFIG_PROVE_LOCKING=y |
20 | #CHECK#CONFIG_PROVE_RCU=y | 20 | #CHECK#CONFIG_PROVE_RCU=y |
21 | CONFIG_RCU_CPU_STALL_INFO=n | ||
22 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=y | 21 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=y |
23 | CONFIG_RCU_EXPERT=y | 22 | CONFIG_RCU_EXPERT=y |
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE07 b/tools/testing/selftests/rcutorture/configs/rcu/TREE07 index f422af4ff5a3..3956b4131f72 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE07 +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE07 | |||
@@ -17,6 +17,5 @@ CONFIG_RCU_FANOUT=2 | |||
17 | CONFIG_RCU_FANOUT_LEAF=2 | 17 | CONFIG_RCU_FANOUT_LEAF=2 |
18 | CONFIG_RCU_NOCB_CPU=n | 18 | CONFIG_RCU_NOCB_CPU=n |
19 | CONFIG_DEBUG_LOCK_ALLOC=n | 19 | CONFIG_DEBUG_LOCK_ALLOC=n |
20 | CONFIG_RCU_CPU_STALL_INFO=n | ||
21 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n | 20 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n |
22 | CONFIG_RCU_EXPERT=y | 21 | CONFIG_RCU_EXPERT=y |
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE08 b/tools/testing/selftests/rcutorture/configs/rcu/TREE08 index a24d2ca30646..bb9b0c1a23c2 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE08 +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE08 | |||
@@ -19,7 +19,6 @@ CONFIG_RCU_NOCB_CPU_ALL=y | |||
19 | CONFIG_DEBUG_LOCK_ALLOC=n | 19 | CONFIG_DEBUG_LOCK_ALLOC=n |
20 | CONFIG_PROVE_LOCKING=y | 20 | CONFIG_PROVE_LOCKING=y |
21 | #CHECK#CONFIG_PROVE_RCU=y | 21 | #CHECK#CONFIG_PROVE_RCU=y |
22 | CONFIG_RCU_CPU_STALL_INFO=n | ||
23 | CONFIG_RCU_BOOST=n | 22 | CONFIG_RCU_BOOST=n |
24 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n | 23 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n |
25 | CONFIG_RCU_EXPERT=y | 24 | CONFIG_RCU_EXPERT=y |
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE08-T b/tools/testing/selftests/rcutorture/configs/rcu/TREE08-T index b2b8cea69dc9..2ad13f0d29cc 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE08-T +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE08-T | |||
@@ -17,6 +17,5 @@ CONFIG_RCU_FANOUT_LEAF=2 | |||
17 | CONFIG_RCU_NOCB_CPU=y | 17 | CONFIG_RCU_NOCB_CPU=y |
18 | CONFIG_RCU_NOCB_CPU_ALL=y | 18 | CONFIG_RCU_NOCB_CPU_ALL=y |
19 | CONFIG_DEBUG_LOCK_ALLOC=n | 19 | CONFIG_DEBUG_LOCK_ALLOC=n |
20 | CONFIG_RCU_CPU_STALL_INFO=n | ||
21 | CONFIG_RCU_BOOST=n | 20 | CONFIG_RCU_BOOST=n |
22 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n | 21 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n |
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE09 b/tools/testing/selftests/rcutorture/configs/rcu/TREE09 index aa4ed08d999d..6710e749d9de 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE09 +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE09 | |||
@@ -13,7 +13,6 @@ CONFIG_SUSPEND=n | |||
13 | CONFIG_HIBERNATION=n | 13 | CONFIG_HIBERNATION=n |
14 | CONFIG_RCU_NOCB_CPU=n | 14 | CONFIG_RCU_NOCB_CPU=n |
15 | CONFIG_DEBUG_LOCK_ALLOC=n | 15 | CONFIG_DEBUG_LOCK_ALLOC=n |
16 | CONFIG_RCU_CPU_STALL_INFO=n | ||
17 | CONFIG_RCU_BOOST=n | 16 | CONFIG_RCU_BOOST=n |
18 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n | 17 | CONFIG_DEBUG_OBJECTS_RCU_HEAD=n |
19 | #CHECK#CONFIG_RCU_EXPERT=n | 18 | #CHECK#CONFIG_RCU_EXPERT=n |
diff --git a/tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt b/tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt index b24c0004fc49..657f3a035488 100644 --- a/tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt +++ b/tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt | |||
@@ -16,7 +16,6 @@ CONFIG_PROVE_LOCKING -- Do several, covering CONFIG_DEBUG_LOCK_ALLOC=y and not. | |||
16 | CONFIG_PROVE_RCU -- Hardwired to CONFIG_PROVE_LOCKING. | 16 | CONFIG_PROVE_RCU -- Hardwired to CONFIG_PROVE_LOCKING. |
17 | CONFIG_RCU_BOOST -- one of PREEMPT_RCU. | 17 | CONFIG_RCU_BOOST -- one of PREEMPT_RCU. |
18 | CONFIG_RCU_KTHREAD_PRIO -- set to 2 for _BOOST testing. | 18 | CONFIG_RCU_KTHREAD_PRIO -- set to 2 for _BOOST testing. |
19 | CONFIG_RCU_CPU_STALL_INFO -- Now default, avoid at least twice. | ||
20 | CONFIG_RCU_FANOUT -- Cover hierarchy, but overlap with others. | 19 | CONFIG_RCU_FANOUT -- Cover hierarchy, but overlap with others. |
21 | CONFIG_RCU_FANOUT_LEAF -- Do one non-default. | 20 | CONFIG_RCU_FANOUT_LEAF -- Do one non-default. |
22 | CONFIG_RCU_FAST_NO_HZ -- Do one, but not with CONFIG_RCU_NOCB_CPU_ALL. | 21 | CONFIG_RCU_FAST_NO_HZ -- Do one, but not with CONFIG_RCU_NOCB_CPU_ALL. |