aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--Documentation/RCU/stallwarn.txt29
-rw-r--r--Documentation/RCU/trace.txt36
-rw-r--r--include/trace/events/rcu.h1
-rw-r--r--kernel/rcu/tree.c589
-rw-r--r--kernel/rcu/tree.h84
-rw-r--r--kernel/rcu/tree_plugin.h117
-rw-r--r--kernel/rcu/tree_trace.c19
-rw-r--r--lib/Kconfig.debug14
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TREE011
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TREE021
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TREE02-T1
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TREE031
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TREE041
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TREE051
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TREE061
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TREE071
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TREE081
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TREE08-T1
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TREE091
-rw-r--r--tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt1
20 files changed, 430 insertions, 471 deletions
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index b57c0c1cdac6..efb9454875ab 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -26,12 +26,6 @@ CONFIG_RCU_CPU_STALL_TIMEOUT
26 Stall-warning messages may be enabled and disabled completely via 26 Stall-warning messages may be enabled and disabled completely via
27 /sys/module/rcupdate/parameters/rcu_cpu_stall_suppress. 27 /sys/module/rcupdate/parameters/rcu_cpu_stall_suppress.
28 28
29CONFIG_RCU_CPU_STALL_INFO
30
31 This kernel configuration parameter causes the stall warning to
32 print out additional per-CPU diagnostic information, including
33 information on scheduling-clock ticks and RCU's idle-CPU tracking.
34
35RCU_STALL_DELAY_DELTA 29RCU_STALL_DELAY_DELTA
36 30
37 Although the lockdep facility is extremely useful, it does add 31 Although the lockdep facility is extremely useful, it does add
@@ -101,15 +95,13 @@ interact. Please note that it is not possible to entirely eliminate this
101sort of false positive without resorting to things like stop_machine(), 95sort of false positive without resorting to things like stop_machine(),
102which is overkill for this sort of problem. 96which is overkill for this sort of problem.
103 97
104If the CONFIG_RCU_CPU_STALL_INFO kernel configuration parameter is set, 98Recent kernels will print a long form of the stall-warning message:
105more information is printed with the stall-warning message, for example:
106 99
107 INFO: rcu_preempt detected stall on CPU 100 INFO: rcu_preempt detected stall on CPU
108 0: (63959 ticks this GP) idle=241/3fffffffffffffff/0 softirq=82/543 101 0: (63959 ticks this GP) idle=241/3fffffffffffffff/0 softirq=82/543
109 (t=65000 jiffies) 102 (t=65000 jiffies)
110 103
111In kernels with CONFIG_RCU_FAST_NO_HZ, even more information is 104In kernels with CONFIG_RCU_FAST_NO_HZ, more information is printed:
112printed:
113 105
114 INFO: rcu_preempt detected stall on CPU 106 INFO: rcu_preempt detected stall on CPU
115 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 softirq=82/543 last_accelerate: a345/d342 nonlazy_posted: 25 .D 107 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 softirq=82/543 last_accelerate: a345/d342 nonlazy_posted: 25 .D
@@ -171,6 +163,23 @@ message will be about three times the interval between the beginning
171of the stall and the first message. 163of the stall and the first message.
172 164
173 165
166Stall Warnings for Expedited Grace Periods
167
168If an expedited grace period detects a stall, it will place a message
169like the following in dmesg:
170
171 INFO: rcu_sched detected expedited stalls on CPUs: { 1 2 6 } 26009 jiffies s: 1043
172
173This indicates that CPUs 1, 2, and 6 have failed to respond to a
174reschedule IPI, that the expedited grace period has been going on for
17526,009 jiffies, and that the expedited grace-period sequence counter is
1761043. The fact that this last value is odd indicates that an expedited
177grace period is in flight.
178
179It is entirely possible to see stall warnings from normal and from
180expedited grace periods at about the same time from the same run.
181
182
174What Causes RCU CPU Stall Warnings? 183What Causes RCU CPU Stall Warnings?
175 184
176So your kernel printed an RCU CPU stall warning. The next question is 185So your kernel printed an RCU CPU stall warning. The next question is
diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt
index 08651da15448..97f17e9decda 100644
--- a/Documentation/RCU/trace.txt
+++ b/Documentation/RCU/trace.txt
@@ -237,42 +237,26 @@ o "ktl" is the low-order 16 bits (in hexadecimal) of the count of
237 237
238The output of "cat rcu/rcu_preempt/rcuexp" looks as follows: 238The output of "cat rcu/rcu_preempt/rcuexp" looks as follows:
239 239
240s=21872 d=21872 w=0 tf=0 wd1=0 wd2=0 n=0 sc=21872 dt=21872 dl=0 dx=21872 240s=21872 wd0=0 wd1=0 wd2=0 wd3=5 n=0 enq=0 sc=21872
241 241
242These fields are as follows: 242These fields are as follows:
243 243
244o "s" is the starting sequence number. 244o "s" is the sequence number, with an odd number indicating that
245 an expedited grace period is in progress.
245 246
246o "d" is the ending sequence number. When the starting and ending 247o "wd0", "wd1", "wd2", and "wd3" are the number of times that an
247 numbers differ, there is an expedited grace period in progress. 248 attempt to start an expedited grace period found that someone
248 249 else had completed an expedited grace period that satisfies the
249o "w" is the number of times that the sequence numbers have been
250 in danger of wrapping.
251
252o "tf" is the number of times that contention has resulted in a
253 failure to begin an expedited grace period.
254
255o "wd1" and "wd2" are the number of times that an attempt to
256 start an expedited grace period found that someone else had
257 completed an expedited grace period that satisfies the
258 attempted request. "Our work is done." 250 attempted request. "Our work is done."
259 251
260o "n" is number of times that contention was so great that 252o "n" is number of times that a concurrent CPU-hotplug operation
261 the request was demoted from an expedited grace period to 253 forced a fallback to a normal grace period.
262 a normal grace period. 254
255o "enq" is the number of quiescent states still outstanding.
263 256
264o "sc" is the number of times that the attempt to start a 257o "sc" is the number of times that the attempt to start a
265 new expedited grace period succeeded. 258 new expedited grace period succeeded.
266 259
267o "dt" is the number of times that we attempted to update
268 the "d" counter.
269
270o "dl" is the number of times that we failed to update the "d"
271 counter.
272
273o "dx" is the number of times that we succeeded in updating
274 the "d" counter.
275
276 260
277The output of "cat rcu/rcu_preempt/rcugp" looks as follows: 261The output of "cat rcu/rcu_preempt/rcugp" looks as follows:
278 262
diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index c78e88ce5ea3..ef72c4aada56 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -661,7 +661,6 @@ TRACE_EVENT(rcu_torture_read,
661 * Tracepoint for _rcu_barrier() execution. The string "s" describes 661 * Tracepoint for _rcu_barrier() execution. The string "s" describes
662 * the _rcu_barrier phase: 662 * the _rcu_barrier phase:
663 * "Begin": _rcu_barrier() started. 663 * "Begin": _rcu_barrier() started.
664 * "Check": _rcu_barrier() checking for piggybacking.
665 * "EarlyExit": _rcu_barrier() piggybacked, thus early exit. 664 * "EarlyExit": _rcu_barrier() piggybacked, thus early exit.
666 * "Inc1": _rcu_barrier() piggyback check counter incremented. 665 * "Inc1": _rcu_barrier() piggyback check counter incremented.
667 * "OfflineNoCB": _rcu_barrier() found callback on never-online CPU 666 * "OfflineNoCB": _rcu_barrier() found callback on never-online CPU
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 0a73d26357a2..9f75f25cc5d9 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -70,6 +70,8 @@ MODULE_ALIAS("rcutree");
70 70
71static struct lock_class_key rcu_node_class[RCU_NUM_LVLS]; 71static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
72static struct lock_class_key rcu_fqs_class[RCU_NUM_LVLS]; 72static struct lock_class_key rcu_fqs_class[RCU_NUM_LVLS];
73static struct lock_class_key rcu_exp_class[RCU_NUM_LVLS];
74static struct lock_class_key rcu_exp_sched_class[RCU_NUM_LVLS];
73 75
74/* 76/*
75 * In order to export the rcu_state name to the tracing tools, it 77 * In order to export the rcu_state name to the tracing tools, it
@@ -124,13 +126,8 @@ module_param(rcu_fanout_exact, bool, 0444);
124static int rcu_fanout_leaf = RCU_FANOUT_LEAF; 126static int rcu_fanout_leaf = RCU_FANOUT_LEAF;
125module_param(rcu_fanout_leaf, int, 0444); 127module_param(rcu_fanout_leaf, int, 0444);
126int rcu_num_lvls __read_mostly = RCU_NUM_LVLS; 128int rcu_num_lvls __read_mostly = RCU_NUM_LVLS;
127static int num_rcu_lvl[] = { /* Number of rcu_nodes at specified level. */ 129/* Number of rcu_nodes at specified level. */
128 NUM_RCU_LVL_0, 130static int num_rcu_lvl[] = NUM_RCU_LVL_INIT;
129 NUM_RCU_LVL_1,
130 NUM_RCU_LVL_2,
131 NUM_RCU_LVL_3,
132 NUM_RCU_LVL_4,
133};
134int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. */ 131int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. */
135 132
136/* 133/*
@@ -1178,9 +1175,11 @@ static void rcu_check_gp_kthread_starvation(struct rcu_state *rsp)
1178 j = jiffies; 1175 j = jiffies;
1179 gpa = READ_ONCE(rsp->gp_activity); 1176 gpa = READ_ONCE(rsp->gp_activity);
1180 if (j - gpa > 2 * HZ) 1177 if (j - gpa > 2 * HZ)
1181 pr_err("%s kthread starved for %ld jiffies! g%lu c%lu f%#x\n", 1178 pr_err("%s kthread starved for %ld jiffies! g%lu c%lu f%#x s%d ->state=%#lx\n",
1182 rsp->name, j - gpa, 1179 rsp->name, j - gpa,
1183 rsp->gpnum, rsp->completed, rsp->gp_flags); 1180 rsp->gpnum, rsp->completed,
1181 rsp->gp_flags, rsp->gp_state,
1182 rsp->gp_kthread ? rsp->gp_kthread->state : 0);
1184} 1183}
1185 1184
1186/* 1185/*
@@ -1906,6 +1905,26 @@ static int rcu_gp_init(struct rcu_state *rsp)
1906} 1905}
1907 1906
1908/* 1907/*
1908 * Helper function for wait_event_interruptible_timeout() wakeup
1909 * at force-quiescent-state time.
1910 */
1911static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
1912{
1913 struct rcu_node *rnp = rcu_get_root(rsp);
1914
1915 /* Someone like call_rcu() requested a force-quiescent-state scan. */
1916 *gfp = READ_ONCE(rsp->gp_flags);
1917 if (*gfp & RCU_GP_FLAG_FQS)
1918 return true;
1919
1920 /* The current grace period has completed. */
1921 if (!READ_ONCE(rnp->qsmask) && !rcu_preempt_blocked_readers_cgp(rnp))
1922 return true;
1923
1924 return false;
1925}
1926
1927/*
1909 * Do one round of quiescent-state forcing. 1928 * Do one round of quiescent-state forcing.
1910 */ 1929 */
1911static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in) 1930static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
@@ -2041,6 +2060,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
2041 wait_event_interruptible(rsp->gp_wq, 2060 wait_event_interruptible(rsp->gp_wq,
2042 READ_ONCE(rsp->gp_flags) & 2061 READ_ONCE(rsp->gp_flags) &
2043 RCU_GP_FLAG_INIT); 2062 RCU_GP_FLAG_INIT);
2063 rsp->gp_state = RCU_GP_DONE_GPS;
2044 /* Locking provides needed memory barrier. */ 2064 /* Locking provides needed memory barrier. */
2045 if (rcu_gp_init(rsp)) 2065 if (rcu_gp_init(rsp))
2046 break; 2066 break;
@@ -2068,11 +2088,8 @@ static int __noreturn rcu_gp_kthread(void *arg)
2068 TPS("fqswait")); 2088 TPS("fqswait"));
2069 rsp->gp_state = RCU_GP_WAIT_FQS; 2089 rsp->gp_state = RCU_GP_WAIT_FQS;
2070 ret = wait_event_interruptible_timeout(rsp->gp_wq, 2090 ret = wait_event_interruptible_timeout(rsp->gp_wq,
2071 ((gf = READ_ONCE(rsp->gp_flags)) & 2091 rcu_gp_fqs_check_wake(rsp, &gf), j);
2072 RCU_GP_FLAG_FQS) || 2092 rsp->gp_state = RCU_GP_DOING_FQS;
2073 (!READ_ONCE(rnp->qsmask) &&
2074 !rcu_preempt_blocked_readers_cgp(rnp)),
2075 j);
2076 /* Locking provides needed memory barriers. */ 2093 /* Locking provides needed memory barriers. */
2077 /* If grace period done, leave loop. */ 2094 /* If grace period done, leave loop. */
2078 if (!READ_ONCE(rnp->qsmask) && 2095 if (!READ_ONCE(rnp->qsmask) &&
@@ -2110,7 +2127,9 @@ static int __noreturn rcu_gp_kthread(void *arg)
2110 } 2127 }
2111 2128
2112 /* Handle grace-period end. */ 2129 /* Handle grace-period end. */
2130 rsp->gp_state = RCU_GP_CLEANUP;
2113 rcu_gp_cleanup(rsp); 2131 rcu_gp_cleanup(rsp);
2132 rsp->gp_state = RCU_GP_CLEANED;
2114 } 2133 }
2115} 2134}
2116 2135
@@ -3305,23 +3324,195 @@ void cond_synchronize_sched(unsigned long oldstate)
3305} 3324}
3306EXPORT_SYMBOL_GPL(cond_synchronize_sched); 3325EXPORT_SYMBOL_GPL(cond_synchronize_sched);
3307 3326
3308static int synchronize_sched_expedited_cpu_stop(void *data) 3327/* Adjust sequence number for start of update-side operation. */
3328static void rcu_seq_start(unsigned long *sp)
3329{
3330 WRITE_ONCE(*sp, *sp + 1);
3331 smp_mb(); /* Ensure update-side operation after counter increment. */
3332 WARN_ON_ONCE(!(*sp & 0x1));
3333}
3334
3335/* Adjust sequence number for end of update-side operation. */
3336static void rcu_seq_end(unsigned long *sp)
3337{
3338 smp_mb(); /* Ensure update-side operation before counter increment. */
3339 WRITE_ONCE(*sp, *sp + 1);
3340 WARN_ON_ONCE(*sp & 0x1);
3341}
3342
3343/* Take a snapshot of the update side's sequence number. */
3344static unsigned long rcu_seq_snap(unsigned long *sp)
3345{
3346 unsigned long s;
3347
3348 smp_mb(); /* Caller's modifications seen first by other CPUs. */
3349 s = (READ_ONCE(*sp) + 3) & ~0x1;
3350 smp_mb(); /* Above access must not bleed into critical section. */
3351 return s;
3352}
3353
3354/*
3355 * Given a snapshot from rcu_seq_snap(), determine whether or not a
3356 * full update-side operation has occurred.
3357 */
3358static bool rcu_seq_done(unsigned long *sp, unsigned long s)
3359{
3360 return ULONG_CMP_GE(READ_ONCE(*sp), s);
3361}
3362
3363/* Wrapper functions for expedited grace periods. */
3364static void rcu_exp_gp_seq_start(struct rcu_state *rsp)
3365{
3366 rcu_seq_start(&rsp->expedited_sequence);
3367}
3368static void rcu_exp_gp_seq_end(struct rcu_state *rsp)
3369{
3370 rcu_seq_end(&rsp->expedited_sequence);
3371 smp_mb(); /* Ensure that consecutive grace periods serialize. */
3372}
3373static unsigned long rcu_exp_gp_seq_snap(struct rcu_state *rsp)
3374{
3375 return rcu_seq_snap(&rsp->expedited_sequence);
3376}
3377static bool rcu_exp_gp_seq_done(struct rcu_state *rsp, unsigned long s)
3378{
3379 return rcu_seq_done(&rsp->expedited_sequence, s);
3380}
3381
3382/* Common code for synchronize_{rcu,sched}_expedited() work-done checking. */
3383static bool sync_exp_work_done(struct rcu_state *rsp, struct rcu_node *rnp,
3384 struct rcu_data *rdp,
3385 atomic_long_t *stat, unsigned long s)
3309{ 3386{
3387 if (rcu_exp_gp_seq_done(rsp, s)) {
3388 if (rnp)
3389 mutex_unlock(&rnp->exp_funnel_mutex);
3390 else if (rdp)
3391 mutex_unlock(&rdp->exp_funnel_mutex);
3392 /* Ensure test happens before caller kfree(). */
3393 smp_mb__before_atomic(); /* ^^^ */
3394 atomic_long_inc(stat);
3395 return true;
3396 }
3397 return false;
3398}
3399
3400/*
3401 * Funnel-lock acquisition for expedited grace periods. Returns a
3402 * pointer to the root rcu_node structure, or NULL if some other
3403 * task did the expedited grace period for us.
3404 */
3405static struct rcu_node *exp_funnel_lock(struct rcu_state *rsp, unsigned long s)
3406{
3407 struct rcu_data *rdp;
3408 struct rcu_node *rnp0;
3409 struct rcu_node *rnp1 = NULL;
3410
3310 /* 3411 /*
3311 * There must be a full memory barrier on each affected CPU 3412 * First try directly acquiring the root lock in order to reduce
3312 * between the time that try_stop_cpus() is called and the 3413 * latency in the common case where expedited grace periods are
3313 * time that it returns. 3414 * rare. We check mutex_is_locked() to avoid pathological levels of
3314 * 3415 * memory contention on ->exp_funnel_mutex in the heavy-load case.
3315 * In the current initial implementation of cpu_stop, the
3316 * above condition is already met when the control reaches
3317 * this point and the following smp_mb() is not strictly
3318 * necessary. Do smp_mb() anyway for documentation and
3319 * robustness against future implementation changes.
3320 */ 3416 */
3321 smp_mb(); /* See above comment block. */ 3417 rnp0 = rcu_get_root(rsp);
3418 if (!mutex_is_locked(&rnp0->exp_funnel_mutex)) {
3419 if (mutex_trylock(&rnp0->exp_funnel_mutex)) {
3420 if (sync_exp_work_done(rsp, rnp0, NULL,
3421 &rsp->expedited_workdone0, s))
3422 return NULL;
3423 return rnp0;
3424 }
3425 }
3426
3427 /*
3428 * Each pass through the following loop works its way
3429 * up the rcu_node tree, returning if others have done the
3430 * work or otherwise falls through holding the root rnp's
3431 * ->exp_funnel_mutex. The mapping from CPU to rcu_node structure
3432 * can be inexact, as it is just promoting locality and is not
3433 * strictly needed for correctness.
3434 */
3435 rdp = per_cpu_ptr(rsp->rda, raw_smp_processor_id());
3436 if (sync_exp_work_done(rsp, NULL, NULL, &rsp->expedited_workdone1, s))
3437 return NULL;
3438 mutex_lock(&rdp->exp_funnel_mutex);
3439 rnp0 = rdp->mynode;
3440 for (; rnp0 != NULL; rnp0 = rnp0->parent) {
3441 if (sync_exp_work_done(rsp, rnp1, rdp,
3442 &rsp->expedited_workdone2, s))
3443 return NULL;
3444 mutex_lock(&rnp0->exp_funnel_mutex);
3445 if (rnp1)
3446 mutex_unlock(&rnp1->exp_funnel_mutex);
3447 else
3448 mutex_unlock(&rdp->exp_funnel_mutex);
3449 rnp1 = rnp0;
3450 }
3451 if (sync_exp_work_done(rsp, rnp1, rdp,
3452 &rsp->expedited_workdone3, s))
3453 return NULL;
3454 return rnp1;
3455}
3456
3457/* Invoked on each online non-idle CPU for expedited quiescent state. */
3458static int synchronize_sched_expedited_cpu_stop(void *data)
3459{
3460 struct rcu_data *rdp = data;
3461 struct rcu_state *rsp = rdp->rsp;
3462
3463 /* We are here: If we are last, do the wakeup. */
3464 rdp->exp_done = true;
3465 if (atomic_dec_and_test(&rsp->expedited_need_qs))
3466 wake_up(&rsp->expedited_wq);
3322 return 0; 3467 return 0;
3323} 3468}
3324 3469
3470static void synchronize_sched_expedited_wait(struct rcu_state *rsp)
3471{
3472 int cpu;
3473 unsigned long jiffies_stall;
3474 unsigned long jiffies_start;
3475 struct rcu_data *rdp;
3476 int ret;
3477
3478 jiffies_stall = rcu_jiffies_till_stall_check();
3479 jiffies_start = jiffies;
3480
3481 for (;;) {
3482 ret = wait_event_interruptible_timeout(
3483 rsp->expedited_wq,
3484 !atomic_read(&rsp->expedited_need_qs),
3485 jiffies_stall);
3486 if (ret > 0)
3487 return;
3488 if (ret < 0) {
3489 /* Hit a signal, disable CPU stall warnings. */
3490 wait_event(rsp->expedited_wq,
3491 !atomic_read(&rsp->expedited_need_qs));
3492 return;
3493 }
3494 pr_err("INFO: %s detected expedited stalls on CPUs: {",
3495 rsp->name);
3496 for_each_online_cpu(cpu) {
3497 rdp = per_cpu_ptr(rsp->rda, cpu);
3498
3499 if (rdp->exp_done)
3500 continue;
3501 pr_cont(" %d", cpu);
3502 }
3503 pr_cont(" } %lu jiffies s: %lu\n",
3504 jiffies - jiffies_start, rsp->expedited_sequence);
3505 for_each_online_cpu(cpu) {
3506 rdp = per_cpu_ptr(rsp->rda, cpu);
3507
3508 if (rdp->exp_done)
3509 continue;
3510 dump_cpu_task(cpu);
3511 }
3512 jiffies_stall = 3 * rcu_jiffies_till_stall_check() + 3;
3513 }
3514}
3515
3325/** 3516/**
3326 * synchronize_sched_expedited - Brute-force RCU-sched grace period 3517 * synchronize_sched_expedited - Brute-force RCU-sched grace period
3327 * 3518 *
@@ -3333,58 +3524,21 @@ static int synchronize_sched_expedited_cpu_stop(void *data)
3333 * restructure your code to batch your updates, and then use a single 3524 * restructure your code to batch your updates, and then use a single
3334 * synchronize_sched() instead. 3525 * synchronize_sched() instead.
3335 * 3526 *
3336 * This implementation can be thought of as an application of ticket 3527 * This implementation can be thought of as an application of sequence
3337 * locking to RCU, with sync_sched_expedited_started and 3528 * locking to expedited grace periods, but using the sequence counter to
3338 * sync_sched_expedited_done taking on the roles of the halves 3529 * determine when someone else has already done the work instead of for
3339 * of the ticket-lock word. Each task atomically increments 3530 * retrying readers.
3340 * sync_sched_expedited_started upon entry, snapshotting the old value,
3341 * then attempts to stop all the CPUs. If this succeeds, then each
3342 * CPU will have executed a context switch, resulting in an RCU-sched
3343 * grace period. We are then done, so we use atomic_cmpxchg() to
3344 * update sync_sched_expedited_done to match our snapshot -- but
3345 * only if someone else has not already advanced past our snapshot.
3346 *
3347 * On the other hand, if try_stop_cpus() fails, we check the value
3348 * of sync_sched_expedited_done. If it has advanced past our
3349 * initial snapshot, then someone else must have forced a grace period
3350 * some time after we took our snapshot. In this case, our work is
3351 * done for us, and we can simply return. Otherwise, we try again,
3352 * but keep our initial snapshot for purposes of checking for someone
3353 * doing our work for us.
3354 *
3355 * If we fail too many times in a row, we fall back to synchronize_sched().
3356 */ 3531 */
3357void synchronize_sched_expedited(void) 3532void synchronize_sched_expedited(void)
3358{ 3533{
3359 cpumask_var_t cm;
3360 bool cma = false;
3361 int cpu; 3534 int cpu;
3362 long firstsnap, s, snap; 3535 unsigned long s;
3363 int trycount = 0; 3536 struct rcu_node *rnp;
3364 struct rcu_state *rsp = &rcu_sched_state; 3537 struct rcu_state *rsp = &rcu_sched_state;
3365 3538
3366 /* 3539 /* Take a snapshot of the sequence number. */
3367 * If we are in danger of counter wrap, just do synchronize_sched(). 3540 s = rcu_exp_gp_seq_snap(rsp);
3368 * By allowing sync_sched_expedited_started to advance no more than
3369 * ULONG_MAX/8 ahead of sync_sched_expedited_done, we are ensuring
3370 * that more than 3.5 billion CPUs would be required to force a
3371 * counter wrap on a 32-bit system. Quite a few more CPUs would of
3372 * course be required on a 64-bit system.
3373 */
3374 if (ULONG_CMP_GE((ulong)atomic_long_read(&rsp->expedited_start),
3375 (ulong)atomic_long_read(&rsp->expedited_done) +
3376 ULONG_MAX / 8)) {
3377 wait_rcu_gp(call_rcu_sched);
3378 atomic_long_inc(&rsp->expedited_wrap);
3379 return;
3380 }
3381 3541
3382 /*
3383 * Take a ticket. Note that atomic_inc_return() implies a
3384 * full memory barrier.
3385 */
3386 snap = atomic_long_inc_return(&rsp->expedited_start);
3387 firstsnap = snap;
3388 if (!try_get_online_cpus()) { 3542 if (!try_get_online_cpus()) {
3389 /* CPU hotplug operation in flight, fall back to normal GP. */ 3543 /* CPU hotplug operation in flight, fall back to normal GP. */
3390 wait_rcu_gp(call_rcu_sched); 3544 wait_rcu_gp(call_rcu_sched);
@@ -3393,100 +3547,38 @@ void synchronize_sched_expedited(void)
3393 } 3547 }
3394 WARN_ON_ONCE(cpu_is_offline(raw_smp_processor_id())); 3548 WARN_ON_ONCE(cpu_is_offline(raw_smp_processor_id()));
3395 3549
3396 /* Offline CPUs, idle CPUs, and any CPU we run on are quiescent. */ 3550 rnp = exp_funnel_lock(rsp, s);
3397 cma = zalloc_cpumask_var(&cm, GFP_KERNEL); 3551 if (rnp == NULL) {
3398 if (cma) { 3552 put_online_cpus();
3399 cpumask_copy(cm, cpu_online_mask); 3553 return; /* Someone else did our work for us. */
3400 cpumask_clear_cpu(raw_smp_processor_id(), cm);
3401 for_each_cpu(cpu, cm) {
3402 struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu);
3403
3404 if (!(atomic_add_return(0, &rdtp->dynticks) & 0x1))
3405 cpumask_clear_cpu(cpu, cm);
3406 }
3407 if (cpumask_weight(cm) == 0)
3408 goto all_cpus_idle;
3409 } 3554 }
3410 3555
3411 /* 3556 rcu_exp_gp_seq_start(rsp);
3412 * Each pass through the following loop attempts to force a
3413 * context switch on each CPU.
3414 */
3415 while (try_stop_cpus(cma ? cm : cpu_online_mask,
3416 synchronize_sched_expedited_cpu_stop,
3417 NULL) == -EAGAIN) {
3418 put_online_cpus();
3419 atomic_long_inc(&rsp->expedited_tryfail);
3420
3421 /* Check to see if someone else did our work for us. */
3422 s = atomic_long_read(&rsp->expedited_done);
3423 if (ULONG_CMP_GE((ulong)s, (ulong)firstsnap)) {
3424 /* ensure test happens before caller kfree */
3425 smp_mb__before_atomic(); /* ^^^ */
3426 atomic_long_inc(&rsp->expedited_workdone1);
3427 free_cpumask_var(cm);
3428 return;
3429 }
3430 3557
3431 /* No joy, try again later. Or just synchronize_sched(). */ 3558 /* Stop each CPU that is online, non-idle, and not us. */
3432 if (trycount++ < 10) { 3559 init_waitqueue_head(&rsp->expedited_wq);
3433 udelay(trycount * num_online_cpus()); 3560 atomic_set(&rsp->expedited_need_qs, 1); /* Extra count avoids race. */
3434 } else { 3561 for_each_online_cpu(cpu) {
3435 wait_rcu_gp(call_rcu_sched); 3562 struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
3436 atomic_long_inc(&rsp->expedited_normal); 3563 struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu);
3437 free_cpumask_var(cm);
3438 return;
3439 }
3440 3564
3441 /* Recheck to see if someone else did our work for us. */ 3565 rdp->exp_done = false;
3442 s = atomic_long_read(&rsp->expedited_done);
3443 if (ULONG_CMP_GE((ulong)s, (ulong)firstsnap)) {
3444 /* ensure test happens before caller kfree */
3445 smp_mb__before_atomic(); /* ^^^ */
3446 atomic_long_inc(&rsp->expedited_workdone2);
3447 free_cpumask_var(cm);
3448 return;
3449 }
3450 3566
3451 /* 3567 /* Skip our CPU and any idle CPUs. */
3452 * Refetching sync_sched_expedited_started allows later 3568 if (raw_smp_processor_id() == cpu ||
3453 * callers to piggyback on our grace period. We retry 3569 !(atomic_add_return(0, &rdtp->dynticks) & 0x1))
3454 * after they started, so our grace period works for them, 3570 continue;
3455 * and they started after our first try, so their grace 3571 atomic_inc(&rsp->expedited_need_qs);
3456 * period works for us. 3572 stop_one_cpu_nowait(cpu, synchronize_sched_expedited_cpu_stop,
3457 */ 3573 rdp, &rdp->exp_stop_work);
3458 if (!try_get_online_cpus()) {
3459 /* CPU hotplug operation in flight, use normal GP. */
3460 wait_rcu_gp(call_rcu_sched);
3461 atomic_long_inc(&rsp->expedited_normal);
3462 free_cpumask_var(cm);
3463 return;
3464 }
3465 snap = atomic_long_read(&rsp->expedited_start);
3466 smp_mb(); /* ensure read is before try_stop_cpus(). */
3467 } 3574 }
3468 atomic_long_inc(&rsp->expedited_stoppedcpus);
3469 3575
3470all_cpus_idle: 3576 /* Remove extra count and, if necessary, wait for CPUs to stop. */
3471 free_cpumask_var(cm); 3577 if (!atomic_dec_and_test(&rsp->expedited_need_qs))
3578 synchronize_sched_expedited_wait(rsp);
3472 3579
3473 /* 3580 rcu_exp_gp_seq_end(rsp);
3474 * Everyone up to our most recent fetch is covered by our grace 3581 mutex_unlock(&rnp->exp_funnel_mutex);
3475 * period. Update the counter, but only if our work is still
3476 * relevant -- which it won't be if someone who started later
3477 * than we did already did their update.
3478 */
3479 do {
3480 atomic_long_inc(&rsp->expedited_done_tries);
3481 s = atomic_long_read(&rsp->expedited_done);
3482 if (ULONG_CMP_GE((ulong)s, (ulong)snap)) {
3483 /* ensure test happens before caller kfree */
3484 smp_mb__before_atomic(); /* ^^^ */
3485 atomic_long_inc(&rsp->expedited_done_lost);
3486 break;
3487 }
3488 } while (atomic_long_cmpxchg(&rsp->expedited_done, s, snap) != s);
3489 atomic_long_inc(&rsp->expedited_done_exit);
3490 3582
3491 put_online_cpus(); 3583 put_online_cpus();
3492} 3584}
@@ -3623,10 +3715,10 @@ static void rcu_barrier_callback(struct rcu_head *rhp)
3623 struct rcu_state *rsp = rdp->rsp; 3715 struct rcu_state *rsp = rdp->rsp;
3624 3716
3625 if (atomic_dec_and_test(&rsp->barrier_cpu_count)) { 3717 if (atomic_dec_and_test(&rsp->barrier_cpu_count)) {
3626 _rcu_barrier_trace(rsp, "LastCB", -1, rsp->n_barrier_done); 3718 _rcu_barrier_trace(rsp, "LastCB", -1, rsp->barrier_sequence);
3627 complete(&rsp->barrier_completion); 3719 complete(&rsp->barrier_completion);
3628 } else { 3720 } else {
3629 _rcu_barrier_trace(rsp, "CB", -1, rsp->n_barrier_done); 3721 _rcu_barrier_trace(rsp, "CB", -1, rsp->barrier_sequence);
3630 } 3722 }
3631} 3723}
3632 3724
@@ -3638,7 +3730,7 @@ static void rcu_barrier_func(void *type)
3638 struct rcu_state *rsp = type; 3730 struct rcu_state *rsp = type;
3639 struct rcu_data *rdp = raw_cpu_ptr(rsp->rda); 3731 struct rcu_data *rdp = raw_cpu_ptr(rsp->rda);
3640 3732
3641 _rcu_barrier_trace(rsp, "IRQ", -1, rsp->n_barrier_done); 3733 _rcu_barrier_trace(rsp, "IRQ", -1, rsp->barrier_sequence);
3642 atomic_inc(&rsp->barrier_cpu_count); 3734 atomic_inc(&rsp->barrier_cpu_count);
3643 rsp->call(&rdp->barrier_head, rcu_barrier_callback); 3735 rsp->call(&rdp->barrier_head, rcu_barrier_callback);
3644} 3736}
@@ -3651,55 +3743,24 @@ static void _rcu_barrier(struct rcu_state *rsp)
3651{ 3743{
3652 int cpu; 3744 int cpu;
3653 struct rcu_data *rdp; 3745 struct rcu_data *rdp;
3654 unsigned long snap = READ_ONCE(rsp->n_barrier_done); 3746 unsigned long s = rcu_seq_snap(&rsp->barrier_sequence);
3655 unsigned long snap_done;
3656 3747
3657 _rcu_barrier_trace(rsp, "Begin", -1, snap); 3748 _rcu_barrier_trace(rsp, "Begin", -1, s);
3658 3749
3659 /* Take mutex to serialize concurrent rcu_barrier() requests. */ 3750 /* Take mutex to serialize concurrent rcu_barrier() requests. */
3660 mutex_lock(&rsp->barrier_mutex); 3751 mutex_lock(&rsp->barrier_mutex);
3661 3752
3662 /* 3753 /* Did someone else do our work for us? */
3663 * Ensure that all prior references, including to ->n_barrier_done, 3754 if (rcu_seq_done(&rsp->barrier_sequence, s)) {
3664 * are ordered before the _rcu_barrier() machinery. 3755 _rcu_barrier_trace(rsp, "EarlyExit", -1, rsp->barrier_sequence);
3665 */
3666 smp_mb(); /* See above block comment. */
3667
3668 /*
3669 * Recheck ->n_barrier_done to see if others did our work for us.
3670 * This means checking ->n_barrier_done for an even-to-odd-to-even
3671 * transition. The "if" expression below therefore rounds the old
3672 * value up to the next even number and adds two before comparing.
3673 */
3674 snap_done = rsp->n_barrier_done;
3675 _rcu_barrier_trace(rsp, "Check", -1, snap_done);
3676
3677 /*
3678 * If the value in snap is odd, we needed to wait for the current
3679 * rcu_barrier() to complete, then wait for the next one, in other
3680 * words, we need the value of snap_done to be three larger than
3681 * the value of snap. On the other hand, if the value in snap is
3682 * even, we only had to wait for the next rcu_barrier() to complete,
3683 * in other words, we need the value of snap_done to be only two
3684 * greater than the value of snap. The "(snap + 3) & ~0x1" computes
3685 * this for us (thank you, Linus!).
3686 */
3687 if (ULONG_CMP_GE(snap_done, (snap + 3) & ~0x1)) {
3688 _rcu_barrier_trace(rsp, "EarlyExit", -1, snap_done);
3689 smp_mb(); /* caller's subsequent code after above check. */ 3756 smp_mb(); /* caller's subsequent code after above check. */
3690 mutex_unlock(&rsp->barrier_mutex); 3757 mutex_unlock(&rsp->barrier_mutex);
3691 return; 3758 return;
3692 } 3759 }
3693 3760
3694 /* 3761 /* Mark the start of the barrier operation. */
3695 * Increment ->n_barrier_done to avoid duplicate work. Use 3762 rcu_seq_start(&rsp->barrier_sequence);
3696 * WRITE_ONCE() to prevent the compiler from speculating 3763 _rcu_barrier_trace(rsp, "Inc1", -1, rsp->barrier_sequence);
3697 * the increment to precede the early-exit check.
3698 */
3699 WRITE_ONCE(rsp->n_barrier_done, rsp->n_barrier_done + 1);
3700 WARN_ON_ONCE((rsp->n_barrier_done & 0x1) != 1);
3701 _rcu_barrier_trace(rsp, "Inc1", -1, rsp->n_barrier_done);
3702 smp_mb(); /* Order ->n_barrier_done increment with below mechanism. */
3703 3764
3704 /* 3765 /*
3705 * Initialize the count to one rather than to zero in order to 3766 * Initialize the count to one rather than to zero in order to
@@ -3723,10 +3784,10 @@ static void _rcu_barrier(struct rcu_state *rsp)
3723 if (rcu_is_nocb_cpu(cpu)) { 3784 if (rcu_is_nocb_cpu(cpu)) {
3724 if (!rcu_nocb_cpu_needs_barrier(rsp, cpu)) { 3785 if (!rcu_nocb_cpu_needs_barrier(rsp, cpu)) {
3725 _rcu_barrier_trace(rsp, "OfflineNoCB", cpu, 3786 _rcu_barrier_trace(rsp, "OfflineNoCB", cpu,
3726 rsp->n_barrier_done); 3787 rsp->barrier_sequence);
3727 } else { 3788 } else {
3728 _rcu_barrier_trace(rsp, "OnlineNoCB", cpu, 3789 _rcu_barrier_trace(rsp, "OnlineNoCB", cpu,
3729 rsp->n_barrier_done); 3790 rsp->barrier_sequence);
3730 smp_mb__before_atomic(); 3791 smp_mb__before_atomic();
3731 atomic_inc(&rsp->barrier_cpu_count); 3792 atomic_inc(&rsp->barrier_cpu_count);
3732 __call_rcu(&rdp->barrier_head, 3793 __call_rcu(&rdp->barrier_head,
@@ -3734,11 +3795,11 @@ static void _rcu_barrier(struct rcu_state *rsp)
3734 } 3795 }
3735 } else if (READ_ONCE(rdp->qlen)) { 3796 } else if (READ_ONCE(rdp->qlen)) {
3736 _rcu_barrier_trace(rsp, "OnlineQ", cpu, 3797 _rcu_barrier_trace(rsp, "OnlineQ", cpu,
3737 rsp->n_barrier_done); 3798 rsp->barrier_sequence);
3738 smp_call_function_single(cpu, rcu_barrier_func, rsp, 1); 3799 smp_call_function_single(cpu, rcu_barrier_func, rsp, 1);
3739 } else { 3800 } else {
3740 _rcu_barrier_trace(rsp, "OnlineNQ", cpu, 3801 _rcu_barrier_trace(rsp, "OnlineNQ", cpu,
3741 rsp->n_barrier_done); 3802 rsp->barrier_sequence);
3742 } 3803 }
3743 } 3804 }
3744 put_online_cpus(); 3805 put_online_cpus();
@@ -3750,16 +3811,13 @@ static void _rcu_barrier(struct rcu_state *rsp)
3750 if (atomic_dec_and_test(&rsp->barrier_cpu_count)) 3811 if (atomic_dec_and_test(&rsp->barrier_cpu_count))
3751 complete(&rsp->barrier_completion); 3812 complete(&rsp->barrier_completion);
3752 3813
3753 /* Increment ->n_barrier_done to prevent duplicate work. */
3754 smp_mb(); /* Keep increment after above mechanism. */
3755 WRITE_ONCE(rsp->n_barrier_done, rsp->n_barrier_done + 1);
3756 WARN_ON_ONCE((rsp->n_barrier_done & 0x1) != 0);
3757 _rcu_barrier_trace(rsp, "Inc2", -1, rsp->n_barrier_done);
3758 smp_mb(); /* Keep increment before caller's subsequent code. */
3759
3760 /* Wait for all rcu_barrier_callback() callbacks to be invoked. */ 3814 /* Wait for all rcu_barrier_callback() callbacks to be invoked. */
3761 wait_for_completion(&rsp->barrier_completion); 3815 wait_for_completion(&rsp->barrier_completion);
3762 3816
3817 /* Mark the end of the barrier operation. */
3818 _rcu_barrier_trace(rsp, "Inc2", -1, rsp->barrier_sequence);
3819 rcu_seq_end(&rsp->barrier_sequence);
3820
3763 /* Other rcu_barrier() invocations can now safely proceed. */ 3821 /* Other rcu_barrier() invocations can now safely proceed. */
3764 mutex_unlock(&rsp->barrier_mutex); 3822 mutex_unlock(&rsp->barrier_mutex);
3765} 3823}
@@ -3822,6 +3880,7 @@ rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp)
3822 WARN_ON_ONCE(atomic_read(&rdp->dynticks->dynticks) != 1); 3880 WARN_ON_ONCE(atomic_read(&rdp->dynticks->dynticks) != 1);
3823 rdp->cpu = cpu; 3881 rdp->cpu = cpu;
3824 rdp->rsp = rsp; 3882 rdp->rsp = rsp;
3883 mutex_init(&rdp->exp_funnel_mutex);
3825 rcu_boot_init_nocb_percpu_data(rdp); 3884 rcu_boot_init_nocb_percpu_data(rdp);
3826 raw_spin_unlock_irqrestore(&rnp->lock, flags); 3885 raw_spin_unlock_irqrestore(&rnp->lock, flags);
3827} 3886}
@@ -4013,22 +4072,22 @@ void rcu_scheduler_starting(void)
4013 * Compute the per-level fanout, either using the exact fanout specified 4072 * Compute the per-level fanout, either using the exact fanout specified
4014 * or balancing the tree, depending on the rcu_fanout_exact boot parameter. 4073 * or balancing the tree, depending on the rcu_fanout_exact boot parameter.
4015 */ 4074 */
4016static void __init rcu_init_levelspread(struct rcu_state *rsp) 4075static void __init rcu_init_levelspread(int *levelspread, const int *levelcnt)
4017{ 4076{
4018 int i; 4077 int i;
4019 4078
4020 if (rcu_fanout_exact) { 4079 if (rcu_fanout_exact) {
4021 rsp->levelspread[rcu_num_lvls - 1] = rcu_fanout_leaf; 4080 levelspread[rcu_num_lvls - 1] = rcu_fanout_leaf;
4022 for (i = rcu_num_lvls - 2; i >= 0; i--) 4081 for (i = rcu_num_lvls - 2; i >= 0; i--)
4023 rsp->levelspread[i] = RCU_FANOUT; 4082 levelspread[i] = RCU_FANOUT;
4024 } else { 4083 } else {
4025 int ccur; 4084 int ccur;
4026 int cprv; 4085 int cprv;
4027 4086
4028 cprv = nr_cpu_ids; 4087 cprv = nr_cpu_ids;
4029 for (i = rcu_num_lvls - 1; i >= 0; i--) { 4088 for (i = rcu_num_lvls - 1; i >= 0; i--) {
4030 ccur = rsp->levelcnt[i]; 4089 ccur = levelcnt[i];
4031 rsp->levelspread[i] = (cprv + ccur - 1) / ccur; 4090 levelspread[i] = (cprv + ccur - 1) / ccur;
4032 cprv = ccur; 4091 cprv = ccur;
4033 } 4092 }
4034 } 4093 }
@@ -4040,23 +4099,20 @@ static void __init rcu_init_levelspread(struct rcu_state *rsp)
4040static void __init rcu_init_one(struct rcu_state *rsp, 4099static void __init rcu_init_one(struct rcu_state *rsp,
4041 struct rcu_data __percpu *rda) 4100 struct rcu_data __percpu *rda)
4042{ 4101{
4043 static const char * const buf[] = { 4102 static const char * const buf[] = RCU_NODE_NAME_INIT;
4044 "rcu_node_0", 4103 static const char * const fqs[] = RCU_FQS_NAME_INIT;
4045 "rcu_node_1", 4104 static const char * const exp[] = RCU_EXP_NAME_INIT;
4046 "rcu_node_2", 4105 static const char * const exp_sched[] = RCU_EXP_SCHED_NAME_INIT;
4047 "rcu_node_3" }; /* Match MAX_RCU_LVLS */
4048 static const char * const fqs[] = {
4049 "rcu_node_fqs_0",
4050 "rcu_node_fqs_1",
4051 "rcu_node_fqs_2",
4052 "rcu_node_fqs_3" }; /* Match MAX_RCU_LVLS */
4053 static u8 fl_mask = 0x1; 4106 static u8 fl_mask = 0x1;
4107
4108 int levelcnt[RCU_NUM_LVLS]; /* # nodes in each level. */
4109 int levelspread[RCU_NUM_LVLS]; /* kids/node in each level. */
4054 int cpustride = 1; 4110 int cpustride = 1;
4055 int i; 4111 int i;
4056 int j; 4112 int j;
4057 struct rcu_node *rnp; 4113 struct rcu_node *rnp;
4058 4114
4059 BUILD_BUG_ON(MAX_RCU_LVLS > ARRAY_SIZE(buf)); /* Fix buf[] init! */ 4115 BUILD_BUG_ON(RCU_NUM_LVLS > ARRAY_SIZE(buf)); /* Fix buf[] init! */
4060 4116
4061 /* Silence gcc 4.8 false positive about array index out of range. */ 4117 /* Silence gcc 4.8 false positive about array index out of range. */
4062 if (rcu_num_lvls <= 0 || rcu_num_lvls > RCU_NUM_LVLS) 4118 if (rcu_num_lvls <= 0 || rcu_num_lvls > RCU_NUM_LVLS)
@@ -4065,19 +4121,19 @@ static void __init rcu_init_one(struct rcu_state *rsp,
4065 /* Initialize the level-tracking arrays. */ 4121 /* Initialize the level-tracking arrays. */
4066 4122
4067 for (i = 0; i < rcu_num_lvls; i++) 4123 for (i = 0; i < rcu_num_lvls; i++)
4068 rsp->levelcnt[i] = num_rcu_lvl[i]; 4124 levelcnt[i] = num_rcu_lvl[i];
4069 for (i = 1; i < rcu_num_lvls; i++) 4125 for (i = 1; i < rcu_num_lvls; i++)
4070 rsp->level[i] = rsp->level[i - 1] + rsp->levelcnt[i - 1]; 4126 rsp->level[i] = rsp->level[i - 1] + levelcnt[i - 1];
4071 rcu_init_levelspread(rsp); 4127 rcu_init_levelspread(levelspread, levelcnt);
4072 rsp->flavor_mask = fl_mask; 4128 rsp->flavor_mask = fl_mask;
4073 fl_mask <<= 1; 4129 fl_mask <<= 1;
4074 4130
4075 /* Initialize the elements themselves, starting from the leaves. */ 4131 /* Initialize the elements themselves, starting from the leaves. */
4076 4132
4077 for (i = rcu_num_lvls - 1; i >= 0; i--) { 4133 for (i = rcu_num_lvls - 1; i >= 0; i--) {
4078 cpustride *= rsp->levelspread[i]; 4134 cpustride *= levelspread[i];
4079 rnp = rsp->level[i]; 4135 rnp = rsp->level[i];
4080 for (j = 0; j < rsp->levelcnt[i]; j++, rnp++) { 4136 for (j = 0; j < levelcnt[i]; j++, rnp++) {
4081 raw_spin_lock_init(&rnp->lock); 4137 raw_spin_lock_init(&rnp->lock);
4082 lockdep_set_class_and_name(&rnp->lock, 4138 lockdep_set_class_and_name(&rnp->lock,
4083 &rcu_node_class[i], buf[i]); 4139 &rcu_node_class[i], buf[i]);
@@ -4097,14 +4153,23 @@ static void __init rcu_init_one(struct rcu_state *rsp,
4097 rnp->grpmask = 0; 4153 rnp->grpmask = 0;
4098 rnp->parent = NULL; 4154 rnp->parent = NULL;
4099 } else { 4155 } else {
4100 rnp->grpnum = j % rsp->levelspread[i - 1]; 4156 rnp->grpnum = j % levelspread[i - 1];
4101 rnp->grpmask = 1UL << rnp->grpnum; 4157 rnp->grpmask = 1UL << rnp->grpnum;
4102 rnp->parent = rsp->level[i - 1] + 4158 rnp->parent = rsp->level[i - 1] +
4103 j / rsp->levelspread[i - 1]; 4159 j / levelspread[i - 1];
4104 } 4160 }
4105 rnp->level = i; 4161 rnp->level = i;
4106 INIT_LIST_HEAD(&rnp->blkd_tasks); 4162 INIT_LIST_HEAD(&rnp->blkd_tasks);
4107 rcu_init_one_nocb(rnp); 4163 rcu_init_one_nocb(rnp);
4164 mutex_init(&rnp->exp_funnel_mutex);
4165 if (rsp == &rcu_sched_state)
4166 lockdep_set_class_and_name(
4167 &rnp->exp_funnel_mutex,
4168 &rcu_exp_sched_class[i], exp_sched[i]);
4169 else
4170 lockdep_set_class_and_name(
4171 &rnp->exp_funnel_mutex,
4172 &rcu_exp_class[i], exp[i]);
4108 } 4173 }
4109 } 4174 }
4110 4175
@@ -4128,9 +4193,7 @@ static void __init rcu_init_geometry(void)
4128{ 4193{
4129 ulong d; 4194 ulong d;
4130 int i; 4195 int i;
4131 int j; 4196 int rcu_capacity[RCU_NUM_LVLS];
4132 int n = nr_cpu_ids;
4133 int rcu_capacity[MAX_RCU_LVLS + 1];
4134 4197
4135 /* 4198 /*
4136 * Initialize any unspecified boot parameters. 4199 * Initialize any unspecified boot parameters.
@@ -4153,47 +4216,49 @@ static void __init rcu_init_geometry(void)
4153 rcu_fanout_leaf, nr_cpu_ids); 4216 rcu_fanout_leaf, nr_cpu_ids);
4154 4217
4155 /* 4218 /*
4156 * Compute number of nodes that can be handled an rcu_node tree
4157 * with the given number of levels. Setting rcu_capacity[0] makes
4158 * some of the arithmetic easier.
4159 */
4160 rcu_capacity[0] = 1;
4161 rcu_capacity[1] = rcu_fanout_leaf;
4162 for (i = 2; i <= MAX_RCU_LVLS; i++)
4163 rcu_capacity[i] = rcu_capacity[i - 1] * RCU_FANOUT;
4164
4165 /*
4166 * The boot-time rcu_fanout_leaf parameter is only permitted 4219 * The boot-time rcu_fanout_leaf parameter is only permitted
4167 * to increase the leaf-level fanout, not decrease it. Of course, 4220 * to increase the leaf-level fanout, not decrease it. Of course,
4168 * the leaf-level fanout cannot exceed the number of bits in 4221 * the leaf-level fanout cannot exceed the number of bits in
4169 * the rcu_node masks. Finally, the tree must be able to accommodate 4222 * the rcu_node masks. Complain and fall back to the compile-
4170 * the configured number of CPUs. Complain and fall back to the 4223 * time values if these limits are exceeded.
4171 * compile-time values if these limits are exceeded.
4172 */ 4224 */
4173 if (rcu_fanout_leaf < RCU_FANOUT_LEAF || 4225 if (rcu_fanout_leaf < RCU_FANOUT_LEAF ||
4174 rcu_fanout_leaf > sizeof(unsigned long) * 8 || 4226 rcu_fanout_leaf > sizeof(unsigned long) * 8) {
4175 n > rcu_capacity[MAX_RCU_LVLS]) { 4227 rcu_fanout_leaf = RCU_FANOUT_LEAF;
4176 WARN_ON(1); 4228 WARN_ON(1);
4177 return; 4229 return;
4178 } 4230 }
4179 4231
4232 /*
4233 * Compute number of nodes that can be handled an rcu_node tree
4234 * with the given number of levels.
4235 */
4236 rcu_capacity[0] = rcu_fanout_leaf;
4237 for (i = 1; i < RCU_NUM_LVLS; i++)
4238 rcu_capacity[i] = rcu_capacity[i - 1] * RCU_FANOUT;
4239
4240 /*
4241 * The tree must be able to accommodate the configured number of CPUs.
4242 * If this limit is exceeded than we have a serious problem elsewhere.
4243 */
4244 if (nr_cpu_ids > rcu_capacity[RCU_NUM_LVLS - 1])
4245 panic("rcu_init_geometry: rcu_capacity[] is too small");
4246
4247 /* Calculate the number of levels in the tree. */
4248 for (i = 0; nr_cpu_ids > rcu_capacity[i]; i++) {
4249 }
4250 rcu_num_lvls = i + 1;
4251
4180 /* Calculate the number of rcu_nodes at each level of the tree. */ 4252 /* Calculate the number of rcu_nodes at each level of the tree. */
4181 for (i = 1; i <= MAX_RCU_LVLS; i++) 4253 for (i = 0; i < rcu_num_lvls; i++) {
4182 if (n <= rcu_capacity[i]) { 4254 int cap = rcu_capacity[(rcu_num_lvls - 1) - i];
4183 for (j = 0; j <= i; j++) 4255 num_rcu_lvl[i] = DIV_ROUND_UP(nr_cpu_ids, cap);
4184 num_rcu_lvl[j] = 4256 }
4185 DIV_ROUND_UP(n, rcu_capacity[i - j]);
4186 rcu_num_lvls = i;
4187 for (j = i + 1; j <= MAX_RCU_LVLS; j++)
4188 num_rcu_lvl[j] = 0;
4189 break;
4190 }
4191 4257
4192 /* Calculate the total number of rcu_node structures. */ 4258 /* Calculate the total number of rcu_node structures. */
4193 rcu_num_nodes = 0; 4259 rcu_num_nodes = 0;
4194 for (i = 0; i <= MAX_RCU_LVLS; i++) 4260 for (i = 0; i < rcu_num_lvls; i++)
4195 rcu_num_nodes += num_rcu_lvl[i]; 4261 rcu_num_nodes += num_rcu_lvl[i];
4196 rcu_num_nodes -= n;
4197} 4262}
4198 4263
4199/* 4264/*
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 4adb7ca0bf47..0412030ca882 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -27,6 +27,7 @@
27#include <linux/threads.h> 27#include <linux/threads.h>
28#include <linux/cpumask.h> 28#include <linux/cpumask.h>
29#include <linux/seqlock.h> 29#include <linux/seqlock.h>
30#include <linux/stop_machine.h>
30 31
31/* 32/*
32 * Define shape of hierarchy based on NR_CPUS, CONFIG_RCU_FANOUT, and 33 * Define shape of hierarchy based on NR_CPUS, CONFIG_RCU_FANOUT, and
@@ -36,8 +37,6 @@
36 * Of course, your mileage may vary. 37 * Of course, your mileage may vary.
37 */ 38 */
38 39
39#define MAX_RCU_LVLS 4
40
41#ifdef CONFIG_RCU_FANOUT 40#ifdef CONFIG_RCU_FANOUT
42#define RCU_FANOUT CONFIG_RCU_FANOUT 41#define RCU_FANOUT CONFIG_RCU_FANOUT
43#else /* #ifdef CONFIG_RCU_FANOUT */ 42#else /* #ifdef CONFIG_RCU_FANOUT */
@@ -66,38 +65,53 @@
66#if NR_CPUS <= RCU_FANOUT_1 65#if NR_CPUS <= RCU_FANOUT_1
67# define RCU_NUM_LVLS 1 66# define RCU_NUM_LVLS 1
68# define NUM_RCU_LVL_0 1 67# define NUM_RCU_LVL_0 1
69# define NUM_RCU_LVL_1 (NR_CPUS) 68# define NUM_RCU_NODES NUM_RCU_LVL_0
70# define NUM_RCU_LVL_2 0 69# define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0 }
71# define NUM_RCU_LVL_3 0 70# define RCU_NODE_NAME_INIT { "rcu_node_0" }
72# define NUM_RCU_LVL_4 0 71# define RCU_FQS_NAME_INIT { "rcu_node_fqs_0" }
72# define RCU_EXP_NAME_INIT { "rcu_node_exp_0" }
73# define RCU_EXP_SCHED_NAME_INIT \
74 { "rcu_node_exp_sched_0" }
73#elif NR_CPUS <= RCU_FANOUT_2 75#elif NR_CPUS <= RCU_FANOUT_2
74# define RCU_NUM_LVLS 2 76# define RCU_NUM_LVLS 2
75# define NUM_RCU_LVL_0 1 77# define NUM_RCU_LVL_0 1
76# define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1) 78# define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1)
77# define NUM_RCU_LVL_2 (NR_CPUS) 79# define NUM_RCU_NODES (NUM_RCU_LVL_0 + NUM_RCU_LVL_1)
78# define NUM_RCU_LVL_3 0 80# define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0, NUM_RCU_LVL_1 }
79# define NUM_RCU_LVL_4 0 81# define RCU_NODE_NAME_INIT { "rcu_node_0", "rcu_node_1" }
82# define RCU_FQS_NAME_INIT { "rcu_node_fqs_0", "rcu_node_fqs_1" }
83# define RCU_EXP_NAME_INIT { "rcu_node_exp_0", "rcu_node_exp_1" }
84# define RCU_EXP_SCHED_NAME_INIT \
85 { "rcu_node_exp_sched_0", "rcu_node_exp_sched_1" }
80#elif NR_CPUS <= RCU_FANOUT_3 86#elif NR_CPUS <= RCU_FANOUT_3
81# define RCU_NUM_LVLS 3 87# define RCU_NUM_LVLS 3
82# define NUM_RCU_LVL_0 1 88# define NUM_RCU_LVL_0 1
83# define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2) 89# define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2)
84# define NUM_RCU_LVL_2 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1) 90# define NUM_RCU_LVL_2 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1)
85# define NUM_RCU_LVL_3 (NR_CPUS) 91# define NUM_RCU_NODES (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2)
86# define NUM_RCU_LVL_4 0 92# define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0, NUM_RCU_LVL_1, NUM_RCU_LVL_2 }
93# define RCU_NODE_NAME_INIT { "rcu_node_0", "rcu_node_1", "rcu_node_2" }
94# define RCU_FQS_NAME_INIT { "rcu_node_fqs_0", "rcu_node_fqs_1", "rcu_node_fqs_2" }
95# define RCU_EXP_NAME_INIT { "rcu_node_exp_0", "rcu_node_exp_1", "rcu_node_exp_2" }
96# define RCU_EXP_SCHED_NAME_INIT \
97 { "rcu_node_exp_sched_0", "rcu_node_exp_sched_1", "rcu_node_exp_sched_2" }
87#elif NR_CPUS <= RCU_FANOUT_4 98#elif NR_CPUS <= RCU_FANOUT_4
88# define RCU_NUM_LVLS 4 99# define RCU_NUM_LVLS 4
89# define NUM_RCU_LVL_0 1 100# define NUM_RCU_LVL_0 1
90# define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_3) 101# define NUM_RCU_LVL_1 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_3)
91# define NUM_RCU_LVL_2 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2) 102# define NUM_RCU_LVL_2 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2)
92# define NUM_RCU_LVL_3 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1) 103# define NUM_RCU_LVL_3 DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1)
93# define NUM_RCU_LVL_4 (NR_CPUS) 104# define NUM_RCU_NODES (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2 + NUM_RCU_LVL_3)
105# define NUM_RCU_LVL_INIT { NUM_RCU_LVL_0, NUM_RCU_LVL_1, NUM_RCU_LVL_2, NUM_RCU_LVL_3 }
106# define RCU_NODE_NAME_INIT { "rcu_node_0", "rcu_node_1", "rcu_node_2", "rcu_node_3" }
107# define RCU_FQS_NAME_INIT { "rcu_node_fqs_0", "rcu_node_fqs_1", "rcu_node_fqs_2", "rcu_node_fqs_3" }
108# define RCU_EXP_NAME_INIT { "rcu_node_exp_0", "rcu_node_exp_1", "rcu_node_exp_2", "rcu_node_exp_3" }
109# define RCU_EXP_SCHED_NAME_INIT \
110 { "rcu_node_exp_sched_0", "rcu_node_exp_sched_1", "rcu_node_exp_sched_2", "rcu_node_exp_sched_3" }
94#else 111#else
95# error "CONFIG_RCU_FANOUT insufficient for NR_CPUS" 112# error "CONFIG_RCU_FANOUT insufficient for NR_CPUS"
96#endif /* #if (NR_CPUS) <= RCU_FANOUT_1 */ 113#endif /* #if (NR_CPUS) <= RCU_FANOUT_1 */
97 114
98#define RCU_SUM (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2 + NUM_RCU_LVL_3 + NUM_RCU_LVL_4)
99#define NUM_RCU_NODES (RCU_SUM - NR_CPUS)
100
101extern int rcu_num_lvls; 115extern int rcu_num_lvls;
102extern int rcu_num_nodes; 116extern int rcu_num_nodes;
103 117
@@ -236,6 +250,8 @@ struct rcu_node {
236 int need_future_gp[2]; 250 int need_future_gp[2];
237 /* Counts of upcoming no-CB GP requests. */ 251 /* Counts of upcoming no-CB GP requests. */
238 raw_spinlock_t fqslock ____cacheline_internodealigned_in_smp; 252 raw_spinlock_t fqslock ____cacheline_internodealigned_in_smp;
253
254 struct mutex exp_funnel_mutex ____cacheline_internodealigned_in_smp;
239} ____cacheline_internodealigned_in_smp; 255} ____cacheline_internodealigned_in_smp;
240 256
241/* 257/*
@@ -287,12 +303,13 @@ struct rcu_data {
287 bool gpwrap; /* Possible gpnum/completed wrap. */ 303 bool gpwrap; /* Possible gpnum/completed wrap. */
288 struct rcu_node *mynode; /* This CPU's leaf of hierarchy */ 304 struct rcu_node *mynode; /* This CPU's leaf of hierarchy */
289 unsigned long grpmask; /* Mask to apply to leaf qsmask. */ 305 unsigned long grpmask; /* Mask to apply to leaf qsmask. */
290#ifdef CONFIG_RCU_CPU_STALL_INFO
291 unsigned long ticks_this_gp; /* The number of scheduling-clock */ 306 unsigned long ticks_this_gp; /* The number of scheduling-clock */
292 /* ticks this CPU has handled */ 307 /* ticks this CPU has handled */
293 /* during and after the last grace */ 308 /* during and after the last grace */
294 /* period it is aware of. */ 309 /* period it is aware of. */
295#endif /* #ifdef CONFIG_RCU_CPU_STALL_INFO */ 310 struct cpu_stop_work exp_stop_work;
311 /* Expedited grace-period control */
312 /* for CPU stopping. */
296 313
297 /* 2) batch handling */ 314 /* 2) batch handling */
298 /* 315 /*
@@ -355,11 +372,13 @@ struct rcu_data {
355 unsigned long n_rp_nocb_defer_wakeup; 372 unsigned long n_rp_nocb_defer_wakeup;
356 unsigned long n_rp_need_nothing; 373 unsigned long n_rp_need_nothing;
357 374
358 /* 6) _rcu_barrier() and OOM callbacks. */ 375 /* 6) _rcu_barrier(), OOM callbacks, and expediting. */
359 struct rcu_head barrier_head; 376 struct rcu_head barrier_head;
360#ifdef CONFIG_RCU_FAST_NO_HZ 377#ifdef CONFIG_RCU_FAST_NO_HZ
361 struct rcu_head oom_head; 378 struct rcu_head oom_head;
362#endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */ 379#endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */
380 struct mutex exp_funnel_mutex;
381 bool exp_done; /* Expedited QS for this CPU? */
363 382
364 /* 7) Callback offloading. */ 383 /* 7) Callback offloading. */
365#ifdef CONFIG_RCU_NOCB_CPU 384#ifdef CONFIG_RCU_NOCB_CPU
@@ -387,9 +406,7 @@ struct rcu_data {
387#endif /* #ifdef CONFIG_RCU_NOCB_CPU */ 406#endif /* #ifdef CONFIG_RCU_NOCB_CPU */
388 407
389 /* 8) RCU CPU stall data. */ 408 /* 8) RCU CPU stall data. */
390#ifdef CONFIG_RCU_CPU_STALL_INFO
391 unsigned int softirq_snap; /* Snapshot of softirq activity. */ 409 unsigned int softirq_snap; /* Snapshot of softirq activity. */
392#endif /* #ifdef CONFIG_RCU_CPU_STALL_INFO */
393 410
394 int cpu; 411 int cpu;
395 struct rcu_state *rsp; 412 struct rcu_state *rsp;
@@ -442,9 +459,9 @@ do { \
442 */ 459 */
443struct rcu_state { 460struct rcu_state {
444 struct rcu_node node[NUM_RCU_NODES]; /* Hierarchy. */ 461 struct rcu_node node[NUM_RCU_NODES]; /* Hierarchy. */
445 struct rcu_node *level[RCU_NUM_LVLS]; /* Hierarchy levels. */ 462 struct rcu_node *level[RCU_NUM_LVLS + 1];
446 u32 levelcnt[MAX_RCU_LVLS + 1]; /* # nodes in each level. */ 463 /* Hierarchy levels (+1 to */
447 u8 levelspread[RCU_NUM_LVLS]; /* kids/node in each level. */ 464 /* shut bogus gcc warning) */
448 u8 flavor_mask; /* bit in flavor mask. */ 465 u8 flavor_mask; /* bit in flavor mask. */
449 struct rcu_data __percpu *rda; /* pointer of percu rcu_data. */ 466 struct rcu_data __percpu *rda; /* pointer of percu rcu_data. */
450 void (*call)(struct rcu_head *head, /* call_rcu() flavor. */ 467 void (*call)(struct rcu_head *head, /* call_rcu() flavor. */
@@ -479,21 +496,18 @@ struct rcu_state {
479 struct mutex barrier_mutex; /* Guards barrier fields. */ 496 struct mutex barrier_mutex; /* Guards barrier fields. */
480 atomic_t barrier_cpu_count; /* # CPUs waiting on. */ 497 atomic_t barrier_cpu_count; /* # CPUs waiting on. */
481 struct completion barrier_completion; /* Wake at barrier end. */ 498 struct completion barrier_completion; /* Wake at barrier end. */
482 unsigned long n_barrier_done; /* ++ at start and end of */ 499 unsigned long barrier_sequence; /* ++ at start and end of */
483 /* _rcu_barrier(). */ 500 /* _rcu_barrier(). */
484 /* End of fields guarded by barrier_mutex. */ 501 /* End of fields guarded by barrier_mutex. */
485 502
486 atomic_long_t expedited_start; /* Starting ticket. */ 503 unsigned long expedited_sequence; /* Take a ticket. */
487 atomic_long_t expedited_done; /* Done ticket. */ 504 atomic_long_t expedited_workdone0; /* # done by others #0. */
488 atomic_long_t expedited_wrap; /* # near-wrap incidents. */
489 atomic_long_t expedited_tryfail; /* # acquisition failures. */
490 atomic_long_t expedited_workdone1; /* # done by others #1. */ 505 atomic_long_t expedited_workdone1; /* # done by others #1. */
491 atomic_long_t expedited_workdone2; /* # done by others #2. */ 506 atomic_long_t expedited_workdone2; /* # done by others #2. */
507 atomic_long_t expedited_workdone3; /* # done by others #3. */
492 atomic_long_t expedited_normal; /* # fallbacks to normal. */ 508 atomic_long_t expedited_normal; /* # fallbacks to normal. */
493 atomic_long_t expedited_stoppedcpus; /* # successful stop_cpus. */ 509 atomic_t expedited_need_qs; /* # CPUs left to check in. */
494 atomic_long_t expedited_done_tries; /* # tries to update _done. */ 510 wait_queue_head_t expedited_wq; /* Wait for check-ins. */
495 atomic_long_t expedited_done_lost; /* # times beaten to _done. */
496 atomic_long_t expedited_done_exit; /* # times exited _done loop. */
497 511
498 unsigned long jiffies_force_qs; /* Time at which to invoke */ 512 unsigned long jiffies_force_qs; /* Time at which to invoke */
499 /* force_quiescent_state(). */ 513 /* force_quiescent_state(). */
@@ -527,7 +541,11 @@ struct rcu_state {
527/* Values for rcu_state structure's gp_flags field. */ 541/* Values for rcu_state structure's gp_flags field. */
528#define RCU_GP_WAIT_INIT 0 /* Initial state. */ 542#define RCU_GP_WAIT_INIT 0 /* Initial state. */
529#define RCU_GP_WAIT_GPS 1 /* Wait for grace-period start. */ 543#define RCU_GP_WAIT_GPS 1 /* Wait for grace-period start. */
530#define RCU_GP_WAIT_FQS 2 /* Wait for force-quiescent-state time. */ 544#define RCU_GP_DONE_GPS 2 /* Wait done for grace-period start. */
545#define RCU_GP_WAIT_FQS 3 /* Wait for force-quiescent-state time. */
546#define RCU_GP_DOING_FQS 4 /* Wait done for force-quiescent-state time. */
547#define RCU_GP_CLEANUP 5 /* Grace-period cleanup started. */
548#define RCU_GP_CLEANED 6 /* Grace-period cleanup complete. */
531 549
532extern struct list_head rcu_struct_flavors; 550extern struct list_head rcu_struct_flavors;
533 551
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 80a7c17907fe..b2bf3963a0ae 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -82,10 +82,8 @@ static void __init rcu_bootup_announce_oddness(void)
82 pr_info("\tRCU lockdep checking is enabled.\n"); 82 pr_info("\tRCU lockdep checking is enabled.\n");
83 if (IS_ENABLED(CONFIG_RCU_TORTURE_TEST_RUNNABLE)) 83 if (IS_ENABLED(CONFIG_RCU_TORTURE_TEST_RUNNABLE))
84 pr_info("\tRCU torture testing starts during boot.\n"); 84 pr_info("\tRCU torture testing starts during boot.\n");
85 if (IS_ENABLED(CONFIG_RCU_CPU_STALL_INFO)) 85 if (RCU_NUM_LVLS >= 4)
86 pr_info("\tAdditional per-CPU info printed with stalls.\n"); 86 pr_info("\tFour(or more)-level hierarchy is enabled.\n");
87 if (NUM_RCU_LVL_4 != 0)
88 pr_info("\tFour-level hierarchy is enabled.\n");
89 if (RCU_FANOUT_LEAF != 16) 87 if (RCU_FANOUT_LEAF != 16)
90 pr_info("\tBuild-time adjustment of leaf fanout to %d.\n", 88 pr_info("\tBuild-time adjustment of leaf fanout to %d.\n",
91 RCU_FANOUT_LEAF); 89 RCU_FANOUT_LEAF);
@@ -418,8 +416,6 @@ static void rcu_print_detail_task_stall(struct rcu_state *rsp)
418 rcu_print_detail_task_stall_rnp(rnp); 416 rcu_print_detail_task_stall_rnp(rnp);
419} 417}
420 418
421#ifdef CONFIG_RCU_CPU_STALL_INFO
422
423static void rcu_print_task_stall_begin(struct rcu_node *rnp) 419static void rcu_print_task_stall_begin(struct rcu_node *rnp)
424{ 420{
425 pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):", 421 pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):",
@@ -431,18 +427,6 @@ static void rcu_print_task_stall_end(void)
431 pr_cont("\n"); 427 pr_cont("\n");
432} 428}
433 429
434#else /* #ifdef CONFIG_RCU_CPU_STALL_INFO */
435
436static void rcu_print_task_stall_begin(struct rcu_node *rnp)
437{
438}
439
440static void rcu_print_task_stall_end(void)
441{
442}
443
444#endif /* #else #ifdef CONFIG_RCU_CPU_STALL_INFO */
445
446/* 430/*
447 * Scan the current list of tasks blocked within RCU read-side critical 431 * Scan the current list of tasks blocked within RCU read-side critical
448 * sections, printing out the tid of each. 432 * sections, printing out the tid of each.
@@ -552,8 +536,6 @@ void synchronize_rcu(void)
552EXPORT_SYMBOL_GPL(synchronize_rcu); 536EXPORT_SYMBOL_GPL(synchronize_rcu);
553 537
554static DECLARE_WAIT_QUEUE_HEAD(sync_rcu_preempt_exp_wq); 538static DECLARE_WAIT_QUEUE_HEAD(sync_rcu_preempt_exp_wq);
555static unsigned long sync_rcu_preempt_exp_count;
556static DEFINE_MUTEX(sync_rcu_preempt_exp_mutex);
557 539
558/* 540/*
559 * Return non-zero if there are any tasks in RCU read-side critical 541 * Return non-zero if there are any tasks in RCU read-side critical
@@ -573,7 +555,7 @@ static int rcu_preempted_readers_exp(struct rcu_node *rnp)
573 * for the current expedited grace period. Works only for preemptible 555 * for the current expedited grace period. Works only for preemptible
574 * RCU -- other RCU implementation use other means. 556 * RCU -- other RCU implementation use other means.
575 * 557 *
576 * Caller must hold sync_rcu_preempt_exp_mutex. 558 * Caller must hold the root rcu_node's exp_funnel_mutex.
577 */ 559 */
578static int sync_rcu_preempt_exp_done(struct rcu_node *rnp) 560static int sync_rcu_preempt_exp_done(struct rcu_node *rnp)
579{ 561{
@@ -589,7 +571,7 @@ static int sync_rcu_preempt_exp_done(struct rcu_node *rnp)
589 * recursively up the tree. (Calm down, calm down, we do the recursion 571 * recursively up the tree. (Calm down, calm down, we do the recursion
590 * iteratively!) 572 * iteratively!)
591 * 573 *
592 * Caller must hold sync_rcu_preempt_exp_mutex. 574 * Caller must hold the root rcu_node's exp_funnel_mutex.
593 */ 575 */
594static void rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp, 576static void rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp,
595 bool wake) 577 bool wake)
@@ -628,7 +610,7 @@ static void rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp,
628 * set the ->expmask bits on the leaf rcu_node structures to tell phase 2 610 * set the ->expmask bits on the leaf rcu_node structures to tell phase 2
629 * that work is needed here. 611 * that work is needed here.
630 * 612 *
631 * Caller must hold sync_rcu_preempt_exp_mutex. 613 * Caller must hold the root rcu_node's exp_funnel_mutex.
632 */ 614 */
633static void 615static void
634sync_rcu_preempt_exp_init1(struct rcu_state *rsp, struct rcu_node *rnp) 616sync_rcu_preempt_exp_init1(struct rcu_state *rsp, struct rcu_node *rnp)
@@ -671,7 +653,7 @@ sync_rcu_preempt_exp_init1(struct rcu_state *rsp, struct rcu_node *rnp)
671 * invoke rcu_report_exp_rnp() to clear out the upper-level ->expmask bits, 653 * invoke rcu_report_exp_rnp() to clear out the upper-level ->expmask bits,
672 * enabling rcu_read_unlock_special() to do the bit-clearing. 654 * enabling rcu_read_unlock_special() to do the bit-clearing.
673 * 655 *
674 * Caller must hold sync_rcu_preempt_exp_mutex. 656 * Caller must hold the root rcu_node's exp_funnel_mutex.
675 */ 657 */
676static void 658static void
677sync_rcu_preempt_exp_init2(struct rcu_state *rsp, struct rcu_node *rnp) 659sync_rcu_preempt_exp_init2(struct rcu_state *rsp, struct rcu_node *rnp)
@@ -719,51 +701,17 @@ sync_rcu_preempt_exp_init2(struct rcu_state *rsp, struct rcu_node *rnp)
719void synchronize_rcu_expedited(void) 701void synchronize_rcu_expedited(void)
720{ 702{
721 struct rcu_node *rnp; 703 struct rcu_node *rnp;
704 struct rcu_node *rnp_unlock;
722 struct rcu_state *rsp = rcu_state_p; 705 struct rcu_state *rsp = rcu_state_p;
723 unsigned long snap; 706 unsigned long s;
724 int trycount = 0;
725 707
726 smp_mb(); /* Caller's modifications seen first by other CPUs. */ 708 s = rcu_exp_gp_seq_snap(rsp);
727 snap = READ_ONCE(sync_rcu_preempt_exp_count) + 1;
728 smp_mb(); /* Above access cannot bleed into critical section. */
729 709
730 /* 710 rnp_unlock = exp_funnel_lock(rsp, s);
731 * Block CPU-hotplug operations. This means that any CPU-hotplug 711 if (rnp_unlock == NULL)
732 * operation that finds an rcu_node structure with tasks in the 712 return; /* Someone else did our work for us. */
733 * process of being boosted will know that all tasks blocking
734 * this expedited grace period will already be in the process of
735 * being boosted. This simplifies the process of moving tasks
736 * from leaf to root rcu_node structures.
737 */
738 if (!try_get_online_cpus()) {
739 /* CPU-hotplug operation in flight, fall back to normal GP. */
740 wait_rcu_gp(call_rcu);
741 return;
742 }
743 713
744 /* 714 rcu_exp_gp_seq_start(rsp);
745 * Acquire lock, falling back to synchronize_rcu() if too many
746 * lock-acquisition failures. Of course, if someone does the
747 * expedited grace period for us, just leave.
748 */
749 while (!mutex_trylock(&sync_rcu_preempt_exp_mutex)) {
750 if (ULONG_CMP_LT(snap,
751 READ_ONCE(sync_rcu_preempt_exp_count))) {
752 put_online_cpus();
753 goto mb_ret; /* Others did our work for us. */
754 }
755 if (trycount++ < 10) {
756 udelay(trycount * num_online_cpus());
757 } else {
758 put_online_cpus();
759 wait_rcu_gp(call_rcu);
760 return;
761 }
762 }
763 if (ULONG_CMP_LT(snap, READ_ONCE(sync_rcu_preempt_exp_count))) {
764 put_online_cpus();
765 goto unlock_mb_ret; /* Others did our work for us. */
766 }
767 715
768 /* force all RCU readers onto ->blkd_tasks lists. */ 716 /* force all RCU readers onto ->blkd_tasks lists. */
769 synchronize_sched_expedited(); 717 synchronize_sched_expedited();
@@ -779,20 +727,14 @@ void synchronize_rcu_expedited(void)
779 rcu_for_each_leaf_node(rsp, rnp) 727 rcu_for_each_leaf_node(rsp, rnp)
780 sync_rcu_preempt_exp_init2(rsp, rnp); 728 sync_rcu_preempt_exp_init2(rsp, rnp);
781 729
782 put_online_cpus();
783
784 /* Wait for snapshotted ->blkd_tasks lists to drain. */ 730 /* Wait for snapshotted ->blkd_tasks lists to drain. */
785 rnp = rcu_get_root(rsp); 731 rnp = rcu_get_root(rsp);
786 wait_event(sync_rcu_preempt_exp_wq, 732 wait_event(sync_rcu_preempt_exp_wq,
787 sync_rcu_preempt_exp_done(rnp)); 733 sync_rcu_preempt_exp_done(rnp));
788 734
789 /* Clean up and exit. */ 735 /* Clean up and exit. */
790 smp_mb(); /* ensure expedited GP seen before counter increment. */ 736 rcu_exp_gp_seq_end(rsp);
791 WRITE_ONCE(sync_rcu_preempt_exp_count, sync_rcu_preempt_exp_count + 1); 737 mutex_unlock(&rnp_unlock->exp_funnel_mutex);
792unlock_mb_ret:
793 mutex_unlock(&sync_rcu_preempt_exp_mutex);
794mb_ret:
795 smp_mb(); /* ensure subsequent action seen after grace period. */
796} 738}
797EXPORT_SYMBOL_GPL(synchronize_rcu_expedited); 739EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);
798 740
@@ -1703,8 +1645,6 @@ early_initcall(rcu_register_oom_notifier);
1703 1645
1704#endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */ 1646#endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */
1705 1647
1706#ifdef CONFIG_RCU_CPU_STALL_INFO
1707
1708#ifdef CONFIG_RCU_FAST_NO_HZ 1648#ifdef CONFIG_RCU_FAST_NO_HZ
1709 1649
1710static void print_cpu_stall_fast_no_hz(char *cp, int cpu) 1650static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
@@ -1793,33 +1733,6 @@ static void increment_cpu_stall_ticks(void)
1793 raw_cpu_inc(rsp->rda->ticks_this_gp); 1733 raw_cpu_inc(rsp->rda->ticks_this_gp);
1794} 1734}
1795 1735
1796#else /* #ifdef CONFIG_RCU_CPU_STALL_INFO */
1797
1798static void print_cpu_stall_info_begin(void)
1799{
1800 pr_cont(" {");
1801}
1802
1803static void print_cpu_stall_info(struct rcu_state *rsp, int cpu)
1804{
1805 pr_cont(" %d", cpu);
1806}
1807
1808static void print_cpu_stall_info_end(void)
1809{
1810 pr_cont("} ");
1811}
1812
1813static void zero_cpu_stall_ticks(struct rcu_data *rdp)
1814{
1815}
1816
1817static void increment_cpu_stall_ticks(void)
1818{
1819}
1820
1821#endif /* #else #ifdef CONFIG_RCU_CPU_STALL_INFO */
1822
1823#ifdef CONFIG_RCU_NOCB_CPU 1736#ifdef CONFIG_RCU_NOCB_CPU
1824 1737
1825/* 1738/*
diff --git a/kernel/rcu/tree_trace.c b/kernel/rcu/tree_trace.c
index 3ea7ffc7d5c4..6fc4c5ff3bb5 100644
--- a/kernel/rcu/tree_trace.c
+++ b/kernel/rcu/tree_trace.c
@@ -81,9 +81,9 @@ static void r_stop(struct seq_file *m, void *v)
81static int show_rcubarrier(struct seq_file *m, void *v) 81static int show_rcubarrier(struct seq_file *m, void *v)
82{ 82{
83 struct rcu_state *rsp = (struct rcu_state *)m->private; 83 struct rcu_state *rsp = (struct rcu_state *)m->private;
84 seq_printf(m, "bcc: %d nbd: %lu\n", 84 seq_printf(m, "bcc: %d bseq: %lu\n",
85 atomic_read(&rsp->barrier_cpu_count), 85 atomic_read(&rsp->barrier_cpu_count),
86 rsp->n_barrier_done); 86 rsp->barrier_sequence);
87 return 0; 87 return 0;
88} 88}
89 89
@@ -185,18 +185,15 @@ static int show_rcuexp(struct seq_file *m, void *v)
185{ 185{
186 struct rcu_state *rsp = (struct rcu_state *)m->private; 186 struct rcu_state *rsp = (struct rcu_state *)m->private;
187 187
188 seq_printf(m, "s=%lu d=%lu w=%lu tf=%lu wd1=%lu wd2=%lu n=%lu sc=%lu dt=%lu dl=%lu dx=%lu\n", 188 seq_printf(m, "s=%lu wd0=%lu wd1=%lu wd2=%lu wd3=%lu n=%lu enq=%d sc=%lu\n",
189 atomic_long_read(&rsp->expedited_start), 189 rsp->expedited_sequence,
190 atomic_long_read(&rsp->expedited_done), 190 atomic_long_read(&rsp->expedited_workdone0),
191 atomic_long_read(&rsp->expedited_wrap),
192 atomic_long_read(&rsp->expedited_tryfail),
193 atomic_long_read(&rsp->expedited_workdone1), 191 atomic_long_read(&rsp->expedited_workdone1),
194 atomic_long_read(&rsp->expedited_workdone2), 192 atomic_long_read(&rsp->expedited_workdone2),
193 atomic_long_read(&rsp->expedited_workdone3),
195 atomic_long_read(&rsp->expedited_normal), 194 atomic_long_read(&rsp->expedited_normal),
196 atomic_long_read(&rsp->expedited_stoppedcpus), 195 atomic_read(&rsp->expedited_need_qs),
197 atomic_long_read(&rsp->expedited_done_tries), 196 rsp->expedited_sequence / 2);
198 atomic_long_read(&rsp->expedited_done_lost),
199 atomic_long_read(&rsp->expedited_done_exit));
200 return 0; 197 return 0;
201} 198}
202 199
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 26da2caa7d15..3e0b662cae09 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1353,20 +1353,6 @@ config RCU_CPU_STALL_TIMEOUT
1353 RCU grace period persists, additional CPU stall warnings are 1353 RCU grace period persists, additional CPU stall warnings are
1354 printed at more widely spaced intervals. 1354 printed at more widely spaced intervals.
1355 1355
1356config RCU_CPU_STALL_INFO
1357 bool "Print additional diagnostics on RCU CPU stall"
1358 depends on (TREE_RCU || PREEMPT_RCU) && DEBUG_KERNEL
1359 default y
1360 help
1361 For each stalled CPU that is aware of the current RCU grace
1362 period, print out additional per-CPU diagnostic information
1363 regarding scheduling-clock ticks, idle state, and,
1364 for RCU_FAST_NO_HZ kernels, idle-entry state.
1365
1366 Say N if you are unsure.
1367
1368 Say Y if you want to enable such diagnostics.
1369
1370config RCU_TRACE 1356config RCU_TRACE
1371 bool "Enable tracing for RCU" 1357 bool "Enable tracing for RCU"
1372 depends on DEBUG_KERNEL 1358 depends on DEBUG_KERNEL
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE01 b/tools/testing/selftests/rcutorture/configs/rcu/TREE01
index 8e9137f66831..f572b873c620 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE01
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE01
@@ -13,7 +13,6 @@ CONFIG_MAXSMP=y
13CONFIG_RCU_NOCB_CPU=y 13CONFIG_RCU_NOCB_CPU=y
14CONFIG_RCU_NOCB_CPU_ZERO=y 14CONFIG_RCU_NOCB_CPU_ZERO=y
15CONFIG_DEBUG_LOCK_ALLOC=n 15CONFIG_DEBUG_LOCK_ALLOC=n
16CONFIG_RCU_CPU_STALL_INFO=n
17CONFIG_RCU_BOOST=n 16CONFIG_RCU_BOOST=n
18CONFIG_DEBUG_OBJECTS_RCU_HEAD=n 17CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
19CONFIG_RCU_EXPERT=y 18CONFIG_RCU_EXPERT=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE02 b/tools/testing/selftests/rcutorture/configs/rcu/TREE02
index aeea6a204d14..ef6a22c44dea 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE02
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE02
@@ -17,7 +17,6 @@ CONFIG_RCU_FANOUT_LEAF=3
17CONFIG_RCU_NOCB_CPU=n 17CONFIG_RCU_NOCB_CPU=n
18CONFIG_DEBUG_LOCK_ALLOC=y 18CONFIG_DEBUG_LOCK_ALLOC=y
19CONFIG_PROVE_LOCKING=n 19CONFIG_PROVE_LOCKING=n
20CONFIG_RCU_CPU_STALL_INFO=n
21CONFIG_RCU_BOOST=n 20CONFIG_RCU_BOOST=n
22CONFIG_DEBUG_OBJECTS_RCU_HEAD=n 21CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
23CONFIG_RCU_EXPERT=y 22CONFIG_RCU_EXPERT=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE02-T b/tools/testing/selftests/rcutorture/configs/rcu/TREE02-T
index 2ac9e68ea3d1..917d2517b5b5 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE02-T
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE02-T
@@ -17,6 +17,5 @@ CONFIG_RCU_FANOUT_LEAF=3
17CONFIG_RCU_NOCB_CPU=n 17CONFIG_RCU_NOCB_CPU=n
18CONFIG_DEBUG_LOCK_ALLOC=y 18CONFIG_DEBUG_LOCK_ALLOC=y
19CONFIG_PROVE_LOCKING=n 19CONFIG_PROVE_LOCKING=n
20CONFIG_RCU_CPU_STALL_INFO=n
21CONFIG_RCU_BOOST=n 20CONFIG_RCU_BOOST=n
22CONFIG_DEBUG_OBJECTS_RCU_HEAD=n 21CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE03 b/tools/testing/selftests/rcutorture/configs/rcu/TREE03
index 72aa7d87ea99..7a17c503b382 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE03
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE03
@@ -13,7 +13,6 @@ CONFIG_RCU_FANOUT=2
13CONFIG_RCU_FANOUT_LEAF=2 13CONFIG_RCU_FANOUT_LEAF=2
14CONFIG_RCU_NOCB_CPU=n 14CONFIG_RCU_NOCB_CPU=n
15CONFIG_DEBUG_LOCK_ALLOC=n 15CONFIG_DEBUG_LOCK_ALLOC=n
16CONFIG_RCU_CPU_STALL_INFO=n
17CONFIG_RCU_BOOST=y 16CONFIG_RCU_BOOST=y
18CONFIG_RCU_KTHREAD_PRIO=2 17CONFIG_RCU_KTHREAD_PRIO=2
19CONFIG_DEBUG_OBJECTS_RCU_HEAD=n 18CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE04 b/tools/testing/selftests/rcutorture/configs/rcu/TREE04
index 3f5112751cda..39a2c6d7d7ec 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE04
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE04
@@ -17,6 +17,5 @@ CONFIG_RCU_FANOUT=4
17CONFIG_RCU_FANOUT_LEAF=4 17CONFIG_RCU_FANOUT_LEAF=4
18CONFIG_RCU_NOCB_CPU=n 18CONFIG_RCU_NOCB_CPU=n
19CONFIG_DEBUG_LOCK_ALLOC=n 19CONFIG_DEBUG_LOCK_ALLOC=n
20CONFIG_RCU_CPU_STALL_INFO=n
21CONFIG_DEBUG_OBJECTS_RCU_HEAD=n 20CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
22CONFIG_RCU_EXPERT=y 21CONFIG_RCU_EXPERT=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE05 b/tools/testing/selftests/rcutorture/configs/rcu/TREE05
index c04dfea6fd21..1257d3227b1e 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE05
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE05
@@ -17,6 +17,5 @@ CONFIG_RCU_NOCB_CPU_NONE=y
17CONFIG_DEBUG_LOCK_ALLOC=y 17CONFIG_DEBUG_LOCK_ALLOC=y
18CONFIG_PROVE_LOCKING=y 18CONFIG_PROVE_LOCKING=y
19#CHECK#CONFIG_PROVE_RCU=y 19#CHECK#CONFIG_PROVE_RCU=y
20CONFIG_RCU_CPU_STALL_INFO=n
21CONFIG_DEBUG_OBJECTS_RCU_HEAD=n 20CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
22CONFIG_RCU_EXPERT=y 21CONFIG_RCU_EXPERT=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE06 b/tools/testing/selftests/rcutorture/configs/rcu/TREE06
index f51d2c73a68e..d3e456b74cbe 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE06
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE06
@@ -18,6 +18,5 @@ CONFIG_RCU_NOCB_CPU=n
18CONFIG_DEBUG_LOCK_ALLOC=y 18CONFIG_DEBUG_LOCK_ALLOC=y
19CONFIG_PROVE_LOCKING=y 19CONFIG_PROVE_LOCKING=y
20#CHECK#CONFIG_PROVE_RCU=y 20#CHECK#CONFIG_PROVE_RCU=y
21CONFIG_RCU_CPU_STALL_INFO=n
22CONFIG_DEBUG_OBJECTS_RCU_HEAD=y 21CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
23CONFIG_RCU_EXPERT=y 22CONFIG_RCU_EXPERT=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE07 b/tools/testing/selftests/rcutorture/configs/rcu/TREE07
index f422af4ff5a3..3956b4131f72 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE07
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE07
@@ -17,6 +17,5 @@ CONFIG_RCU_FANOUT=2
17CONFIG_RCU_FANOUT_LEAF=2 17CONFIG_RCU_FANOUT_LEAF=2
18CONFIG_RCU_NOCB_CPU=n 18CONFIG_RCU_NOCB_CPU=n
19CONFIG_DEBUG_LOCK_ALLOC=n 19CONFIG_DEBUG_LOCK_ALLOC=n
20CONFIG_RCU_CPU_STALL_INFO=n
21CONFIG_DEBUG_OBJECTS_RCU_HEAD=n 20CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
22CONFIG_RCU_EXPERT=y 21CONFIG_RCU_EXPERT=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE08 b/tools/testing/selftests/rcutorture/configs/rcu/TREE08
index a24d2ca30646..bb9b0c1a23c2 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE08
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE08
@@ -19,7 +19,6 @@ CONFIG_RCU_NOCB_CPU_ALL=y
19CONFIG_DEBUG_LOCK_ALLOC=n 19CONFIG_DEBUG_LOCK_ALLOC=n
20CONFIG_PROVE_LOCKING=y 20CONFIG_PROVE_LOCKING=y
21#CHECK#CONFIG_PROVE_RCU=y 21#CHECK#CONFIG_PROVE_RCU=y
22CONFIG_RCU_CPU_STALL_INFO=n
23CONFIG_RCU_BOOST=n 22CONFIG_RCU_BOOST=n
24CONFIG_DEBUG_OBJECTS_RCU_HEAD=n 23CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
25CONFIG_RCU_EXPERT=y 24CONFIG_RCU_EXPERT=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE08-T b/tools/testing/selftests/rcutorture/configs/rcu/TREE08-T
index b2b8cea69dc9..2ad13f0d29cc 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE08-T
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE08-T
@@ -17,6 +17,5 @@ CONFIG_RCU_FANOUT_LEAF=2
17CONFIG_RCU_NOCB_CPU=y 17CONFIG_RCU_NOCB_CPU=y
18CONFIG_RCU_NOCB_CPU_ALL=y 18CONFIG_RCU_NOCB_CPU_ALL=y
19CONFIG_DEBUG_LOCK_ALLOC=n 19CONFIG_DEBUG_LOCK_ALLOC=n
20CONFIG_RCU_CPU_STALL_INFO=n
21CONFIG_RCU_BOOST=n 20CONFIG_RCU_BOOST=n
22CONFIG_DEBUG_OBJECTS_RCU_HEAD=n 21CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE09 b/tools/testing/selftests/rcutorture/configs/rcu/TREE09
index aa4ed08d999d..6710e749d9de 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE09
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE09
@@ -13,7 +13,6 @@ CONFIG_SUSPEND=n
13CONFIG_HIBERNATION=n 13CONFIG_HIBERNATION=n
14CONFIG_RCU_NOCB_CPU=n 14CONFIG_RCU_NOCB_CPU=n
15CONFIG_DEBUG_LOCK_ALLOC=n 15CONFIG_DEBUG_LOCK_ALLOC=n
16CONFIG_RCU_CPU_STALL_INFO=n
17CONFIG_RCU_BOOST=n 16CONFIG_RCU_BOOST=n
18CONFIG_DEBUG_OBJECTS_RCU_HEAD=n 17CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
19#CHECK#CONFIG_RCU_EXPERT=n 18#CHECK#CONFIG_RCU_EXPERT=n
diff --git a/tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt b/tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt
index b24c0004fc49..657f3a035488 100644
--- a/tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt
+++ b/tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt
@@ -16,7 +16,6 @@ CONFIG_PROVE_LOCKING -- Do several, covering CONFIG_DEBUG_LOCK_ALLOC=y and not.
16CONFIG_PROVE_RCU -- Hardwired to CONFIG_PROVE_LOCKING. 16CONFIG_PROVE_RCU -- Hardwired to CONFIG_PROVE_LOCKING.
17CONFIG_RCU_BOOST -- one of PREEMPT_RCU. 17CONFIG_RCU_BOOST -- one of PREEMPT_RCU.
18CONFIG_RCU_KTHREAD_PRIO -- set to 2 for _BOOST testing. 18CONFIG_RCU_KTHREAD_PRIO -- set to 2 for _BOOST testing.
19CONFIG_RCU_CPU_STALL_INFO -- Now default, avoid at least twice.
20CONFIG_RCU_FANOUT -- Cover hierarchy, but overlap with others. 19CONFIG_RCU_FANOUT -- Cover hierarchy, but overlap with others.
21CONFIG_RCU_FANOUT_LEAF -- Do one non-default. 20CONFIG_RCU_FANOUT_LEAF -- Do one non-default.
22CONFIG_RCU_FAST_NO_HZ -- Do one, but not with CONFIG_RCU_NOCB_CPU_ALL. 21CONFIG_RCU_FAST_NO_HZ -- Do one, but not with CONFIG_RCU_NOCB_CPU_ALL.