diff options
author | Thomas Gleixner <tglx@linutronix.de> | 2013-02-26 12:44:33 -0500 |
---|---|---|
committer | Thomas Gleixner <tglx@linutronix.de> | 2013-02-26 16:25:17 -0500 |
commit | 46c498c2cdee5efe44f617bcd4f388179be36115 (patch) | |
tree | a1f854a24210acb01caa1f9d345297151ef9c97e /kernel | |
parent | 1a13c0b181f218bf56a1a6b8edbaf2876b22314b (diff) |
stop_machine: Mark per cpu stopper enabled early
commit 14e568e78 (stop_machine: Use smpboot threads) introduced the
following regression:
Before this commit the stopper enabled bit was set in the online
notifier.
CPU0 CPU1
cpu_up
cpu online
hotplug_notifier(ONLINE)
stopper(CPU1)->enabled = true;
...
stop_machine()
The conversion to smpboot threads moved the enablement to the wakeup
path of the parked thread. The majority of users seem to have the
following working order:
CPU0 CPU1
cpu_up
cpu online
unpark_threads()
wakeup(stopper[CPU1])
....
stopper thread runs
stopper(CPU1)->enabled = true;
stop_machine()
But Konrad and Sander have observed:
CPU0 CPU1
cpu_up
cpu online
unpark_threads()
wakeup(stopper[CPU1])
....
stop_machine()
stopper thread runs
stopper(CPU1)->enabled = true;
Now the stop machinery kicks CPU0 into the stop loop, where it gets
stuck forever because the queue code saw stopper(CPU1)->enabled ==
false, so CPU0 waits for CPU1 to enter stomp_machine, but the CPU1
stopper work got discarded due to enabled == false.
Add a pre_unpark function to the smpboot thread descriptor and call it
before waking the thread.
This fixes the problem at hand, but the stop_machine code should be
more robust. The stopper->enabled flag smells fishy at best.
Thanks to Konrad for going through a loop of debug patches and
providing the information to decode this issue.
Reported-and-tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reported-and-tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1302261843240.22263@ionos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Diffstat (limited to 'kernel')
-rw-r--r-- | kernel/smpboot.c | 2 | ||||
-rw-r--r-- | kernel/stop_machine.c | 2 |
2 files changed, 3 insertions, 1 deletions
diff --git a/kernel/smpboot.c b/kernel/smpboot.c index d4abac261779..8eaed9aa9cf0 100644 --- a/kernel/smpboot.c +++ b/kernel/smpboot.c | |||
@@ -209,6 +209,8 @@ static void smpboot_unpark_thread(struct smp_hotplug_thread *ht, unsigned int cp | |||
209 | { | 209 | { |
210 | struct task_struct *tsk = *per_cpu_ptr(ht->store, cpu); | 210 | struct task_struct *tsk = *per_cpu_ptr(ht->store, cpu); |
211 | 211 | ||
212 | if (ht->pre_unpark) | ||
213 | ht->pre_unpark(cpu); | ||
212 | kthread_unpark(tsk); | 214 | kthread_unpark(tsk); |
213 | } | 215 | } |
214 | 216 | ||
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 95d178c62d5a..c09f2955ae30 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c | |||
@@ -336,7 +336,7 @@ static struct smp_hotplug_thread cpu_stop_threads = { | |||
336 | .create = cpu_stop_create, | 336 | .create = cpu_stop_create, |
337 | .setup = cpu_stop_unpark, | 337 | .setup = cpu_stop_unpark, |
338 | .park = cpu_stop_park, | 338 | .park = cpu_stop_park, |
339 | .unpark = cpu_stop_unpark, | 339 | .pre_unpark = cpu_stop_unpark, |
340 | .selfparking = true, | 340 | .selfparking = true, |
341 | }; | 341 | }; |
342 | 342 | ||