From 8038dad7e888581266c76df15d70ca457a3c5910 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 25 Feb 2015 10:34:39 -0800 Subject: smpboot: Add common code for notification from dying CPU RCU ignores offlined CPUs, so they cannot safely run RCU read-side code. (They -can- use SRCU, but not RCU.) This means that any use of RCU during or after the call to arch_cpu_idle_dead(). Unfortunately, commit 2ed53c0d6cc99 added a complete() call, which will contain RCU read-side critical sections if there is a task waiting to be awakened. Which, as it turns out, there almost never is. In my qemu/KVM testing, the to-be-awakened task is not yet asleep more than 99.5% of the time. In current mainline, failure is even harder to reproduce, requiring a virtualized environment that delays the outgoing CPU by at least three jiffies between the time it exits its stop_machine() task at CPU_DYING time and the time it calls arch_cpu_idle_dead() from the idle loop. However, this problem really can occur, especially in virtualized environments, and therefore really does need to be fixed This suggests moving back to the polling loop, but using a much shorter wait, with gentle exponential backoff instead of the old 100-millisecond wait. Most of the time, the loop will exit without waiting at all, and almost all of the remaining uses will wait only five microseconds. If the outgoing CPU is preempted, a loop will wait one jiffy, then increase the wait by a factor of 11/10ths, rounding up. As before, there is a five-second timeout. This commit therefore provides common-code infrastructure to do the dying-to-surviving CPU handoff in a safe manner. This code also provides an indication at CPU-online of whether the CPU to be onlined previously timed out on offline. The new cpu_check_up_prepare() function returns -EBUSY if this CPU previously took more than five seconds to go offline, or -EAGAIN if it has not yet managed to go offline. The rationale for -EAGAIN is that it might still be preempted, so an additional wait might well find it correctly offlined. Architecture-specific code can decide how to handle these conditions. Systems in which CPUs take themselves completely offline might respond to an -EBUSY return as if it was a zero (success) return. Systems in which the surviving CPU must take some action might take it at this time, or might simply mark the other CPU as unusable. Note that architectures that take the easy way out and simply pass the -EBUSY and -EAGAIN upwards will change the sysfs API. Signed-off-by: Paul E. McKenney Cc: Cc: [ paulmck: Fixed state machine for architectures that don't check earlier CPU-hotplug results as suggested by James Hogan. ] --- include/linux/cpu.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) (limited to 'include/linux/cpu.h') diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 4260e8594bd7..4744ef915acd 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -95,6 +95,8 @@ enum { * Called on the new cpu, just before * enabling interrupts. Must not sleep, * must not fail */ +#define CPU_BROKEN 0x000C /* CPU (unsigned)v did not die properly, + * perhaps due to preemption. */ /* Used for CPU hotplug events occurring while tasks are frozen due to a suspend * operation in progress @@ -271,4 +273,14 @@ void arch_cpu_idle_enter(void); void arch_cpu_idle_exit(void); void arch_cpu_idle_dead(void); +DECLARE_PER_CPU(bool, cpu_dead_idle); + +int cpu_report_state(int cpu); +int cpu_check_up_prepare(int cpu); +void cpu_set_state_online(int cpu); +#ifdef CONFIG_HOTPLUG_CPU +bool cpu_wait_death(unsigned int cpu, int seconds); +bool cpu_report_death(void); +#endif /* #ifdef CONFIG_HOTPLUG_CPU */ + #endif /* _LINUX_CPU_H_ */ -- cgit v1.2.2 From 88428cc5c27c63a4313e213813bc39b9899224d5 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Wed, 28 Jan 2015 14:42:09 -0800 Subject: rcu: Handle outgoing CPUs on exit from idle loop This commit informs RCU of an outgoing CPU just before that CPU invokes arch_cpu_idle_dead() during its last pass through the idle loop (via a new CPU_DYING_IDLE notifier value). This change means that RCU need not deal with outgoing CPUs passing through the scheduler after informing RCU that they are no longer online. Note that removing the CPU from the rcu_node ->qsmaskinit bit masks is done at CPU_DYING_IDLE time, and orphaning callbacks is still done at CPU_DEAD time, the reason being that at CPU_DEAD time we have another CPU that can adopt them. Signed-off-by: Paul E. McKenney --- include/linux/cpu.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/linux/cpu.h') diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 4744ef915acd..d028721748d4 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -95,6 +95,8 @@ enum { * Called on the new cpu, just before * enabling interrupts. Must not sleep, * must not fail */ +#define CPU_DYING_IDLE 0x000B /* CPU (unsigned)v dying, reached + * idle loop. */ #define CPU_BROKEN 0x000C /* CPU (unsigned)v did not die properly, * perhaps due to preemption. */ -- cgit v1.2.2 From 00df35f991914db6b8bde8cf09808e19a9cffc3d Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Sun, 12 Apr 2015 08:06:55 -0700 Subject: cpu: Defer smpboot kthread unparking until CPU known to scheduler Currently, smpboot_unpark_threads() is invoked before the incoming CPU has been added to the scheduler's runqueue structures. This might potentially cause the unparked kthread to run on the wrong CPU, since the correct CPU isn't fully set up yet. That causes a sporadic, hard to debug boot crash triggering on some systems, reported by Borislav Petkov, and bisected down to: 2a442c9c6453 ("x86: Use common outgoing-CPU-notification code") This patch places smpboot_unpark_threads() in a CPU hotplug notifier with priority set so that these kthreads are unparked just after the CPU has been added to the runqueues. Reported-and-tested-by: Borislav Petkov Signed-off-by: Paul E. McKenney Cc: Andrew Morton Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar --- include/linux/cpu.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/linux/cpu.h') diff --git a/include/linux/cpu.h b/include/linux/cpu.h index d028721748d4..091badf0f6ba 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -73,6 +73,7 @@ enum { /* migration should happen before other stuff but after perf */ CPU_PRI_PERF = 20, CPU_PRI_MIGRATION = 10, + CPU_PRI_SMPBOOT = 9, /* bring up workqueues before normal notifiers and down after */ CPU_PRI_WORKQUEUE_UP = 5, CPU_PRI_WORKQUEUE_DOWN = -5, @@ -165,6 +166,7 @@ static inline void __unregister_cpu_notifier(struct notifier_block *nb) } #endif +void smpboot_thread_init(void); int cpu_up(unsigned int cpu); void notify_cpu_starting(unsigned int cpu); extern void cpu_maps_update_begin(void); -- cgit v1.2.2 From 590ee7dbd569a012df705a5204fc5f1066f52b8c Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Mon, 13 Apr 2015 10:04:59 +0200 Subject: cpu: Provide smpboot_thread_init() on !CONFIG_SMP kernels as well Now that we are using smpboot_thread_init() in init/main.c as well, provide it for !CONFIG_SMP as well. This addresses a !CONFIG_SMP build failure. Cc: Paul E. McKenney Cc: Borislav Petkov Cc: Andrew Morton Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar --- include/linux/cpu.h | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'include/linux/cpu.h') diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 091badf0f6ba..c0fb6b1b4712 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -214,6 +214,10 @@ static inline void cpu_notifier_register_done(void) { } +static inline void smpboot_thread_init(void) +{ +} + #endif /* CONFIG_SMP */ extern struct bus_type cpu_subsys; -- cgit v1.2.2