From 8038dad7e888581266c76df15d70ca457a3c5910 Mon Sep 17 00:00:00 2001
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Wed, 25 Feb 2015 10:34:39 -0800
Subject: smpboot: Add common code for notification from dying CPU

RCU ignores offlined CPUs, so they cannot safely run RCU read-side code.
(They -can- use SRCU, but not RCU.)  This means that any use of RCU
during or after the call to arch_cpu_idle_dead().  Unfortunately,
commit 2ed53c0d6cc99 added a complete() call, which will contain RCU
read-side critical sections if there is a task waiting to be awakened.

Which, as it turns out, there almost never is.  In my qemu/KVM testing,
the to-be-awakened task is not yet asleep more than 99.5% of the time.
In current mainline, failure is even harder to reproduce, requiring a
virtualized environment that delays the outgoing CPU by at least three
jiffies between the time it exits its stop_machine() task at CPU_DYING
time and the time it calls arch_cpu_idle_dead() from the idle loop.
However, this problem really can occur, especially in virtualized
environments, and therefore really does need to be fixed

This suggests moving back to the polling loop, but using a much shorter
wait, with gentle exponential backoff instead of the old 100-millisecond
wait.  Most of the time, the loop will exit without waiting at all,
and almost all of the remaining uses will wait only five microseconds.
If the outgoing CPU is preempted, a loop will wait one jiffy, then
increase the wait by a factor of 11/10ths, rounding up.  As before, there
is a five-second timeout.

This commit therefore provides common-code infrastructure to do the
dying-to-surviving CPU handoff in a safe manner.  This code also
provides an indication at CPU-online of whether the CPU to be onlined
previously timed out on offline.  The new cpu_check_up_prepare() function
returns -EBUSY if this CPU previously took more than five seconds to
go offline, or -EAGAIN if it has not yet managed to go offline.  The
rationale for -EAGAIN is that it might still be preempted, so an additional
wait might well find it correctly offlined.  Architecture-specific code
can decide how to handle these conditions.  Systems in which CPUs take
themselves completely offline might respond to an -EBUSY return as if
it was a zero (success) return.  Systems in which the surviving CPU must
take some action might take it at this time, or might simply mark the
other CPU as unusable.

Note that architectures that take the easy way out and simply pass the
-EBUSY and -EAGAIN upwards will change the sysfs API.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: <linux-api@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
[ paulmck: Fixed state machine for architectures that don't check earlier
  CPU-hotplug results as suggested by James Hogan. ]
---
 include/linux/cpu.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

(limited to 'include/linux/cpu.h')

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 4260e8594bd7..4744ef915acd 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -95,6 +95,8 @@ enum {
 					* Called on the new cpu, just before
 					* enabling interrupts. Must not sleep,
 					* must not fail */
+#define CPU_BROKEN		0x000C /* CPU (unsigned)v did not die properly,
+					* perhaps due to preemption. */
 
 /* Used for CPU hotplug events occurring while tasks are frozen due to a suspend
  * operation in progress
@@ -271,4 +273,14 @@ void arch_cpu_idle_enter(void);
 void arch_cpu_idle_exit(void);
 void arch_cpu_idle_dead(void);
 
+DECLARE_PER_CPU(bool, cpu_dead_idle);
+
+int cpu_report_state(int cpu);
+int cpu_check_up_prepare(int cpu);
+void cpu_set_state_online(int cpu);
+#ifdef CONFIG_HOTPLUG_CPU
+bool cpu_wait_death(unsigned int cpu, int seconds);
+bool cpu_report_death(void);
+#endif /* #ifdef CONFIG_HOTPLUG_CPU */
+
 #endif /* _LINUX_CPU_H_ */
-- 
cgit v1.2.2


From 88428cc5c27c63a4313e213813bc39b9899224d5 Mon Sep 17 00:00:00 2001
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Wed, 28 Jan 2015 14:42:09 -0800
Subject: rcu: Handle outgoing CPUs on exit from idle loop

This commit informs RCU of an outgoing CPU just before that CPU invokes
arch_cpu_idle_dead() during its last pass through the idle loop (via a
new CPU_DYING_IDLE notifier value).  This change means that RCU need not
deal with outgoing CPUs passing through the scheduler after informing
RCU that they are no longer online.  Note that removing the CPU from
the rcu_node ->qsmaskinit bit masks is done at CPU_DYING_IDLE time,
and orphaning callbacks is still done at CPU_DEAD time, the reason being
that at CPU_DEAD time we have another CPU that can adopt them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/cpu.h | 2 ++
 1 file changed, 2 insertions(+)

(limited to 'include/linux/cpu.h')

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 4744ef915acd..d028721748d4 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -95,6 +95,8 @@ enum {
 					* Called on the new cpu, just before
 					* enabling interrupts. Must not sleep,
 					* must not fail */
+#define CPU_DYING_IDLE		0x000B /* CPU (unsigned)v dying, reached
+					* idle loop. */
 #define CPU_BROKEN		0x000C /* CPU (unsigned)v did not die properly,
 					* perhaps due to preemption. */
 
-- 
cgit v1.2.2


From 00df35f991914db6b8bde8cf09808e19a9cffc3d Mon Sep 17 00:00:00 2001
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Sun, 12 Apr 2015 08:06:55 -0700
Subject: cpu: Defer smpboot kthread unparking until CPU known to scheduler

Currently, smpboot_unpark_threads() is invoked before the incoming CPU
has been added to the scheduler's runqueue structures.  This might
potentially cause the unparked kthread to run on the wrong CPU, since the
correct CPU isn't fully set up yet.

That causes a sporadic, hard to debug boot crash triggering on some
systems, reported by Borislav Petkov, and bisected down to:

  2a442c9c6453 ("x86: Use common outgoing-CPU-notification code")

This patch places smpboot_unpark_threads() in a CPU hotplug
notifier with priority set so that these kthreads are unparked just after
the CPU has been added to the runqueues.

Reported-and-tested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/cpu.h | 2 ++
 1 file changed, 2 insertions(+)

(limited to 'include/linux/cpu.h')

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index d028721748d4..091badf0f6ba 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -73,6 +73,7 @@ enum {
 	/* migration should happen before other stuff but after perf */
 	CPU_PRI_PERF		= 20,
 	CPU_PRI_MIGRATION	= 10,
+	CPU_PRI_SMPBOOT		= 9,
 	/* bring up workqueues before normal notifiers and down after */
 	CPU_PRI_WORKQUEUE_UP	= 5,
 	CPU_PRI_WORKQUEUE_DOWN	= -5,
@@ -165,6 +166,7 @@ static inline void __unregister_cpu_notifier(struct notifier_block *nb)
 }
 #endif
 
+void smpboot_thread_init(void);
 int cpu_up(unsigned int cpu);
 void notify_cpu_starting(unsigned int cpu);
 extern void cpu_maps_update_begin(void);
-- 
cgit v1.2.2


From 590ee7dbd569a012df705a5204fc5f1066f52b8c Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@kernel.org>
Date: Mon, 13 Apr 2015 10:04:59 +0200
Subject: cpu: Provide smpboot_thread_init() on !CONFIG_SMP kernels as well

Now that we are using smpboot_thread_init() in init/main.c as well,
provide it for !CONFIG_SMP as well.

This addresses a !CONFIG_SMP build failure.

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/cpu.h | 4 ++++
 1 file changed, 4 insertions(+)

(limited to 'include/linux/cpu.h')

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 091badf0f6ba..c0fb6b1b4712 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -214,6 +214,10 @@ static inline void cpu_notifier_register_done(void)
 {
 }
 
+static inline void smpboot_thread_init(void)
+{
+}
+
 #endif /* CONFIG_SMP */
 extern struct bus_type cpu_subsys;
 
-- 
cgit v1.2.2