aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2016-03-15 16:50:29 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2016-03-15 16:50:29 -0400
commit710d60cbf1b312a8075a2158cbfbbd9c66132dcc (patch)
treed46a9f1a14165807701f1868398a6dc76af85968
parentdf2e37c814d51692803245fcbecca360d4882e96 (diff)
parentd10ef6f9380b8853c4b48eb104268fccfdc0b0c5 (diff)
Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull cpu hotplug updates from Thomas Gleixner: "This is the first part of the ongoing cpu hotplug rework: - Initial implementation of the state machine - Runs all online and prepare down callbacks on the plugged cpu and not on some random processor - Replaces busy loop waiting with completions - Adds tracepoints so the states can be followed" More detailed commentary on this work from an earlier email: "What's wrong with the current cpu hotplug infrastructure? - Asymmetry The hotplug notifier mechanism is asymmetric versus the bringup and teardown. This is mostly caused by the notifier mechanism. - Largely undocumented dependencies While some notifiers use explicitely defined notifier priorities, we have quite some notifiers which use numerical priorities to express dependencies without any documentation why. - Control processor driven Most of the bringup/teardown of a cpu is driven by a control processor. While it is understandable, that preperatory steps, like idle thread creation, memory allocation for and initialization of essential facilities needs to be done before a cpu can boot, there is no reason why everything else must run on a control processor. Before this patch series, bringup looks like this: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu bring the rest up - All or nothing approach There is no way to do partial bringups. That's something which is really desired because we waste e.g. at boot substantial amount of time just busy waiting that the cpu comes to life. That's stupid as we could very well do preparatory steps and the initial IPI for other cpus and then go back and do the necessary low level synchronization with the freshly booted cpu. - Minimal debuggability Due to the notifier based design, it's impossible to switch between two stages of the bringup/teardown back and forth in order to test the correctness. So in many hotplug notifiers the cancel mechanisms are either not existant or completely untested. - Notifier [un]registering is tedious To [un]register notifiers we need to protect against hotplug at every callsite. There is no mechanism that bringup/teardown callbacks are issued on the online cpus, so every caller needs to do it itself. That also includes error rollback. What's the new design? The base of the new design is a symmetric state machine, where both the control processor and the booting/dying cpu execute a well defined set of states. Each state is symmetric in the end, except for some well defined exceptions, and the bringup/teardown can be stopped and reversed at almost all states. So the bringup of a cpu will look like this in the future: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu bring itself up The synchronization step does not require the control cpu to wait. That mechanism can be done asynchronously via a worker or some other mechanism. The teardown can be made very similar, so that the dying cpu cleans up and brings itself down. Cleanups which need to be done after the cpu is gone, can be scheduled asynchronously as well. There is a long way to this, as we need to refactor the notion when a cpu is available. Today we set the cpu online right after it comes out of the low level bringup, which is not really correct. The proper mechanism is to set it to available, i.e. cpu local threads, like softirqd, hotplug thread etc. can be scheduled on that cpu, and once it finished all booting steps, it's set to online, so general workloads can be scheduled on it. The reverse happens on teardown. First thing to do is to forbid scheduling of general workloads, then teardown all the per cpu resources and finally shut it off completely. This patch series implements the basic infrastructure for this at the core level. This includes the following: - Basic state machine implementation with well defined states, so ordering and prioritization can be expressed. - Interfaces to [un]register state callbacks This invokes the bringup/teardown callback on all online cpus with the proper protection in place and [un]installs the callbacks in the state machine array. For callbacks which have no particular ordering requirement we have a dynamic state space, so that drivers don't have to register an explicit hotplug state. If a callback fails, the code automatically does a rollback to the previous state. - Sysfs interface to drive the state machine to a particular step. This is only partially functional today. Full functionality and therefor testability will be achieved once we converted all existing hotplug notifiers over to the new scheme. - Run all CPU_ONLINE/DOWN_PREPARE notifiers on the booting/dying processor: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu wait for boot bring itself up Signal completion to control cpu In a previous step of this work we've done a full tree mechanical conversion of all hotplug notifiers to the new scheme. The balance is a net removal of about 4000 lines of code. This is not included in this series, as we decided to take a different approach. Instead of mechanically converting everything over, we will do a proper overhaul of the usage sites one by one so they nicely fit into the symmetric callback scheme. I decided to do that after I looked at the ugliness of some of the converted sites and figured out that their hotplug mechanism is completely buggered anyway. So there is no point to do a mechanical conversion first as we need to go through the usage sites one by one again in order to achieve a full symmetric and testable behaviour" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) cpu/hotplug: Document states better cpu/hotplug: Fix smpboot thread ordering cpu/hotplug: Remove redundant state check cpu/hotplug: Plug death reporting race rcu: Make CPU_DYING_IDLE an explicit call cpu/hotplug: Make wait for dead cpu completion based cpu/hotplug: Let upcoming cpu bring itself fully up arch/hotplug: Call into idle with a proper state cpu/hotplug: Move online calls to hotplugged cpu cpu/hotplug: Create hotplug threads cpu/hotplug: Split out the state walk into functions cpu/hotplug: Unpark smpboot threads from the state machine cpu/hotplug: Move scheduler cpu_online notifier to hotplug core cpu/hotplug: Implement setup/removal interface cpu/hotplug: Make target state writeable cpu/hotplug: Add sysfs state interface cpu/hotplug: Hand in target state to _cpu_up/down cpu/hotplug: Convert the hotplugged cpu work to a state machine cpu/hotplug: Convert to a state machine for the control processor cpu/hotplug: Add tracepoints ...
-rw-r--r--arch/alpha/kernel/smp.c2
-rw-r--r--arch/arc/kernel/smp.c2
-rw-r--r--arch/arm/kernel/smp.c2
-rw-r--r--arch/arm64/kernel/smp.c2
-rw-r--r--arch/blackfin/mach-common/smp.c2
-rw-r--r--arch/hexagon/kernel/smp.c2
-rw-r--r--arch/ia64/kernel/smpboot.c2
-rw-r--r--arch/m32r/kernel/smpboot.c2
-rw-r--r--arch/metag/kernel/smp.c2
-rw-r--r--arch/mips/kernel/smp.c2
-rw-r--r--arch/mn10300/kernel/smp.c2
-rw-r--r--arch/parisc/kernel/smp.c2
-rw-r--r--arch/powerpc/kernel/smp.c2
-rw-r--r--arch/s390/kernel/smp.c2
-rw-r--r--arch/sh/kernel/smp.c2
-rw-r--r--arch/sparc/kernel/smp_32.c2
-rw-r--r--arch/sparc/kernel/smp_64.c2
-rw-r--r--arch/tile/kernel/smpboot.c2
-rw-r--r--arch/x86/kernel/smpboot.c2
-rw-r--r--arch/x86/xen/smp.c2
-rw-r--r--arch/xtensa/kernel/smp.c2
-rw-r--r--include/linux/cpu.h27
-rw-r--r--include/linux/cpuhotplug.h93
-rw-r--r--include/linux/notifier.h2
-rw-r--r--include/linux/rcupdate.h4
-rw-r--r--include/trace/events/cpuhp.h66
-rw-r--r--init/main.c16
-rw-r--r--kernel/cpu.c1162
-rw-r--r--kernel/rcu/tree.c73
-rw-r--r--kernel/sched/core.c10
-rw-r--r--kernel/sched/idle.c9
-rw-r--r--kernel/smp.c1
-rw-r--r--kernel/smpboot.c6
-rw-r--r--kernel/smpboot.h6
-rw-r--r--lib/Kconfig.debug13
35 files changed, 1294 insertions, 236 deletions
diff --git a/arch/alpha/kernel/smp.c b/arch/alpha/kernel/smp.c
index 2f24447fef92..46bf263c3153 100644
--- a/arch/alpha/kernel/smp.c
+++ b/arch/alpha/kernel/smp.c
@@ -168,7 +168,7 @@ smp_callin(void)
168 cpuid, current, current->active_mm)); 168 cpuid, current, current->active_mm));
169 169
170 preempt_disable(); 170 preempt_disable();
171 cpu_startup_entry(CPUHP_ONLINE); 171 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
172} 172}
173 173
174/* Wait until hwrpb->txrdy is clear for cpu. Return -1 on timeout. */ 174/* Wait until hwrpb->txrdy is clear for cpu. Return -1 on timeout. */
diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c
index 424e937da5c8..4cb3add77c75 100644
--- a/arch/arc/kernel/smp.c
+++ b/arch/arc/kernel/smp.c
@@ -142,7 +142,7 @@ void start_kernel_secondary(void)
142 142
143 local_irq_enable(); 143 local_irq_enable();
144 preempt_disable(); 144 preempt_disable();
145 cpu_startup_entry(CPUHP_ONLINE); 145 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
146} 146}
147 147
148/* 148/*
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 37312f6749f3..baee70267f29 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -409,7 +409,7 @@ asmlinkage void secondary_start_kernel(void)
409 /* 409 /*
410 * OK, it's off to the idle thread for us 410 * OK, it's off to the idle thread for us
411 */ 411 */
412 cpu_startup_entry(CPUHP_ONLINE); 412 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
413} 413}
414 414
415void __init smp_cpus_done(unsigned int max_cpus) 415void __init smp_cpus_done(unsigned int max_cpus)
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index b1adc51b2c2e..460765799c64 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -195,7 +195,7 @@ asmlinkage void secondary_start_kernel(void)
195 /* 195 /*
196 * OK, it's off to the idle thread for us 196 * OK, it's off to the idle thread for us
197 */ 197 */
198 cpu_startup_entry(CPUHP_ONLINE); 198 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
199} 199}
200 200
201#ifdef CONFIG_HOTPLUG_CPU 201#ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/blackfin/mach-common/smp.c b/arch/blackfin/mach-common/smp.c
index 0030e21cfceb..23c4ef5f8bdc 100644
--- a/arch/blackfin/mach-common/smp.c
+++ b/arch/blackfin/mach-common/smp.c
@@ -333,7 +333,7 @@ void secondary_start_kernel(void)
333 333
334 /* We are done with local CPU inits, unblock the boot CPU. */ 334 /* We are done with local CPU inits, unblock the boot CPU. */
335 set_cpu_online(cpu, true); 335 set_cpu_online(cpu, true);
336 cpu_startup_entry(CPUHP_ONLINE); 336 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
337} 337}
338 338
339void __init smp_prepare_boot_cpu(void) 339void __init smp_prepare_boot_cpu(void)
diff --git a/arch/hexagon/kernel/smp.c b/arch/hexagon/kernel/smp.c
index ff759f26b96a..983bae7d2665 100644
--- a/arch/hexagon/kernel/smp.c
+++ b/arch/hexagon/kernel/smp.c
@@ -180,7 +180,7 @@ void start_secondary(void)
180 180
181 local_irq_enable(); 181 local_irq_enable();
182 182
183 cpu_startup_entry(CPUHP_ONLINE); 183 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
184} 184}
185 185
186 186
diff --git a/arch/ia64/kernel/smpboot.c b/arch/ia64/kernel/smpboot.c
index 0e76fad27975..74fe317477e6 100644
--- a/arch/ia64/kernel/smpboot.c
+++ b/arch/ia64/kernel/smpboot.c
@@ -454,7 +454,7 @@ start_secondary (void *unused)
454 preempt_disable(); 454 preempt_disable();
455 smp_callin(); 455 smp_callin();
456 456
457 cpu_startup_entry(CPUHP_ONLINE); 457 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
458 return 0; 458 return 0;
459} 459}
460 460
diff --git a/arch/m32r/kernel/smpboot.c b/arch/m32r/kernel/smpboot.c
index a468467542f4..f98d2f6519d6 100644
--- a/arch/m32r/kernel/smpboot.c
+++ b/arch/m32r/kernel/smpboot.c
@@ -432,7 +432,7 @@ int __init start_secondary(void *unused)
432 */ 432 */
433 local_flush_tlb_all(); 433 local_flush_tlb_all();
434 434
435 cpu_startup_entry(CPUHP_ONLINE); 435 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
436 return 0; 436 return 0;
437} 437}
438 438
diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
index c3c6f0864881..bad13232de51 100644
--- a/arch/metag/kernel/smp.c
+++ b/arch/metag/kernel/smp.c
@@ -396,7 +396,7 @@ asmlinkage void secondary_start_kernel(void)
396 /* 396 /*
397 * OK, it's off to the idle thread for us 397 * OK, it's off to the idle thread for us
398 */ 398 */
399 cpu_startup_entry(CPUHP_ONLINE); 399 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
400} 400}
401 401
402void __init smp_cpus_done(unsigned int max_cpus) 402void __init smp_cpus_done(unsigned int max_cpus)
diff --git a/arch/mips/kernel/smp.c b/arch/mips/kernel/smp.c
index 8b687fee0cb0..37708d9af638 100644
--- a/arch/mips/kernel/smp.c
+++ b/arch/mips/kernel/smp.c
@@ -328,7 +328,7 @@ asmlinkage void start_secondary(void)
328 WARN_ON_ONCE(!irqs_disabled()); 328 WARN_ON_ONCE(!irqs_disabled());
329 mp_ops->smp_finish(); 329 mp_ops->smp_finish();
330 330
331 cpu_startup_entry(CPUHP_ONLINE); 331 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
332} 332}
333 333
334static void stop_this_cpu(void *dummy) 334static void stop_this_cpu(void *dummy)
diff --git a/arch/mn10300/kernel/smp.c b/arch/mn10300/kernel/smp.c
index f984193718b1..426173c4b0b9 100644
--- a/arch/mn10300/kernel/smp.c
+++ b/arch/mn10300/kernel/smp.c
@@ -675,7 +675,7 @@ int __init start_secondary(void *unused)
675#ifdef CONFIG_GENERIC_CLOCKEVENTS 675#ifdef CONFIG_GENERIC_CLOCKEVENTS
676 init_clockevents(); 676 init_clockevents();
677#endif 677#endif
678 cpu_startup_entry(CPUHP_ONLINE); 678 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
679 return 0; 679 return 0;
680} 680}
681 681
diff --git a/arch/parisc/kernel/smp.c b/arch/parisc/kernel/smp.c
index 52e85973a283..c2a9cc55a62f 100644
--- a/arch/parisc/kernel/smp.c
+++ b/arch/parisc/kernel/smp.c
@@ -305,7 +305,7 @@ void __init smp_callin(void)
305 305
306 local_irq_enable(); /* Interrupts have been off until now */ 306 local_irq_enable(); /* Interrupts have been off until now */
307 307
308 cpu_startup_entry(CPUHP_ONLINE); 308 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
309 309
310 /* NOTREACHED */ 310 /* NOTREACHED */
311 panic("smp_callin() AAAAaaaaahhhh....\n"); 311 panic("smp_callin() AAAAaaaaahhhh....\n");
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index ec9ec2058d2d..cc13d4c83291 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -727,7 +727,7 @@ void start_secondary(void *unused)
727 727
728 local_irq_enable(); 728 local_irq_enable();
729 729
730 cpu_startup_entry(CPUHP_ONLINE); 730 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
731 731
732 BUG(); 732 BUG();
733} 733}
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index 3c65a8eae34d..40a6b4f9c36c 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -798,7 +798,7 @@ static void smp_start_secondary(void *cpuvoid)
798 set_cpu_online(smp_processor_id(), true); 798 set_cpu_online(smp_processor_id(), true);
799 inc_irq_stat(CPU_RST); 799 inc_irq_stat(CPU_RST);
800 local_irq_enable(); 800 local_irq_enable();
801 cpu_startup_entry(CPUHP_ONLINE); 801 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
802} 802}
803 803
804/* Upping and downing of CPUs */ 804/* Upping and downing of CPUs */
diff --git a/arch/sh/kernel/smp.c b/arch/sh/kernel/smp.c
index de6be008fc01..13f633add29a 100644
--- a/arch/sh/kernel/smp.c
+++ b/arch/sh/kernel/smp.c
@@ -203,7 +203,7 @@ asmlinkage void start_secondary(void)
203 set_cpu_online(cpu, true); 203 set_cpu_online(cpu, true);
204 per_cpu(cpu_state, cpu) = CPU_ONLINE; 204 per_cpu(cpu_state, cpu) = CPU_ONLINE;
205 205
206 cpu_startup_entry(CPUHP_ONLINE); 206 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
207} 207}
208 208
209extern struct { 209extern struct {
diff --git a/arch/sparc/kernel/smp_32.c b/arch/sparc/kernel/smp_32.c
index b3a5d81b20f0..fb30e7c6a5b1 100644
--- a/arch/sparc/kernel/smp_32.c
+++ b/arch/sparc/kernel/smp_32.c
@@ -364,7 +364,7 @@ static void sparc_start_secondary(void *arg)
364 local_irq_enable(); 364 local_irq_enable();
365 365
366 wmb(); 366 wmb();
367 cpu_startup_entry(CPUHP_ONLINE); 367 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
368 368
369 /* We should never reach here! */ 369 /* We should never reach here! */
370 BUG(); 370 BUG();
diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
index 19cd08d18672..8a6151a628ce 100644
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -134,7 +134,7 @@ void smp_callin(void)
134 134
135 local_irq_enable(); 135 local_irq_enable();
136 136
137 cpu_startup_entry(CPUHP_ONLINE); 137 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
138} 138}
139 139
140void cpu_panic(void) 140void cpu_panic(void)
diff --git a/arch/tile/kernel/smpboot.c b/arch/tile/kernel/smpboot.c
index 20d52a98e171..6c0abaacec33 100644
--- a/arch/tile/kernel/smpboot.c
+++ b/arch/tile/kernel/smpboot.c
@@ -208,7 +208,7 @@ void online_secondary(void)
208 /* Set up tile-timer clock-event device on this cpu */ 208 /* Set up tile-timer clock-event device on this cpu */
209 setup_tile_timer(); 209 setup_tile_timer();
210 210
211 cpu_startup_entry(CPUHP_ONLINE); 211 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
212} 212}
213 213
214int __cpu_up(unsigned int cpu, struct task_struct *tidle) 214int __cpu_up(unsigned int cpu, struct task_struct *tidle)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3bf1e0b5f827..643dbdccf4bc 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -256,7 +256,7 @@ static void notrace start_secondary(void *unused)
256 x86_cpuinit.setup_percpu_clockev(); 256 x86_cpuinit.setup_percpu_clockev();
257 257
258 wmb(); 258 wmb();
259 cpu_startup_entry(CPUHP_ONLINE); 259 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
260} 260}
261 261
262int topology_update_package_map(unsigned int apicid, unsigned int cpu) 262int topology_update_package_map(unsigned int apicid, unsigned int cpu)
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 3f4ebf0261f2..3c6d17fd423a 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -112,7 +112,7 @@ asmlinkage __visible void cpu_bringup_and_idle(int cpu)
112 xen_pvh_secondary_vcpu_init(cpu); 112 xen_pvh_secondary_vcpu_init(cpu);
113#endif 113#endif
114 cpu_bringup(); 114 cpu_bringup();
115 cpu_startup_entry(CPUHP_ONLINE); 115 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
116} 116}
117 117
118static void xen_smp_intr_free(unsigned int cpu) 118static void xen_smp_intr_free(unsigned int cpu)
diff --git a/arch/xtensa/kernel/smp.c b/arch/xtensa/kernel/smp.c
index 4d02e38514f5..fc4ad21a5ed4 100644
--- a/arch/xtensa/kernel/smp.c
+++ b/arch/xtensa/kernel/smp.c
@@ -157,7 +157,7 @@ void secondary_start_kernel(void)
157 157
158 complete(&cpu_running); 158 complete(&cpu_running);
159 159
160 cpu_startup_entry(CPUHP_ONLINE); 160 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
161} 161}
162 162
163static void mx_cpu_start(void *p) 163static void mx_cpu_start(void *p)
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index d2ca8c38f9c4..f9b1fab4388a 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -16,6 +16,7 @@
16#include <linux/node.h> 16#include <linux/node.h>
17#include <linux/compiler.h> 17#include <linux/compiler.h>
18#include <linux/cpumask.h> 18#include <linux/cpumask.h>
19#include <linux/cpuhotplug.h>
19 20
20struct device; 21struct device;
21struct device_node; 22struct device_node;
@@ -27,6 +28,9 @@ struct cpu {
27 struct device dev; 28 struct device dev;
28}; 29};
29 30
31extern void boot_cpu_init(void);
32extern void boot_cpu_state_init(void);
33
30extern int register_cpu(struct cpu *cpu, int num); 34extern int register_cpu(struct cpu *cpu, int num);
31extern struct device *get_cpu_device(unsigned cpu); 35extern struct device *get_cpu_device(unsigned cpu);
32extern bool cpu_is_hotpluggable(unsigned cpu); 36extern bool cpu_is_hotpluggable(unsigned cpu);
@@ -74,7 +78,7 @@ enum {
74 /* migration should happen before other stuff but after perf */ 78 /* migration should happen before other stuff but after perf */
75 CPU_PRI_PERF = 20, 79 CPU_PRI_PERF = 20,
76 CPU_PRI_MIGRATION = 10, 80 CPU_PRI_MIGRATION = 10,
77 CPU_PRI_SMPBOOT = 9, 81
78 /* bring up workqueues before normal notifiers and down after */ 82 /* bring up workqueues before normal notifiers and down after */
79 CPU_PRI_WORKQUEUE_UP = 5, 83 CPU_PRI_WORKQUEUE_UP = 5,
80 CPU_PRI_WORKQUEUE_DOWN = -5, 84 CPU_PRI_WORKQUEUE_DOWN = -5,
@@ -97,9 +101,7 @@ enum {
97 * Called on the new cpu, just before 101 * Called on the new cpu, just before
98 * enabling interrupts. Must not sleep, 102 * enabling interrupts. Must not sleep,
99 * must not fail */ 103 * must not fail */
100#define CPU_DYING_IDLE 0x000B /* CPU (unsigned)v dying, reached 104#define CPU_BROKEN 0x000B /* CPU (unsigned)v did not die properly,
101 * idle loop. */
102#define CPU_BROKEN 0x000C /* CPU (unsigned)v did not die properly,
103 * perhaps due to preemption. */ 105 * perhaps due to preemption. */
104 106
105/* Used for CPU hotplug events occurring while tasks are frozen due to a suspend 107/* Used for CPU hotplug events occurring while tasks are frozen due to a suspend
@@ -118,6 +120,7 @@ enum {
118 120
119 121
120#ifdef CONFIG_SMP 122#ifdef CONFIG_SMP
123extern bool cpuhp_tasks_frozen;
121/* Need to know about CPUs going up/down? */ 124/* Need to know about CPUs going up/down? */
122#if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) 125#if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE)
123#define cpu_notifier(fn, pri) { \ 126#define cpu_notifier(fn, pri) { \
@@ -167,7 +170,6 @@ static inline void __unregister_cpu_notifier(struct notifier_block *nb)
167} 170}
168#endif 171#endif
169 172
170void smpboot_thread_init(void);
171int cpu_up(unsigned int cpu); 173int cpu_up(unsigned int cpu);
172void notify_cpu_starting(unsigned int cpu); 174void notify_cpu_starting(unsigned int cpu);
173extern void cpu_maps_update_begin(void); 175extern void cpu_maps_update_begin(void);
@@ -177,6 +179,7 @@ extern void cpu_maps_update_done(void);
177#define cpu_notifier_register_done cpu_maps_update_done 179#define cpu_notifier_register_done cpu_maps_update_done
178 180
179#else /* CONFIG_SMP */ 181#else /* CONFIG_SMP */
182#define cpuhp_tasks_frozen 0
180 183
181#define cpu_notifier(fn, pri) do { (void)(fn); } while (0) 184#define cpu_notifier(fn, pri) do { (void)(fn); } while (0)
182#define __cpu_notifier(fn, pri) do { (void)(fn); } while (0) 185#define __cpu_notifier(fn, pri) do { (void)(fn); } while (0)
@@ -215,10 +218,6 @@ static inline void cpu_notifier_register_done(void)
215{ 218{
216} 219}
217 220
218static inline void smpboot_thread_init(void)
219{
220}
221
222#endif /* CONFIG_SMP */ 221#endif /* CONFIG_SMP */
223extern struct bus_type cpu_subsys; 222extern struct bus_type cpu_subsys;
224 223
@@ -265,11 +264,6 @@ static inline int disable_nonboot_cpus(void) { return 0; }
265static inline void enable_nonboot_cpus(void) {} 264static inline void enable_nonboot_cpus(void) {}
266#endif /* !CONFIG_PM_SLEEP_SMP */ 265#endif /* !CONFIG_PM_SLEEP_SMP */
267 266
268enum cpuhp_state {
269 CPUHP_OFFLINE,
270 CPUHP_ONLINE,
271};
272
273void cpu_startup_entry(enum cpuhp_state state); 267void cpu_startup_entry(enum cpuhp_state state);
274 268
275void cpu_idle_poll_ctrl(bool enable); 269void cpu_idle_poll_ctrl(bool enable);
@@ -280,14 +274,15 @@ void arch_cpu_idle_enter(void);
280void arch_cpu_idle_exit(void); 274void arch_cpu_idle_exit(void);
281void arch_cpu_idle_dead(void); 275void arch_cpu_idle_dead(void);
282 276
283DECLARE_PER_CPU(bool, cpu_dead_idle);
284
285int cpu_report_state(int cpu); 277int cpu_report_state(int cpu);
286int cpu_check_up_prepare(int cpu); 278int cpu_check_up_prepare(int cpu);
287void cpu_set_state_online(int cpu); 279void cpu_set_state_online(int cpu);
288#ifdef CONFIG_HOTPLUG_CPU 280#ifdef CONFIG_HOTPLUG_CPU
289bool cpu_wait_death(unsigned int cpu, int seconds); 281bool cpu_wait_death(unsigned int cpu, int seconds);
290bool cpu_report_death(void); 282bool cpu_report_death(void);
283void cpuhp_report_idle_dead(void);
284#else
285static inline void cpuhp_report_idle_dead(void) { }
291#endif /* #ifdef CONFIG_HOTPLUG_CPU */ 286#endif /* #ifdef CONFIG_HOTPLUG_CPU */
292 287
293#endif /* _LINUX_CPU_H_ */ 288#endif /* _LINUX_CPU_H_ */
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
new file mode 100644
index 000000000000..5d68e15e46b7
--- /dev/null
+++ b/include/linux/cpuhotplug.h
@@ -0,0 +1,93 @@
1#ifndef __CPUHOTPLUG_H
2#define __CPUHOTPLUG_H
3
4enum cpuhp_state {
5 CPUHP_OFFLINE,
6 CPUHP_CREATE_THREADS,
7 CPUHP_NOTIFY_PREPARE,
8 CPUHP_BRINGUP_CPU,
9 CPUHP_AP_IDLE_DEAD,
10 CPUHP_AP_OFFLINE,
11 CPUHP_AP_NOTIFY_STARTING,
12 CPUHP_AP_ONLINE,
13 CPUHP_TEARDOWN_CPU,
14 CPUHP_AP_ONLINE_IDLE,
15 CPUHP_AP_SMPBOOT_THREADS,
16 CPUHP_AP_NOTIFY_ONLINE,
17 CPUHP_AP_ONLINE_DYN,
18 CPUHP_AP_ONLINE_DYN_END = CPUHP_AP_ONLINE_DYN + 30,
19 CPUHP_ONLINE,
20};
21
22int __cpuhp_setup_state(enum cpuhp_state state, const char *name, bool invoke,
23 int (*startup)(unsigned int cpu),
24 int (*teardown)(unsigned int cpu));
25
26/**
27 * cpuhp_setup_state - Setup hotplug state callbacks with calling the callbacks
28 * @state: The state for which the calls are installed
29 * @name: Name of the callback (will be used in debug output)
30 * @startup: startup callback function
31 * @teardown: teardown callback function
32 *
33 * Installs the callback functions and invokes the startup callback on
34 * the present cpus which have already reached the @state.
35 */
36static inline int cpuhp_setup_state(enum cpuhp_state state,
37 const char *name,
38 int (*startup)(unsigned int cpu),
39 int (*teardown)(unsigned int cpu))
40{
41 return __cpuhp_setup_state(state, name, true, startup, teardown);
42}
43
44/**
45 * cpuhp_setup_state_nocalls - Setup hotplug state callbacks without calling the
46 * callbacks
47 * @state: The state for which the calls are installed
48 * @name: Name of the callback.
49 * @startup: startup callback function
50 * @teardown: teardown callback function
51 *
52 * Same as @cpuhp_setup_state except that no calls are executed are invoked
53 * during installation of this callback. NOP if SMP=n or HOTPLUG_CPU=n.
54 */
55static inline int cpuhp_setup_state_nocalls(enum cpuhp_state state,
56 const char *name,
57 int (*startup)(unsigned int cpu),
58 int (*teardown)(unsigned int cpu))
59{
60 return __cpuhp_setup_state(state, name, false, startup, teardown);
61}
62
63void __cpuhp_remove_state(enum cpuhp_state state, bool invoke);
64
65/**
66 * cpuhp_remove_state - Remove hotplug state callbacks and invoke the teardown
67 * @state: The state for which the calls are removed
68 *
69 * Removes the callback functions and invokes the teardown callback on
70 * the present cpus which have already reached the @state.
71 */
72static inline void cpuhp_remove_state(enum cpuhp_state state)
73{
74 __cpuhp_remove_state(state, true);
75}
76
77/**
78 * cpuhp_remove_state_nocalls - Remove hotplug state callbacks without invoking
79 * teardown
80 * @state: The state for which the calls are removed
81 */
82static inline void cpuhp_remove_state_nocalls(enum cpuhp_state state)
83{
84 __cpuhp_remove_state(state, false);
85}
86
87#ifdef CONFIG_SMP
88void cpuhp_online_idle(enum cpuhp_state state);
89#else
90static inline void cpuhp_online_idle(enum cpuhp_state state) { }
91#endif
92
93#endif
diff --git a/include/linux/notifier.h b/include/linux/notifier.h
index d14a4c362465..4149868de4e6 100644
--- a/include/linux/notifier.h
+++ b/include/linux/notifier.h
@@ -47,6 +47,8 @@
47 * runtime initialization. 47 * runtime initialization.
48 */ 48 */
49 49
50struct notifier_block;
51
50typedef int (*notifier_fn_t)(struct notifier_block *nb, 52typedef int (*notifier_fn_t)(struct notifier_block *nb,
51 unsigned long action, void *data); 53 unsigned long action, void *data);
52 54
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index b5d48bd56e3f..2657aff2725b 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -332,9 +332,7 @@ void rcu_init(void);
332void rcu_sched_qs(void); 332void rcu_sched_qs(void);
333void rcu_bh_qs(void); 333void rcu_bh_qs(void);
334void rcu_check_callbacks(int user); 334void rcu_check_callbacks(int user);
335struct notifier_block; 335void rcu_report_dead(unsigned int cpu);
336int rcu_cpu_notify(struct notifier_block *self,
337 unsigned long action, void *hcpu);
338 336
339#ifndef CONFIG_TINY_RCU 337#ifndef CONFIG_TINY_RCU
340void rcu_end_inkernel_boot(void); 338void rcu_end_inkernel_boot(void);
diff --git a/include/trace/events/cpuhp.h b/include/trace/events/cpuhp.h
new file mode 100644
index 000000000000..a72bd93ec7e5
--- /dev/null
+++ b/include/trace/events/cpuhp.h
@@ -0,0 +1,66 @@
1#undef TRACE_SYSTEM
2#define TRACE_SYSTEM cpuhp
3
4#if !defined(_TRACE_CPUHP_H) || defined(TRACE_HEADER_MULTI_READ)
5#define _TRACE_CPUHP_H
6
7#include <linux/tracepoint.h>
8
9TRACE_EVENT(cpuhp_enter,
10
11 TP_PROTO(unsigned int cpu,
12 int target,
13 int idx,
14 int (*fun)(unsigned int)),
15
16 TP_ARGS(cpu, target, idx, fun),
17
18 TP_STRUCT__entry(
19 __field( unsigned int, cpu )
20 __field( int, target )
21 __field( int, idx )
22 __field( void *, fun )
23 ),
24
25 TP_fast_assign(
26 __entry->cpu = cpu;
27 __entry->target = target;
28 __entry->idx = idx;
29 __entry->fun = fun;
30 ),
31
32 TP_printk("cpu: %04u target: %3d step: %3d (%pf)",
33 __entry->cpu, __entry->target, __entry->idx, __entry->fun)
34);
35
36TRACE_EVENT(cpuhp_exit,
37
38 TP_PROTO(unsigned int cpu,
39 int state,
40 int idx,
41 int ret),
42
43 TP_ARGS(cpu, state, idx, ret),
44
45 TP_STRUCT__entry(
46 __field( unsigned int, cpu )
47 __field( int, state )
48 __field( int, idx )
49 __field( int, ret )
50 ),
51
52 TP_fast_assign(
53 __entry->cpu = cpu;
54 __entry->state = state;
55 __entry->idx = idx;
56 __entry->ret = ret;
57 ),
58
59 TP_printk(" cpu: %04u state: %3d step: %3d ret: %d",
60 __entry->cpu, __entry->state, __entry->idx, __entry->ret)
61);
62
63#endif
64
65/* This part must be outside protection */
66#include <trace/define_trace.h>
diff --git a/init/main.c b/init/main.c
index 7c27de4577ed..8dc93df20f7f 100644
--- a/init/main.c
+++ b/init/main.c
@@ -385,7 +385,6 @@ static noinline void __init_refok rest_init(void)
385 int pid; 385 int pid;
386 386
387 rcu_scheduler_starting(); 387 rcu_scheduler_starting();
388 smpboot_thread_init();
389 /* 388 /*
390 * We need to spawn init first so that it obtains pid 1, however 389 * We need to spawn init first so that it obtains pid 1, however
391 * the init task will end up wanting to create kthreads, which, if 390 * the init task will end up wanting to create kthreads, which, if
@@ -449,20 +448,6 @@ void __init parse_early_param(void)
449 done = 1; 448 done = 1;
450} 449}
451 450
452/*
453 * Activate the first processor.
454 */
455
456static void __init boot_cpu_init(void)
457{
458 int cpu = smp_processor_id();
459 /* Mark the boot cpu "present", "online" etc for SMP and UP case */
460 set_cpu_online(cpu, true);
461 set_cpu_active(cpu, true);
462 set_cpu_present(cpu, true);
463 set_cpu_possible(cpu, true);
464}
465
466void __init __weak smp_setup_processor_id(void) 451void __init __weak smp_setup_processor_id(void)
467{ 452{
468} 453}
@@ -522,6 +507,7 @@ asmlinkage __visible void __init start_kernel(void)
522 setup_command_line(command_line); 507 setup_command_line(command_line);
523 setup_nr_cpu_ids(); 508 setup_nr_cpu_ids();
524 setup_per_cpu_areas(); 509 setup_per_cpu_areas();
510 boot_cpu_state_init();
525 smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */ 511 smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */
526 512
527 build_all_zonelists(NULL, NULL); 513 build_all_zonelists(NULL, NULL);
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 5b9d39633ce9..6ea42e8da861 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -22,13 +22,88 @@
22#include <linux/lockdep.h> 22#include <linux/lockdep.h>
23#include <linux/tick.h> 23#include <linux/tick.h>
24#include <linux/irq.h> 24#include <linux/irq.h>
25#include <linux/smpboot.h>
26
25#include <trace/events/power.h> 27#include <trace/events/power.h>
28#define CREATE_TRACE_POINTS
29#include <trace/events/cpuhp.h>
26 30
27#include "smpboot.h" 31#include "smpboot.h"
28 32
33/**
34 * cpuhp_cpu_state - Per cpu hotplug state storage
35 * @state: The current cpu state
36 * @target: The target state
37 * @thread: Pointer to the hotplug thread
38 * @should_run: Thread should execute
39 * @cb_stat: The state for a single callback (install/uninstall)
40 * @cb: Single callback function (install/uninstall)
41 * @result: Result of the operation
42 * @done: Signal completion to the issuer of the task
43 */
44struct cpuhp_cpu_state {
45 enum cpuhp_state state;
46 enum cpuhp_state target;
47#ifdef CONFIG_SMP
48 struct task_struct *thread;
49 bool should_run;
50 enum cpuhp_state cb_state;
51 int (*cb)(unsigned int cpu);
52 int result;
53 struct completion done;
54#endif
55};
56
57static DEFINE_PER_CPU(struct cpuhp_cpu_state, cpuhp_state);
58
59/**
60 * cpuhp_step - Hotplug state machine step
61 * @name: Name of the step
62 * @startup: Startup function of the step
63 * @teardown: Teardown function of the step
64 * @skip_onerr: Do not invoke the functions on error rollback
65 * Will go away once the notifiers are gone
66 * @cant_stop: Bringup/teardown can't be stopped at this step
67 */
68struct cpuhp_step {
69 const char *name;
70 int (*startup)(unsigned int cpu);
71 int (*teardown)(unsigned int cpu);
72 bool skip_onerr;
73 bool cant_stop;
74};
75
76static DEFINE_MUTEX(cpuhp_state_mutex);
77static struct cpuhp_step cpuhp_bp_states[];
78static struct cpuhp_step cpuhp_ap_states[];
79
80/**
81 * cpuhp_invoke_callback _ Invoke the callbacks for a given state
82 * @cpu: The cpu for which the callback should be invoked
83 * @step: The step in the state machine
84 * @cb: The callback function to invoke
85 *
86 * Called from cpu hotplug and from the state register machinery
87 */
88static int cpuhp_invoke_callback(unsigned int cpu, enum cpuhp_state step,
89 int (*cb)(unsigned int))
90{
91 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
92 int ret = 0;
93
94 if (cb) {
95 trace_cpuhp_enter(cpu, st->target, step, cb);
96 ret = cb(cpu);
97 trace_cpuhp_exit(cpu, st->state, step, ret);
98 }
99 return ret;
100}
101
29#ifdef CONFIG_SMP 102#ifdef CONFIG_SMP
30/* Serializes the updates to cpu_online_mask, cpu_present_mask */ 103/* Serializes the updates to cpu_online_mask, cpu_present_mask */
31static DEFINE_MUTEX(cpu_add_remove_lock); 104static DEFINE_MUTEX(cpu_add_remove_lock);
105bool cpuhp_tasks_frozen;
106EXPORT_SYMBOL_GPL(cpuhp_tasks_frozen);
32 107
33/* 108/*
34 * The following two APIs (cpu_maps_update_begin/done) must be used when 109 * The following two APIs (cpu_maps_update_begin/done) must be used when
@@ -207,31 +282,281 @@ int __register_cpu_notifier(struct notifier_block *nb)
207 return raw_notifier_chain_register(&cpu_chain, nb); 282 return raw_notifier_chain_register(&cpu_chain, nb);
208} 283}
209 284
210static int __cpu_notify(unsigned long val, void *v, int nr_to_call, 285static int __cpu_notify(unsigned long val, unsigned int cpu, int nr_to_call,
211 int *nr_calls) 286 int *nr_calls)
212{ 287{
288 unsigned long mod = cpuhp_tasks_frozen ? CPU_TASKS_FROZEN : 0;
289 void *hcpu = (void *)(long)cpu;
290
213 int ret; 291 int ret;
214 292
215 ret = __raw_notifier_call_chain(&cpu_chain, val, v, nr_to_call, 293 ret = __raw_notifier_call_chain(&cpu_chain, val | mod, hcpu, nr_to_call,
216 nr_calls); 294 nr_calls);
217 295
218 return notifier_to_errno(ret); 296 return notifier_to_errno(ret);
219} 297}
220 298
221static int cpu_notify(unsigned long val, void *v) 299static int cpu_notify(unsigned long val, unsigned int cpu)
222{ 300{
223 return __cpu_notify(val, v, -1, NULL); 301 return __cpu_notify(val, cpu, -1, NULL);
224} 302}
225 303
226#ifdef CONFIG_HOTPLUG_CPU 304/* Notifier wrappers for transitioning to state machine */
305static int notify_prepare(unsigned int cpu)
306{
307 int nr_calls = 0;
308 int ret;
309
310 ret = __cpu_notify(CPU_UP_PREPARE, cpu, -1, &nr_calls);
311 if (ret) {
312 nr_calls--;
313 printk(KERN_WARNING "%s: attempt to bring up CPU %u failed\n",
314 __func__, cpu);
315 __cpu_notify(CPU_UP_CANCELED, cpu, nr_calls, NULL);
316 }
317 return ret;
318}
319
320static int notify_online(unsigned int cpu)
321{
322 cpu_notify(CPU_ONLINE, cpu);
323 return 0;
324}
325
326static int notify_starting(unsigned int cpu)
327{
328 cpu_notify(CPU_STARTING, cpu);
329 return 0;
330}
331
332static int bringup_wait_for_ap(unsigned int cpu)
333{
334 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
335
336 wait_for_completion(&st->done);
337 return st->result;
338}
339
340static int bringup_cpu(unsigned int cpu)
341{
342 struct task_struct *idle = idle_thread_get(cpu);
343 int ret;
344
345 /* Arch-specific enabling code. */
346 ret = __cpu_up(cpu, idle);
347 if (ret) {
348 cpu_notify(CPU_UP_CANCELED, cpu);
349 return ret;
350 }
351 ret = bringup_wait_for_ap(cpu);
352 BUG_ON(!cpu_online(cpu));
353 return ret;
354}
355
356/*
357 * Hotplug state machine related functions
358 */
359static void undo_cpu_down(unsigned int cpu, struct cpuhp_cpu_state *st,
360 struct cpuhp_step *steps)
361{
362 for (st->state++; st->state < st->target; st->state++) {
363 struct cpuhp_step *step = steps + st->state;
364
365 if (!step->skip_onerr)
366 cpuhp_invoke_callback(cpu, st->state, step->startup);
367 }
368}
369
370static int cpuhp_down_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
371 struct cpuhp_step *steps, enum cpuhp_state target)
372{
373 enum cpuhp_state prev_state = st->state;
374 int ret = 0;
375
376 for (; st->state > target; st->state--) {
377 struct cpuhp_step *step = steps + st->state;
378
379 ret = cpuhp_invoke_callback(cpu, st->state, step->teardown);
380 if (ret) {
381 st->target = prev_state;
382 undo_cpu_down(cpu, st, steps);
383 break;
384 }
385 }
386 return ret;
387}
388
389static void undo_cpu_up(unsigned int cpu, struct cpuhp_cpu_state *st,
390 struct cpuhp_step *steps)
391{
392 for (st->state--; st->state > st->target; st->state--) {
393 struct cpuhp_step *step = steps + st->state;
394
395 if (!step->skip_onerr)
396 cpuhp_invoke_callback(cpu, st->state, step->teardown);
397 }
398}
399
400static int cpuhp_up_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
401 struct cpuhp_step *steps, enum cpuhp_state target)
402{
403 enum cpuhp_state prev_state = st->state;
404 int ret = 0;
405
406 while (st->state < target) {
407 struct cpuhp_step *step;
408
409 st->state++;
410 step = steps + st->state;
411 ret = cpuhp_invoke_callback(cpu, st->state, step->startup);
412 if (ret) {
413 st->target = prev_state;
414 undo_cpu_up(cpu, st, steps);
415 break;
416 }
417 }
418 return ret;
419}
420
421/*
422 * The cpu hotplug threads manage the bringup and teardown of the cpus
423 */
424static void cpuhp_create(unsigned int cpu)
425{
426 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
427
428 init_completion(&st->done);
429}
430
431static int cpuhp_should_run(unsigned int cpu)
432{
433 struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);
434
435 return st->should_run;
436}
437
438/* Execute the teardown callbacks. Used to be CPU_DOWN_PREPARE */
439static int cpuhp_ap_offline(unsigned int cpu, struct cpuhp_cpu_state *st)
440{
441 enum cpuhp_state target = max((int)st->target, CPUHP_TEARDOWN_CPU);
442
443 return cpuhp_down_callbacks(cpu, st, cpuhp_ap_states, target);
444}
445
446/* Execute the online startup callbacks. Used to be CPU_ONLINE */
447static int cpuhp_ap_online(unsigned int cpu, struct cpuhp_cpu_state *st)
448{
449 return cpuhp_up_callbacks(cpu, st, cpuhp_ap_states, st->target);
450}
451
452/*
453 * Execute teardown/startup callbacks on the plugged cpu. Also used to invoke
454 * callbacks when a state gets [un]installed at runtime.
455 */
456static void cpuhp_thread_fun(unsigned int cpu)
457{
458 struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);
459 int ret = 0;
460
461 /*
462 * Paired with the mb() in cpuhp_kick_ap_work and
463 * cpuhp_invoke_ap_callback, so the work set is consistent visible.
464 */
465 smp_mb();
466 if (!st->should_run)
467 return;
468
469 st->should_run = false;
470
471 /* Single callback invocation for [un]install ? */
472 if (st->cb) {
473 if (st->cb_state < CPUHP_AP_ONLINE) {
474 local_irq_disable();
475 ret = cpuhp_invoke_callback(cpu, st->cb_state, st->cb);
476 local_irq_enable();
477 } else {
478 ret = cpuhp_invoke_callback(cpu, st->cb_state, st->cb);
479 }
480 } else {
481 /* Cannot happen .... */
482 BUG_ON(st->state < CPUHP_AP_ONLINE_IDLE);
483
484 /* Regular hotplug work */
485 if (st->state < st->target)
486 ret = cpuhp_ap_online(cpu, st);
487 else if (st->state > st->target)
488 ret = cpuhp_ap_offline(cpu, st);
489 }
490 st->result = ret;
491 complete(&st->done);
492}
227 493
228static void cpu_notify_nofail(unsigned long val, void *v) 494/* Invoke a single callback on a remote cpu */
495static int cpuhp_invoke_ap_callback(int cpu, enum cpuhp_state state,
496 int (*cb)(unsigned int))
229{ 497{
230 BUG_ON(cpu_notify(val, v)); 498 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
499
500 if (!cpu_online(cpu))
501 return 0;
502
503 st->cb_state = state;
504 st->cb = cb;
505 /*
506 * Make sure the above stores are visible before should_run becomes
507 * true. Paired with the mb() above in cpuhp_thread_fun()
508 */
509 smp_mb();
510 st->should_run = true;
511 wake_up_process(st->thread);
512 wait_for_completion(&st->done);
513 return st->result;
231} 514}
515
516/* Regular hotplug invocation of the AP hotplug thread */
517static void __cpuhp_kick_ap_work(struct cpuhp_cpu_state *st)
518{
519 st->result = 0;
520 st->cb = NULL;
521 /*
522 * Make sure the above stores are visible before should_run becomes
523 * true. Paired with the mb() above in cpuhp_thread_fun()
524 */
525 smp_mb();
526 st->should_run = true;
527 wake_up_process(st->thread);
528}
529
530static int cpuhp_kick_ap_work(unsigned int cpu)
531{
532 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
533 enum cpuhp_state state = st->state;
534
535 trace_cpuhp_enter(cpu, st->target, state, cpuhp_kick_ap_work);
536 __cpuhp_kick_ap_work(st);
537 wait_for_completion(&st->done);
538 trace_cpuhp_exit(cpu, st->state, state, st->result);
539 return st->result;
540}
541
542static struct smp_hotplug_thread cpuhp_threads = {
543 .store = &cpuhp_state.thread,
544 .create = &cpuhp_create,
545 .thread_should_run = cpuhp_should_run,
546 .thread_fn = cpuhp_thread_fun,
547 .thread_comm = "cpuhp/%u",
548 .selfparking = true,
549};
550
551void __init cpuhp_threads_init(void)
552{
553 BUG_ON(smpboot_register_percpu_thread(&cpuhp_threads));
554 kthread_unpark(this_cpu_read(cpuhp_state.thread));
555}
556
557#ifdef CONFIG_HOTPLUG_CPU
232EXPORT_SYMBOL(register_cpu_notifier); 558EXPORT_SYMBOL(register_cpu_notifier);
233EXPORT_SYMBOL(__register_cpu_notifier); 559EXPORT_SYMBOL(__register_cpu_notifier);
234
235void unregister_cpu_notifier(struct notifier_block *nb) 560void unregister_cpu_notifier(struct notifier_block *nb)
236{ 561{
237 cpu_maps_update_begin(); 562 cpu_maps_update_begin();
@@ -311,57 +636,60 @@ static inline void check_for_tasks(int dead_cpu)
311 read_unlock(&tasklist_lock); 636 read_unlock(&tasklist_lock);
312} 637}
313 638
314struct take_cpu_down_param { 639static void cpu_notify_nofail(unsigned long val, unsigned int cpu)
315 unsigned long mod; 640{
316 void *hcpu; 641 BUG_ON(cpu_notify(val, cpu));
317}; 642}
643
644static int notify_down_prepare(unsigned int cpu)
645{
646 int err, nr_calls = 0;
647
648 err = __cpu_notify(CPU_DOWN_PREPARE, cpu, -1, &nr_calls);
649 if (err) {
650 nr_calls--;
651 __cpu_notify(CPU_DOWN_FAILED, cpu, nr_calls, NULL);
652 pr_warn("%s: attempt to take down CPU %u failed\n",
653 __func__, cpu);
654 }
655 return err;
656}
657
658static int notify_dying(unsigned int cpu)
659{
660 cpu_notify(CPU_DYING, cpu);
661 return 0;
662}
318 663
319/* Take this CPU down. */ 664/* Take this CPU down. */
320static int take_cpu_down(void *_param) 665static int take_cpu_down(void *_param)
321{ 666{
322 struct take_cpu_down_param *param = _param; 667 struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);
323 int err; 668 enum cpuhp_state target = max((int)st->target, CPUHP_AP_OFFLINE);
669 int err, cpu = smp_processor_id();
324 670
325 /* Ensure this CPU doesn't handle any more interrupts. */ 671 /* Ensure this CPU doesn't handle any more interrupts. */
326 err = __cpu_disable(); 672 err = __cpu_disable();
327 if (err < 0) 673 if (err < 0)
328 return err; 674 return err;
329 675
330 cpu_notify(CPU_DYING | param->mod, param->hcpu); 676 /* Invoke the former CPU_DYING callbacks */
677 for (; st->state > target; st->state--) {
678 struct cpuhp_step *step = cpuhp_ap_states + st->state;
679
680 cpuhp_invoke_callback(cpu, st->state, step->teardown);
681 }
331 /* Give up timekeeping duties */ 682 /* Give up timekeeping duties */
332 tick_handover_do_timer(); 683 tick_handover_do_timer();
333 /* Park the stopper thread */ 684 /* Park the stopper thread */
334 stop_machine_park((long)param->hcpu); 685 stop_machine_park(cpu);
335 return 0; 686 return 0;
336} 687}
337 688
338/* Requires cpu_add_remove_lock to be held */ 689static int takedown_cpu(unsigned int cpu)
339static int _cpu_down(unsigned int cpu, int tasks_frozen)
340{ 690{
341 int err, nr_calls = 0; 691 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
342 void *hcpu = (void *)(long)cpu; 692 int err;
343 unsigned long mod = tasks_frozen ? CPU_TASKS_FROZEN : 0;
344 struct take_cpu_down_param tcd_param = {
345 .mod = mod,
346 .hcpu = hcpu,
347 };
348
349 if (num_online_cpus() == 1)
350 return -EBUSY;
351
352 if (!cpu_online(cpu))
353 return -EINVAL;
354
355 cpu_hotplug_begin();
356
357 err = __cpu_notify(CPU_DOWN_PREPARE | mod, hcpu, -1, &nr_calls);
358 if (err) {
359 nr_calls--;
360 __cpu_notify(CPU_DOWN_FAILED | mod, hcpu, nr_calls, NULL);
361 pr_warn("%s: attempt to take down CPU %u failed\n",
362 __func__, cpu);
363 goto out_release;
364 }
365 693
366 /* 694 /*
367 * By now we've cleared cpu_active_mask, wait for all preempt-disabled 695 * By now we've cleared cpu_active_mask, wait for all preempt-disabled
@@ -378,6 +706,8 @@ static int _cpu_down(unsigned int cpu, int tasks_frozen)
378 else 706 else
379 synchronize_rcu(); 707 synchronize_rcu();
380 708
709 /* Park the smpboot threads */
710 kthread_park(per_cpu_ptr(&cpuhp_state, cpu)->thread);
381 smpboot_park_threads(cpu); 711 smpboot_park_threads(cpu);
382 712
383 /* 713 /*
@@ -389,12 +719,12 @@ static int _cpu_down(unsigned int cpu, int tasks_frozen)
389 /* 719 /*
390 * So now all preempt/rcu users must observe !cpu_active(). 720 * So now all preempt/rcu users must observe !cpu_active().
391 */ 721 */
392 err = stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu)); 722 err = stop_machine(take_cpu_down, NULL, cpumask_of(cpu));
393 if (err) { 723 if (err) {
394 /* CPU didn't die: tell everyone. Can't complain. */ 724 /* CPU didn't die: tell everyone. Can't complain. */
395 cpu_notify_nofail(CPU_DOWN_FAILED | mod, hcpu); 725 cpu_notify_nofail(CPU_DOWN_FAILED, cpu);
396 irq_unlock_sparse(); 726 irq_unlock_sparse();
397 goto out_release; 727 return err;
398 } 728 }
399 BUG_ON(cpu_online(cpu)); 729 BUG_ON(cpu_online(cpu));
400 730
@@ -405,10 +735,8 @@ static int _cpu_down(unsigned int cpu, int tasks_frozen)
405 * 735 *
406 * Wait for the stop thread to go away. 736 * Wait for the stop thread to go away.
407 */ 737 */
408 while (!per_cpu(cpu_dead_idle, cpu)) 738 wait_for_completion(&st->done);
409 cpu_relax(); 739 BUG_ON(st->state != CPUHP_AP_IDLE_DEAD);
410 smp_mb(); /* Read from cpu_dead_idle before __cpu_die(). */
411 per_cpu(cpu_dead_idle, cpu) = false;
412 740
413 /* Interrupts are moved away from the dying cpu, reenable alloc/free */ 741 /* Interrupts are moved away from the dying cpu, reenable alloc/free */
414 irq_unlock_sparse(); 742 irq_unlock_sparse();
@@ -417,20 +745,104 @@ static int _cpu_down(unsigned int cpu, int tasks_frozen)
417 /* This actually kills the CPU. */ 745 /* This actually kills the CPU. */
418 __cpu_die(cpu); 746 __cpu_die(cpu);
419 747
420 /* CPU is completely dead: tell everyone. Too late to complain. */
421 tick_cleanup_dead_cpu(cpu); 748 tick_cleanup_dead_cpu(cpu);
422 cpu_notify_nofail(CPU_DEAD | mod, hcpu); 749 return 0;
750}
423 751
752static int notify_dead(unsigned int cpu)
753{
754 cpu_notify_nofail(CPU_DEAD, cpu);
424 check_for_tasks(cpu); 755 check_for_tasks(cpu);
756 return 0;
757}
425 758
426out_release: 759static void cpuhp_complete_idle_dead(void *arg)
760{
761 struct cpuhp_cpu_state *st = arg;
762
763 complete(&st->done);
764}
765
766void cpuhp_report_idle_dead(void)
767{
768 struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);
769
770 BUG_ON(st->state != CPUHP_AP_OFFLINE);
771 rcu_report_dead(smp_processor_id());
772 st->state = CPUHP_AP_IDLE_DEAD;
773 /*
774 * We cannot call complete after rcu_report_dead() so we delegate it
775 * to an online cpu.
776 */
777 smp_call_function_single(cpumask_first(cpu_online_mask),
778 cpuhp_complete_idle_dead, st, 0);
779}
780
781#else
782#define notify_down_prepare NULL
783#define takedown_cpu NULL
784#define notify_dead NULL
785#define notify_dying NULL
786#endif
787
788#ifdef CONFIG_HOTPLUG_CPU
789
790/* Requires cpu_add_remove_lock to be held */
791static int __ref _cpu_down(unsigned int cpu, int tasks_frozen,
792 enum cpuhp_state target)
793{
794 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
795 int prev_state, ret = 0;
796 bool hasdied = false;
797
798 if (num_online_cpus() == 1)
799 return -EBUSY;
800
801 if (!cpu_present(cpu))
802 return -EINVAL;
803
804 cpu_hotplug_begin();
805
806 cpuhp_tasks_frozen = tasks_frozen;
807
808 prev_state = st->state;
809 st->target = target;
810 /*
811 * If the current CPU state is in the range of the AP hotplug thread,
812 * then we need to kick the thread.
813 */
814 if (st->state > CPUHP_TEARDOWN_CPU) {
815 ret = cpuhp_kick_ap_work(cpu);
816 /*
817 * The AP side has done the error rollback already. Just
818 * return the error code..
819 */
820 if (ret)
821 goto out;
822
823 /*
824 * We might have stopped still in the range of the AP hotplug
825 * thread. Nothing to do anymore.
826 */
827 if (st->state > CPUHP_TEARDOWN_CPU)
828 goto out;
829 }
830 /*
831 * The AP brought itself down to CPUHP_TEARDOWN_CPU. So we need
832 * to do the further cleanups.
833 */
834 ret = cpuhp_down_callbacks(cpu, st, cpuhp_bp_states, target);
835
836 hasdied = prev_state != st->state && st->state == CPUHP_OFFLINE;
837out:
427 cpu_hotplug_done(); 838 cpu_hotplug_done();
428 if (!err) 839 /* This post dead nonsense must die */
429 cpu_notify_nofail(CPU_POST_DEAD | mod, hcpu); 840 if (!ret && hasdied)
430 return err; 841 cpu_notify_nofail(CPU_POST_DEAD, cpu);
842 return ret;
431} 843}
432 844
433int cpu_down(unsigned int cpu) 845static int do_cpu_down(unsigned int cpu, enum cpuhp_state target)
434{ 846{
435 int err; 847 int err;
436 848
@@ -441,100 +853,131 @@ int cpu_down(unsigned int cpu)
441 goto out; 853 goto out;
442 } 854 }
443 855
444 err = _cpu_down(cpu, 0); 856 err = _cpu_down(cpu, 0, target);
445 857
446out: 858out:
447 cpu_maps_update_done(); 859 cpu_maps_update_done();
448 return err; 860 return err;
449} 861}
862int cpu_down(unsigned int cpu)
863{
864 return do_cpu_down(cpu, CPUHP_OFFLINE);
865}
450EXPORT_SYMBOL(cpu_down); 866EXPORT_SYMBOL(cpu_down);
451#endif /*CONFIG_HOTPLUG_CPU*/ 867#endif /*CONFIG_HOTPLUG_CPU*/
452 868
453/* 869/**
454 * Unpark per-CPU smpboot kthreads at CPU-online time. 870 * notify_cpu_starting(cpu) - call the CPU_STARTING notifiers
871 * @cpu: cpu that just started
872 *
873 * This function calls the cpu_chain notifiers with CPU_STARTING.
874 * It must be called by the arch code on the new cpu, before the new cpu
875 * enables interrupts and before the "boot" cpu returns from __cpu_up().
455 */ 876 */
456static int smpboot_thread_call(struct notifier_block *nfb, 877void notify_cpu_starting(unsigned int cpu)
457 unsigned long action, void *hcpu)
458{ 878{
459 int cpu = (long)hcpu; 879 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
460 880 enum cpuhp_state target = min((int)st->target, CPUHP_AP_ONLINE);
461 switch (action & ~CPU_TASKS_FROZEN) {
462 881
463 case CPU_DOWN_FAILED: 882 while (st->state < target) {
464 case CPU_ONLINE: 883 struct cpuhp_step *step;
465 smpboot_unpark_threads(cpu);
466 break;
467 884
468 default: 885 st->state++;
469 break; 886 step = cpuhp_ap_states + st->state;
887 cpuhp_invoke_callback(cpu, st->state, step->startup);
470 } 888 }
471
472 return NOTIFY_OK;
473} 889}
474 890
475static struct notifier_block smpboot_thread_notifier = { 891/*
476 .notifier_call = smpboot_thread_call, 892 * Called from the idle task. We need to set active here, so we can kick off
477 .priority = CPU_PRI_SMPBOOT, 893 * the stopper thread and unpark the smpboot threads. If the target state is
478}; 894 * beyond CPUHP_AP_ONLINE_IDLE we kick cpuhp thread and let it bring up the
479 895 * cpu further.
480void smpboot_thread_init(void) 896 */
897void cpuhp_online_idle(enum cpuhp_state state)
481{ 898{
482 register_cpu_notifier(&smpboot_thread_notifier); 899 struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);
900 unsigned int cpu = smp_processor_id();
901
902 /* Happens for the boot cpu */
903 if (state != CPUHP_AP_ONLINE_IDLE)
904 return;
905
906 st->state = CPUHP_AP_ONLINE_IDLE;
907
908 /* The cpu is marked online, set it active now */
909 set_cpu_active(cpu, true);
910 /* Unpark the stopper thread and the hotplug thread of this cpu */
911 stop_machine_unpark(cpu);
912 kthread_unpark(st->thread);
913
914 /* Should we go further up ? */
915 if (st->target > CPUHP_AP_ONLINE_IDLE)
916 __cpuhp_kick_ap_work(st);
917 else
918 complete(&st->done);
483} 919}
484 920
485/* Requires cpu_add_remove_lock to be held */ 921/* Requires cpu_add_remove_lock to be held */
486static int _cpu_up(unsigned int cpu, int tasks_frozen) 922static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
487{ 923{
488 int ret, nr_calls = 0; 924 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
489 void *hcpu = (void *)(long)cpu;
490 unsigned long mod = tasks_frozen ? CPU_TASKS_FROZEN : 0;
491 struct task_struct *idle; 925 struct task_struct *idle;
926 int ret = 0;
492 927
493 cpu_hotplug_begin(); 928 cpu_hotplug_begin();
494 929
495 if (cpu_online(cpu) || !cpu_present(cpu)) { 930 if (!cpu_present(cpu)) {
496 ret = -EINVAL; 931 ret = -EINVAL;
497 goto out; 932 goto out;
498 } 933 }
499 934
500 idle = idle_thread_get(cpu); 935 /*
501 if (IS_ERR(idle)) { 936 * The caller of do_cpu_up might have raced with another
502 ret = PTR_ERR(idle); 937 * caller. Ignore it for now.
503 goto out; 938 */
504 } 939 if (st->state >= target)
505
506 ret = smpboot_create_threads(cpu);
507 if (ret)
508 goto out; 940 goto out;
509 941
510 ret = __cpu_notify(CPU_UP_PREPARE | mod, hcpu, -1, &nr_calls); 942 if (st->state == CPUHP_OFFLINE) {
511 if (ret) { 943 /* Let it fail before we try to bring the cpu up */
512 nr_calls--; 944 idle = idle_thread_get(cpu);
513 pr_warn("%s: attempt to bring up CPU %u failed\n", 945 if (IS_ERR(idle)) {
514 __func__, cpu); 946 ret = PTR_ERR(idle);
515 goto out_notify; 947 goto out;
948 }
516 } 949 }
517 950
518 /* Arch-specific enabling code. */ 951 cpuhp_tasks_frozen = tasks_frozen;
519 ret = __cpu_up(cpu, idle);
520
521 if (ret != 0)
522 goto out_notify;
523 BUG_ON(!cpu_online(cpu));
524 952
525 /* Now call notifier in preparation. */ 953 st->target = target;
526 cpu_notify(CPU_ONLINE | mod, hcpu); 954 /*
955 * If the current CPU state is in the range of the AP hotplug thread,
956 * then we need to kick the thread once more.
957 */
958 if (st->state > CPUHP_BRINGUP_CPU) {
959 ret = cpuhp_kick_ap_work(cpu);
960 /*
961 * The AP side has done the error rollback already. Just
962 * return the error code..
963 */
964 if (ret)
965 goto out;
966 }
527 967
528out_notify: 968 /*
529 if (ret != 0) 969 * Try to reach the target state. We max out on the BP at
530 __cpu_notify(CPU_UP_CANCELED | mod, hcpu, nr_calls, NULL); 970 * CPUHP_BRINGUP_CPU. After that the AP hotplug thread is
971 * responsible for bringing it up to the target state.
972 */
973 target = min((int)target, CPUHP_BRINGUP_CPU);
974 ret = cpuhp_up_callbacks(cpu, st, cpuhp_bp_states, target);
531out: 975out:
532 cpu_hotplug_done(); 976 cpu_hotplug_done();
533
534 return ret; 977 return ret;
535} 978}
536 979
537int cpu_up(unsigned int cpu) 980static int do_cpu_up(unsigned int cpu, enum cpuhp_state target)
538{ 981{
539 int err = 0; 982 int err = 0;
540 983
@@ -558,12 +1001,16 @@ int cpu_up(unsigned int cpu)
558 goto out; 1001 goto out;
559 } 1002 }
560 1003
561 err = _cpu_up(cpu, 0); 1004 err = _cpu_up(cpu, 0, target);
562
563out: 1005out:
564 cpu_maps_update_done(); 1006 cpu_maps_update_done();
565 return err; 1007 return err;
566} 1008}
1009
1010int cpu_up(unsigned int cpu)
1011{
1012 return do_cpu_up(cpu, CPUHP_ONLINE);
1013}
567EXPORT_SYMBOL_GPL(cpu_up); 1014EXPORT_SYMBOL_GPL(cpu_up);
568 1015
569#ifdef CONFIG_PM_SLEEP_SMP 1016#ifdef CONFIG_PM_SLEEP_SMP
@@ -586,7 +1033,7 @@ int disable_nonboot_cpus(void)
586 if (cpu == first_cpu) 1033 if (cpu == first_cpu)
587 continue; 1034 continue;
588 trace_suspend_resume(TPS("CPU_OFF"), cpu, true); 1035 trace_suspend_resume(TPS("CPU_OFF"), cpu, true);
589 error = _cpu_down(cpu, 1); 1036 error = _cpu_down(cpu, 1, CPUHP_OFFLINE);
590 trace_suspend_resume(TPS("CPU_OFF"), cpu, false); 1037 trace_suspend_resume(TPS("CPU_OFF"), cpu, false);
591 if (!error) 1038 if (!error)
592 cpumask_set_cpu(cpu, frozen_cpus); 1039 cpumask_set_cpu(cpu, frozen_cpus);
@@ -636,7 +1083,7 @@ void enable_nonboot_cpus(void)
636 1083
637 for_each_cpu(cpu, frozen_cpus) { 1084 for_each_cpu(cpu, frozen_cpus) {
638 trace_suspend_resume(TPS("CPU_ON"), cpu, true); 1085 trace_suspend_resume(TPS("CPU_ON"), cpu, true);
639 error = _cpu_up(cpu, 1); 1086 error = _cpu_up(cpu, 1, CPUHP_ONLINE);
640 trace_suspend_resume(TPS("CPU_ON"), cpu, false); 1087 trace_suspend_resume(TPS("CPU_ON"), cpu, false);
641 if (!error) { 1088 if (!error) {
642 pr_info("CPU%d is up\n", cpu); 1089 pr_info("CPU%d is up\n", cpu);
@@ -709,26 +1156,463 @@ core_initcall(cpu_hotplug_pm_sync_init);
709 1156
710#endif /* CONFIG_PM_SLEEP_SMP */ 1157#endif /* CONFIG_PM_SLEEP_SMP */
711 1158
1159#endif /* CONFIG_SMP */
1160
1161/* Boot processor state steps */
1162static struct cpuhp_step cpuhp_bp_states[] = {
1163 [CPUHP_OFFLINE] = {
1164 .name = "offline",
1165 .startup = NULL,
1166 .teardown = NULL,
1167 },
1168#ifdef CONFIG_SMP
1169 [CPUHP_CREATE_THREADS]= {
1170 .name = "threads:create",
1171 .startup = smpboot_create_threads,
1172 .teardown = NULL,
1173 .cant_stop = true,
1174 },
1175 /*
1176 * Preparatory and dead notifiers. Will be replaced once the notifiers
1177 * are converted to states.
1178 */
1179 [CPUHP_NOTIFY_PREPARE] = {
1180 .name = "notify:prepare",
1181 .startup = notify_prepare,
1182 .teardown = notify_dead,
1183 .skip_onerr = true,
1184 .cant_stop = true,
1185 },
1186 /* Kicks the plugged cpu into life */
1187 [CPUHP_BRINGUP_CPU] = {
1188 .name = "cpu:bringup",
1189 .startup = bringup_cpu,
1190 .teardown = NULL,
1191 .cant_stop = true,
1192 },
1193 /*
1194 * Handled on controll processor until the plugged processor manages
1195 * this itself.
1196 */
1197 [CPUHP_TEARDOWN_CPU] = {
1198 .name = "cpu:teardown",
1199 .startup = NULL,
1200 .teardown = takedown_cpu,
1201 .cant_stop = true,
1202 },
1203#endif
1204};
1205
1206/* Application processor state steps */
1207static struct cpuhp_step cpuhp_ap_states[] = {
1208#ifdef CONFIG_SMP
1209 /* Final state before CPU kills itself */
1210 [CPUHP_AP_IDLE_DEAD] = {
1211 .name = "idle:dead",
1212 },
1213 /*
1214 * Last state before CPU enters the idle loop to die. Transient state
1215 * for synchronization.
1216 */
1217 [CPUHP_AP_OFFLINE] = {
1218 .name = "ap:offline",
1219 .cant_stop = true,
1220 },
1221 /*
1222 * Low level startup/teardown notifiers. Run with interrupts
1223 * disabled. Will be removed once the notifiers are converted to
1224 * states.
1225 */
1226 [CPUHP_AP_NOTIFY_STARTING] = {
1227 .name = "notify:starting",
1228 .startup = notify_starting,
1229 .teardown = notify_dying,
1230 .skip_onerr = true,
1231 .cant_stop = true,
1232 },
1233 /* Entry state on starting. Interrupts enabled from here on. Transient
1234 * state for synchronsization */
1235 [CPUHP_AP_ONLINE] = {
1236 .name = "ap:online",
1237 },
1238 /* Handle smpboot threads park/unpark */
1239 [CPUHP_AP_SMPBOOT_THREADS] = {
1240 .name = "smpboot:threads",
1241 .startup = smpboot_unpark_threads,
1242 .teardown = NULL,
1243 },
1244 /*
1245 * Online/down_prepare notifiers. Will be removed once the notifiers
1246 * are converted to states.
1247 */
1248 [CPUHP_AP_NOTIFY_ONLINE] = {
1249 .name = "notify:online",
1250 .startup = notify_online,
1251 .teardown = notify_down_prepare,
1252 },
1253#endif
1254 /*
1255 * The dynamically registered state space is here
1256 */
1257
1258 /* CPU is fully up and running. */
1259 [CPUHP_ONLINE] = {
1260 .name = "online",
1261 .startup = NULL,
1262 .teardown = NULL,
1263 },
1264};
1265
1266/* Sanity check for callbacks */
1267static int cpuhp_cb_check(enum cpuhp_state state)
1268{
1269 if (state <= CPUHP_OFFLINE || state >= CPUHP_ONLINE)
1270 return -EINVAL;
1271 return 0;
1272}
1273
1274static bool cpuhp_is_ap_state(enum cpuhp_state state)
1275{
1276 /*
1277 * The extra check for CPUHP_TEARDOWN_CPU is only for documentation
1278 * purposes as that state is handled explicitely in cpu_down.
1279 */
1280 return state > CPUHP_BRINGUP_CPU && state != CPUHP_TEARDOWN_CPU;
1281}
1282
1283static struct cpuhp_step *cpuhp_get_step(enum cpuhp_state state)
1284{
1285 struct cpuhp_step *sp;
1286
1287 sp = cpuhp_is_ap_state(state) ? cpuhp_ap_states : cpuhp_bp_states;
1288 return sp + state;
1289}
1290
1291static void cpuhp_store_callbacks(enum cpuhp_state state,
1292 const char *name,
1293 int (*startup)(unsigned int cpu),
1294 int (*teardown)(unsigned int cpu))
1295{
1296 /* (Un)Install the callbacks for further cpu hotplug operations */
1297 struct cpuhp_step *sp;
1298
1299 mutex_lock(&cpuhp_state_mutex);
1300 sp = cpuhp_get_step(state);
1301 sp->startup = startup;
1302 sp->teardown = teardown;
1303 sp->name = name;
1304 mutex_unlock(&cpuhp_state_mutex);
1305}
1306
1307static void *cpuhp_get_teardown_cb(enum cpuhp_state state)
1308{
1309 return cpuhp_get_step(state)->teardown;
1310}
1311
1312/*
1313 * Call the startup/teardown function for a step either on the AP or
1314 * on the current CPU.
1315 */
1316static int cpuhp_issue_call(int cpu, enum cpuhp_state state,
1317 int (*cb)(unsigned int), bool bringup)
1318{
1319 int ret;
1320
1321 if (!cb)
1322 return 0;
1323 /*
1324 * The non AP bound callbacks can fail on bringup. On teardown
1325 * e.g. module removal we crash for now.
1326 */
1327#ifdef CONFIG_SMP
1328 if (cpuhp_is_ap_state(state))
1329 ret = cpuhp_invoke_ap_callback(cpu, state, cb);
1330 else
1331 ret = cpuhp_invoke_callback(cpu, state, cb);
1332#else
1333 ret = cpuhp_invoke_callback(cpu, state, cb);
1334#endif
1335 BUG_ON(ret && !bringup);
1336 return ret;
1337}
1338
1339/*
1340 * Called from __cpuhp_setup_state on a recoverable failure.
1341 *
1342 * Note: The teardown callbacks for rollback are not allowed to fail!
1343 */
1344static void cpuhp_rollback_install(int failedcpu, enum cpuhp_state state,
1345 int (*teardown)(unsigned int cpu))
1346{
1347 int cpu;
1348
1349 if (!teardown)
1350 return;
1351
1352 /* Roll back the already executed steps on the other cpus */
1353 for_each_present_cpu(cpu) {
1354 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
1355 int cpustate = st->state;
1356
1357 if (cpu >= failedcpu)
1358 break;
1359
1360 /* Did we invoke the startup call on that cpu ? */
1361 if (cpustate >= state)
1362 cpuhp_issue_call(cpu, state, teardown, false);
1363 }
1364}
1365
1366/*
1367 * Returns a free for dynamic slot assignment of the Online state. The states
1368 * are protected by the cpuhp_slot_states mutex and an empty slot is identified
1369 * by having no name assigned.
1370 */
1371static int cpuhp_reserve_state(enum cpuhp_state state)
1372{
1373 enum cpuhp_state i;
1374
1375 mutex_lock(&cpuhp_state_mutex);
1376 for (i = CPUHP_AP_ONLINE_DYN; i <= CPUHP_AP_ONLINE_DYN_END; i++) {
1377 if (cpuhp_ap_states[i].name)
1378 continue;
1379
1380 cpuhp_ap_states[i].name = "Reserved";
1381 mutex_unlock(&cpuhp_state_mutex);
1382 return i;
1383 }
1384 mutex_unlock(&cpuhp_state_mutex);
1385 WARN(1, "No more dynamic states available for CPU hotplug\n");
1386 return -ENOSPC;
1387}
1388
712/** 1389/**
713 * notify_cpu_starting(cpu) - call the CPU_STARTING notifiers 1390 * __cpuhp_setup_state - Setup the callbacks for an hotplug machine state
714 * @cpu: cpu that just started 1391 * @state: The state to setup
1392 * @invoke: If true, the startup function is invoked for cpus where
1393 * cpu state >= @state
1394 * @startup: startup callback function
1395 * @teardown: teardown callback function
715 * 1396 *
716 * This function calls the cpu_chain notifiers with CPU_STARTING. 1397 * Returns 0 if successful, otherwise a proper error code
717 * It must be called by the arch code on the new cpu, before the new cpu
718 * enables interrupts and before the "boot" cpu returns from __cpu_up().
719 */ 1398 */
720void notify_cpu_starting(unsigned int cpu) 1399int __cpuhp_setup_state(enum cpuhp_state state,
1400 const char *name, bool invoke,
1401 int (*startup)(unsigned int cpu),
1402 int (*teardown)(unsigned int cpu))
721{ 1403{
722 unsigned long val = CPU_STARTING; 1404 int cpu, ret = 0;
1405 int dyn_state = 0;
723 1406
724#ifdef CONFIG_PM_SLEEP_SMP 1407 if (cpuhp_cb_check(state) || !name)
725 if (frozen_cpus != NULL && cpumask_test_cpu(cpu, frozen_cpus)) 1408 return -EINVAL;
726 val = CPU_STARTING_FROZEN; 1409
727#endif /* CONFIG_PM_SLEEP_SMP */ 1410 get_online_cpus();
728 cpu_notify(val, (void *)(long)cpu); 1411
1412 /* currently assignments for the ONLINE state are possible */
1413 if (state == CPUHP_AP_ONLINE_DYN) {
1414 dyn_state = 1;
1415 ret = cpuhp_reserve_state(state);
1416 if (ret < 0)
1417 goto out;
1418 state = ret;
1419 }
1420
1421 cpuhp_store_callbacks(state, name, startup, teardown);
1422
1423 if (!invoke || !startup)
1424 goto out;
1425
1426 /*
1427 * Try to call the startup callback for each present cpu
1428 * depending on the hotplug state of the cpu.
1429 */
1430 for_each_present_cpu(cpu) {
1431 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
1432 int cpustate = st->state;
1433
1434 if (cpustate < state)
1435 continue;
1436
1437 ret = cpuhp_issue_call(cpu, state, startup, true);
1438 if (ret) {
1439 cpuhp_rollback_install(cpu, state, teardown);
1440 cpuhp_store_callbacks(state, NULL, NULL, NULL);
1441 goto out;
1442 }
1443 }
1444out:
1445 put_online_cpus();
1446 if (!ret && dyn_state)
1447 return state;
1448 return ret;
729} 1449}
1450EXPORT_SYMBOL(__cpuhp_setup_state);
730 1451
731#endif /* CONFIG_SMP */ 1452/**
1453 * __cpuhp_remove_state - Remove the callbacks for an hotplug machine state
1454 * @state: The state to remove
1455 * @invoke: If true, the teardown function is invoked for cpus where
1456 * cpu state >= @state
1457 *
1458 * The teardown callback is currently not allowed to fail. Think
1459 * about module removal!
1460 */
1461void __cpuhp_remove_state(enum cpuhp_state state, bool invoke)
1462{
1463 int (*teardown)(unsigned int cpu) = cpuhp_get_teardown_cb(state);
1464 int cpu;
1465
1466 BUG_ON(cpuhp_cb_check(state));
1467
1468 get_online_cpus();
1469
1470 if (!invoke || !teardown)
1471 goto remove;
1472
1473 /*
1474 * Call the teardown callback for each present cpu depending
1475 * on the hotplug state of the cpu. This function is not
1476 * allowed to fail currently!
1477 */
1478 for_each_present_cpu(cpu) {
1479 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
1480 int cpustate = st->state;
1481
1482 if (cpustate >= state)
1483 cpuhp_issue_call(cpu, state, teardown, false);
1484 }
1485remove:
1486 cpuhp_store_callbacks(state, NULL, NULL, NULL);
1487 put_online_cpus();
1488}
1489EXPORT_SYMBOL(__cpuhp_remove_state);
1490
1491#if defined(CONFIG_SYSFS) && defined(CONFIG_HOTPLUG_CPU)
1492static ssize_t show_cpuhp_state(struct device *dev,
1493 struct device_attribute *attr, char *buf)
1494{
1495 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, dev->id);
1496
1497 return sprintf(buf, "%d\n", st->state);
1498}
1499static DEVICE_ATTR(state, 0444, show_cpuhp_state, NULL);
1500
1501static ssize_t write_cpuhp_target(struct device *dev,
1502 struct device_attribute *attr,
1503 const char *buf, size_t count)
1504{
1505 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, dev->id);
1506 struct cpuhp_step *sp;
1507 int target, ret;
1508
1509 ret = kstrtoint(buf, 10, &target);
1510 if (ret)
1511 return ret;
1512
1513#ifdef CONFIG_CPU_HOTPLUG_STATE_CONTROL
1514 if (target < CPUHP_OFFLINE || target > CPUHP_ONLINE)
1515 return -EINVAL;
1516#else
1517 if (target != CPUHP_OFFLINE && target != CPUHP_ONLINE)
1518 return -EINVAL;
1519#endif
1520
1521 ret = lock_device_hotplug_sysfs();
1522 if (ret)
1523 return ret;
1524
1525 mutex_lock(&cpuhp_state_mutex);
1526 sp = cpuhp_get_step(target);
1527 ret = !sp->name || sp->cant_stop ? -EINVAL : 0;
1528 mutex_unlock(&cpuhp_state_mutex);
1529 if (ret)
1530 return ret;
1531
1532 if (st->state < target)
1533 ret = do_cpu_up(dev->id, target);
1534 else
1535 ret = do_cpu_down(dev->id, target);
1536
1537 unlock_device_hotplug();
1538 return ret ? ret : count;
1539}
1540
1541static ssize_t show_cpuhp_target(struct device *dev,
1542 struct device_attribute *attr, char *buf)
1543{
1544 struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, dev->id);
1545
1546 return sprintf(buf, "%d\n", st->target);
1547}
1548static DEVICE_ATTR(target, 0644, show_cpuhp_target, write_cpuhp_target);
1549
1550static struct attribute *cpuhp_cpu_attrs[] = {
1551 &dev_attr_state.attr,
1552 &dev_attr_target.attr,
1553 NULL
1554};
1555
1556static struct attribute_group cpuhp_cpu_attr_group = {
1557 .attrs = cpuhp_cpu_attrs,
1558 .name = "hotplug",
1559 NULL
1560};
1561
1562static ssize_t show_cpuhp_states(struct device *dev,
1563 struct device_attribute *attr, char *buf)
1564{
1565 ssize_t cur, res = 0;
1566 int i;
1567
1568 mutex_lock(&cpuhp_state_mutex);
1569 for (i = CPUHP_OFFLINE; i <= CPUHP_ONLINE; i++) {
1570 struct cpuhp_step *sp = cpuhp_get_step(i);
1571
1572 if (sp->name) {
1573 cur = sprintf(buf, "%3d: %s\n", i, sp->name);
1574 buf += cur;
1575 res += cur;
1576 }
1577 }
1578 mutex_unlock(&cpuhp_state_mutex);
1579 return res;
1580}
1581static DEVICE_ATTR(states, 0444, show_cpuhp_states, NULL);
1582
1583static struct attribute *cpuhp_cpu_root_attrs[] = {
1584 &dev_attr_states.attr,
1585 NULL
1586};
1587
1588static struct attribute_group cpuhp_cpu_root_attr_group = {
1589 .attrs = cpuhp_cpu_root_attrs,
1590 .name = "hotplug",
1591 NULL
1592};
1593
1594static int __init cpuhp_sysfs_init(void)
1595{
1596 int cpu, ret;
1597
1598 ret = sysfs_create_group(&cpu_subsys.dev_root->kobj,
1599 &cpuhp_cpu_root_attr_group);
1600 if (ret)
1601 return ret;
1602
1603 for_each_possible_cpu(cpu) {
1604 struct device *dev = get_cpu_device(cpu);
1605
1606 if (!dev)
1607 continue;
1608 ret = sysfs_create_group(&dev->kobj, &cpuhp_cpu_attr_group);
1609 if (ret)
1610 return ret;
1611 }
1612 return 0;
1613}
1614device_initcall(cpuhp_sysfs_init);
1615#endif
732 1616
733/* 1617/*
734 * cpu_bit_bitmap[] is a special, "compressed" data structure that 1618 * cpu_bit_bitmap[] is a special, "compressed" data structure that
@@ -789,3 +1673,25 @@ void init_cpu_online(const struct cpumask *src)
789{ 1673{
790 cpumask_copy(&__cpu_online_mask, src); 1674 cpumask_copy(&__cpu_online_mask, src);
791} 1675}
1676
1677/*
1678 * Activate the first processor.
1679 */
1680void __init boot_cpu_init(void)
1681{
1682 int cpu = smp_processor_id();
1683
1684 /* Mark the boot cpu "present", "online" etc for SMP and UP case */
1685 set_cpu_online(cpu, true);
1686 set_cpu_active(cpu, true);
1687 set_cpu_present(cpu, true);
1688 set_cpu_possible(cpu, true);
1689}
1690
1691/*
1692 * Must be called _AFTER_ setting up the per_cpu areas
1693 */
1694void __init boot_cpu_state_init(void)
1695{
1696 per_cpu_ptr(&cpuhp_state, smp_processor_id())->state = CPUHP_ONLINE;
1697}
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 55cea189783f..9a535a86e732 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2606,28 +2606,6 @@ static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf)
2606} 2606}
2607 2607
2608/* 2608/*
2609 * The CPU is exiting the idle loop into the arch_cpu_idle_dead()
2610 * function. We now remove it from the rcu_node tree's ->qsmaskinit
2611 * bit masks.
2612 */
2613static void rcu_cleanup_dying_idle_cpu(int cpu, struct rcu_state *rsp)
2614{
2615 unsigned long flags;
2616 unsigned long mask;
2617 struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
2618 struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */
2619
2620 if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
2621 return;
2622
2623 /* Remove outgoing CPU from mask in the leaf rcu_node structure. */
2624 mask = rdp->grpmask;
2625 raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */
2626 rnp->qsmaskinitnext &= ~mask;
2627 raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
2628}
2629
2630/*
2631 * The CPU has been completely removed, and some other CPU is reporting 2609 * The CPU has been completely removed, and some other CPU is reporting
2632 * this fact from process context. Do the remainder of the cleanup, 2610 * this fact from process context. Do the remainder of the cleanup,
2633 * including orphaning the outgoing CPU's RCU callbacks, and also 2611 * including orphaning the outgoing CPU's RCU callbacks, and also
@@ -4246,6 +4224,46 @@ static void rcu_prepare_cpu(int cpu)
4246 rcu_init_percpu_data(cpu, rsp); 4224 rcu_init_percpu_data(cpu, rsp);
4247} 4225}
4248 4226
4227#ifdef CONFIG_HOTPLUG_CPU
4228/*
4229 * The CPU is exiting the idle loop into the arch_cpu_idle_dead()
4230 * function. We now remove it from the rcu_node tree's ->qsmaskinit
4231 * bit masks.
4232 * The CPU is exiting the idle loop into the arch_cpu_idle_dead()
4233 * function. We now remove it from the rcu_node tree's ->qsmaskinit
4234 * bit masks.
4235 */
4236static void rcu_cleanup_dying_idle_cpu(int cpu, struct rcu_state *rsp)
4237{
4238 unsigned long flags;
4239 unsigned long mask;
4240 struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
4241 struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */
4242
4243 if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
4244 return;
4245
4246 /* Remove outgoing CPU from mask in the leaf rcu_node structure. */
4247 mask = rdp->grpmask;
4248 raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */
4249 rnp->qsmaskinitnext &= ~mask;
4250 raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
4251}
4252
4253void rcu_report_dead(unsigned int cpu)
4254{
4255 struct rcu_state *rsp;
4256
4257 /* QS for any half-done expedited RCU-sched GP. */
4258 preempt_disable();
4259 rcu_report_exp_rdp(&rcu_sched_state,
4260 this_cpu_ptr(rcu_sched_state.rda), true);
4261 preempt_enable();
4262 for_each_rcu_flavor(rsp)
4263 rcu_cleanup_dying_idle_cpu(cpu, rsp);
4264}
4265#endif
4266
4249/* 4267/*
4250 * Handle CPU online/offline notification events. 4268 * Handle CPU online/offline notification events.
4251 */ 4269 */
@@ -4277,17 +4295,6 @@ int rcu_cpu_notify(struct notifier_block *self,
4277 for_each_rcu_flavor(rsp) 4295 for_each_rcu_flavor(rsp)
4278 rcu_cleanup_dying_cpu(rsp); 4296 rcu_cleanup_dying_cpu(rsp);
4279 break; 4297 break;
4280 case CPU_DYING_IDLE:
4281 /* QS for any half-done expedited RCU-sched GP. */
4282 preempt_disable();
4283 rcu_report_exp_rdp(&rcu_sched_state,
4284 this_cpu_ptr(rcu_sched_state.rda), true);
4285 preempt_enable();
4286
4287 for_each_rcu_flavor(rsp) {
4288 rcu_cleanup_dying_idle_cpu(cpu, rsp);
4289 }
4290 break;
4291 case CPU_DEAD: 4298 case CPU_DEAD:
4292 case CPU_DEAD_FROZEN: 4299 case CPU_DEAD_FROZEN:
4293 case CPU_UP_CANCELED: 4300 case CPU_UP_CANCELED:
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e5725b931bee..ea8f49ae0062 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5434,16 +5434,6 @@ static int sched_cpu_active(struct notifier_block *nfb,
5434 set_cpu_rq_start_time(); 5434 set_cpu_rq_start_time();
5435 return NOTIFY_OK; 5435 return NOTIFY_OK;
5436 5436
5437 case CPU_ONLINE:
5438 /*
5439 * At this point a starting CPU has marked itself as online via
5440 * set_cpu_online(). But it might not yet have marked itself
5441 * as active, which is essential from here on.
5442 */
5443 set_cpu_active(cpu, true);
5444 stop_machine_unpark(cpu);
5445 return NOTIFY_OK;
5446
5447 case CPU_DOWN_FAILED: 5437 case CPU_DOWN_FAILED:
5448 set_cpu_active(cpu, true); 5438 set_cpu_active(cpu, true);
5449 return NOTIFY_OK; 5439 return NOTIFY_OK;
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 544a7133cbd1..bd12c6c714ec 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -4,6 +4,7 @@
4#include <linux/sched.h> 4#include <linux/sched.h>
5#include <linux/cpu.h> 5#include <linux/cpu.h>
6#include <linux/cpuidle.h> 6#include <linux/cpuidle.h>
7#include <linux/cpuhotplug.h>
7#include <linux/tick.h> 8#include <linux/tick.h>
8#include <linux/mm.h> 9#include <linux/mm.h>
9#include <linux/stackprotector.h> 10#include <linux/stackprotector.h>
@@ -193,8 +194,6 @@ exit_idle:
193 rcu_idle_exit(); 194 rcu_idle_exit();
194} 195}
195 196
196DEFINE_PER_CPU(bool, cpu_dead_idle);
197
198/* 197/*
199 * Generic idle loop implementation 198 * Generic idle loop implementation
200 * 199 *
@@ -221,10 +220,7 @@ static void cpu_idle_loop(void)
221 rmb(); 220 rmb();
222 221
223 if (cpu_is_offline(smp_processor_id())) { 222 if (cpu_is_offline(smp_processor_id())) {
224 rcu_cpu_notify(NULL, CPU_DYING_IDLE, 223 cpuhp_report_idle_dead();
225 (void *)(long)smp_processor_id());
226 smp_mb(); /* all activity before dead. */
227 this_cpu_write(cpu_dead_idle, true);
228 arch_cpu_idle_dead(); 224 arch_cpu_idle_dead();
229 } 225 }
230 226
@@ -291,5 +287,6 @@ void cpu_startup_entry(enum cpuhp_state state)
291 boot_init_stack_canary(); 287 boot_init_stack_canary();
292#endif 288#endif
293 arch_cpu_idle_prepare(); 289 arch_cpu_idle_prepare();
290 cpuhp_online_idle(state);
294 cpu_idle_loop(); 291 cpu_idle_loop();
295} 292}
diff --git a/kernel/smp.c b/kernel/smp.c
index 300d29391e07..74165443c240 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -568,6 +568,7 @@ void __init smp_init(void)
568 unsigned int cpu; 568 unsigned int cpu;
569 569
570 idle_threads_init(); 570 idle_threads_init();
571 cpuhp_threads_init();
571 572
572 /* FIXME: This should be done in userspace --RR */ 573 /* FIXME: This should be done in userspace --RR */
573 for_each_present_cpu(cpu) { 574 for_each_present_cpu(cpu) {
diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index d264f59bff56..13bc43d1fb22 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -226,7 +226,7 @@ static void smpboot_unpark_thread(struct smp_hotplug_thread *ht, unsigned int cp
226 kthread_unpark(tsk); 226 kthread_unpark(tsk);
227} 227}
228 228
229void smpboot_unpark_threads(unsigned int cpu) 229int smpboot_unpark_threads(unsigned int cpu)
230{ 230{
231 struct smp_hotplug_thread *cur; 231 struct smp_hotplug_thread *cur;
232 232
@@ -235,6 +235,7 @@ void smpboot_unpark_threads(unsigned int cpu)
235 if (cpumask_test_cpu(cpu, cur->cpumask)) 235 if (cpumask_test_cpu(cpu, cur->cpumask))
236 smpboot_unpark_thread(cur, cpu); 236 smpboot_unpark_thread(cur, cpu);
237 mutex_unlock(&smpboot_threads_lock); 237 mutex_unlock(&smpboot_threads_lock);
238 return 0;
238} 239}
239 240
240static void smpboot_park_thread(struct smp_hotplug_thread *ht, unsigned int cpu) 241static void smpboot_park_thread(struct smp_hotplug_thread *ht, unsigned int cpu)
@@ -245,7 +246,7 @@ static void smpboot_park_thread(struct smp_hotplug_thread *ht, unsigned int cpu)
245 kthread_park(tsk); 246 kthread_park(tsk);
246} 247}
247 248
248void smpboot_park_threads(unsigned int cpu) 249int smpboot_park_threads(unsigned int cpu)
249{ 250{
250 struct smp_hotplug_thread *cur; 251 struct smp_hotplug_thread *cur;
251 252
@@ -253,6 +254,7 @@ void smpboot_park_threads(unsigned int cpu)
253 list_for_each_entry_reverse(cur, &hotplug_threads, list) 254 list_for_each_entry_reverse(cur, &hotplug_threads, list)
254 smpboot_park_thread(cur, cpu); 255 smpboot_park_thread(cur, cpu);
255 mutex_unlock(&smpboot_threads_lock); 256 mutex_unlock(&smpboot_threads_lock);
257 return 0;
256} 258}
257 259
258static void smpboot_destroy_threads(struct smp_hotplug_thread *ht) 260static void smpboot_destroy_threads(struct smp_hotplug_thread *ht)
diff --git a/kernel/smpboot.h b/kernel/smpboot.h
index 72415a0eb955..485b81cfab34 100644
--- a/kernel/smpboot.h
+++ b/kernel/smpboot.h
@@ -14,7 +14,9 @@ static inline void idle_threads_init(void) { }
14#endif 14#endif
15 15
16int smpboot_create_threads(unsigned int cpu); 16int smpboot_create_threads(unsigned int cpu);
17void smpboot_park_threads(unsigned int cpu); 17int smpboot_park_threads(unsigned int cpu);
18void smpboot_unpark_threads(unsigned int cpu); 18int smpboot_unpark_threads(unsigned int cpu);
19
20void __init cpuhp_threads_init(void);
19 21
20#endif 22#endif
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 8bfd1aca7a3d..f28f7fad452f 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1442,6 +1442,19 @@ config DEBUG_BLOCK_EXT_DEVT
1442 1442
1443 Say N if you are unsure. 1443 Say N if you are unsure.
1444 1444
1445config CPU_HOTPLUG_STATE_CONTROL
1446 bool "Enable CPU hotplug state control"
1447 depends on DEBUG_KERNEL
1448 depends on HOTPLUG_CPU
1449 default n
1450 help
1451 Allows to write steps between "offline" and "online" to the CPUs
1452 sysfs target file so states can be stepped granular. This is a debug
1453 option for now as the hotplug machinery cannot be stopped and
1454 restarted at arbitrary points yet.
1455
1456 Say N if your are unsure.
1457
1445config NOTIFIER_ERROR_INJECTION 1458config NOTIFIER_ERROR_INJECTION
1446 tristate "Notifier error injection" 1459 tristate "Notifier error injection"
1447 depends on DEBUG_KERNEL 1460 depends on DEBUG_KERNEL