diff options
author | Preeti U Murthy <preeti@linux.vnet.ibm.com> | 2015-03-30 05:29:19 -0400 |
---|---|---|
committer | Ingo Molnar <mingo@kernel.org> | 2015-04-02 08:25:39 -0400 |
commit | 345527b1edce8df719e0884500c76832a18211c3 (patch) | |
tree | 386a6b25b2437bd94cf63df6d02d95f729eab7cc /kernel/time/tick-broadcast.c | |
parent | 9eed56e889d8a0bb7870e1216d8d4326dd63ec50 (diff) |
clockevents: Fix cpu_down() race for hrtimer based broadcasting
It was found when doing a hotplug stress test on POWER, that the
machine either hit softlockups or rcu_sched stall warnings. The
issue was traced to commit:
7cba160ad789 ("powernv/cpuidle: Redesign idle states management")
which exposed the cpu_down() race with hrtimer based broadcast mode:
5d1638acb9f6 ("tick: Introduce hrtimer based broadcast")
The race is the following:
Assume CPU1 is the CPU which holds the hrtimer broadcasting duty
before it is taken down.
CPU0 CPU1
cpu_down() take_cpu_down()
disable_interrupts()
cpu_die()
while (CPU1 != CPU_DEAD) {
msleep(100);
switch_to_idle();
stop_cpu_timer();
schedule_broadcast();
}
tick_cleanup_cpu_dead()
take_over_broadcast()
So after CPU1 disabled interrupts it cannot handle the broadcast
hrtimer anymore, so CPU0 will be stuck forever.
Fix this by explicitly taking over broadcast duty before cpu_die().
This is a temporary workaround. What we really want is a callback
in the clockevent device which allows us to do that from the dying
CPU by pushing the hrtimer onto a different cpu. That might involve
an IPI and is definitely more complex than this immediate fix.
Changelog was picked up from:
https://lkml.org/lkml/2015/2/16/213
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: nicolas.pitre@linaro.org
Cc: peterz@infradead.org
Cc: rjw@rjwysocki.net
Fixes: http://linuxppc.10917.n7.nabble.com/offlining-cpus-breakage-td88619.html
Link: http://lkml.kernel.org/r/20150330092410.24979.59887.stgit@preeti.in.ibm.com
[ Merged it to the latest timer tree, renamed the callback, tidied up the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Diffstat (limited to 'kernel/time/tick-broadcast.c')
-rw-r--r-- | kernel/time/tick-broadcast.c | 19 |
1 files changed, 11 insertions, 8 deletions
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index 19cfb381faa9..f5e0fd5652dc 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c | |||
@@ -680,14 +680,19 @@ static void broadcast_shutdown_local(struct clock_event_device *bc, | |||
680 | clockevents_set_state(dev, CLOCK_EVT_STATE_SHUTDOWN); | 680 | clockevents_set_state(dev, CLOCK_EVT_STATE_SHUTDOWN); |
681 | } | 681 | } |
682 | 682 | ||
683 | static void broadcast_move_bc(int deadcpu) | 683 | void hotplug_cpu__broadcast_tick_pull(int deadcpu) |
684 | { | 684 | { |
685 | struct clock_event_device *bc = tick_broadcast_device.evtdev; | 685 | struct clock_event_device *bc; |
686 | unsigned long flags; | ||
686 | 687 | ||
687 | if (!bc || !broadcast_needs_cpu(bc, deadcpu)) | 688 | raw_spin_lock_irqsave(&tick_broadcast_lock, flags); |
688 | return; | 689 | bc = tick_broadcast_device.evtdev; |
689 | /* This moves the broadcast assignment to this cpu */ | 690 | |
690 | clockevents_program_event(bc, bc->next_event, 1); | 691 | if (bc && broadcast_needs_cpu(bc, deadcpu)) { |
692 | /* This moves the broadcast assignment to this CPU: */ | ||
693 | clockevents_program_event(bc, bc->next_event, 1); | ||
694 | } | ||
695 | raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags); | ||
691 | } | 696 | } |
692 | 697 | ||
693 | /* | 698 | /* |
@@ -924,8 +929,6 @@ void tick_shutdown_broadcast_oneshot(unsigned int *cpup) | |||
924 | cpumask_clear_cpu(cpu, tick_broadcast_pending_mask); | 929 | cpumask_clear_cpu(cpu, tick_broadcast_pending_mask); |
925 | cpumask_clear_cpu(cpu, tick_broadcast_force_mask); | 930 | cpumask_clear_cpu(cpu, tick_broadcast_force_mask); |
926 | 931 | ||
927 | broadcast_move_bc(cpu); | ||
928 | |||
929 | raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags); | 932 | raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags); |
930 | } | 933 | } |
931 | 934 | ||