diff options
author | Oleg Nesterov <oleg@redhat.com> | 2013-08-12 12:14:00 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2013-08-13 11:19:26 -0400 |
commit | e0acd0a68ec7dbf6b7a81a87a867ebd7ac9b76c4 (patch) | |
tree | 0421e55e2d74024f1ee1949ccdd4cd92765b2560 /kernel/sched/core.c | |
parent | 584d88b2cd3b60507e708d2452651e4d3caa1b81 (diff) |
sched: fix the theoretical signal_wake_up() vs schedule() race
This is only theoretical, but after try_to_wake_up(p) was changed
to check p->state under p->pi_lock the code like
__set_current_state(TASK_INTERRUPTIBLE);
schedule();
can miss a signal. This is the special case of wait-for-condition,
it relies on try_to_wake_up/schedule interaction and thus it does
not need mb() between __set_current_state() and if(signal_pending).
However, this __set_current_state() can move into the critical
section protected by rq->lock, now that try_to_wake_up() takes
another lock we need to ensure that it can't be reordered with
"if (signal_pending(current))" check inside that section.
The patch is actually one-liner, it simply adds smp_wmb() before
spin_lock_irq(rq->lock). This is what try_to_wake_up() already
does by the same reason.
We turn this wmb() into the new helper, smp_mb__before_spinlock(),
for better documentation and to allow the architectures to change
the default implementation.
While at it, kill smp_mb__after_lock(), it has no callers.
Perhaps we can also add smp_mb__before/after_spinunlock() for
prepare_to_wait().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'kernel/sched/core.c')
-rw-r--r-- | kernel/sched/core.c | 14 |
1 files changed, 13 insertions, 1 deletions
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index b7c32cb7bfeb..ef51b0ef4bdc 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c | |||
@@ -1491,7 +1491,13 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) | |||
1491 | unsigned long flags; | 1491 | unsigned long flags; |
1492 | int cpu, success = 0; | 1492 | int cpu, success = 0; |
1493 | 1493 | ||
1494 | smp_wmb(); | 1494 | /* |
1495 | * If we are going to wake up a thread waiting for CONDITION we | ||
1496 | * need to ensure that CONDITION=1 done by the caller can not be | ||
1497 | * reordered with p->state check below. This pairs with mb() in | ||
1498 | * set_current_state() the waiting thread does. | ||
1499 | */ | ||
1500 | smp_mb__before_spinlock(); | ||
1495 | raw_spin_lock_irqsave(&p->pi_lock, flags); | 1501 | raw_spin_lock_irqsave(&p->pi_lock, flags); |
1496 | if (!(p->state & state)) | 1502 | if (!(p->state & state)) |
1497 | goto out; | 1503 | goto out; |
@@ -2394,6 +2400,12 @@ need_resched: | |||
2394 | if (sched_feat(HRTICK)) | 2400 | if (sched_feat(HRTICK)) |
2395 | hrtick_clear(rq); | 2401 | hrtick_clear(rq); |
2396 | 2402 | ||
2403 | /* | ||
2404 | * Make sure that signal_pending_state()->signal_pending() below | ||
2405 | * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE) | ||
2406 | * done by the caller to avoid the race with signal_wake_up(). | ||
2407 | */ | ||
2408 | smp_mb__before_spinlock(); | ||
2397 | raw_spin_lock_irq(&rq->lock); | 2409 | raw_spin_lock_irq(&rq->lock); |
2398 | 2410 | ||
2399 | switch_count = &prev->nivcsw; | 2411 | switch_count = &prev->nivcsw; |