diff options
author | Peter Zijlstra <peterz@infradead.org> | 2018-04-30 08:51:01 -0400 |
---|---|---|
committer | Ingo Molnar <mingo@kernel.org> | 2018-05-04 01:54:54 -0400 |
commit | b5bf9a90bbebffba888c9144c5a8a10317b04064 (patch) | |
tree | 4c059f0785c26ca66df0009544a23b68f7228be3 /kernel/sched | |
parent | 85f1abe0019fcb3ea10df7029056cf42702283a8 (diff) |
sched/core: Introduce set_special_state()
Gaurav reported a perceived problem with TASK_PARKED, which turned out
to be a broken wait-loop pattern in __kthread_parkme(), but the
reported issue can (and does) in fact happen for states that do not do
condition based sleeps.
When the 'current->state = TASK_RUNNING' store of a previous
(concurrent) try_to_wake_up() collides with the setting of a 'special'
sleep state, we can loose the sleep state.
Normal condition based wait-loops are immune to this problem, but for
sleep states that are not condition based are subject to this problem.
There already is a fix for TASK_DEAD. Abstract that and also apply it
to TASK_STOPPED and TASK_TRACED, both of which are also without
condition based wait-loop.
Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Diffstat (limited to 'kernel/sched')
-rw-r--r-- | kernel/sched/core.c | 17 |
1 files changed, 1 insertions, 16 deletions
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 7ad60e00a6a8..ffde9eebc846 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c | |||
@@ -3508,23 +3508,8 @@ static void __sched notrace __schedule(bool preempt) | |||
3508 | 3508 | ||
3509 | void __noreturn do_task_dead(void) | 3509 | void __noreturn do_task_dead(void) |
3510 | { | 3510 | { |
3511 | /* | ||
3512 | * The setting of TASK_RUNNING by try_to_wake_up() may be delayed | ||
3513 | * when the following two conditions become true. | ||
3514 | * - There is race condition of mmap_sem (It is acquired by | ||
3515 | * exit_mm()), and | ||
3516 | * - SMI occurs before setting TASK_RUNINNG. | ||
3517 | * (or hypervisor of virtual machine switches to other guest) | ||
3518 | * As a result, we may become TASK_RUNNING after becoming TASK_DEAD | ||
3519 | * | ||
3520 | * To avoid it, we have to wait for releasing tsk->pi_lock which | ||
3521 | * is held by try_to_wake_up() | ||
3522 | */ | ||
3523 | raw_spin_lock_irq(¤t->pi_lock); | ||
3524 | raw_spin_unlock_irq(¤t->pi_lock); | ||
3525 | |||
3526 | /* Causes final put_task_struct in finish_task_switch(): */ | 3511 | /* Causes final put_task_struct in finish_task_switch(): */ |
3527 | __set_current_state(TASK_DEAD); | 3512 | set_special_state(TASK_DEAD); |
3528 | 3513 | ||
3529 | /* Tell freezer to ignore us: */ | 3514 | /* Tell freezer to ignore us: */ |
3530 | current->flags |= PF_NOFREEZE; | 3515 | current->flags |= PF_NOFREEZE; |