workqueue: fix possible pool stall bug in wq_unbind_fn()

Since multiple pools per cpu have been introduced, wq_unbind_fn() has a subtle bug which may theoretically stall work item processing. The problem is two-fold. * wq_unbind_fn() depends on the worker executing wq_unbind_fn() itself to start unbound chain execution, which works fine when there was only single pool. With multiple pools, only the pool which is running wq_unbind_fn() - the highpri one - is guaranteed to have such kick-off. The other pool could stall when its busy workers block. * The current code is setting WORKER_UNBIND / POOL_DISASSOCIATED of the two pools in succession without initiating work execution inbetween. Because setting the flags requires grabbing assoc_mutex which is held while new workers are created, this could lead to stalls if a pool's manager is waiting for the previous pool's work items to release memory. This is almost purely theoretical tho. Update wq_unbind_fn() such that it sets WORKER_UNBIND / POOL_DISASSOCIATED, goes over schedule() and explicitly kicks off execution for a pool and then moves on to the next one. tj: Updated comments and description. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org
author: Lai Jiangshan <laijs@cn.fujitsu.com> 2013-03-08 18:18:28 -0500
committer: Tejun Heo <tj@kernel.org> 2013-03-08 18:18:28 -0500
commit: eb2834285cf172856cd12f66892fc7467935ebed (patch)
tree: bf0a3125e85b7acf2dc43e0d4f82b68459e77a36 /kernel
parent: 6dbe51c251a327e012439c4772097a13df43c5b8 (diff)
1 files changed, 25 insertions, 19 deletions
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 81f2457811eb..604801b91cba 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3446,28 +3446,34 @@ static void wq_unbind_fn(struct work_struct *work)
                spin_unlock_irq(&pool->lock);
                mutex_unlock(&pool->assoc_mutex);
-        }
-        /*
+                /*
-         * Call schedule() so that we cross rq->lock and thus can guarantee
+                 * Call schedule() so that we cross rq->lock and thus can
-         * sched callbacks see the %WORKER_UNBOUND flag.  This is necessary
+                 * guarantee sched callbacks see the %WORKER_UNBOUND flag.
-         * as scheduler callbacks may be invoked from other cpus.
+                 * This is necessary as scheduler callbacks may be invoked
-         */
+                 * from other cpus.
-        schedule();
+                 */
+                schedule();
-        /*
+                /*
-         * Sched callbacks are disabled now.  Zap nr_running.  After this,
+                 * Sched callbacks are disabled now.  Zap nr_running.
-         * nr_running stays zero and need_more_worker() and keep_working()
+                 * After this, nr_running stays zero and need_more_worker()
-         * are always true as long as the worklist is not empty.  Pools on
+                 * and keep_working() are always true as long as the
-         * @cpu now behave as unbound (in terms of concurrency management)
+                 * worklist is not empty.  This pool now behaves as an
-         * pools which are served by workers tied to the CPU.
+                 * unbound (in terms of concurrency management) pool which
-         *
+                 * are served by workers tied to the pool.
-         * On return from this function, the current worker would trigger
+                 */
-         * unbound chain execution of pending work items if other workers
-         * didn't already.
-         */
-        for_each_std_worker_pool(pool, cpu)
                atomic_set(&pool->nr_running, 0);
+                /*
+                 * With concurrency management just turned off, a busy
+                 * worker blocking could lead to lengthy stalls.  Kick off
+                 * unbound chain execution of currently pending work items.
+                 */
+                spin_lock_irq(&pool->lock);
+                wake_up_worker(pool);
+                spin_unlock_irq(&pool->lock);
+        }
 }
 /*
author	Lai Jiangshan <laijs@cn.fujitsu.com>	2013-03-08 18:18:28 -0500
committer	Tejun Heo <tj@kernel.org>	2013-03-08 18:18:28 -0500
commit	eb2834285cf172856cd12f66892fc7467935ebed (patch)
tree	bf0a3125e85b7acf2dc43e0d4f82b68459e77a36 /kernel
parent	6dbe51c251a327e012439c4772097a13df43c5b8 (diff)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 81f2457811eb..604801b91cba 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c
@@ -3446,28 +3446,34 @@ static void wq_unbind_fn(struct work_struct *work)
3446		3446
3447	spin_unlock_irq(&pool->lock);	3447	spin_unlock_irq(&pool->lock);
3448	mutex_unlock(&pool->assoc_mutex);	3448	mutex_unlock(&pool->assoc_mutex);
3449	}
3450		3449
3451	/*	3450	/*
3452	* Call schedule() so that we cross rq->lock and thus can guarantee	3451	* Call schedule() so that we cross rq->lock and thus can
3453	* sched callbacks see the %WORKER_UNBOUND flag. This is necessary	3452	* guarantee sched callbacks see the %WORKER_UNBOUND flag.
3454	* as scheduler callbacks may be invoked from other cpus.	3453	* This is necessary as scheduler callbacks may be invoked
3455	*/	3454	* from other cpus.
3456	schedule();	3455	*/
		3456	schedule();
3457		3457
3458	/*	3458	/*
3459	* Sched callbacks are disabled now. Zap nr_running. After this,	3459	* Sched callbacks are disabled now. Zap nr_running.
3460	* nr_running stays zero and need_more_worker() and keep_working()	3460	* After this, nr_running stays zero and need_more_worker()
3461	* are always true as long as the worklist is not empty. Pools on	3461	* and keep_working() are always true as long as the
3462	* @cpu now behave as unbound (in terms of concurrency management)	3462	* worklist is not empty. This pool now behaves as an
3463	* pools which are served by workers tied to the CPU.	3463	* unbound (in terms of concurrency management) pool which
3464	*	3464	* are served by workers tied to the pool.
3465	* On return from this function, the current worker would trigger	3465	*/
3466	* unbound chain execution of pending work items if other workers
3467	* didn't already.
3468	*/
3469	for_each_std_worker_pool(pool, cpu)
3470	atomic_set(&pool->nr_running, 0);	3466	atomic_set(&pool->nr_running, 0);
		3467
		3468	/*
		3469	* With concurrency management just turned off, a busy
		3470	* worker blocking could lead to lengthy stalls. Kick off
		3471	* unbound chain execution of currently pending work items.
		3472	*/
		3473	spin_lock_irq(&pool->lock);
		3474	wake_up_worker(pool);
		3475	spin_unlock_irq(&pool->lock);
		3476	}
3471	}	3477	}
3472		3478
3473	/*	3479	/*