aboutsummaryrefslogtreecommitdiffstats
path: root/kernel/workqueue.c
diff options
context:
space:
mode:
authorNeilBrown <neilb@suse.de>2014-12-08 12:39:16 -0500
committerTejun Heo <tj@kernel.org>2014-12-08 12:39:16 -0500
commit008847f66c38712f2819cd956969519006ebc11d (patch)
tree490309d3182856041024ec4b88e03fc1dab57680 /kernel/workqueue.c
parentb2d829096bee7eaf7be31b6229bf722e503adfd8 (diff)
workqueue: allow rescuer thread to do more work.
When there is serious memory pressure, all workers in a pool could be blocked, and a new thread cannot be created because it requires memory allocation. In this situation a WQ_MEM_RECLAIM workqueue will wake up the rescuer thread to do some work. The rescuer will only handle requests that are already on ->worklist. If max_requests is 1, that means it will handle a single request. The rescuer will be woken again in 100ms to handle another max_requests requests. I've seen a machine (running a 3.0 based "enterprise" kernel) with thousands of requests queued for xfslogd, which has a max_requests of 1, and is needed for retiring all 'xfs' write requests. When one of the worker pools gets into this state, it progresses extremely slowly and possibly never recovers (only waited an hour or two). With this patch we leave a pool_workqueue on mayday list until it is clearly no longer in need of assistance. This allows all requests to be handled in a timely fashion. We keep each pool_workqueue on the mayday list until need_to_create_worker() is false, and no work for this workqueue is found in the pool. I have tested this in combination with a (hackish) patch which forces all work items to be handled by the rescuer thread. In that context it significantly improves performance. A similar patch for a 3.0 kernel significantly improved performance on a heavy work load. Thanks to Jan Kara for some design ideas, and to Dongsu Park for some comments and testing. tj: Inverted the lock order between wq_mayday_lock and pool->lock with a preceding patch and simplified this patch. Added comment and updated changelog accordingly. Dongsu spotted missing get_pwq() in the simplified code. Cc: Dongsu Park <dongsu.park@profitbricks.com> Cc: Jan Kara <jack@suse.cz> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Tejun Heo <tj@kernel.org>
Diffstat (limited to 'kernel/workqueue.c')
-rw-r--r--kernel/workqueue.c20
1 files changed, 19 insertions, 1 deletions
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 3992cf6c3ee3..6202b08f1933 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2253,7 +2253,25 @@ repeat:
2253 if (get_work_pwq(work) == pwq) 2253 if (get_work_pwq(work) == pwq)
2254 move_linked_works(work, scheduled, &n); 2254 move_linked_works(work, scheduled, &n);
2255 2255
2256 process_scheduled_works(rescuer); 2256 if (!list_empty(scheduled)) {
2257 process_scheduled_works(rescuer);
2258
2259 /*
2260 * The above execution of rescued work items could
2261 * have created more to rescue through
2262 * pwq_activate_first_delayed() or chained
2263 * queueing. Let's put @pwq back on mayday list so
2264 * that such back-to-back work items, which may be
2265 * being used to relieve memory pressure, don't
2266 * incur MAYDAY_INTERVAL delay inbetween.
2267 */
2268 if (need_to_create_worker(pool)) {
2269 spin_lock(&wq_mayday_lock);
2270 get_pwq(pwq);
2271 list_move_tail(&pwq->mayday_node, &wq->maydays);
2272 spin_unlock(&wq_mayday_lock);
2273 }
2274 }
2257 2275
2258 /* 2276 /*
2259 * Put the reference grabbed by send_mayday(). @pool won't 2277 * Put the reference grabbed by send_mayday(). @pool won't