aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/md
diff options
context:
space:
mode:
authorEivind Sarto <eivindsarto@gmail.com>2014-05-27 23:39:23 -0400
committerNeilBrown <neilb@suse.de>2014-05-29 02:59:46 -0400
commitcf170f3fa451350e431314e1a0a52014fda4b2d6 (patch)
tree575178e9fb3a2324c9cba1b20f644e01bb9baa41 /drivers/md
parent8b32bf5e37328c0ef267bc95d73b55e52f72ac77 (diff)
raid5: avoid release list until last reference of the stripe
The (lockless) release_list reduces lock contention, but there is excessive queueing and dequeuing of stripes on this list. A stripe will currently be queued on the release_list with a stripe reference count > 1. This can cause the raid5 kernel thread(s) to dequeue the stripe and decrement the refcount without doing any other useful processing of the stripe. The are two cases when the stripe can be put on the release_list multiple times before it is actually handled by the kernel thread(s). 1) make_request() activates the stripe processing in 4k increments. When a write request is large enough to span multiple chunks of a stripe_head, the first 4k chunk adds the stripe to the plug list. The next 4k chunk that is processed for the same stripe puts the stripe on the release_list with a refcount=2. This can cause the kernel thread to process and decrement the stripe before the stripe us unplugged, which again will put it back on the release_list. 2) Whenever IO is scheduled on a stripe (pre-read and/or write), the stripe refcount is set to the number of active IO (for each chunk). The stripe is released as each IO complete, and can be queued and dequeued multiple times on the release_list, until its refcount finally reached zero. This simple patch will ensure a stripe is only queued on the release_list when its refcount=1 and is ready to be handled by the kernel thread(s). I added some instrumentation to raid5 and counted the number of times striped were queued on the release_list for a variety of write IO sizes. Without this patch the number of times stripes got queued on the release_list was 100-500% higher than with the patch. The excess queuing will increase with the IO size. The patch also improved throughput by 5-10%. Signed-off-by: Eivind Sarto <esarto@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>
Diffstat (limited to 'drivers/md')
-rw-r--r--drivers/md/raid5.c5
1 files changed, 5 insertions, 0 deletions
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index c1e8607d8340..348a857ab0ff 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -416,6 +416,11 @@ static void release_stripe(struct stripe_head *sh)
416 int hash; 416 int hash;
417 bool wakeup; 417 bool wakeup;
418 418
419 /* Avoid release_list until the last reference.
420 */
421 if (atomic_add_unless(&sh->count, -1, 1))
422 return;
423
419 if (unlikely(!conf->mddev->thread) || 424 if (unlikely(!conf->mddev->thread) ||
420 test_and_set_bit(STRIPE_ON_RELEASE_LIST, &sh->state)) 425 test_and_set_bit(STRIPE_ON_RELEASE_LIST, &sh->state))
421 goto slow_path; 426 goto slow_path;