diff options
author | NeilBrown <neilb@suse.de> | 2015-02-01 18:44:29 -0500 |
---|---|---|
committer | NeilBrown <neilb@suse.de> | 2015-02-02 00:57:17 -0500 |
commit | b1b02fe97f75b12ab34b2303bfd4e3526d903a58 (patch) | |
tree | 62450049509e3f8413f2e3dbe52bd1f29d747e7f | |
parent | 59343cd7c4809cf7598789e1cd14563780ae4239 (diff) |
md/raid5: fix another livelock caused by non-aligned writes.
If a non-page-aligned write is destined for a device which
is missing/faulty, we can deadlock.
As the target device is missing, a read-modify-write cycle
is not possible.
As the write is not for a full-page, a recontruct-write cycle
is not possible.
This should be handled by logic in fetch_block() which notices
there is a non-R5_OVERWRITE write to a missing device, and so
loads all blocks.
However since commit 67f455486d2ea2, that code requires
STRIPE_PREREAD_ACTIVE before it will active, and those circumstances
never set STRIPE_PREREAD_ACTIVE.
So: in handle_stripe_dirtying, if neither rmw or rcw was possible,
set STRIPE_DELAYED, which will cause STRIPE_PREREAD_ACTIVE be set
after a suitable delay.
Fixes: 67f455486d2ea20b2d94d6adf5b9b783d079e321
Cc: stable@vger.kernel.org (v3.16+)
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Tested-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
-rw-r--r-- | drivers/md/raid5.c | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index c1b0d52bfcb0..b98765f6f77f 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c | |||
@@ -3195,6 +3195,11 @@ static void handle_stripe_dirtying(struct r5conf *conf, | |||
3195 | (unsigned long long)sh->sector, | 3195 | (unsigned long long)sh->sector, |
3196 | rcw, qread, test_bit(STRIPE_DELAYED, &sh->state)); | 3196 | rcw, qread, test_bit(STRIPE_DELAYED, &sh->state)); |
3197 | } | 3197 | } |
3198 | |||
3199 | if (rcw > disks && rmw > disks && | ||
3200 | !test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) | ||
3201 | set_bit(STRIPE_DELAYED, &sh->state); | ||
3202 | |||
3198 | /* now if nothing is locked, and if we have enough data, | 3203 | /* now if nothing is locked, and if we have enough data, |
3199 | * we can start a write request | 3204 | * we can start a write request |
3200 | */ | 3205 | */ |