diff options
author | NeilBrown <neilb@suse.de> | 2011-06-28 02:59:42 -0400 |
---|---|---|
committer | NeilBrown <neilb@suse.de> | 2011-06-28 02:59:42 -0400 |
commit | 4274215d24633df7302069e51426659d4759c5ed (patch) | |
tree | c21fff5f11201eaaea0e44cf81a38df21dd63ffd /drivers/md | |
parent | 2992c4bd5742b31a0ee00a76eee9c1c284507418 (diff) |
md: avoid endless recovery loop when waiting for fail device to complete.
If a device fails in a way that causes pending request to take a while
to complete, md will not be able to immediately remove it from the
array in remove_and_add_spares.
It will then incorrectly look like a spare device and md will try to
recover it even though it is failed.
This leads to a recovery process starting and instantly aborting over
and over again.
We should check if the device is faulty before considering it to be a
spare. This will avoid trying to start a recovery that cannot
proceed.
This bug was introduced in 2.6.26 so that patch is suitable for any
kernel since then.
Cc: stable@kernel.org
Reported-by: Jim Paradis <james.paradis@stratus.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Diffstat (limited to 'drivers/md')
-rw-r--r-- | drivers/md/md.c | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/drivers/md/md.c b/drivers/md/md.c index 4332fc2f25d4..91e31e260b4a 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c | |||
@@ -7088,6 +7088,7 @@ static int remove_and_add_spares(mddev_t *mddev) | |||
7088 | list_for_each_entry(rdev, &mddev->disks, same_set) { | 7088 | list_for_each_entry(rdev, &mddev->disks, same_set) { |
7089 | if (rdev->raid_disk >= 0 && | 7089 | if (rdev->raid_disk >= 0 && |
7090 | !test_bit(In_sync, &rdev->flags) && | 7090 | !test_bit(In_sync, &rdev->flags) && |
7091 | !test_bit(Faulty, &rdev->flags) && | ||
7091 | !test_bit(Blocked, &rdev->flags)) | 7092 | !test_bit(Blocked, &rdev->flags)) |
7092 | spares++; | 7093 | spares++; |
7093 | if (rdev->raid_disk < 0 | 7094 | if (rdev->raid_disk < 0 |