aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/md/raid1.c
diff options
context:
space:
mode:
authorNeilBrown <neilb@suse.de>2011-07-27 21:31:48 -0400
committerNeilBrown <neilb@suse.de>2011-07-27 21:31:48 -0400
commitde393cdea66cbd63c90725663f400c76faf1b255 (patch)
tree6a2bf37bee98bf7de42856f904bd23c81e082f8e /drivers/md/raid1.c
parentd7a9d443bc8a75a24873c0506f50051edfedc714 (diff)
md: make it easier to wait for bad blocks to be acknowledged.
It is only safe to choose not to write to a bad block if that bad block is safely recorded in metadata - i.e. if it has been 'acknowledged'. If it hasn't we need to wait for the acknowledgement. We support that using rdev->blocked wait and md_wait_for_blocked_rdev by introducing a new device flag 'BlockedBadBlock'. This flag is only advisory. It is cleared whenever we acknowledge a bad block, so that a waiter can re-check the particular bad blocks that it is interested it. It should be set by a caller when they find they need to wait. This (set after test) is inherently racy, but as md_wait_for_blocked_rdev already has a timeout, losing the race will have minimal impact. When we clear "Blocked" was also clear "BlockedBadBlocks" incase it was set incorrectly (see above race). We also modify the way we manage 'Blocked' to fit better with the new handling of 'BlockedBadBlocks' and to make it consistent between externally managed and internally managed metadata. This requires that each raidXd loop checks if the metadata needs to be written and triggers a write (md_check_recovery) if needed. Otherwise a queued write request might cause raidXd to wait for the metadata to write, and only that thread can write it. Before writing metadata, we set FaultRecorded for all devices that are Faulty, then after writing the metadata we clear Blocked for any device for which the Fault was certainly Recorded. The 'faulty' device flag now appears in sysfs if the device is faulty *or* it has unacknowledged bad blocks. So user-space which does not understand bad blocks can continue to function correctly. User space which does, should not assume a device is faulty until it sees the 'faulty' flag, and then sees the list of unacknowledged bad blocks is empty. Signed-off-by: NeilBrown <neilb@suse.de>
Diffstat (limited to 'drivers/md/raid1.c')
-rw-r--r--drivers/md/raid1.c3
1 files changed, 3 insertions, 0 deletions
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 8c31c39b6f8c..4d40d9d54a20 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1059,6 +1059,7 @@ static void error(mddev_t *mddev, mdk_rdev_t *rdev)
1059 conf->recovery_disabled = mddev->recovery_disabled; 1059 conf->recovery_disabled = mddev->recovery_disabled;
1060 return; 1060 return;
1061 } 1061 }
1062 set_bit(Blocked, &rdev->flags);
1062 if (test_and_clear_bit(In_sync, &rdev->flags)) { 1063 if (test_and_clear_bit(In_sync, &rdev->flags)) {
1063 unsigned long flags; 1064 unsigned long flags;
1064 spin_lock_irqsave(&conf->device_lock, flags); 1065 spin_lock_irqsave(&conf->device_lock, flags);
@@ -1751,6 +1752,8 @@ read_more:
1751 generic_make_request(r1_bio->bios[r1_bio->read_disk]); 1752 generic_make_request(r1_bio->bios[r1_bio->read_disk]);
1752 } 1753 }
1753 cond_resched(); 1754 cond_resched();
1755 if (mddev->flags & ~(1<<MD_CHANGE_PENDING))
1756 md_check_recovery(mddev);
1754 } 1757 }
1755 blk_finish_plug(&plug); 1758 blk_finish_plug(&plug);
1756} 1759}