diff options
author | Menny Hamburger <Menny_Hamburger@Dell.com> | 2010-12-16 14:57:07 -0500 |
---|---|---|
committer | James Bottomley <James.Bottomley@suse.de> | 2010-12-21 13:37:27 -0500 |
commit | db422318cbca55168cf965f655471dbf8be82433 (patch) | |
tree | 2d433a285c4ff23a2be684f5b8e88ed2415d7d5e /drivers | |
parent | 35dd3039e09cd46ca3a8733ff1c817bf7b7b19ce (diff) |
[SCSI] scsi_dh: propagate SCSI device deletion
Currently, when scsi_dh_activate() returns with an error
(e.g. SCSI_DH_NOSYS) the activate_complete callback is not called and
the error is not propagated to DM mpath.
When a SCSI device attached to a device handler is deleted, userland
processes currently performing I/O on the device will have their I/O
hang forever.
- Set SCSI_DH_NOSYS error when the handler is in the process of being
deleted (e.g. the SCSI device is in a SDEV_CANCEL or SDEV_DEL state).
- Set SCSI_DH_DEV_OFFLINED error when device is in SDEV_OFFLINE state.
- Call the activate_complete callback function directly from
scsi_dh_activate if an error has been set (when either the scsi_dh
internal data has already been deleted or is in the process of being
deleted).
The patch was tested in an iSCSI environment, RDAC H/W handler and
multipath. In the following reproduction process, dd will I/O hang
forever and the only way to release it will be to reboot the machine:
1) Perform I/O on a multipath device:
dd if=/dev/dm-0 of=/dev/zero bs=8k count=1000000 &
2) Delete all slave SCSI devices contained in the mpath device:
I) In an iSCSI environment, the easiest way to do this is by
stopping iSCSI:
/etc/init.d/iscsi stop
II) Another way to delete the devices is by applying the following
bash scriptlet:
dm_devs=$(ls /sys/block/ | grep dm- | xargs)
for dm_dev in $dm_devs; do
devices=$(ls /sys/block/$dm_dev/slaves)
for device in $devices; do
echo 1 > /sys/block/$device/device/delete
done
done
NOTE: when DM mpath's fail_path uses blk_abort_queue this scsi_dh change
isn't strictly required. However, DM mpath's call to blk_abort_queue
will soon be reverted because it has proven to be unsafe due to a race
(between blk_abort_queue and scsi_request_fn) that can lead to list
corruption. Therefore we cannot rely on blk_abort_queue via fail_path,
but even if we could this scsi_dh change is still preferrable.
Signed-off-by: Menny Hamburger <Menny_Hamburger@Dell.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Babu Moger <babu.moger@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Diffstat (limited to 'drivers')
-rw-r--r-- | drivers/scsi/device_handler/scsi_dh.c | 11 |
1 files changed, 9 insertions, 2 deletions
diff --git a/drivers/scsi/device_handler/scsi_dh.c b/drivers/scsi/device_handler/scsi_dh.c index 6fae3d285ae7..b837c5b3c8f9 100644 --- a/drivers/scsi/device_handler/scsi_dh.c +++ b/drivers/scsi/device_handler/scsi_dh.c | |||
@@ -442,12 +442,19 @@ int scsi_dh_activate(struct request_queue *q, activate_complete fn, void *data) | |||
442 | sdev = q->queuedata; | 442 | sdev = q->queuedata; |
443 | if (sdev && sdev->scsi_dh_data) | 443 | if (sdev && sdev->scsi_dh_data) |
444 | scsi_dh = sdev->scsi_dh_data->scsi_dh; | 444 | scsi_dh = sdev->scsi_dh_data->scsi_dh; |
445 | if (!scsi_dh || !get_device(&sdev->sdev_gendev)) | 445 | if (!scsi_dh || !get_device(&sdev->sdev_gendev) || |
446 | sdev->sdev_state == SDEV_CANCEL || | ||
447 | sdev->sdev_state == SDEV_DEL) | ||
446 | err = SCSI_DH_NOSYS; | 448 | err = SCSI_DH_NOSYS; |
449 | if (sdev->sdev_state == SDEV_OFFLINE) | ||
450 | err = SCSI_DH_DEV_OFFLINED; | ||
447 | spin_unlock_irqrestore(q->queue_lock, flags); | 451 | spin_unlock_irqrestore(q->queue_lock, flags); |
448 | 452 | ||
449 | if (err) | 453 | if (err) { |
454 | if (fn) | ||
455 | fn(data, err); | ||
450 | return err; | 456 | return err; |
457 | } | ||
451 | 458 | ||
452 | if (scsi_dh->activate) | 459 | if (scsi_dh->activate) |
453 | err = scsi_dh->activate(sdev, fn, data); | 460 | err = scsi_dh->activate(sdev, fn, data); |