aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/scsi/sd.c
Commit message (Collapse)AuthorAge
* sd: disable discard_zeroes_data for UNMAPMartin K. Petersen2014-11-12
| | | | | | | | | | | | | | | | | | | | | | | | | The T10 SBC UNMAP command does not provide any hard guarantees that blocks will return zeroes on a subsequent READ. This is due to the fact that the device server is free to silently ignore all or parts of the request. The only way to ensure that a block consistently returns zeroes after being unmapped is to use WRITE SAME with the UNMAP bit set. Should the device be unable to unmap one or more blocks described by the command it is required to manually write zeroes to them. Until now we have preferred UNMAP over the WRITE SAME variants to accommodate thinly provisioned devices that predated the final SBC-3 spec. This patch changes the heuristic so that we favor WRITE SAME(16) or (10) over UNMAP if these commands are marked as supported in the Logical Block Provisioning VPD page. The patch also disables discard_zeroes_data for devices operating in UNMAP mode. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* sd: fix up ->compat_ioctlChristoph Hellwig2014-11-12
| | | | | | | | | | No need to verify the passthrough ioctls, the real handler will take care of that. Also make sure not to block for resets on O_NONBLOCK fds. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
* scsi: split scsi_nonblockable_ioctlChristoph Hellwig2014-11-12
| | | | | | | | | | | | | | | | | The calling conventions for this function are bad as it could return -ENODEV both for a device not currently online and a not recognized ioctl. Add a new scsi_ioctl_block_when_processing_errors function that wraps scsi_block_when_processing_errors with the a special case for the SG_SCSI_RESET ioctl command, and handle the SG_SCSI_RESET case itself in scsi_ioctl. All callers of scsi_ioctl now must call the above helper to check for the EH state, so that the ioctl handler itself doesn't have to. Reported-by: Robert Elliott <Elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
* scsi: remove scsi_show_result()Hannes Reinecke2014-11-12
| | | | | | | | | | | | | Open-code scsi_print_result in sd.c, and cleanup logging to not print duplicate informations. Also remove the call to scsi_show_result() in ufshcd.c to be consistent with other callers of scsi_execute(). With that we can remove scsi_show_result in constants.c Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* scsi: use sdev as argument for sense code printingHannes Reinecke2014-11-12
| | | | | | | | | | | We should be using the standard dev_printk() variants for sense code printing. [hch: remove __scsi_print_sense call in xen-scsiback, Acked by Juergen] [hch: folded bracing fix from Dan Carpenter] Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* sd: remove scsi_print_sense() in sd_done()Hannes Reinecke2014-11-12
| | | | | | | | | sd_done() was calling scsi_print_sense() for a sense code of 'NO_SENSE'. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* Merge branch 'for-3.18/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds2014-10-18
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block layer driver update from Jens Axboe: "This is the block driver pull request for 3.18. Not a lot in there this round, and nothing earth shattering. - A round of drbd fixes from the linbit team, and an improvement in asender performance. - Removal of deprecated (and unused) IRQF_DISABLED flag in rsxx and hd from Michael Opdenacker. - Disable entropy collection from flash devices by default, from Mike Snitzer. - A small collection of xen blkfront/back fixes from Roger Pau Monné and Vitaly Kuznetsov" * 'for-3.18/drivers' of git://git.kernel.dk/linux-block: block: disable entropy contributions for nonrot devices xen, blkfront: factor out flush-related checks from do_blkif_request() xen-blkback: fix leak on grant map error path xen/blkback: unmap all persistent grants when frontend gets disconnected rsxx: Remove deprecated IRQF_DISABLED block: hd: remove deprecated IRQF_DISABLED drbd: use RB_DECLARE_CALLBACKS() to define augment callbacks drbd: compute the end before rb_insert_augmented() drbd: Add missing newline in resync progress display in /proc/drbd drbd: reduce lock contention in drbd_worker drbd: Improve asender performance drbd: Get rid of the WORK_PENDING macro drbd: Get rid of the __no_warn and __cond_lock macros drbd: Avoid inconsistent locking warning drbd: Remove superfluous newline from "resync_extents" debugfs entry. drbd: Use consistent names for all the bi_end_io callbacks drbd: Use better variable names
| * block: disable entropy contributions for nonrot devicesMike Snitzer2014-10-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clear QUEUE_FLAG_ADD_RANDOM in all block drivers that set QUEUE_FLAG_NONROT. Historically, all block devices have automatically made entropy contributions. But as previously stated in commit e2e1a148 ("block: add sysfs knob for turning off disk entropy contributions"): - On SSD disks, the completion times aren't as random as they are for rotational drives. So it's questionable whether they should contribute to the random pool in the first place. - Calling add_disk_randomness() has a lot of overhead. There are more reliable sources for randomness than non-rotational block devices. From a security perspective it is better to err on the side of caution than to allow entropy contributions from unreliable "random" sources. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* | Merge branch 'for-3.18/core' of git://git.kernel.dk/linux-blockLinus Torvalds2014-10-18
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull core block layer changes from Jens Axboe: "This is the core block IO pull request for 3.18. Apart from the new and improved flush machinery for blk-mq, this is all mostly bug fixes and cleanups. - blk-mq timeout updates and fixes from Christoph. - Removal of REQ_END, also from Christoph. We pass it through the ->queue_rq() hook for blk-mq instead, freeing up one of the request bits. The space was overly tight on 32-bit, so Martin also killed REQ_KERNEL since it's no longer used. - blk integrity updates and fixes from Martin and Gu Zheng. - Update to the flush machinery for blk-mq from Ming Lei. Now we have a per hardware context flush request, which both cleans up the code should scale better for flush intensive workloads on blk-mq. - Improve the error printing, from Rob Elliott. - Backing device improvements and cleanups from Tejun. - Fixup of a misplaced rq_complete() tracepoint from Hannes. - Make blk_get_request() return error pointers, fixing up issues where we NULL deref when a device goes bad or missing. From Joe Lawrence. - Prep work for drastically reducing the memory consumption of dm devices from Junichi Nomura. This allows creating clone bio sets without preallocating a lot of memory. - Fix a blk-mq hang on certain combinations of queue depths and hardware queues from me. - Limit memory consumption for blk-mq devices for crash dump scenarios and drivers that use crazy high depths (certain SCSI shared tag setups). We now just use a single queue and limited depth for that" * 'for-3.18/core' of git://git.kernel.dk/linux-block: (58 commits) block: Remove REQ_KERNEL blk-mq: allocate cpumask on the home node bio-integrity: remove the needless fail handle of bip_slab creating block: include func name in __get_request prints block: make blk_update_request print prefix match ratelimited prefix blk-merge: don't compute bi_phys_segments from bi_vcnt for cloned bio block: fix alignment_offset math that assumes io_min is a power-of-2 blk-mq: Make bt_clear_tag() easier to read blk-mq: fix potential hang if rolling wakeup depth is too high block: add bioset_create_nobvec() block: use bio_clone_fast() in blk_rq_prep_clone() block: misplaced rq_complete tracepoint sd: Honor block layer integrity handling flags block: Replace strnicmp with strncasecmp block: Add T10 Protection Information functions block: Don't merge requests if integrity flags differ block: Integrity checksum flag block: Relocate bio integrity flags block: Add a disk flag to block integrity profile block: Add prefix to block integrity profile flags ...
| * | sd: Honor block layer integrity handling flagsMartin K. Petersen2014-09-30
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A set of flags introduced in the block layer enable better control over how protection information is handled. These flags are useful for both error injection and data recovery purposes. Checking can be enabled and disabled for controller and disk, and the guard tag format is now a per-I/O property. Update sd_protect_op to communicate the relevant information to the low-level device driver via a set of flags in scsi_cmnd. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* | scsi: balance out autopm get/put calls in scsi_sysfs_add_sdev()Subhash Jadavani2014-09-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SCSI Well-known logical units generally don't have any scsi driver associated with it which means no one will call scsi_autopm_put_device() on these wlun scsi devices and this would result in keeping the corresponding scsi device always active (hence LLD can't be suspended as well). Same exact problem can be seen for other scsi device representing normal logical unit whose driver is yet to be loaded. This patch fixes the above problem with this approach: - make the scsi_autopm_put_device call at the end of scsi_sysfs_add_sdev to make it balance out the get earlier in the function. - let drivers do paired get/put calls in their probe methods. Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Dolev Raviv <draviv@codeaurora.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
* | sd: Avoid sending medium write commands if device is write protectedSujit Reddy Thumma2014-09-15
|/ | | | | | | | | | | | | The SYNCHRONIZE_CACHE command is a medium write command and hence can fail when the device is write protected. Avoid sending such commands by making sure that write-cache-enable is disabled even though the device claim to support it. Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org> Signed-off-by: Dolev Raviv <draviv@codeaurora.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Venkatesh Srinivas <venkateshs@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* sd: fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeoutK. Y. Srinivasan2014-07-25
| | | | | | | | | | Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this patch did not use the basic I/O timeout of the device. Fix this bug. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* scsi: add a blacklist flag which enables VPD page inquiriesMartin K. Petersen2014-07-25
| | | | | | | | | | | | | | | | | Despite supporting modern SCSI features some storage devices continue to claim conformance to an older version of the SPC spec. This is done for compatibility with legacy operating systems. Linux by default will not attempt to read VPD pages on devices that claim SPC-2 or older. Introduce a blacklist flag that can be used to trigger VPD page inquiries on devices that are known to support them. Reported-by: KY Srinivasan <kys@microsoft.com> Tested-by: KY Srinivasan <kys@microsoft.com> Reviewed-by: KY Srinivasan <kys@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> CC: <stable@vger.kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
* scsi: move the writeable field from struct scsi_device to struct scsi_cdChristoph Hellwig2014-07-25
| | | | | | | | | | | | | We currently set the field in common code based on the device type, but then only use it in the cdrom driver which also overrides the value previously set in the generic code. Just leave this entirely to the CDROM driver to make everyones life simpler. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
* sd: split sd_init_commandChristoph Hellwig2014-07-17
| | | | | | | | | Factor out a function to initialize regular read/write commands and leave sd_init_command as a simple dispatcher to the different prepare routines. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
* sd: retry discard commandsChristoph Hellwig2014-07-17
| | | | | | | | | | Currently cmd->allowed is initialized from rq->retries for discard commands, but retries is always 0 for non-BLOCK_PC requests. Set it to the standard number of retries instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
* sd: retry write same commandsChristoph Hellwig2014-07-17
| | | | | | | | | | Currently cmd->allowed is initialized from rq->retries for write same commands, but retries is always 0 for non-BLOCK_PC requests. Set it to the standard number of retries instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
* sd: don't use scsi_setup_blk_pc_cmnd for discard requestsChristoph Hellwig2014-07-17
| | | | | | | | | | Simplify handling of discard requests by setting up the command directly instead of initializing request fields and then calling scsi_setup_blk_pc_cmnd to propagate the information into the command. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
* sd: don't use scsi_setup_blk_pc_cmnd for write same requestsChristoph Hellwig2014-07-17
| | | | | | | | | | Simplify handling of write same requests by setting up the command directly instead of initializing request fields and then calling scsi_setup_blk_pc_cmnd to propagate the information into the command. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Webb Scales <webbnh@hp.com>
* sd: don't use scsi_setup_blk_pc_cmnd for flush requestsChristoph Hellwig2014-07-17
| | | | | | | | | | | | Simplify handling of flush requests by setting up the command directly instead of initializing request fields and then calling scsi_setup_blk_pc_cmnd to propagate the information into the command. Also rename scsi_setup_flush_cmnd to sd_setup_flush_cmnd for consistency. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
* scsi: set sc_data_direction in common codeChristoph Hellwig2014-07-17
| | | | | | | | | | The data direction fiel in the SCSI command is derived only from the block request structure. Move setting it up into common code instead of duplicating it in the ULDs. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
* scsi: restructure command initialization for TYPE_FS requestsChristoph Hellwig2014-07-17
| | | | | | | | | | | | We should call the device handler prep_fn for all TYPE_FS requests, not just simple read/write calls that are handled by the disk driver. Restructure the common I/O code to call the prep_fn handler and zero out the CDB, and just leave the call to scsi_init_io to the ULDs. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
* sd: Limit transfer lengthMartin K. Petersen2014-07-17
| | | | | | | | | | | | | | | | | Until now the per-command transfer length has exclusively been gated by the max_sectors parameter in the scsi_host template. Given that the size of this parameter has been bumped to an unsigned int we have to be careful not to exceed the target device's capabilities. If the if the device specifies a Maximum Transfer Length in the Block Limits VPD we'll use that value. Otherwise we'll use 0xffffffff for devices that have use_16_for_rw set and 0xffff for the rest. We then combine the chosen disk limit with max_sectors in the host template. The smaller of the two will be used to set the max_hw_sectors queue limit. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* sd: bad return code of init_sdClément Calmels2014-07-17
| | | | | | | | | | In init_sd function, if kmem_cache_create or mempool_create_slab_pools calls fail, the error will not be correclty reported because class_register previously set the value of err to 0. Signed-off-by: Clément Calmels <clement.calmels@free.fr> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* sd: notify block layer when using temporary change to cache_typeVaughan Cao2014-07-17
| | | | | | | | | | | | | | | This is a fix for commit 39c60a0948cc06139e2fbfe084f83cb7e7deae3b "sd: fix array cache flushing bug causing performance problems" We must notify the block layer via q->flush_flags after a temporary change of the cache_type to write through. Without this, a SYNCHRONIZE CACHE command will still be generated. This patch factors out a helper that can be called from sd_revalidate_disk and cache_type_store. Signed-off-by: Vaughan Cao <vaughan.cao@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
* sd: use READ_16 or WRITE_16 when transfer length is greater than 0xffffAkinobu Mita2014-07-17
| | | | | | | | | | | | | | | This change makes the scsi disk driver handle the requests whose transfer length is greater than 0xffff with READ_16 or WRITE_16. However, this is a preparation for extending the data type of max_sectors in struct Scsi_Host and scsi_host_template. So, it is impossible to happen this condition for now, because SCSI low-level drivers can not specify max_sectors greater than 0xffff due to the data type limitation. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* usb-storage/SCSI: Add broken_fua blacklist flagAlan Stern2014-07-01
| | | | | | | | | | | | | | Some buggy JMicron USB-ATA bridges don't know how to translate the FUA bit in READs or WRITEs. This patch adds an entry in unusual_devs.h and a blacklist flag to tell the sd driver not to use FUA. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Michael Büsch <m@bues.ch> Tested-by: Michael Büsch <m@bues.ch> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net> CC: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Merge tag 'scsi-for-linus' of ↵Linus Torvalds2014-06-09
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This patch consists of the usual driver updates (qla2xxx, qla4xxx, lpfc, be2iscsi, fnic, ufs, NCR5380) The NCR5380 is the addition to maintained status of a long neglected driver for older hardware. In addition there are a lot of minor fixes and cleanups and some more updates to make scsi mq ready" * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (130 commits) include/scsi/osd_protocol.h: remove unnecessary __constant mvsas: Recognise device/subsystem 9485/9485 as 88SE9485 Revert "be2iscsi: Fix processing cqe for cxn whose endpoint is freed" mptfusion: fix msgContext in mptctl_hp_hostinfo acornscsi: remove linked command support scsi/NCR5380: dprintk macro fusion: Remove use of DEF_SCSI_QCMD fusion: Add free msg frames to the head, not tail of list mpt2sas: Add free smids to the head, not tail of list mpt2sas: Remove use of DEF_SCSI_QCMD mpt2sas: Remove uses of serial_number mpt3sas: Remove use of DEF_SCSI_QCMD mpt3sas: Remove uses of serial_number qla2xxx: Use kmemdup instead of kmalloc + memcpy qla4xxx: Use kmemdup instead of kmalloc + memcpy qla2xxx: fix incorrect debug printk be2iscsi: Bump the driver version be2iscsi: Fix processing cqe for cxn whose endpoint is freed be2iscsi: Fix destroy MCC-CQ before MCC-EQ is destroyed be2iscsi: Fix memory corruption in MBX path ...
| * sd: medium access timeout counter fails to resetDavid Jeffery2014-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is an error with the medium access timeout feature of the sd driver. The sdkp->medium_access_timed_out value is reset to zero in sd_done() in the wrong place. Currently it is reset to zero only when a command returns sense data. This can result in cases where the medium access check falsely triggers from timed out commands which are hours or days apart. For example, an I/O command times out and is aborted. It then retries and succeeds. But with no sense data generated and returned, the medium_access_timed_out value is not reset. If no sd command returns sense data, then the next command to time out (however far in time from the first failure) will trigger the medium access timeout and put the device offline. The resetting of sdkp->medium_access_timed_out should occur before the check for sense data. To reproduce using scsi_debug, use SCSI_DEBUG_OPT_TIMEOUT or SCSI_DEBUG_OPT_MAC_TIMEOUT to force an I/O command to timeout. Then, remove the opt value so the I/O will succeed on retry. Perform more I/O as desired. Finally, repeat the process to make a new I/O command time out. Without the patch, the device will be marked offline even though many I/O commands have succeeded between the 2 instances of timed out commands. Signed-off-by: David Jeffery <djeffery@redhat.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * scsi: reintroduce scsi_driver.init_commandChristoph Hellwig2014-05-19
| | | | | | | | | | | | | | | | | | | | | | | | Instead of letting the ULD play games with the prep_fn move back to the model of a central prep_fn with a callback to the ULD. This already cleans up and shortens the code by itself, and will be required to properly support blk-mq in the SCSI midlayer. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de>
* | sd/skd: stuff discard page in request->completion_dataJens Axboe2014-04-16
| | | | | | | | | | | | | | | | Store the pointer to the page there, so we can always safely reference it from end_io context where ->bio may have been cleared. Signed-off-by: Jens Axboe <axboe@fb.com>
* | block: remove struct request buffer memberJens Axboe2014-04-15
|/ | | | | | | | | | | | | | | This was used in the olden days, back when onions were proper yellow. Basically it mapped to the current buffer to be transferred. With highmem being added more than a decade ago, most drivers map pages out of a bio, and rq->buffer isn't pointing at anything valid. Convert old style drivers to just use bio_data(). For the discard payload use case, just reference the page in the bio. Signed-off-by: Jens Axboe <axboe@fb.com>
* Merge branch 'async-scsi-resume' of ↵Linus Torvalds2014-04-11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/djbw/isci Pull async SCSI resume support from Dan Williams: "Allow disks and other devices to resume in parallel. This provides a tangible speed up for a non-esoteric use case (laptop resume): https://01.org/suspendresume/blogs/tebrandt/2013/hard-disk-resume-optimization-simpler-approach" * 'async-scsi-resume' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/isci: scsi: async sd resume
| * scsi: async sd resumeDan Williams2014-04-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | async_schedule() sd resume work to allow disks and other devices to resume in parallel. This moves the entirety of scsi_device resume to an async context to ensure that scsi_device_resume() remains ordered with respect to the completion of the start/stop command. For the duration of the resume, new command submissions (that do not originate from the scsi-core) will be deferred (BLKPREP_DEFER). It adds a new ASYNC_DOMAIN_EXCLUSIVE(scsi_sd_pm_domain) as a container of these operations. Like scsi_sd_probe_domain it is flushed at sd_remove() time to ensure async ops do not continue past the end-of-life of the sdev. The implementation explicitly refrains from reusing scsi_sd_probe_domain directly for this purpose as it is flushed at the end of dpm_resume(), potentially defeating some of the benefit. Given sdevs are quiesced it is permissible for these resume operations to bleed past the async_synchronize_full() calls made by the driver core. We defer the resolution of which pm callback to call until scsi_dev_type_{suspend|resume} time and guarantee that the callback parameter is never NULL. With this in place the type of resume operation is encoded in the async function identifier. There is a concern that async resume could trigger PSU overload. In the enterprise, storage enclosures enforce staggered spin-up regardless of what the kernel does making async scanning safe by default. Outside of that context a user can disable asynchronous scanning via a kernel command line or CONFIG_SCSI_SCAN_ASYNC. Honor that setting when deciding whether to do resume asynchronously. Inspired by Todd's analysis and initial proposal [2]: https://01.org/suspendresume/blogs/tebrandt/2013/hard-disk-resume-optimization-simpler-approach Cc: Len Brown <len.brown@intel.com> Cc: Phillip Susi <psusi@ubuntu.com> [alan: bug fix and clean up suggestion] Acked-by: Alan Stern <stern@rowland.harvard.edu> Suggested-by: Todd Brandt <todd.e.brandt@linux.intel.com> [djbw: kick all resume work to the async queue] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | [SCSI] sd: Quiesce mode sense error messagesMartin K. Petersen2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Messages about discovered disk properties are only printed once unless they are found to have changed. Errors encountered during mode sense, however, are printed every time we revalidate. Quiesce mode sense errors so they are only printed during the first scan. [jejb: checkpatch fixes] Bugzilla: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=733565 Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* | [SCSI] sd: don't fail if the device doesn't recognize SYNCHRONIZE CACHEAlan Stern2014-03-15
|/ | | | | | | | | | | | | | | | | | Evidently some wacky USB-ATA bridges don't recognize the SYNCHRONIZE CACHE command, as shown in this email thread: http://marc.info/?t=138978356200002&r=1&w=2 The fact that we can't tell them to drain their caches shouldn't prevent the system from going into suspend. Therefore sd_sync_cache() shouldn't return an error if the device replies with an Invalid Command ASC. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Sven Neumann <s.neumann@raumfeld.com> Tested-by: Daniel Mack <zonque@gmail.com> CC: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-blockLinus Torvalds2014-01-30
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull core block IO changes from Jens Axboe: "The major piece in here is the immutable bio_ve series from Kent, the rest is fairly minor. It was supposed to go in last round, but various issues pushed it to this release instead. The pull request contains: - Various smaller blk-mq fixes from different folks. Nothing major here, just minor fixes and cleanups. - Fix for a memory leak in the error path in the block ioctl code from Christian Engelmayer. - Header export fix from CaiZhiyong. - Finally the immutable biovec changes from Kent Overstreet. This enables some nice future work on making arbitrarily sized bios possible, and splitting more efficient. Related fixes to immutable bio_vecs: - dm-cache immutable fixup from Mike Snitzer. - btrfs immutable fixup from Muthu Kumar. - bio-integrity fix from Nic Bellinger, which is also going to stable" * 'for-3.14/core' of git://git.kernel.dk/linux-block: (44 commits) xtensa: fixup simdisk driver to work with immutable bio_vecs block/blk-mq-cpu.c: use hotcpu_notifier() blk-mq: for_each_* macro correctness block: Fix memory leak in rw_copy_check_uvector() handling bio-integrity: Fix bio_integrity_verify segment start bug block: remove unrelated header files and export symbol blk-mq: uses page->list incorrectly blk-mq: use __smp_call_function_single directly btrfs: fix missing increment of bi_remaining Revert "block: Warn and free bio if bi_end_io is not set" block: Warn and free bio if bi_end_io is not set blk-mq: fix initializing request's start time block: blk-mq: don't export blk_mq_free_queue() block: blk-mq: make blk_sync_queue support mq block: blk-mq: support draining mq queue dm cache: increment bi_remaining when bi_end_io is restored block: fixup for generic bio chaining block: Really silence spurious compiler warnings block: Silence spurious compiler warnings block: Kill bio_pair_split() ...
| * Merge tag 'v3.13-rc6' into for-3.14/coreJens Axboe2013-12-31
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Needed to bring blk-mq uptodate, since changes have been going in since for-3.14/core was established. Fixup merge issues related to the immutable biovec changes. Signed-off-by: Jens Axboe <axboe@kernel.dk> Conflicts: block/blk-flush.c fs/btrfs/check-integrity.c fs/btrfs/extent_io.c fs/btrfs/scrub.c fs/logfs/dev_bdev.c
| * | block: Convert bio_iovec() to bvec_iterKent Overstreet2013-11-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For immutable biovecs, we'll be introducing a new bio_iovec() that uses our new bvec iterator to construct a biovec, taking into account bvec_iter->bi_bvec_done - this patch updates existing users for the new usage. Some of the existing users really do need a pointer into the bvec array - those uses are all going to be removed, but we'll need the functionality from immutable to remove them - so for now rename the existing bio_iovec() -> __bio_iovec(), and it'll be removed in a couple patches. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: "Ed L. Cashin" <ecashin@coraid.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
* | | [SCSI] sd: Do not call do_div() with a 64-bit divisorGeert Uytterhoeven2013-12-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | do_div() is meant for divisions of 64-bit number by 32-bit numbers. Passing 64-bit divisor types caused issues in the past on 32-bit platforms, cfr. commit ea077b1b96e073eac5c3c5590529e964767fc5f7 ("m68k: Truncate base in do_div()"). As scsi_device.sector_size is unsigned (int), factor should be unsigned int, too. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* | | [SCSI] Fix erratic device offline during EHJames Bottomley2013-12-19
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 18a4d0a22ed6c54b67af7718c305cd010f09ddf8 (Handle disk devices which can not process medium access commands) was introduced to offline any device which cannot process medium access commands. However, commit 3eef6257de48ff84a5d98ca533685df8a3beaeb8 (Reduce error recovery time by reducing use of TURs) reduced the number of TURs by sending it only on the first failing command, which might or might not be a medium access command. So in combination this results in an erratic device offlining during EH; if the command where the TUR was sent upon happens to be a medium access command the device will be set offline, if not everything proceeds as normal. This patch moves the check to the final test, eliminating this problem. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* | [SCSI] Disable WRITE SAME for RAID and virtual host adapter driversMartin K. Petersen2013-11-28
|/ | | | | | | | | | | | | | | | | | Some host adapters do not pass commands through to the target disk directly. Instead they provide an emulated target which may or may not accurately report its capabilities. In some cases the physical device characteristics are reported even when the host adapter is processing commands on the device's behalf. This can lead to adapter firmware hangs or excessive I/O errors. This patch disables WRITE SAME for devices connected to host adapters that provide an emulated target. Driver writers can disable WRITE SAME by setting the no_write_same flag in the host adapter template. [jejb: fix up rejections due to eh_deadline patch] Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* Merge tag 'scsi-for-linus' of ↵Linus Torvalds2013-11-13
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull first round of SCSI updates from James Bottomley: "This patch set is driver updates for qla4xxx, scsi_debug, pm80xx, fcoe/libfc, eas2r, lpfc, be2iscsi and megaraid_sas plus some assorted bug fixes and cleanups" * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (106 commits) [SCSI] scsi_error: Escalate to LUN reset if abort fails [SCSI] Add 'eh_deadline' to limit SCSI EH runtime [SCSI] remove check for 'resetting' [SCSI] dc395: Move 'last_reset' into internal host structure [SCSI] tmscsim: Move 'last_reset' into host structure [SCSI] advansys: Remove 'last_reset' references [SCSI] dpt_i2o: return SCSI_MLQUEUE_HOST_BUSY when in reset [SCSI] dpt_i2o: Remove DPTI_STATE_IOCTL [SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path [SCSI] lpfc: Fix typo on NULL assignment [SCSI] scsi_dh_alua: ALUA handler attach should succeed while TPG is transitioning [SCSI] scsi_dh_alua: ALUA check sense should retry device internal reset unit attention [SCSI] esas2r: Cleanup snprinf formatting of firmware version [SCSI] esas2r: Remove superfluous mask of pcie_cap_reg [SCSI] esas2r: Fixes for big-endian platforms [SCSI] esas2r: Directly call kernel functions for atomic bit operations [SCSI] lpfc 8.3.43: Update lpfc version to driver version 8.3.43 [SCSI] lpfc 8.3.43: Fixed not processing task management IOCB response status [SCSI] lpfc 8.3.43: Fixed spinlock hang. [SCSI] lpfc 8.3.43: Fixed invalid Total_Data_Placed value received for els and ct command responses ...
| * [SCSI] Derive the FLUSH_TIMEOUT from the basic I/O timeoutJames Bottomley2013-10-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than having a separate constant for specifying the timeout on FLUSH operations, use the basic I/O timeout value that is already configurable on a per target basis to derive the FLUSH timeout. Looking at the current definitions of these timeout values, the FLUSH operation is supposed to have a value that is twice the normal timeout value. This patch preserves this relationship while leveraging the flexibility of specifying the I/O timeout. Based on a prior patch by KY Srinivasan <kys@microsoft.com> Reviewed-by: KY Srinivasan <kys@microsoft.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] sd: Add error handling during flushing cachesOliver Neukum2013-10-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It makes no sense to flush the cache of a device without medium. Errors during suspend must be handled according to their causes. Errors due to missing media or unplugged devices must be ignored. Errors due to devices being offlined must also be ignored. The error returns must be modified so that the generic layer understands them. [jejb: fix up whitespace and other formatting problems] Signed-off-by: Oliver Neukum <oneukum@suse.de> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] sd: Reduce buffer size for vpd requestBernd Schubert2013-10-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Somehow older areca firmware versions have issues with scsi_get_vpd_page() and a large buffer, the firmware seems to crash and the scsi error-handler will start endless recovery retries. Limiting the buf-size to 64-bytes fixes this issue with older firmware versions (<1.49 for my controller). Fixes a regression with areca controllers and older firmware versions introduced by commit: 66c28f97120e8a621afd5aa7a31c4b85c547d33d Reported-by: Nix <nix@esperi.org.uk> Tested-by: Nix <nix@esperi.org.uk> Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Cc: stable@vger.kernel.org # delay inclusion for 2 months for testing Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* | Merge branch 'blk-mq/core' into for-3.13/coreJens Axboe2013-11-08
|\ \ | |/ |/| | | | | | | | | Signed-off-by: Jens Axboe <axboe@kernel.dk> Conflicts: block/blk-timeout.c
| * block: make rq->cmd_flags be 64-bitJens Axboe2013-10-25
| | | | | | | | | | | | | | | | We have officially run out of flags in a 32-bit space. Extend it to 64-bit even on 32-bit archs. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | [SCSI] sd: call blk_pm_runtime_init before add_diskAaron Lu2013-10-23
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sujit has found a race condition that would make q->nr_pending unbalanced, it occurs as Sujit explained: " sd_probe_async() -> add_disk() -> disk_add_event() -> schedule(disk_events_workfn) sd_revalidate_disk() blk_pm_runtime_init() return; Let's say the disk_events_workfn() calls sd_check_events() which tries to send test_unit_ready() and because of sd_revalidate_disk() trying to send another commands the test_unit_ready() might be re-queued as the tagged command queuing is disabled. So the race condition is - Thread 1 | Thread 2 sd_revalidate_disk() | sd_check_events() ...nr_pending = 0 as q->dev = NULL| scsi_queue_insert() blk_runtime_pm_init() | blk_pm_requeue_request() -> | nr_pending = -1 since | q->dev != NULL " The problem is, the test_unit_ready request doesn't get counted the first time it is queued, so the later decrement of q->nr_pending in blk_pm_requeue_request makes it unbalanced. Fix this by calling blk_pm_runtime_init before add_disk so that all requests initiated there will all be counted. Signed-off-by: Aaron Lu <aaron.lu@intel.com> Reported-and-tested-by: Sujit Reddy Thumma <sthumma@codeaurora.org> Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>