aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* nvme-pci: Fix an error handling path in 'nvme_probe()'Christophe JAILLET2017-07-20
| | | | | | | | | | | Release resources in the correct order in order not to miss a 'put_device()' if 'nvme_dev_map()' fails. Fixes: b00a726a9fd8 ("NVMe: Don't unmap controller registers on reset") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme-pci: Remove nvme_setup_prps BUG_ONKeith Busch2017-07-20
| | | | | | | | | | | | | This patch replaces the invalid nvme SGL kernel panic with a warning, and returns an appropriate error. The warning will occur only on the first occurance, and sgl details will be printed to help debug how the request was allowed to form. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme-pci: add another device ID with stripe quirkDavid Wayne Fugate2017-07-20
| | | | | | | | | | Adds a fourth Intel controller which has the "stripe" quirk. Signed-off-by: David Wayne Fugate <david.fugate@intel.com> Acked-by: Keith Busch <keith.busch@intel.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvmet-fc: fix byte swapping in nvmet_fc_ls_create_associationChristoph Hellwig2017-07-20
| | | | | | | | | We always need to do non-equal comparisms on the native endian versions to get the correct result. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nvme: fix byte swapping in the streams codeChristoph Hellwig2017-07-20
| | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* nbd: kill unused ret in recv_workKefeng Wang2017-07-13
| | | | | | | No need to return value in queue work, kill ret variable. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* bfq: dispatch request to prevent queue stalling after the request completionHou Tao2017-07-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are mq devices (eg., virtio-blk, nbd and loopback) which don't invoke blk_mq_run_hw_queues() after the completion of a request. If bfq is enabled on these devices and the slice_idle attribute or strict_guarantees attribute is set as zero, it is possible that after a request completion the remaining requests of busy bfq queue will stalled in the bfq schedule until a new request arrives. To fix the scheduler latency problem, we need to check whether or not all issued requests have completed and dispatch more requests to driver if there is no request in driver. The problem can be reproduced by running the following script on a virtio-blk device with nr_hw_queues as 1: #!/bin/sh dev=vdb # mount point for dev mp=/tmp/mnt cd $mp job=strict.job cat <<EOF > $job [global] direct=1 bs=4k size=256M rw=write ioengine=libaio iodepth=128 runtime=5 time_based [1] filename=1.data [2] new_group filename=2.data EOF echo bfq > /sys/block/$dev/queue/scheduler echo 1 > /sys/block/$dev/queue/iosched/strict_guarantees fio $job Signed-off-by: Hou Tao <houtao1@huawei.com> Reviewed-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* bfq: fix typos in comments about B-WF2Q+ algorithmHou Tao2017-07-12
| | | | | | | | | | The start time of eligible entity should be less than or equal to the current virtual time, and the entity in idle tree has a finish time being greater than the current virtual time. Signed-off-by: Hou Tao <houtao1@huawei.com> Reviewed-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds2017-07-12
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull sparc fixes from David Miller: - Fix symbol version generation for assembler on sparc, from Nagarathnam Muthusamy. - Fix compound page handling in gup_huge_pmd(), from Nitin Gupta. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix gup_huge_pmd Adding the type of exported symbols sed regex in Makefile.build requires line break between exported symbols Adding asm-prototypes.h for genksyms to generate crc
| * sparc64: Fix gup_huge_pmdNitin Gupta2017-06-25
| | | | | | | | | | | | | | | | | | | | | | | | The function assumes that each PMD points to head of a huge page. This is not correct as a PMD can point to start of any 8M region with a, say 256M, hugepage. The fix ensures that it points to the correct head of any PMD huge page. Cc: Julian Calaby <julian.calaby@gmail.com> Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'sparc-Suppressing-version-generation-failed-warnings-in-sparc'David S. Miller2017-06-19
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nagarathnam Muthusamy says: ==================== sparc: Suppressing version generation failed warnings in sparc Compiling the sparc build of upstream kernel generates lots of warnings regarding version generation failure for functions defined in assembly files. This can be easily suppressed by adding those function prototypes to asm/asm-prototypes.h as in powerpc architecture. The following series of patches aims to clean the following warnings. WARNING: EXPORT symbol "atomic_add" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "xor_niagara_4" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic_sub_return" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic64_sub" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "xor_vis_3" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic_xor" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "_mcount" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "sun4v_niagara2_getperf" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic64_fetch_xor" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic_fetch_xor" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__bzero" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic64_fetch_and" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "raw_copy_to_user" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__flushw_user" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "sun4v_niagara2_setperf" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__arch_hweight64" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__ffs" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "prom_root_node" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "csum_partial_copy_nocheck" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "memcpy" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic_fetch_or" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "xor_vis_4" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__arch_hweight32" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "ffs" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic_and" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "xor_niagara_3" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic64_and" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "real_hard_smp_processor_id" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "clear_user_page" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "memmove" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "raw_copy_from_user" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic64_add_return" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__memscan_generic" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic_sub" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__memscan_zero" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic64_add" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "strncmp" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic64_fetch_sub" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "xor_vis_5" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "ip_fast_csum" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "test_and_change_bit" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__arch_hweight8" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "strlen" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "xor_niagara_2" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic64_fetch_or" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic64_or" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic_fetch_sub" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic_fetch_add" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "clear_bit" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "sun4v_mach_set_watchdog" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "set_bit" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__csum_partial_copy_from_user" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "memset" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__csum_partial_copy_to_user" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic64_fetch_add" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "change_bit" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__clear_user" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic64_sub_return" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__arch_hweight16" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "tlb_type" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "VISenter" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "test_and_set_bit" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "xor_niagara_5" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "_clear_page" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "xor_vis_2" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic64_xor" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "test_and_clear_bit" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "csum_partial" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic_fetch_and" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "copy_user_page" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "sun4v_chip_type" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "raw_copy_in_user" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "memcmp" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic_add_return" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic_or" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "atomic64_dec_if_positive" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "sun4v_niagara_setperf" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "sun4v_niagara_getperf" [vmlinux] version generation failed, symbol will not be versioned. With the fix, all these warnings will be removed during compilation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * Adding the type of exported symbolsNagarathnam Muthusamy2017-06-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Missing symbol type for few functions prevents genksyms from generating symbol versions for those functions. This patch fixes them. Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * sed regex in Makefile.build requires line break between exported symbolsNagarathnam Muthusamy2017-06-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following regex in Makefile.build matches only one ___EXPORT_SYMBOL per line. sed 's/.*___EXPORT_SYMBOL[[:space:]]*\([a-zA-Z0-9_]*\)[[:space:]]*,.*/EXPORT_SYMBOL(\1);/' ATOMIC_OPS macro in atomic_64.S expands multiple symbols in same line hence version generation is done only for the last matched symbol. This patch adds new line between the symbol expansions. Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * Adding asm-prototypes.h for genksyms to generate crcNagarathnam Muthusamy2017-06-19
| |/ | | | | | | | | | | | | | | | | | | This patch adds the prototypes of assembly defined functions to asm-prototypes.h. Some prototypes are directly added as they are not present in any existing header files. Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds2017-07-11
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull more block updates from Jens Axboe: "This is a followup for block changes, that didn't make the initial pull request. It's a bit of a mixed bag, this contains: - A followup pull request from Sagi for NVMe. Outside of fixups for NVMe, it also includes a series for ensuring that we properly quiesce hardware queues when browsing live tags. - Set of integrity fixes from Dmitry (mostly), fixing various issues for folks using DIF/DIX. - Fix for a bug introduced in cciss, with the req init changes. From Christoph. - Fix for a bug in BFQ, from Paolo. - Two followup fixes for lightnvm/pblk from Javier. - Depth fix from Ming for blk-mq-sched. - Also from Ming, performance fix for mtip32xx that was introduced with the dynamic initialization of commands" * 'for-linus' of git://git.kernel.dk/linux-block: (44 commits) block: call bio_uninit in bio_endio nvmet: avoid unneeded assignment of submit_bio return value nvme-pci: add module parameter for io queue depth nvme-pci: compile warnings in nvme_alloc_host_mem() nvmet_fc: Accept variable pad lengths on Create Association LS nvme_fc/nvmet_fc: revise Create Association descriptor length lightnvm: pblk: remove unnecessary checks lightnvm: pblk: control I/O flow also on tear down cciss: initialize struct scsi_req null_blk: fix error flow for shared tags during module_init block: Fix __blkdev_issue_zeroout loop nvme-rdma: unconditionally recycle the request mr nvme: split nvme_uninit_ctrl into stop and uninit virtio_blk: quiesce/unquiesce live IO when entering PM states mtip32xx: quiesce request queues to make sure no submissions are inflight nbd: quiesce request queues to make sure no submissions are inflight nvme: kick requeue list when requeueing a request instead of when starting the queues nvme-pci: quiesce/unquiesce admin_q instead of start/stop its hw queues nvme-loop: quiesce/unquiesce admin_q instead of start/stop its hw queues nvme-fc: quiesce/unquiesce admin_q instead of start/stop its hw queues ...
| * | block: call bio_uninit in bio_endioShaohua Li2017-07-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bio_free isn't a good place to free cgroup info. There are a lot of cases bio is allocated in special way (for example, in stack) and never gets called by bio_put hence bio_free, we are leaking memory. This patch moves the free to bio endio, which should be called anyway. The bio_uninit call in bio_free is kept, in case the bio never gets called bio endio. This assumes ->bi_end_io() doesn't access cgroup info, which seems true in my audit. This along with Christoph's integrity patch should fix the memory leak issue. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | Merge branch 'nvme-4.13' of git://git.infradead.org/nvme into for-linusJens Axboe2017-07-10
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull followup NVMe (mostly) changes from Sagi: I added the quiesce/unquiesce patches in here as it's easy for me easily apply changes on top. It has accumulated reviews and includes mostly nvme anyway, please tell me if you don't want to take them with this. This includes: - quiesce/unquiesce fixes in nvme and others from me - nvme-fc add create association padding spec updates from James - some more quirking from MKP - nvmet nit cleanup from Max - Fix nvme-rdma racy RDMA completion signalling from Marta - some centralization patches from me - add tagset nr_hw_queues updates on controller resets in nvme drivers from me - nvme-rdma fix resources recycling when doing error recovery from me - minor cleanups in nvme-fc from me
| | * | nvmet: avoid unneeded assignment of submit_bio return valueMax Gurtovoy2017-07-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We actually using the cookie returned from the last submit_bio call. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme-pci: add module parameter for io queue depthweiping zhang2017-07-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Adjust io queue depth more easily, and make sure io queue depth >= 2. Signed-off-by: weiping zhang <zhangweiping@didichuxing.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme-pci: compile warnings in nvme_alloc_host_mem()Dan Carpenter2017-07-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "i" should be signed or it could cause a forever loop on the cleanup path. "size" can be used uninitialized. Fixes: 87ad72a59a38 ("nvme-pci: implement host memory buffer support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvmet_fc: Accept variable pad lengths on Create Association LSJames Smart2017-07-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Target validation of the Create Association LS revised to accept any LS as long as all non-pad data has been received. This allows a (newer) target to accept the LS from older initiators with varying pad lengths. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme_fc/nvmet_fc: revise Create Association descriptor lengthJames Smart2017-07-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Revises the Create Association LS for the amount of pad expected in 1.16. Add defines for the minimum lengths that a target can accept (e.g. variable pad lengths) Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme-rdma: unconditionally recycle the request mrSagi Grimberg2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When our RDMA queue-pair is torn down with high load of I/O traffic, we have no way of knowing if the memory region was actually registered by the reg_mr work request as it completion flushes with error (hw might have done it or not). So in order to not deal with all this uncertanty, we simply recycle the MR in reinit_request. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme: split nvme_uninit_ctrl into stop and uninitSagi Grimberg2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Usually before we teardown the controller we want to: 1. complete/cancel any ctrl inflight works 2. remove ctrl namespaces (only for removal though, resets shouldn't remove any namespaces). but we do not want to destroy the controller device as we might use it for logging during the teardown stage. This patch adds nvme_start_ctrl() which queues inflight controller works (aen, ns scan, queue start and keep-alive if kato is set) and nvme_stop_ctrl() which cancels the works namespace removal is left to the callers to handle. Move nvme_uninit_ctrl after we are done with the controller device. Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | virtio_blk: quiesce/unquiesce live IO when entering PM statesSagi Grimberg2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without it its not guaranteed that no .queue_rq is inflight. Reviewed-by: Ming Lei <ming.lei@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: virtio-dev@lists.oasis-open.org Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | mtip32xx: quiesce request queues to make sure no submissions are inflightSagi Grimberg2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unlike blk_mq_stop_hw_queues, blk_mq_quiesce_queue respects the submission path rcu grace. quiesce the queue before iterating on live tags, or performing device io quiescing. While were at it, verify that the request started in mtip_abort_cmd amd mtip_queue_cmd tag iteration calls. Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nbd: quiesce request queues to make sure no submissions are inflightSagi Grimberg2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unlike blk_mq_stop_hw_queues, blk_mq_quiesce_queue respects the submission path rcu grace. quiesce the queue before iterating on live tags. Reviewed-by: Ming Lei <ming.lei@redhat.com> Acked-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme: kick requeue list when requeueing a request instead of when starting ↵Sagi Grimberg2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the queues When we requeue a request, we can always insert the request back to the scheduler instead of doing it when restarting the queues and kicking the requeue work, so get rid of the requeue kick in nvme (core and drivers). Also, now there is no need start hw queues in nvme_kill_queues We don't stop the hw queues anymore, so no need to start them. Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme-pci: quiesce/unquiesce admin_q instead of start/stop its hw queuesSagi Grimberg2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unlike blk_mq_stop_hw_queues and blk_mq_start_stopped_hw_queues quiescing/unquiescing respects the submission path rcu grace. Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme-loop: quiesce/unquiesce admin_q instead of start/stop its hw queuesSagi Grimberg2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unlike blk_mq_stop_hw_queues and blk_mq_start_stopped_hw_queues quiescing/unquiescing respects the submission path rcu grace. Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme-fc: quiesce/unquiesce admin_q instead of start/stop its hw queuesSagi Grimberg2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unlike blk_mq_stop_hw_queues and blk_mq_start_stopped_hw_queues quiescing/unquiescing respects the submission path rcu grace. Also, make sure to unquiesce before cleanup the admin queue. Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-By: James Smart <james.smart@broadcom.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme-rdma: quiesce/unquiesce admin_q instead of start/stop its hw queuesSagi Grimberg2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unlike blk_mq_stop_hw_queues and blk_mq_start_stopped_hw_queues quiescing/unquiescing respects the submission path rcu grace. Also make sure to kick the requeue list when appropriate. Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme-rdma: remove race conditions from IB signallingMarta Rybczynska2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch improves the way the RDMA IB signalling is done by using atomic operations for the signalling variable. This avoids race conditions on sig_count. The signalling interval changes slightly and is now the largest power of two not larger than queue depth / 2. ilog() usage idea by Bart Van Assche. Signed-off-by: Marta Rybczynska <marta.rybczynska@kalray.eu> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org
| | * | nvme-fc: use blk_mq_delay_run_hw_queue instead of open-coding itSagi Grimberg2017-07-04
| | | | | | | | | | | | | | | | | | | | Cc: James Smart <james.smart@broadcom.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme-fc: update tagset nr_hw_queues after queues reinitSagi Grimberg2017-07-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We might have more/less queues once we reconnect/reset. For example due to cpu going online/offline or controller constraints. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme-loop: update tagset nr_hw_queues after reconnecting/resettingSagi Grimberg2017-07-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We might have more/less queues once we reconnect/reset. For example due to cpu going online/offline Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme-rdma: update tagset nr_hw_queues after reconnecting/resettingSagi Grimberg2017-07-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We might have more/less queues once we reconnect/reset. For example due to cpu going online/offline or controller constraints. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme-fc: don't override opts->nr_io_queuesSagi Grimberg2017-07-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Its what the user passed, so its probably a better idea to keep it intact. Also, limit the number of I/O queues to max online cpus and the lport maximum hw queues. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme-pci: rename to nvme_pci_configure_admin_queueSagi Grimberg2017-07-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | we are going to need the name for the core routine... Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme: move ctrl cap to struct nvme_ctrlSagi Grimberg2017-07-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All transports use either a private cache of controller cap or an on-stack copy, move it to the generic struct nvme_ctrl. In the future it will also be maintained by the core. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme: move queue_count to the nvme_ctrlSagi Grimberg2017-07-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All all transports use the queue_count in exactly the same, so move it to the generic struct nvme_ctrl. In the future it will also be maintained by the core. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-By: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme: Quirks for PM1725 controllersMartin K. Petersen2017-07-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PM1725 controllers have a couple of quirks that need to be handled in the driver: - I/O queue depth must be limited to 64 entries on controllers that do not report MQES. - The host interface registers go offline briefly while resetting the chip. Thus a delay is needed before checking whether the controller is ready. Note that the admin queue depth is also limited to 64 on older versions of this board. Since our NVME_AQ_DEPTH is now 32 that is no longer an issue. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
| * | | lightnvm: pblk: remove unnecessary checksJavier González2017-07-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove unnecessary checks when freeing dma memory in the completion path. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | lightnvm: pblk: control I/O flow also on tear downJavier González2017-07-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When removing a pblk instance, control the write I/O flow to the controller as we do in the fast path. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | cciss: initialize struct scsi_reqChristoph Hellwig2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The changes in "block: Make most scsi_req_init() calls implicit" mean that every driver that supports the generic scsi ioctls needs to call scsi_req_init on newly allocated requests, but that commit didn't add the call to the ccіss driver. Fix that to avoid crashes when udev issues SG_IO commands. Fixes: ca18d6f7 ("block: Make most scsi_req_init() calls implicit") Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | null_blk: fix error flow for shared tags during module_initMax Gurtovoy2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case we use shared tags feature, blk_mq_alloc_tag_set might fail during module initialization. In that case, fail the load with a suitable error code. Also move the tagset initialization process after defining the amount of submission queues. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | block: Fix __blkdev_issue_zeroout loopDamien Le Moal2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The BIO issuing loop in __blkdev_issue_zeroout() is allocating BIOs with a maximum number of bvec (pages) equal to min(nr_sects, (sector_t)BIO_MAX_PAGES) This works since the requested number of bvecs will always be limited to the absolute maximum number supported (BIO_MAX_PAGES), but this is ineficient as too many bvec entries may be requested due to the different units being used in the min() operation (number of sectors vs number of pages). To fix this, introduce the helper __blkdev_sectors_to_bio_pages() to correctly calculate the number of bvecs for zeroout BIOs as the issuing loop progresses. The calculation is done using consistent units and makes sure that the number of pages return is at least 1 (for cases where the number of sectors is less that the number of sectors in a page). Also remove a trailing space after the bit shift in the internal loop min() call. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | mtip32xx: avoid to read HOST_CAP from HW in .queue_rq()Ming Lei2017-07-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is observed reading the register from HW takes a bit long, for example in my box, the following difference of 'perf report --no-children fio ...' can be seen when running I/O: 1) V4.12 without patch + 9.28% fio [mtip32xx] [k] mtip_irq_handler + 8.48% fio [mtip32xx] [k] mtip_init_cmd_header 2) V4.12 with the following patch + 9.14% fio [mtip32xx] [k] mtip_irq_handler ...... + 1.14% fio [mtip32xx] [k] mtip_init_cmd_header IOPS can be increased by ~5% with this patch too. Fixes: a4e84aae8139(mtip32xx: use runtime tag to initialize command header) Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | bio-integrity: fix boolreturn.cocci warningskbuild test robot2017-07-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | block/bio-integrity.c:318:10-11: WARNING: return of 0/1 in function 'bio_integrity_prep' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Generated by: scripts/coccinelle/misc/boolreturn.cocci Fixes: e23947bd76f0 ("bio-integrity: fold bio_integrity_enabled to bio_integrity_prep") CC: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | bio-integrity: stop abusing bi_end_ioChristoph Hellwig2017-07-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | And instead call directly into the integrity code from bio_end_io. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>