aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* cgroup: separate out interface file creation from css creationTejun Heo2016-03-03
| | | | | | | | | | | | | Currently, interface files are created when a css is created depending on whether @visible is set. This patch separates out the two into separate steps to help code refactoring and eventually allow cgroups which aren't visible through cgroup fs. Move css_populate_dir() out of create_css() and drop @visible. While at it, rename the function to css_create() for consistency. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Zefan Li <lizefan@huawei.com>
* cgroup: suppress spurious de-populated eventsTejun Heo2016-03-03
| | | | | | | | | | | | | During task migration, tasks may transfer between two css_sets which are associated with the same cgroup. If those tasks are the only tasks in the cgroup, this currently triggers a spurious de-populated event on the cgroup. Fix it by bumping up populated count before bumping it down during migration to ensure that it doesn't reach zero spuriously. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Zefan Li <lizefan@huawei.com>
* cgroup: re-hash init_css_set after subsystems are initializedTejun Heo2016-03-03
| | | | | | | | | | | | | | | | | css_sets are hashed by their subsys[] contents and in cgroup_init() init_css_set is hashed early, before subsystem inits, when all entries in its subsys[] are NULL, so that cgroup_dfl_root initialization can find and link to it. As subsystems are initialized, init_css_set.subsys[] is filled up but the hashing is never updated making init_css_set hashed in the wrong place. While incorrect, this doesn't cause a critical failure as css_set management code would create an identical css_set dynamically. Fix it by rehashing init_css_set after subsystems are initialized. While at it, drop unnecessary @key local variable. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Zefan Li <lizefan@huawei.com>
* cgroup: reset css on destructionVladimir Davydov2016-03-01
| | | | | | | | | | | | | | | An associated css can be around for quite a while after a cgroup directory has been removed. In general, it makes sense to reset it to defaults so as not to worry about any remnants. For instance, memory cgroup needs to reset memory.low, otherwise pages charged to a dead cgroup might never get reclaimed. There's ->css_reset callback, which would fit perfectly for the purpose. Currently, it's only called when a subsystem is disabled in the unified hierarchy and there are other subsystems dependant on it. Let's call it on css destruction as well. Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* cgroup: fix a mistake in warning messageXiubo Li2016-02-27
| | | | | | | | | | There is a mistake about the print format name:id <--> %d:%s, which the name is 'char *' type and id is 'int' type. Change "name:id" to "id:name" instead to be consistent with "cgroup_subsys %d:%s". Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Acked-by: Zefan Li <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* cgroup: use ->subtree_control when testing no internal process ruleTejun Heo2016-02-23
| | | | | | | | | | | | | | | No internal process rule is enforced by cgroup_migrate_prepare_dst() during process migration. It tests whether the target cgroup's ->child_subsys_mask is zero which is different from "subtree_control" write path which tests ->subtree_control. This hasn't mattered because up until now, both ->child_subsys_mask and ->subtree_control are zero or non-zero at the same time. However, with the planned addition of implicit controllers, this will no longer be true. This patch prepares for the change by making cgorup_migrate_prepare_dst() test ->subtree_control instead. Signed-off-by: Tejun Heo <tj@kernel.org>
* cgroup: make css_tryget_online_from_dir() also recognize cgroup2 fsTejun Heo2016-02-23
| | | | | | | | The function currently returns -EBADF for a directory on the default hierarchy. Make it also recognize cgroup2_fs_type. This will be used for perf_event cgroup2 support. Signed-off-by: Tejun Heo <tj@kernel.org>
* cgroup: convert cgroup_subsys flag fields to bool bitfieldsTejun Heo2016-02-23
| | | | | | | Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org>
* cgroup: s/cgrp_dfl_root_/cgrp_dfl_/Tejun Heo2016-02-23
| | | | | | | These var names are unnecessarily unwiedly and another similar variable will be added. Let's shorten them. Signed-off-by: Tejun Heo <tj@kernel.org>
* cgroup: make cgroup subsystem masks u16Tejun Heo2016-02-22
| | | | | | | | | | | | | | | After the recent do_each_subsys_mask() conversion, there's no reason to use ulong for subsystem masks. We'll be adding more subsystem masks to persistent data structures, let's reduce its size to u16 which should be enough for now and the foreseeable future. This doesn't create any noticeable behavior differences. v2: Johannes spotted that the initial patch missed cgroup_no_v1_mask. Converted. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
* cgroup: use do_each_subsys_mask() where applicableTejun Heo2016-02-22
| | | | | | | | | | There are several places in cgroup_subtree_control_write() which can use do_each_subsys_mask() instead of manual mask testing. Use it. No functional changes. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
* cgroup: convert for_each_subsys_which() to do-while styleTejun Heo2016-02-22
| | | | | | | | | | | | | | | for_each_subsys_which() allows iterating subsystems specified in a subsystem bitmask; unfortunately, it requires the mask to be an unsigned long l-value which can be inconvenient and makes it awkward to use a smaller type for subsystem masks. This patch converts for_each_subsy_which() to do-while style which allows it to drop the l-value requirement. The new iterator is named do_each_subsys_mask() / while_each_subsys_mask(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Aleksa Sarai <cyphar@cyphar.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
* cgroup: s/child_subsys_mask/subtree_ss_mask/Tejun Heo2016-02-22
| | | | | | | | | | | | | For consistency with cgroup->subtree_control. * cgroup->child_subsys_mask -> cgroup->subtree_ss_mask * cgroup_calc_child_subsys_mask() -> cgroup_calc_subtree_ss_mask() * cgroup_refresh_child_subsys_mask() -> cgroup_refresh_subtree_ss_mask() No functional changes. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
* Revert "cgroup: add cgroup_subsys->css_e_css_changed()"Tejun Heo2016-02-22
| | | | | | | | | | | | This reverts commit 56c807ba4e91f0980567b6a69de239677879b17f. cgroup_subsys->css_e_css_changed() was supposed to be used by cgroup writeback support; however, the change to per-inode cgroup association made it unnecessary and the callback doesn't have any user. Remove it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
* cgroup: fix error return value of cgroup_addrm_files()Tejun Heo2016-02-22
| | | | | | | | cgroup_addrm_files() incorrectly returned 0 after add failure. Fix it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
* cgroup: document cgroup_no_v1=Johannes Weiner2016-02-16
| | | | | | | | Add cgroup_no_v1= to kernel-parameters.txt, and a small blurb to cgroup-v2.txt section about transitioning from cgroup to cgroup2. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Tejun Heo <tj@kernel.org>
* cgroup: provide cgroup_nov1= to disable controllers in v1 mountsJohannes Weiner2016-02-12
| | | | | | | | | | | | | | | | | | | | | | Testing cgroup2 can be painful with system software automatically mounting and populating all cgroup controllers in v1 mode. Sometimes they can be unmounted from rc.local, sometimes even that is too late. Provide a commandline option to disable certain controllers in v1 mounts, so that they remain available for cgroup2 mounts. Example use: cgroup_no_v1=memory,cpu cgroup_no_v1=all Disabling will be confirmed at boot-time as such: [ 0.013770] Disabling cpu control group subsystem in v1 mounts [ 0.016004] Disabling memory control group subsystem in v1 mounts Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Tejun Heo <tj@kernel.org>
* kernel/Makefile: remove the useless CFLAGS_REMOVE_cgroup-debug.oLi Bin2016-01-31
| | | | | | | | | | | The file cgroup-debug.c had been removed from commit fe6934354f8e (cgroups: move the cgroup debug subsys into cgroup.c to access internal state). Remain the CFLAGS_REMOVE_cgroup-debug.o = $(CC_FLAGS_FTRACE) useless in kernel/Makefile. Signed-off-by: Li Bin <huawei.libin@huawei.com> Acked-by: Zefan Li <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* Documentation: cgroup: Fix 'cgroup-legacy' -> 'cgroup-v1'W. Trevor King2016-01-29
| | | | | | | | This should have happened in 6255c46f (cgroup: rename cgroup documentations, 2016-01-11). Signed-off-by: W. Trevor King <wking@tremily.us> Signed-off-by: Tejun Heo <tj@kernel.org>
* cgroup: make sure a parent css isn't freed before its childrenTejun Heo2016-01-22
| | | | | | | | | | | | | | | | | | | | There are three subsystem callbacks in css shutdown path - css_offline(), css_released() and css_free(). Except for css_released(), cgroup core didn't guarantee the order of invocation. css_offline() or css_free() could be called on a parent css before its children. This behavior is unexpected and led to bugs in cpu and memory controller. The previous patch updated ordering for css_offline() which fixes the cpu controller issue. While there currently isn't a known bug caused by misordering of css_free() invocations, let's fix it too for consistency. css_free() ordering can be trivially fixed by moving putting of the parent css below css_free() invocation. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org>
* cgroup: make sure a parent css isn't offlined before its childrenTejun Heo2016-01-22
| | | | | | | | | | | | | | | | | | | | | | | | | | There are three subsystem callbacks in css shutdown path - css_offline(), css_released() and css_free(). Except for css_released(), cgroup core didn't guarantee the order of invocation. css_offline() or css_free() could be called on a parent css before its children. This behavior is unexpected and led to bugs in cpu and memory controller. This patch updates offline path so that a parent css is never offlined before its children. Each css keeps online_cnt which reaches zero iff itself and all its children are offline and offline_css() is invoked only after online_cnt reaches zero. This fixes the memory controller bug and allows the fix for cpu controller. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-and-tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Reported-by: Brian Christiansen <brian.o.christiansen@gmail.com> Link: http://lkml.kernel.org/g/5698A023.9070703@de.ibm.com Link: http://lkml.kernel.org/g/CAKB58ikDkzc8REt31WBkD99+hxNzjK4+FBmhkgS+NVrC9vjMSg@mail.gmail.com Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org
* cpuset: make mm migration asynchronousTejun Heo2016-01-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | If "cpuset.memory_migrate" is set, when a process is moved from one cpuset to another with a different memory node mask, pages in used by the process are migrated to the new set of nodes. This was performed synchronously in the ->attach() callback, which is synchronized against process management. Recently, the synchronization was changed from per-process rwsem to global percpu rwsem for simplicity and optimization. Combined with the synchronous mm migration, this led to deadlocks because mm migration could schedule a work item which may in turn try to create a new worker blocking on the process management lock held from cgroup process migration path. This heavy an operation shouldn't be performed synchronously from that deep inside cgroup migration in the first place. This patch punts the actual migration to an ordered workqueue and updates cgroup process migration and cpuset config update paths to flush the workqueue after all locks are released. This way, the operations still seem synchronous to userland without entangling mm migration with process management synchronization. CPU hotplug can also invoke mm migration but there's no reason for it to wait for mm migrations and thus doesn't synchronize against their completions. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-and-tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: stable@vger.kernel.org # v4.4+
* Merge branch 'for-4.5/nvme' of git://git.kernel.dk/linux-blockLinus Torvalds2016-01-21
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NVMe updates from Jens Axboe: "Last branch for this series is the nvme changes. It's in a separate branch to avoid splitting too much between core and NVMe changes, since NVMe is still helping drive some blk-mq changes. That said, not a huge amount of core changes in here. The grunt of the work is the continued split of the code" * 'for-4.5/nvme' of git://git.kernel.dk/linux-block: (67 commits) uapi: update install list after nvme.h rename NVMe: Export NVMe attributes to sysfs group NVMe: Shutdown controller only for power-off NVMe: IO queue deletion re-write NVMe: Remove queue freezing on resets NVMe: Use a retryable error code on reset NVMe: Fix admin queue ring wrap nvme: make SG_IO support optional nvme: fixes for NVME_IOCTL_IO_CMD on the char device nvme: synchronize access to ctrl->namespaces nvme: Move nvme_freeze/unfreeze_queues to nvme core PCI/AER: include header file NVMe: Export namespace attributes to sysfs NVMe: Add pci error handlers block: remove REQ_NO_TIMEOUT flag nvme: merge iod and cmd_info nvme: meta_sg doesn't have to be an array nvme: properly free resources for cancelled command nvme: simplify completion handling nvme: special case AEN requests ...
| * uapi: update install list after nvme.h renameMike Frysinger2016-01-13
| | | | | | | | | | | | | | | | | | | | | | | | Commit 9d99a8dda154 ("nvme: move hardware structures out of the uapi version of nvme.h") renamed nvme.h to nvme_ioctl.h, but the uapi list still refers to nvme.h. People trying to install the headers hit a failure as the header no longer exists. Cc: stable@vger.kernel.org Signed-off-by: Mike Frysinger <vapier@gentoo.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
| * NVMe: Export NVMe attributes to sysfs groupKeith Busch2016-01-12
| | | | | | | | | | | | | | | | | | | | | | | | Adds all controller information to attribute list exposed to sysfs, and appends the reset_controller attribute to it. The nvme device is created with this attribute list, so driver no long manages its attributes. Reported-by: Sujith Pandel <sujithpshankar@gmail.com> Cc: Sujith Pandel <sujithpshankar@ gmail.com> Cc: David Milburn <dmilburn@redhat.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * NVMe: Shutdown controller only for power-offKeith Busch2016-01-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't need to shutdown a controller for a reset. A controller in a shutdown state may take longer to become ready than one that was simply disabled. This patch has the driver shut down a controller only if the device is about to be powered off or being removed. When taking the controller down for a reset reason, the controller will be disabled instead. Function names have been updated in this patch to reflect their changed semantics. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * NVMe: IO queue deletion re-writeKeith Busch2016-01-12
| | | | | | | | | | | | | | | | | | | | | | | | | | The nvme driver deletes IO queues asynchronously since this operation may potentially take an undesirable amount of time with a large number of queues if done serially. The driver used to manage coordinating asynchronous deletions. This patch simplifies that by leveraging the block layer rather than using kthread workers and chaining more complicated callbacks. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * NVMe: Remove queue freezing on resetsKeith Busch2016-01-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | NVMe submits all commands through the block layer now. This means we can let requests queue at the blk-mq hardware context since there is no path that bypasses this anymore so we don't need to freeze the queues anymore. The driver can simply stop the h/w queues from running during a reset instead. This also fixes a WARN in percpu_ref_reinit when the queue was unfrozen with requeued requests. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * NVMe: Use a retryable error code on resetKeith Busch2016-01-12
| | | | | | | | | | | | | | | | | | | | | | | | A negative status has the "do not retry" bit set, which makes it not retryable. Use a fake status that can potentially be retried on reset. An aborted command's status is overridden by the timeout handler so that it won't be retried, which is necessary to keep initialization from getting into a reset loop. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * NVMe: Fix admin queue ring wrapKeith Busch2016-01-12
| | | | | | | | | | | | | | | | | | The tag set queue depth needs to be one less than the h/w queue depth so we don't wrap the circular buffer. This conforms to the specification defined "Full Queue" condition. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * nvme: make SG_IO support optionalChristoph Hellwig2016-01-12
| | | | | | | | | | | | | | | | | | | | | | | | Translation SCSI commands to NVMe commands is rather pointless in general as applications must not expext to be able to use SCSI commands on a generic block device. Make the huge translation layer optional and hope no one will ever enable it in the future. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
| * nvme: fixes for NVME_IOCTL_IO_CMD on the char deviceChristoph Hellwig2016-01-12
| | | | | | | | | | | | | | | | | | | | | | | | Make sure we synchronize access to the namespaces list and grab a reference to the namespace before doing I/O. Make sure to reject the ioctl if multiple namespaces are present as it's entirely unsafe, and warn when using it even with a single namespace. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * nvme: synchronize access to ctrl->namespacesChristoph Hellwig2016-01-12
| | | | | | | | | | | | | | | | | | | | | | | | Currently traversal and modification of ctrl->namespaces happens completely unsynchronized, which can be fixed by the addition of a simple mutex. Note: nvme_dev_ioctl will be handled in the next patch. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * nvme: Move nvme_freeze/unfreeze_queues to nvme coreSagi Grimberg2016-01-12
| | | | | | | | | | | | | | | | | | | | Nothing pci specific about them and We'll need them exported in other transports too. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * PCI/AER: include header fileSudip Mukherjee2015-12-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are having build failure with sparc allmodconfig with the error: drivers/nvme/host/pci.c:15:0: include/linux/aer.h: In function 'pci_enable_pcie_error_reporting': include/linux/aer.h:49:10: error: 'EINVAL' undeclared (first use in this function) The file aer.h is using the error values but they are defined in errno.h. Include errno.h so that we have the definitions of the error codes. Fixes: a0a3408ee614 ("NVMe: Add pci error handlers") Cc: Keith Busch <keith.busch@intel.com> Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * NVMe: Export namespace attributes to sysfsKeith Busch2015-12-22
| | | | | | | | | | | | | | | | | | Exposes the NGUID, EUI-64, and NSID to sysfs entries under the disk's kobject. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * NVMe: Add pci error handlersKeith Busch2015-12-22
| | | | | | | | | | | | | | | | | | | | Requests enabling pcie aer support. Shuts down the controller on error detected with io frozen state prior to requesting slot reset; resumes controller after reset completes. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
| * block: remove REQ_NO_TIMEOUT flagChristoph Hellwig2015-12-22
| | | | | | | | | | | | | | | | | | | | This was added for the 'magic' AEN requests in the NVMe driver that never return. We now handle them purely inside the driver and don't need this core hack any more. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * nvme: merge iod and cmd_infoChristoph Hellwig2015-12-22
| | | | | | | | | | | | | | | | | | Merge the two per-request structures in the nvme driver into a single one. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * nvme: meta_sg doesn't have to be an arrayChristoph Hellwig2015-12-22
| | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * nvme: properly free resources for cancelled commandChristoph Hellwig2015-12-22
| | | | | | | | | | | | | | | | | | We need to move freeing of resources to the ->complete handler to ensure they are also freed when we cancel the command. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * nvme: simplify completion handlingChristoph Hellwig2015-12-22
| | | | | | | | | | | | | | | | | | | | | | Now that all commands are executed as block layer requests we can remove the internal completion in the NVMe driver. Note that we can simply call blk_mq_complete_request to abort commands as the block layer will protect against double copletions internally. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * nvme: special case AEN requestsChristoph Hellwig2015-12-22
| | | | | | | | | | | | | | | | | | | | AEN requests are different from other requests in that they don't time out or can easily be cancelled. Because of that we should not use the blk-mq infrastructure but just special case them in the completion path. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * nvme: switch abort to blk_execute_rq_nowaitChristoph Hellwig2015-12-22
| | | | | | | | | | | | | | | | And remove the now unused nvme_submit_cmd helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * nvme: switch delete SQ/CQ to blk_execute_rq_nowaitChristoph Hellwig2015-12-22
| | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * nvme: factor out a few helpers from req_completionChristoph Hellwig2015-12-22
| | | | | | | | | | | | | | | | We'll need them in other places later. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * nvme: fix admin queue depthChristoph Hellwig2015-12-22
| | | | | | | | | | | | | | | | The number in tag_set->queue depth includes the reserved tags. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * NVMe: Simplify metadata setupKeith Busch2015-12-22
| | | | | | | | | | | | | | We no longer require the two-pass setup for block integrity. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * NVMe: Remove device management handles on removeKeith Busch2015-12-22
| | | | | | | | | | | | | | | | | | We don't want to allow new references to open on a device that is removed. This ties the lifetime of these handles to the physical device's presence rather than to the open reference count. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * NVMe: Use unbounded work queue for all workKeith Busch2015-12-22
| | | | | | | | | | | | | | | | | | | | | | | | Removes all usage of the global work queue so work can't be scheduled on two different work queues, and removes nvme's work queue singlethreadedness so controllers can be driven in parallel. Signed-off-by: Keith Busch <keith.busch@intel.com> [hch: keep the dead controller removal on the system workqueue to avoid deadlocks] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>