aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/block/xen-blkfront.c
Commit message (Collapse)AuthorAge
* xen-blkfront: free allocated pageRoger Pau Monne2012-11-26
| | | | | | | Free the page allocated for the persistent grant. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/blkback: persistent-grants fixesRoger Pau Monne2012-11-04
| | | | | | | | | | | | | | | | | | | | | | | This patch contains fixes for persistent grants implementation v2: * handle == 0 is a valid handle, so initialize grants in blkback setting the handle to BLKBACK_INVALID_HANDLE instead of 0. Reported by Konrad Rzeszutek Wilk. * new_map is a boolean, use "true" or "false" instead of 1 and 0. Reported by Konrad Rzeszutek Wilk. * blkfront announces the persistent-grants feature as feature-persistent-grants, use feature-persistent instead which is consistent with blkback and the public Xen headers. * Add a consistency check in blkfront to make sure we don't try to access segments that have not been set. Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Roger Pau Monne <roger.pau@citrix.com> [v1: The new_map int->bool had already been changed] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/blkback: Persistent grant maps for xen blk driversRoger Pau Monne2012-10-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements persistent grants for the xen-blk{front,back} mechanism. The effect of this change is to reduce the number of unmap operations performed, since they cause a (costly) TLB shootdown. This allows the I/O performance to scale better when a large number of VMs are performing I/O. Previously, the blkfront driver was supplied a bvec[] from the request queue. This was granted to dom0; dom0 performed the I/O and wrote directly into the grant-mapped memory and unmapped it; blkfront then removed foreign access for that grant. The cost of unmapping scales badly with the number of CPUs in Dom0. An experiment showed that when Dom0 has 24 VCPUs, and guests are performing parallel I/O to a ramdisk, the IPIs from performing unmap's is a bottleneck at 5 guests (at which point 650,000 IOPS are being performed in total). If more than 5 guests are used, the performance declines. By 10 guests, only 400,000 IOPS are being performed. This patch improves performance by only unmapping when the connection between blkfront and back is broken. On startup blkfront notifies blkback that it is using persistent grants, and blkback will do the same. If blkback is not capable of persistent mapping, blkfront will still use the same grants, since it is compatible with the previous protocol, and simplifies the code complexity in blkfront. To perform a read, in persistent mode, blkfront uses a separate pool of pages that it maps to dom0. When a request comes in, blkfront transmutes the request so that blkback will write into one of these free pages. Blkback keeps note of which grefs it has already mapped. When a new ring request comes to blkback, it looks to see if it has already mapped that page. If so, it will not map it again. If the page hasn't been previously mapped, it is mapped now, and a record is kept of this mapping. Blkback proceeds as usual. When blkfront is notified that blkback has completed a request, it memcpy's from the shared memory, into the bvec supplied. A record that the {gref, page} tuple is mapped, and not inflight is kept. Writes are similar, except that the memcpy is peformed from the supplied bvecs, into the shared pages, before the request is put onto the ring. Blkback stores a mapping of grefs=>{page mapped to by gref} in a red-black tree. As the grefs are not known apriori, and provide no guarantees on their ordering, we have to perform a search through this tree to find the page, for every gref we receive. This operation takes O(log n) time in the worst case. In blkfront grants are stored using a single linked list. The maximum number of grants that blkback will persistenly map is currently set to RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST, to prevent a malicios guest from attempting a DoS, by supplying fresh grefs, causing the Dom0 kernel to map excessively. If a guest is using persistent grants and exceeds the maximum number of grants to map persistenly the newly passed grefs will be mapped and unmaped. Using this approach, we can have requests that mix persistent and non-persistent grants, and we need to handle them correctly. This allows us to set the maximum number of persistent grants to a lower value than RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST, although setting it will lead to unpredictable performance. In writing this patch, the question arrises as to if the additional cost of performing memcpys in the guest (to/from the pool of granted pages) outweigh the gains of not performing TLB shootdowns. The answer to that question is `no'. There appears to be very little, if any additional cost to the guest of using persistent grants. There is perhaps a small saving, from the reduced number of hypercalls performed in granting, and ending foreign access. Signed-off-by: Oliver Chick <oliver.chick@citrix.com> Signed-off-by: Roger Pau Monne <roger.pau@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [v1: Fixed up the misuse of bool as int]
* workqueue: deprecate flush[_delayed]_work_sync()Tejun Heo2012-08-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | flush[_delayed]_work_sync() are now spurious. Mark them deprecated and convert all users to flush[_delayed]_work(). If you're cc'd and wondering what's going on: Now all workqueues are non-reentrant and the regular flushes guarantee that the work item is not pending or running on any CPU on return, so there's no reason to use the sync flushes at all and they're going away. This patch doesn't make any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Mattia Dongili <malattia@linux.it> Cc: Kent Yoder <key@linux.vnet.ibm.com> Cc: David Airlie <airlied@linux.ie> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: Bryan Wu <bryan.wu@canonical.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-wireless@vger.kernel.org Cc: Anton Vorontsov <cbou@mail.ru> Cc: Sangbeom Kim <sbkim73@samsung.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Petr Vandrovec <petr@vandrovec.name> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Avi Kivity <avi@redhat.com>
* Merge tag 'random_for_linus' of ↵Linus Torvalds2012-07-31
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull random subsystem patches from Ted Ts'o: "This patch series contains a major revamp of how we collect entropy from interrupts for /dev/random and /dev/urandom. The goal is to addresses weaknesses discussed in the paper "Mining your Ps and Qs: Detection of Widespread Weak Keys in Network Devices", by Nadia Heninger, Zakir Durumeric, Eric Wustrow, J. Alex Halderman, which will be published in the Proceedings of the 21st Usenix Security Symposium, August 2012. (See https://factorable.net for more information and an extended version of the paper.)" Fix up trivial conflicts due to nearby changes in drivers/{mfd/ab3100-core.c, usb/gadget/omap_udc.c} * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: (33 commits) random: mix in architectural randomness in extract_buf() dmi: Feed DMI table to /dev/random driver random: Add comment to random_initialize() random: final removal of IRQF_SAMPLE_RANDOM um: remove IRQF_SAMPLE_RANDOM which is now a no-op sparc/ldc: remove IRQF_SAMPLE_RANDOM which is now a no-op [ARM] pxa: remove IRQF_SAMPLE_RANDOM which is now a no-op board-palmz71: remove IRQF_SAMPLE_RANDOM which is now a no-op isp1301_omap: remove IRQF_SAMPLE_RANDOM which is now a no-op pxa25x_udc: remove IRQF_SAMPLE_RANDOM which is now a no-op omap_udc: remove IRQF_SAMPLE_RANDOM which is now a no-op goku_udc: remove IRQF_SAMPLE_RANDOM which was commented out uartlite: remove IRQF_SAMPLE_RANDOM which is now a no-op drivers: hv: remove IRQF_SAMPLE_RANDOM which is now a no-op xen-blkfront: remove IRQF_SAMPLE_RANDOM which is now a no-op n2_crypto: remove IRQF_SAMPLE_RANDOM which is now a no-op pda_power: remove IRQF_SAMPLE_RANDOM which is now a no-op i2c-pmcmsp: remove IRQF_SAMPLE_RANDOM which is now a no-op input/serio/hp_sdc.c: remove IRQF_SAMPLE_RANDOM which is now a no-op mfd: remove IRQF_SAMPLE_RANDOM which is now a no-op ...
| * xen-blkfront: remove IRQF_SAMPLE_RANDOM which is now a no-opTheodore Ts'o2012-07-19
| | | | | | | | | | | | | | | | | | | | | | | | | | With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a no-op; interrupt randomness is now collected unconditionally in a very low-overhead fashion; see commit 775f4b297b. The IRQF_SAMPLE_RANDOM flag was scheduled to be removed in 2009 on the feature-removal-schedule, so this patch is preparation for the final removal of this flag. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org>
* | xen/blkfront: Add WARN to deal with misbehaving backends.Konrad Rzeszutek Wilk2012-06-12
|/ | | | | | | | | | | | | | | | | | | | | | | | | | Part of the ring structure is the 'id' field which is under control of the frontend. The frontend stamps it with "some" value (this some in this implementation being a value less than BLK_RING_SIZE), and when it gets a response expects said value to be in the response structure. We have a check for the id field when spolling new requests but not when de-spolling responses. We also add an extra check in add_id_to_freelist to make sure that the 'struct request' was not NULL - as we cannot pass a NULL to __blk_end_request_all, otherwise that crashes (and all the operations that the response is dealing with end up with __blk_end_request_all). Lastly we also print the name of the operation that failed. [v1: s/BUG/WARN/ suggested by Stefano] [v2: Add extra check in add_id_to_freelist] [v3: Redid op_name per Jan's suggestion] [v4: add const * and add WARN on failure returns] Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen-blkfront: module exit handling adjustmentsJan Beulich2012-05-11
| | | | | | | | | The blkdev major must be released upon exit, or else the module can't attach to devices using the same majors upon being loaded again. Also avoid leaking the minor tracking bitmap. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen-blkfront: properly name all devicesJan Beulich2012-05-11
| | | | | | | | | - devices beyond xvdzz didn't get proper names assigned at all - extended devices with minors not representable within the kernel's major/minor bit split spilled into foreign majors Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* Merge branch 'for-3.4/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds2012-04-13
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block driver bits from Jens Axboe: - A series of fixes for mtip32xx. Most from Asai at Micron, but also one from Greg, getting rid of the dependency on PCIE_HOTPLUG. - A few bug fixes for xen-blkfront, and blkback. - A virtio-blk fix for Vivek, making resize actually work. - Two fixes from Stephen, making larger transfers possible on cciss. This is needed for tape drive support. * 'for-3.4/drivers' of git://git.kernel.dk/linux-block: block: mtip32xx: remove HOTPLUG_PCI_PCIE dependancy mtip32xx: dump tagmap on failure mtip32xx: fix handling of commands in various scenarios mtip32xx: Shorten macro names mtip32xx: misc changes mtip32xx: Add new sysfs entry 'status' mtip32xx: make setting comp_time as common mtip32xx: Add new bitwise flag 'dd_flag' mtip32xx: fix error handling in mtip_init() virtio-blk: Call revalidate_disk() upon online disk resize xen/blkback: Make optional features be really optional. xen/blkback: Squash the discard support for 'file' and 'phy' type. mtip32xx: fix incorrect value set for drv_cleanup_done, and re-initialize and start port in mtip_restart_port() cciss: Fix scsi tape io with more than 255 scatter gather elements cciss: Initialize scsi host max_sectors for tape drive support xen-blkfront: make blkif_io_lock spinlock per-device xen/blkfront: don't put bdev right after getting it xen-blkfront: use bitmap_set() and bitmap_clear() xen/blkback: Enable blkback on HVM guests xen/blkback: use grant-table.c hypercall wrappers
| * xen-blkfront: make blkif_io_lock spinlock per-deviceSteven Noonan2012-03-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves the global blkif_io_lock to the per-device structure. The spinlock seems to exists for two reasons: to disable IRQs when in the interrupt handlers for blkfront, and to protect the blkfront VBDs when a detachment is requested. Having a global blkif_io_lock doesn't make sense given the use case, and it drastically hinders performance due to contention. All VBDs with pending IOs have to take the lock in order to get work done, which serializes everything pretty badly. Signed-off-by: Steven Noonan <snoonan@amazon.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/blkfront: don't put bdev right after getting itAndrew Jones2012-03-20
| | | | | | | | | | | | | | | | We should hang onto bdev until we're done with it. Signed-off-by: Andrew Jones <drjones@redhat.com> [v1: Fixed up git commit description] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen-blkfront: use bitmap_set() and bitmap_clear()Akinobu Mita2012-03-20
| | | | | | | | | | | | | | | | | | | | | | | | Use bitmap_set and bitmap_clear rather than modifying individual bits in a memory region. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: xen-devel@lists.xensource.com Cc: virtualization@lists.linux-foundation.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | Merge tag 'stable/for-linus-3.4-rc1-tag' of ↵Linus Torvalds2012-04-06
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull xen fixes from Konrad Rzeszutek Wilk: "Two fixes for regressions: * one is a workaround that will be removed in v3.5 with proper fix in the tip/x86 tree, * the other is to fix drivers to load on PV (a previous patch made them only load in PVonHVM mode). The rest are just minor fixes in the various drivers and some cleanup in the core code." * tag 'stable/for-linus-3.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/pcifront: avoid pci_frontend_enable_msix() falsely returning success xen/pciback: fix XEN_PCI_OP_enable_msix result xen/smp: Remove unnecessary call to smp_processor_id() xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries' xen: only check xen_platform_pci_unplug if hvm
| * | xen: only check xen_platform_pci_unplug if hvmIgor Mammedov2012-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit b9136d207f08 xen: initialize platform-pci even if xen_emul_unplug=never breaks blkfront/netfront by not loading them because of xen_platform_pci_unplug=0 and it is never set for PV guest. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | | Merge tag 'stable/for-linus-3.4-tag-two' of ↵Linus Torvalds2012-03-24
|\| | | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull more xen updates from Konrad Rzeszutek Wilk: "One tiny feature that accidentally got lost in the initial git pull: * Add fast-EOI acking of interrupts (clear a bit instead of hypercall) And bug-fixes: * Fix CPU bring-up code missing a call to notify other subsystems. * Fix reading /sys/hypervisor even if PVonHVM drivers are not loaded. * In Xen ACPI processor driver: remove too verbose WARN messages, fix up the Kconfig dependency to be a module by default, and add dependency on CPU_FREQ. * Disable CPU frequency drivers from loading when booting under Xen (as we want the Xen ACPI processor to be used instead). * Cleanups in tmem code." * tag 'stable/for-linus-3.4-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/acpi: Fix Kconfig dependency on CPU_FREQ xen: initialize platform-pci even if xen_emul_unplug=never xen/smp: Fix bringup bug in AP code. xen/acpi: Remove the WARN's as they just create noise. xen/tmem: cleanup xen: support pirq_eoi_map xen/acpi-processor: Do not depend on CPU frequency scaling drivers. xen/cpufreq: Disable the cpu frequency scaling drivers from loading. provide disable_cpufreq() function to disable the API.
| * xen: initialize platform-pci even if xen_emul_unplug=neverIgor Mammedov2012-03-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When xen_emul_unplug=never is specified on kernel command line reading files from /sys/hypervisor is broken (returns -EBUSY). It is caused by xen_bus dependency on platform-pci and platform-pci isn't initialized when xen_emul_unplug=never is specified. Fix it by allowing platform-pci to ignore xen_emul_unplug=never, and do not intialize xen_[blk|net]front instead. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | Merge branch 'for-3.3/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds2012-01-15
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-3.3/drivers' of git://git.kernel.dk/linux-block: mtip32xx: do rebuild monitoring asynchronously xen-blkfront: Use kcalloc instead of kzalloc to allocate array mtip32xx: uninitialized variable in mtip_quiesce_io() mtip32xx: updates based on feedback xen-blkback: convert hole punching to discard request on loop devices xen/blkback: Move processing of BLKIF_OP_DISCARD from dispatch_rw_block_io xen/blk[front|back]: Enhance discard support with secure erasing support. xen/blk[front|back]: Squash blkif_request_rw and blkif_request_discard together mtip32xx: update to new ->make_request() API mtip32xx: add module.h include to avoid conflict with moduleh tree mtip32xx: mark a few more items static mtip32xx: ensure that all local functions are static mtip32xx: cleanup compat ioctl handling mtip32xx: fix warnings/errors on 32-bit compiles block: Add driver for Micron RealSSD pcie flash cards
| * xen-blkfront: Use kcalloc instead of kzalloc to allocate arrayThomas Meyer2011-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | The advantage of kcalloc is, that will prevent integer overflows which could result from the multiplication of number of elements and size and it is also a bit nicer to read. The semantic patch that makes this change is available in https://lkml.org/lkml/2011/11/25/107 Signed-off-by: Thomas Meyer <thomas@m3y3r.de> [v1: Seperated the drivers/block/cciss_scsi.c out of this patch] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/blk[front|back]: Enhance discard support with secure erasing support.Konrad Rzeszutek Wilk2011-11-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Part of the blkdev_issue_discard(xx) operation is that it can also issue a secure discard operation that will permanantly remove the sectors in question. We advertise that we can support that via the 'discard-secure' attribute and on the request, if the 'secure' bit is set, we will attempt to pass in REQ_DISCARD | REQ_SECURE. CC: Li Dongyang <lidongyang@novell.com> [v1: Used 'flag' instead of 'secure:1' bit] [v2: Use 'reserved' uint8_t instead of adding a new value] [v3: Check for nseg when mapping instead of operation] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/blk[front|back]: Squash blkif_request_rw and blkif_request_discard togetherKonrad Rzeszutek Wilk2011-11-18
| | | | | | | | | | | | | | | | In a union type structure to deal with the overlapping attributes in a easier manner. Suggested-by: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | Xen: consolidate and simplify struct xenbus_driver instantiationJan Beulich2012-01-04
|/ | | | | | | | | | | | | | | | | | | | | | | | | | The 'name', 'owner', and 'mod_name' members are redundant with the identically named fields in the 'driver' sub-structure. Rather than switching each instance to specify these fields explicitly, introduce a macro to simplify this. Eliminate further redundancy by allowing the drvname argument to DEFINE_XENBUS_DRIVER() to be blank (in which case the first entry from the ID table will be used for .driver.name). Also eliminate the questionable xenbus_register_{back,front}end() wrappers - their sole remaining purpose was the checking of the 'owner' field, proper setting of which shouldn't be an issue anymore when the macro gets used. v2: Restore DRV_NAME for the driver name in xen-pciback. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* Merge branch 'for-3.2/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds2011-11-04
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-3.2/drivers' of git://git.kernel.dk/linux-block: (30 commits) virtio-blk: use ida to allocate disk index hpsa: add small delay when using PCI Power Management to reset for kump cciss: add small delay when using PCI Power Management to reset for kump xen/blkback: Fix two races in the handling of barrier requests. xen/blkback: Check for proper operation. xen/blkback: Fix the inhibition to map pages when discarding sector ranges. xen/blkback: Report VBD_WSECT (wr_sect) properly. xen/blkback: Support 'feature-barrier' aka old-style BARRIER requests. xen-blkfront: plug device number leak in xlblk_init() error path xen-blkfront: If no barrier or flush is supported, use invalid operation. xen-blkback: use kzalloc() in favor of kmalloc()+memset() xen-blkback: fixed indentation and comments xen-blkfront: fix a deadlock while handling discard response xen-blkfront: Handle discard requests. xen-blkback: Implement discard requests ('feature-discard') xen-blkfront: add BLKIF_OP_DISCARD and discard request struct drivers/block/loop.c: remove unnecessary bdev argument from loop_clr_fd() drivers/block/loop.c: emit uevent on auto release drivers/block/cpqarray.c: use pci_dev->revision loop: always allow userspace partitions and optionally support automatic scanning ... Fic up trivial header file includsion conflict in drivers/block/loop.c
| * xen-blkfront: plug device number leak in xlblk_init() error pathLaszlo Ersek2011-10-13
| | | | | | | | | | | | | | | | ... though after a failed xenbus_register_frontend() all may be lost. Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen-blkfront: If no barrier or flush is supported, use invalid operation.Konrad Rzeszutek Wilk2011-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Guard against issuing BLKIF_OP_WRITE_BARRIER or BLKIF_OP_FLUSH_CACHE by checking whether we successfully negotiated with the backend. The negotiation with the backend also sets the q->flush_flags which fortunately for us is also used when submitting an bio to us. If we don't support barriers or flushes it would be set to zero so we should never end up having to deal with REQ_FLUSH | REQ_FUA. However, other third party implementations of __make_request that might be stacked on top of us might not be so smart, so lets fix this up. Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen-blkfront: fix a deadlock while handling discard responseLi Dongyang2011-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we get -EOPNOTSUPP response for a discard request, we will clear the discard flag on the request queue so we won't attempt to send discard requests to backend again, and this should be protected under rq->queue_lock. However, when we setup the request queue, we pass blkif_io_lock to blk_init_queue so rq->queue_lock is blkif_io_lock indeed, and this lock is already taken when we are in blkif_interrpt, so remove the spin_lock/spin_unlock when we clear the discard flag or we will end up with deadlock here Signed-off-by: Li Dongyang <lidongyang@novell.com> [v1: Updated description a bit and removed comment from source] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen-blkfront: Handle discard requests.Li Dongyang2011-10-13
| | | | | | | | | | | | | | | | | | | | | | If the backend advertises 'feature-discard', then interrogate the backend for alignment and granularity. Setup the request queue with the appropiate values and send the discard operation as required. Signed-off-by: Li Dongyang <lidongyang@novell.com> [v1: Amended commit description] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | xen-blkfront: Fix one off warning about name clashStefan Bader2011-07-14
| | | | | | | | | | | | | | | | Avoid telling users to use xvde and onwards when using xvde. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | xen-blkfront: Drop name and minor adjustments for emulated scsi devicesStefan Bader2011-07-14
|/ | | | | | | | | | | | | These were intended to avoid the namespace clash when representing emulated IDE and SCSI devices. However that seems to confuse users more than expected (a disk defined as sda becomes xvde). So for now go back to the scheme which does no adjustments. This will break when mixing IDE and SCSI names in the configuration of guests but should be by now expected. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen-blkfront: Introduce BLKIF_OP_FLUSH_DISKCACHE support.Konrad Rzeszutek Wilk2011-05-12
| | | | | | | | | | | | If the backend supports the 'feature-flush-cache' mode, use that instead of the 'feature-barrier' support. Currently there are three backends that support the 'feature-flush-cache' mode: NetBSD, Solaris and Linux kernel. The 'flush' option is much light-weight version than the 'barrier' support so lets try to use as there are no filesystems in the kernel that use full barriers anymore. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen-blkfront: fix data size for xenbus_gather in blkfront_connectMarek Marczykowski2011-05-12
| | | | | | | | | | | | barrier variable is int, not long. This overflow caused another variable override: "err" (in PV code) and "binfo" (in xenlinux code - drivers/xen/blkfront/blkfront.c). The later caused incorrect device flags (RO/removable etc). Signed-off-by: Marek Marczykowski <marmarek@mimuw.edu.pl> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> [v1: Changed title] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* Merge branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvmLinus Torvalds2011-03-15
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm: xen: suspend: remove xen_hvm_suspend xen: suspend: pull pre/post suspend hooks out into suspend_info xen: suspend: move arch specific pre/post suspend hooks into generic hooks xen: suspend: refactor non-arch specific pre/post suspend hooks xen: suspend: add "arch" to pre/post suspend hooks xen: suspend: pass extra hypercall argument via suspend_info struct xen: suspend: refactor cancellation flag into a structure xen: suspend: use HYPERVISOR_suspend for PVHVM case instead of open coding xen: switch to new schedop hypercall by default. xen: use new schedop interface for suspend xen: do not respond to unknown xenstore control requests xen: fix compile issue if XEN is enabled but XEN_PVHVM is disabled xen: PV on HVM: support PV spinlocks and IPIs xen: make the ballon driver work for hvm domains xen-blkfront: handle Xen major numbers other than XENVBD xen: do not use xen_info on HVM, set pv_info name to "Xen HVM" xen: no need to delay xen_setup_shutdown_event for hvm guests anymore
| * xen-blkfront: handle Xen major numbers other than XENVBDStefano Stabellini2011-02-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes sure blkfront handles correctly virtual device numbers corresponding to Xen emulated IDE and SCSI disks: in those cases blkfront translates the major number to XENVBD and the minor number to a low xvd minor. Note: this behaviour is different from what old xenlinux PV guests used to do: they used to steal an IDE or SCSI major number and use it instead. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
* | xen: Union the blkif_request request specific fieldsOwen Smith2011-03-08
|/ | | | | | | | | | Prepare for extending the block device ring to allow request specific fields, by moving the request specific fields for reads, writes and barrier requests to a union member. Acked-by: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: Owen Smith <owen.smith@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds2011-01-07
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (33 commits) usb: don't use flush_scheduled_work() speedtch: don't abuse struct delayed_work media/video: don't use flush_scheduled_work() media/video: explicitly flush request_module work ioc4: use static work_struct for ioc4_load_modules() init: don't call flush_scheduled_work() from do_initcalls() s390: don't use flush_scheduled_work() rtc: don't use flush_scheduled_work() mmc: update workqueue usages mfd: update workqueue usages dvb: don't use flush_scheduled_work() leds-wm8350: don't use flush_scheduled_work() mISDN: don't use flush_scheduled_work() macintosh/ams: don't use flush_scheduled_work() vmwgfx: don't use flush_scheduled_work() tpm: don't use flush_scheduled_work() sonypi: don't use flush_scheduled_work() hvsi: don't use flush_scheduled_work() xen: don't use flush_scheduled_work() gdrom: don't use flush_scheduled_work() ... Fixed up trivial conflict in drivers/media/video/bt8xx/bttv-input.c as per Tejun.
| * xen: don't use flush_scheduled_work()Tejun Heo2010-12-24
| | | | | | | | | | | | | | | | | | flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush info->work instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * block: remove REQ_HARDBARRIERChristoph Hellwig2010-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | REQ_HARDBARRIER is dead now, so remove the leftovers. What's left at this point is: - various checks inside the block layer. - sanity checks in bio based drivers. - now unused bio_empty_barrier helper. - Xen blockfront use of BLKIF_OP_WRITE_BARRIER - it's dead for a while, but Xen really needs to sort out it's barrier situaton. - setting of ordered tags in uas - dead code copied from old scsi drivers. - scsi different retry for barriers - it's dead and should have been removed when flushes were converted to FS requests. - blktrace handling of barriers - removed. Someone who knows blktrace better should add support for REQ_FLUSH and REQ_FUA, though. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* | xen: Provide a variant of __RING_SIZE() that is an integer constant expressionJeremy Fitzhardinge2010-12-15
| | | | | | | | | | | | | | | | | | | | | | | | Without this, gcc 4.5 won't compile xen-netfront and xen-blkfront, where this is being used to specify array sizes. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: David Miller <davem@davemloft.net> Cc: Stable Kernel <stable@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | xen/blkfront: cope with backend that fail empty BLKIF_OP_WRITE_BARRIER requestsJeremy Fitzhardinge2010-11-02
| | | | | | | | | | | | | | | | | | Some(?) Xen block backends fail BLKIF_OP_WRITE_BARRIER requests, which Linux uses as a cache flush operation. In that case, disable use of FLUSH. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Daniel Stodden <daniel.stodden@citrix.com>
* | xen/blkfront: Implement FUA with BLKIF_OP_WRITE_BARRIERJeremy Fitzhardinge2010-11-02
| | | | | | | | | | | | | | | | The BLKIF_OP_WRITE_BARRIER is a full ordered barrier, so we can use it to implement FUA as well as a plain FLUSH. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Christoph Hellwig <hch@lst.de>
* | xen/blkfront: change blk_shadow.request to proper pointerJeremy Fitzhardinge2010-11-02
| | | | | | | | Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
* | xen/blkfront: map REQ_FLUSH into a full barrierJeremy Fitzhardinge2010-11-02
|/ | | | | | | Implement a flush as a full barrier, since we have nothing weaker. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Christoph Hellwig <hch@lst.de>
* Merge branch 'stable/xen-pcifront-0.8.2' of ↵Linus Torvalds2010-10-28
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen and branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm * 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm: xen: register xen pci notifier xen: initialize cpu masks for pv guests in xen_smp_init xen: add a missing #include to arch/x86/pci/xen.c xen: mask the MTRR feature from the cpuid xen: make hvc_xen console work for dom0. xen: add the direct mapping area for ISA bus access xen: Initialize xenbus for dom0. xen: use vcpu_ops to setup cpu masks xen: map a dummy page for local apic and ioapic in xen_set_fixmap xen: remap MSIs into pirqs when running as initial domain xen: remap GSIs as pirqs when running as initial domain xen: introduce XEN_DOM0 as a silent option xen: map MSIs into pirqs xen: support GSI -> pirq remapping in PV on HVM guests xen: add xen hvm acpi_register_gsi variant acpi: use indirect call to register gsi in different modes xen: implement xen_hvm_register_pirq xen: get the maximum number of pirqs from xen xen: support pirq != irq * 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (27 commits) X86/PCI: Remove the dependency on isapnp_disable. xen: Update Makefile with CONFIG_BLOCK dependency for biomerge.c MAINTAINERS: Add myself to the Xen Hypervisor Interface and remove Chris Wright. x86: xen: Sanitse irq handling (part two) swiotlb-xen: On x86-32 builts, select SWIOTLB instead of depending on it. MAINTAINERS: Add myself for Xen PCI and Xen SWIOTLB maintainer. xen/pci: Request ACS when Xen-SWIOTLB is activated. xen-pcifront: Xen PCI frontend driver. xenbus: prevent warnings on unhandled enumeration values xenbus: Xen paravirtualised PCI hotplug support. xen/x86/PCI: Add support for the Xen PCI subsystem x86: Introduce x86_msi_ops msi: Introduce default_[teardown|setup]_msi_irqs with fallback. x86/PCI: Export pci_walk_bus function. x86/PCI: make sure _PAGE_IOMAP it set on pci mappings x86/PCI: Clean up pci_cache_line_size xen: fix shared irq device passthrough xen: Provide a variant of xen_poll_irq with timeout. xen: Find an unbound irq number in reverse order (high to low). xen: statically initialize cpu_evtchn_mask_p ... Fix up trivial conflicts in drivers/pci/Makefile
| * xenbus: prevent warnings on unhandled enumeration valuesNoboru Iwamatsu2010-10-18
| | | | | | | | | | | | | | | | | | | | | | XenbusStateReconfiguring/XenbusStateReconfigured were introduced by c/s 437, but aren't handled in many switch statements. .. also pulled from the linux-2.6-sparse-tree tree. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
* | Merge branch 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2010-10-22
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block: (46 commits) xen-blkfront: disable barrier/flush write support Added blk-lib.c and blk-barrier.c was renamed to blk-flush.c block: remove BLKDEV_IFL_WAIT aic7xxx_old: removed unused 'req' variable block: remove the BH_Eopnotsupp flag block: remove the BLKDEV_IFL_BARRIER flag block: remove the WRITE_BARRIER flag swap: do not send discards as barriers fat: do not send discards as barriers ext4: do not send discards as barriers jbd2: replace barriers with explicit flush / FUA usage jbd2: Modify ASYNC_COMMIT code to not rely on queue draining on barrier jbd: replace barriers with explicit flush / FUA usage nilfs2: replace barriers with explicit flush / FUA usage reiserfs: replace barriers with explicit flush / FUA usage gfs2: replace barriers with explicit flush / FUA usage btrfs: replace barriers with explicit flush / FUA usage xfs: replace barriers with explicit flush / FUA usage block: pass gfp_mask and flags to sb_issue_discard dm: convey that all flushes are processed as empty ...
| * | xen-blkfront: disable barrier/flush write supportJens Axboe2010-10-22
| | | | | | | | | | | | | | | | | | | | | The driver doesn't handle empty flushes. Disable barrier/flush write support until this is fixed up. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * | Merge branch 'v2.6.36-rc8' into for-2.6.37/barrierJens Axboe2010-10-19
| |\| | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: block/blk-core.c drivers/block/loop.c mm/swapfile.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * | block: deprecate barrier and replace blk_queue_ordered() with blk_queue_flush()Tejun Heo2010-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA requests. Deprecate barrier. All REQ_HARDBARRIERs are failed with -EOPNOTSUPP and blk_queue_ordered() is replaced with simpler blk_queue_flush(). blk_queue_flush() takes combinations of REQ_FLUSH and FUA. If a device has write cache and can flush it, it should set REQ_FLUSH. If the device can handle FUA writes, it should also set REQ_FUA. All blk_queue_ordered() users are converted. * ORDERED_DRAIN is mapped to 0 which is the default value. * ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH. * ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH | REQ_FUA. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Boaz Harrosh <bharrosh@panasas.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: David S. Miller <davem@davemloft.net> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Pierre Ossman <drzeus@drzeus.cx> Cc: Stefan Weinhuber <wein@de.ibm.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * | block: kill QUEUE_ORDERED_BY_TAGTejun Heo2010-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nobody is making meaningful use of ORDERED_BY_TAG now and queue draining for barrier requests will be removed soon which will render the advantage of tag ordering moot. Kill ORDERED_BY_TAG. The following users are affected. * brd: converted to ORDERED_DRAIN. * virtio_blk: ORDERED_TAG path was already marked deprecated. Removed. * xen-blkfront: ORDERED_TAG case dropped. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* | | block: autoconvert trivial BKL users to private mutexArnd Bergmann2010-10-05
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The block device drivers have all gained new lock_kernel calls from a recent pushdown, and some of the drivers were already using the BKL before. This turns the BKL into a set of per-driver mutexes. Still need to check whether this is safe to do. file=$1 name=$2 if grep -q lock_kernel ${file} ; then if grep -q 'include.*linux.mutex.h' ${file} ; then sed -i '/include.*<linux\/smp_lock.h>/d' ${file} else sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file} fi sed -i ${file} \ -e "/^#include.*linux.mutex.h/,$ { 1,/^\(static\|int\|long\)/ { /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex); } }" \ -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \ -e '/[ ]*cycle_kernel_lock();/d' else sed -i -e '/include.*\<smp_lock.h\>/d' ${file} \ -e '/cycle_kernel_lock()/d' fi Signed-off-by: Arnd Bergmann <arnd@arndb.de>