| Commit message (Collapse) | Author | Age |
|
|
|
| |
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When switching scheduler from cfq, cfq_exit_queue() does not clear
ioc->ioc_data, leaving a dangling pointer that can deceive the following
lookups when the iosched is switched back to cfq. The pattern that can
trigger that is the following:
- elevator switch from cfq to something else;
- module unloading, with elv_unregister() that calls cfq_free_io_context()
on ioc freeing the cic (via the .trim op);
- module gets reloaded and the elevator switches back to cfq;
- reallocation of a cic at the same address as before (with a valid key).
To fix it just assign NULL to ioc_data in __cfq_exit_single_io_context(),
that is called from the regular exit path and from the elevator switching
code. The only path that frees a cic and is not covered is the error handling
one, but cic's freed in this way are never cached in ioc_data.
Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
|
| |
SLAB_DESTROY_BY_RCU is not a direct substitute for normal call_rcu()
freeing, since it'll page freeing but NOT object freeing. So change
cfq to do the freeing on its own.
Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Looking a bit closer into this regression the reason this can't be
right is that dma_addr common default is BLK_BOUNCE_HIGH and most
machines have less than 4G. So if you do:
if (b_pfn <= (min_t(u64, 0xffffffff, BLK_BOUNCE_HIGH) >> PAGE_SHIFT))
dma = 1
that will translate to:
if (BLK_BOUNCE_HIGH <= BLK_BOUNCE_HIGH)
dma = 1
So for 99% of hardware this will trigger unnecessary GFP_DMA
allocations and isa pooling operations.
Also note how the 32bit code still does b_pfn < blk_max_low_pfn.
I guess this is what you were looking after. I didn't verify but as
far as I can tell, this will stop the regression with isa dma
operations at boot for 99% of blkdev/memory combinations out there and
I guess this fixes the setups with >4G of ram and 32bit pci cards as
well (this also retains symmetry with the 32bit code).
Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
|
| |
Fixes:
block/genhd.c:361: warning: ignoring return value of ‘class_register’, declared with attribute warn_unused_result
Signed-off-by: Roland McGrath <roland@redhat.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
| |
This is important to eg dm, that tries to decide whether to stop using
barriers or not.
Tested as working by Anders Henke <anders.henke@1und1.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
| |
Introduced between 2.6.25-rc2 and -rc3
block/blk-map.c:154:14: warning: symbol 'bio' shadows an earlier one
block/blk-map.c:110:13: originally declared here
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
| |
Intoduced between 2.6.25-rc2 and -rc3
block/blk-settings.c:319:12: warning: function 'blk_queue_dma_drain' with external linkage has definition
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
| |
This patch removes the unused export of blk_rq_map_user_iov.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
| |
This patch removes the unused exports of blk_{get,put}_queue.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
| |
This patch contains the following cleanups:
- make the needlessly global struct disk_type static
- #if 0 the unused genhd_media_change_notify()
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
| |
This patch adds a proper prototye for blk_dev_init() in block/blk.h
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
| |
Every file should include the headers containing the externs for its
global functions (in this case for __blk_queue_free_tags()).
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
| |
4GB or upper 4GB memory
For some non-x86 systems with 4GB or upper 4GB memory,
we need increase the range of addresses that can be
used for direct DMA in 64-bit kernel.
Signed-off-by: Yang Shi <yang.shi@windriver.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Block layer alignment was used for two different purposes - memory
alignment and padding. This causes problems in lower layers because
drivers which only require memory alignment ends up with adjusted
rq->data_len. Separate out padding such that padding occurs iff
driver explicitly requests it.
Tomo: restorethe code to update bio in blk_rq_map_user
introduced by the commit 40b01b9bbdf51ae543a04744283bf2d56c4a6afa
according to padding alignment.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The meaning of rq->data_len was changed to the length of an allocated
buffer from the true data length. It breaks SG_IO friends and
bsg. This patch restores the meaning of rq->data_len to the true data
length and adds rq->extra_len to store an extended length (due to
drain buffer and padding).
This patch also removes the code to update bio in blk_rq_map_user
introduced by the commit 40b01b9bbdf51ae543a04744283bf2d56c4a6afa.
The commit adjusts bio according to memory alignment
(queue_dma_alignment). However, memory alignment is NOT padding
alignment. This adjustment also breaks SG_IO friends and bsg. Padding
alignment needs to be fixed in a proper way (by a separate patch).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
|
|
|
|
|
|
|
|
|
|
| |
kernel-doc for block/:
- add missing parameters
- fix one function's parameter list (remove blank line)
- add 2 source files to docbook for non-exported kernel-doc functions
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
| |
Clear drain buffer before chaining if the command in question is a
write.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Draining shouldn't be done for commands where overflow may indicate
data integrity issues. Add dma_drain_needed callback to
request_queue. Drain buffer is appened iff this function returns
non-zero.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With padding and draining moved into it, block layer now may extend
requests as directed by queue parameters, so now a request has two
sizes - the original request size and the extended size which matches
the size of area pointed to by bios and later by sgs. The latter size
is what lower layers are primarily interested in when allocating,
filling up DMA tables and setting up the controller.
Both padding and draining extend the data area to accomodate
controller characteristics. As any controller which speaks SCSI can
handle underflows, feeding larger data area is safe.
So, this patch makes the primary data length field, request->data_len,
indicate the size of full data area and add a separate length field,
request->raw_data_len, for the unmodified request size. The latter is
used to report to higher layer (userland) and where the original
request size should be fed to the controller or device.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DMA start address and transfer size alignment for PC requests are
achieved using bio_copy_user() instead of bio_map_user(). This works
because bio_copy_user() always uses full pages and block DMA alignment
isn't allowed to go over PAGE_SIZE.
However, the implementation didn't update the last bio of the request
to make this padding visible to lower layers. This patch makes
blk_rq_map_user() extend the last bio such that it includes the
padding area and the size of area pointed to by the request is
properly aligned.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we fail if someone requests a valid io scheduler, but it's
modular and not currently loaded. That can happen from a driver init
asking for a different scheduler, or online switching through sysfs
as requested by a user.
This patch makes elevator_get() request_module() to attempt to load
the appropriate module, instead of requiring that done manually.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
| |
It's cumbersome to browse a radix tree from start to finish, especially
since we modify keys when a process exits. So add a hlist for the single
purpose of browsing over all known cfq_io_contexts, used for exit,
io prio change, etc.
This fixes http://bugzilla.kernel.org/show_bug.cgi?id=9948
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
| |
That way the interface is symmetric, and calling blk_rq_unmap_user()
on the request wont oops.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
| |
blk_settings_init() can become static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
|
|
|
|
|
|
|
| |
blk_ioc_init() can become static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
|
|
|
|
|
|
|
| |
request_cachep needlessly became global.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
|
|
|
|
|
|
|
| |
Removes the now unused old partition statistic code.
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
| |
Reports enhanced partition statistics in /proc/diskstats.
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
|
|
|
|
|
|
|
|
| |
Updates the enhanced partition statistics in generic block layer
besides the disk statistics.
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
| |
Rearrange fields in cache order and initialize some fields that
we didn't previously init. Remove init of ->completion_data, it's
part of a union with ->hash. Luckily clearing the rb node is the same
as setting it to null!
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
|
| |
It blindly copies everything in the io_context, including the lock.
That doesn't work so well for either lock ordering or lockdep.
There seems zero point in swapping io contexts on a request to request
merge, so the best point of action is to just remove it.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Since it's acquired from irq context, all locking must be of the
irq safe variant. Most are already inside the queue lock (which
already disables interrupts), but the io scheduler rmmod path
always has irqs enabled and the put_io_context() path may legally
be called with irqs enabled (even if it isn't usually). So fixup
those two.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
| |
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
| |
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
| |
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
| |
No point in passing signed integers as the byte count, they can never
be negative.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
|
|
| |
This fixes a problem in SCSI where we use the (previously
uninitialised) cmd_type via blk_pc_request() to set up the transfer in
scsi_init_sgtable().
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
|
|
|
|
|
|
|
|
| |
If the two requests belong to the same io context, we will attempt
to lock the same lock twice. But swapping contexts is pointless in
that case, so just check for rioc == nioc before doing the double
lock and copy.
Tested-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
| |
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
| |
Expose hardware sector size in sysfs queue directory.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
| |
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
| |
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
| |
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
|
| |
Adds files for barrier handling, rq execution, io context handling,
mapping data to requests, and queue settings.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
| |
Seperates the tag and sysfs handling from ll_rw_blk.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
| |
Then we retain history in blk-core.c
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|\
| |
| |
| |
| |
| |
| |
| | |
* 'for-2.6.25' of git://git.kernel.dk/linux-2.6-block:
block: implement drain buffers
__bio_clone: don't calculate hw/phys segment counts
block: allow queue dma_alignment of zero
blktrace: Add blktrace ioctls to SCSI generic devices
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
These DMA drain buffer implementations in drivers are pretty horrible
to do in terms of manipulating the scatterlist. Plus they're being
done at least in drivers/ide and drivers/ata, so we now have code
duplication.
The one use case for this, as I understand it is AHCI controllers doing
PIO mode to mmc devices but translating this to DMA at the controller
level.
So, what about adding a callback to the block layer that permits the
adding of the drain buffer for the problem devices. The idea is that
you'd do this in slave_configure after you find one of these devices.
The beauty of doing it in the block layer is that it quietly adds the
drain buffer to the end of the sg list, so it automatically gets mapped
(and unmapped) without anything unusual having to be done to the
scatterlist in driver/scsi or drivers/ata and without any alteration to
the transfer length.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Since the SCSI layer uses the request queues from the block layer, blktrace can
also be used to trace the requests to all SCSI devices (like SCSI tape drives),
not only disks. The only missing part is the ioctl interface to start and stop
tracing.
This patch adds the SETUP, START, STOP and TEARDOWN ioctls from blktrace to the
sg device files. With this change, blktrace can be used for SCSI devices like
for disks, e.g.: blktrace -d /dev/sg1 -o - | blkparse -i -
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|