aboutsummaryrefslogtreecommitdiffstats
path: root/block
Commit message (Collapse)AuthorAge
* [PATCH] CFQ: request <-> request merging rr_list fixupJens Axboe2006-10-31
| | | | | | | | | | | | | | In very rare circumstances would we be pruning a merged request and at the same time delete the implicated cfqq from the rr_list, and not readd it when the merged request got added. This could cause io stalls until that process issued io again. Fix it up by putting the rr_list add handling into cfq_add_rq_rb(), identical to how pruning is handled in cfq_del_rq_rb(). This fixes a hang reproducible with fsx-linux. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: check bio address after mapping through partitions.NeilBrown2006-10-31
| | | | | | | | | | | | | | | Partitions are not limited to live within a device. So we should range check after partition mapping. Note that 'maxsector' was being used for two different things. I have split off the second usage into 'old_sector' so that maxsector can be still be used for it's primary usage later in the function. Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Neil Brown <neilb@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] CFQ: bad locking in changed_ioprio()Jens Axboe2006-10-30
| | | | | | | | | When the ioprio code recently got juggled a bit, a bug was introduced. changed_ioprio() is no longer called with interrupts disabled, so using plain spin_lock() on the queue_lock is a bug. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] CFQ: use irq safe locking in cfq_cic_link()Jens Axboe2006-10-30
| | | | | | | | | | | | | | If cfq_set_request() is called for a new process AND a non-fs io request (so that __GFP_WAIT may not be set), cfq_cic_link() may use spin_lock_irq() and spin_unlock_irq() with interrupts already disabled. Fix is to always use irq safe locking in cfq_cic_link() Acked-By: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] separate bdi congestion functions from queue congestion functionsAndrew Morton2006-10-20
| | | | | | | | | | | | | | | | | | | Separate out the concept of "queue congestion" from "backing-dev congestion". Congestion is a backing-dev concept, not a queue concept. The blk_* congestion functions are retained, as wrappers around the core backing-dev congestion functions. This proper layering is needed so that NFS can cleanly use the congestion functions, and so that CONFIG_BLOCK=n actually links. Cc: "Thomas Maier" <balagi@justmail.de> Cc: "Jens Axboe" <jens.axboe@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: David Howells <dhowells@redhat.com> Cc: Peter Osterlund <petero2@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] export clear_queue_congested and set_queue_congestedThomas Maier2006-10-20
| | | | | | | | | | | | | | | | Export the clear_queue_congested() and set_queue_congested() functions located in ll_rw_blk.c The functions are renamed to blk_clear_queue_congested() and blk_set_queue_congested(). (needed in the pktcdvd driver's bio write congestion control) Signed-off-by: Thomas Maier <balagi@justmail.de> Cc: Peter Osterlund <petero2@telia.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] block layer: elv_iosched_show should get elv_list_lockVasily Tarasov2006-10-12
| | | | | | | | elv_iosched_show function iterates other elv_list, hence elv_list_lock should be got. Signed-off-by: Vasily Tarasov <vtaras@openvz.org> Signed-off-by: Vasily Tarasov <jens.axboe@oracle.com>
* [PATCH] block layer: elevator_find function cleanupVasily Tarasov2006-10-12
| | | | | | | | We can easily produce search through the elevator list without introducing additional elevator_type variable. Signed-off-by: Vasily Tarasov <vtaras@openvz.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* [PATCH] helper function for retrieving scsi_cmd given host based block layer tagDavid C Somayajulu2006-10-04
| | | | | | | | | | This was necessitated by the need for a function to get back to a scsi_cmnd, when an hba the posts its (corresponding) completion interrupt with a block layer tag as its reference. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: David Somayajulu <david.somayajulu@qlogic.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* [PATCH] dm: export blkdev_driver_ioctlAlasdair G Kergon2006-10-03
| | | | | | | | | | | | Export blkdev_driver_ioctl for device-mapper. If we get as far as the device-mapper ioctl handler, we know the ioctl is not a standard block layer BLK* one, so we don't need to check for them a second time and can call blkdev_driver_ioctl() directly. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] completions: lockdep annotate on stack completionsPeter Zijlstra2006-10-01
| | | | | | | | | | | All on stack DECLARE_COMPLETIONs should be replaced by: DECLARE_COMPLETION_ONSTACK Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Only enable CONFIG_BLOCK option for embeddedJens Axboe2006-09-30
| | | | | | | It's too easy for people to shoot themselves in the foot, and it only makes sense for embedded folks anyway. Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] blk_queue_start_tag() shared map race fixJens Axboe2006-09-30
| | | | | | | | | | | If we share the tag map between two or more queues, then we cannot use __set_bit() to set the bit. In fact we need to make sure we atomically acquire this tag, so loop using test_and_set_bit() to protect from that. Noticed by Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] Update axboe@suse.de email addressJens Axboe2006-09-30
| | | | | | | As people often look for the copyright in files to see who to mail, update the link to a neutral one. Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] BLOCK: Make it possible to disable the block layer [try #6]David Howells2006-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make it possible to disable the block layer. Not all embedded devices require it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require the block layer to be present. This patch does the following: (*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev support. (*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls an item that uses the block layer. This includes: (*) Block I/O tracing. (*) Disk partition code. (*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS. (*) The SCSI layer. As far as I can tell, even SCSI chardevs use the block layer to do scheduling. Some drivers that use SCSI facilities - such as USB storage - end up disabled indirectly from this. (*) Various block-based device drivers, such as IDE and the old CDROM drivers. (*) MTD blockdev handling and FTL. (*) JFFS - which uses set_bdev_super(), something it could avoid doing by taking a leaf out of JFFS2's book. (*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is, however, still used in places, and so is still available. (*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and parts of linux/fs.h. (*) Makes a number of files in fs/ contingent on CONFIG_BLOCK. (*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK. (*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK is not enabled. (*) fs/no-block.c is created to hold out-of-line stubs and things that are required when CONFIG_BLOCK is not set: (*) Default blockdev file operations (to give error ENODEV on opening). (*) Makes some /proc changes: (*) /proc/devices does not list any blockdevs. (*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK. (*) Makes some compat ioctl handling contingent on CONFIG_BLOCK. (*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if given command other than Q_SYNC or if a special device is specified. (*) In init/do_mounts.c, no reference is made to the blockdev routines if CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2. (*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return error ENOSYS by way of cond_syscall if so). (*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if CONFIG_BLOCK is not set, since they can't then happen. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] blktrace: cleanup using on_each_cpuMartin Peschke2006-09-30
| | | | | | | | This patch kills a few lines of code in blktrace by making use of on_each_cpu(). Signed-off-by: Martin Peschke <mp3@de.ibm.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] exit_io_context: don't disable irqsOleg Nesterov2006-09-30
| | | | | | | | | | | | | We don't need to disable irqs to clear current->io_context, it is protected by ->alloc_lock. Even IF it was possible to submit I/O from IRQ on behalf of current this irq_disable() can't help: current_io_context() will re-instantiate ->io_context after irq_enable(). We don't need task_lock() or local_irq_disable() to clear ioc->task. This can't prevent other CPUs from playing with our io_context anyway. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] blktrace: support for logging metadata readsJens Axboe2006-09-30
| | | | Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: use metadata read flagJens Axboe2006-09-30
| | | | | | | | Give meta data reads preference over regular reads, as the process often needs to get that out of the way to do the io it was actually interested in. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Allow file systems to differentiate between data and meta readsJens Axboe2006-09-30
| | | | | | | We can use this information for making more intelligent priority decisions, and it will also be useful for blktrace. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] ll_rw_blk: allow more flexibility for read_ahead_kb storeJens Axboe2006-09-30
| | | | | | | | | | | It can make sense to set read-ahead larger than a single request. We should not be enforcing such policy on the user. Additionally, using the BLKRASET ioctl doesn't impose such a restriction. So additionally we now expose identical behaviour through the two. Issue also reported by Anton <cbou@mail.ru> Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: improve queue preemptionJens Axboe2006-09-30
| | | | | | | Don't touch the current queues, just make sure that the wanted queue is selected next. Simplifies the logic. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Add blk_start_queueing() helperJens Axboe2006-09-30
| | | | | | | | | | CFQ implements this on its own now, but it's really block layer knowledge. Tells a device queue to start dispatching requests to the driver, taking care to unplug if needed. Also fixes the issue where as/cfq will invoke a stopped queue, which we really don't want. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: kill the empty_listJens Axboe2006-09-30
| | | | | | | No point in having a place holder list just for empty queues, so remove it. It's not used for anything other than to keep ->cfq_list busy. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: Kill O(N) runtime of cfq_resort_rr_list()Jens Axboe2006-09-30
| | | | | | | | | Currently it scales with number of processes in that priority group, which is potentially not very nice as it's called quite often. Basically we always need to do tail inserts, except for the case of a new process. So just mark/detect a queue as such. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Make sure all block/io scheduler setups are node awareJens Axboe2006-09-30
| | | | | | | Some were kmalloc_node(), some were still kmalloc(). Change them all to kmalloc_node(). Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Audit block layer inlinesJens Axboe2006-09-30
| | | | | | | Kill a few inlines that bring in too much code to more than one location Shrinks kernel text by about 300 bytes on 32-bit x86. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: use new io context counting mechanismJens Axboe2006-09-30
| | | | | | | | It's ok if the read path is a lot more costly, as long as inc/dec is really cheap. The inc/dec will happen for each created/freed io context, while the reading only happens when a disk queue exits. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] as-iosched: use new io context counting mechanismJens Axboe2006-09-30
| | | | | | | | It's ok if the read path is a lot more costly, as long as inc/dec is really cheap. The inc/dec will happen for each created/freed io context, while the reading only happens when a disk queue exits. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: kill cfq_exit_lockJens Axboe2006-09-30
| | | | | | | | | | | | | | | | | | | | | cfq_exit_lock is protecting two things now: - The per-ioc rbtree of cfq_io_contexts - The per-cfqd linked list of cfq_io_contexts The per-cfqd linked list can be protected by the queue lock, as it is (by definition) per cfqd as the queue lock is. The per-ioc rbtree is mainly used and updated by the process itself only. The only outside use is the io priority changing. If we move the priority changing to not browsing the rbtree, we can remove any locking from the rbtree updates and lookup completely. Let the sys_ioprio syscall just mark processes as having the iopriority changed and lazily update the private cfq io contexts the next time io is queued, and we can remove this locking as well. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: cleanups, fixes, dead code removalJens Axboe2006-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | A collection of little fixes and cleanups: - We don't use the 'queued' sysfs exported attribute, since the may_queue() logic was rewritten. So kill it. - Remove dead defines. - cfq_set_active_queue() can be rewritten cleaner with else if conditions. - Several places had cfq_exit_cfqq() like logic, abstract that out and use that. - Annotate the cfqq kmem_cache_alloc() so the allocator knows that this is a repeat allocation if it fails with __GFP_WAIT set. Allows the allocator to start freeing some memory, if needed. CFQ already loops for this condition, so might as well pass the hint down. - Remove cfqd->rq_starved logic. It's not needed anymore after we dropped the crq allocation in cfq_set_request(). - Remove uneeded parameter passing. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] ll_rw_blk: cleanup __make_request()Jens Axboe2006-09-30
| | | | | | | | - Don't assign variables that are only used once. - Kill spin_lock() prefetching, it's opportunistic at best. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Drop useless bio passing in may_queue/set_request APIJens Axboe2006-09-30
| | | | | | It's not needed for anything, so kill the bio passing. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Remove ->rq_status from struct requestJens Axboe2006-09-30
| | | | | | | | | | | After Christophs SCSI change, the only usage left is RQ_ACTIVE and RQ_INACTIVE. The block layer sets RQ_INACTIVE right before freeing the request, so any check for RQ_INACTIVE in a driver is a bug and indicates use-after-free. So kill/clean the remaining users, straight forward. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Remove struct request_list from struct requestJens Axboe2006-09-30
| | | | | | | | It is always identical to &q->rq, and we only use it for detecting whether this request came out of our mempool or not. So replace it with an additional ->flags bit flag. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Remove ->waiting member from struct requestJens Axboe2006-09-30
| | | | | | | | As the comments indicates in blkdev.h, we can fold it into ->end_io_data usage as that is really what ->waiting is. Fixup the users of blk_end_sync_rq(). Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] as-iosched: kill arqJens Axboe2006-09-30
| | | | | | | | | Get rid of the as_rq request type. With the added elevator_private2, we have enough room in struct request to get rid of any arq allocation/free for each request. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de>
* [PATCH] cfq-iosched: kill crqJens Axboe2006-09-30
| | | | | | | | Get rid of the cfq_rq request type. With the added elevator_private2, we have enough room in struct request to get rid of any crq allocation/free for each request. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: remove the crq flag functions/variableJens Axboe2006-09-30
| | | | | | | There's just one flag currently (SYNC), and that one can be grabbed from the request. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] deadline-iosched: remove elevator private drq request typeJens Axboe2006-09-30
| | | | | | | | A big win, we now save an allocation/free on each request! With the previous rb/hash abstractions, we can just reuse queuelist/donelist for the FIFO data and be done with it. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] as-iosched: remove arq->is_sync memberJens Axboe2006-09-30
| | | | | | | We can track this in struct request. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de>
* [PATCH] as-iosched: reuse rq for fifoJens Axboe2006-09-30
| | | | | | | Saves some space in arq. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de>
* [PATCH] cfq-iosched: convert to using the FIFO elevator definesJens Axboe2006-09-30
| | | | Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] deadline-iosched: migrate to using the elevator rb functionsJens Axboe2006-09-30
| | | | | | This removes the rbtree handling from deadline. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: migrate to using the elevator rb functionsJens Axboe2006-09-30
| | | | | | This removes the rbtree handling from CFQ. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] as-iosched: migrate to using the elevator rb functionsJens Axboe2006-09-30
| | | | | | | This removes the rbtree handling from AS. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de>
* [PATCH] elevator: abstract out the rbtree sort handlingJens Axboe2006-09-30
| | | | | | | | | | | | The rbtree sort/lookup/reposition logic is mostly duplicated in cfq/deadline/as, so move it to the elevator core. The io schedulers still provide the actual rb root, as we don't want to impose any sort of specific handling on the schedulers. Introduce the helpers and rb_node in struct request to help migrate the IO schedulers. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] rbtree: fixed reversed RB_EMPTY_NODE and rb_next/prevJens Axboe2006-09-30
| | | | | | | The conditions got reserved. Also make rb_next() and rb_prev() check for the empty condition. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] elevator: move the backmerging logic into the elevator coreJens Axboe2006-09-30
| | | | | | | | | | | Right now, every IO scheduler implements its own backmerging (except for noop, which does no merging). That results in duplicated code for essentially the same operation, which is never a good thing. This patch moves the backmerging out of the io schedulers and into the elevator core. We save 1.6kb of text and as a bonus get backmerging for noop as well. Win-win! Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Split struct request ->flags into two partsJens Axboe2006-09-30
| | | | | | | | | | Right now ->flags is a bit of a mess: some are request types, and others are just modifiers. Clean this up by splitting it into ->cmd_type and ->cmd_flags. This allows introduction of generic Linux block message types, useful for sending generic Linux commands to block devices. Signed-off-by: Jens Axboe <axboe@suse.de>