aboutsummaryrefslogtreecommitdiffstats
path: root/block
Commit message (Collapse)AuthorAge
* [PATCH] Only enable CONFIG_BLOCK option for embeddedJens Axboe2006-09-30
| | | | | | | It's too easy for people to shoot themselves in the foot, and it only makes sense for embedded folks anyway. Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] blk_queue_start_tag() shared map race fixJens Axboe2006-09-30
| | | | | | | | | | | If we share the tag map between two or more queues, then we cannot use __set_bit() to set the bit. In fact we need to make sure we atomically acquire this tag, so loop using test_and_set_bit() to protect from that. Noticed by Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] Update axboe@suse.de email addressJens Axboe2006-09-30
| | | | | | | As people often look for the copyright in files to see who to mail, update the link to a neutral one. Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] BLOCK: Make it possible to disable the block layer [try #6]David Howells2006-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make it possible to disable the block layer. Not all embedded devices require it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require the block layer to be present. This patch does the following: (*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev support. (*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls an item that uses the block layer. This includes: (*) Block I/O tracing. (*) Disk partition code. (*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS. (*) The SCSI layer. As far as I can tell, even SCSI chardevs use the block layer to do scheduling. Some drivers that use SCSI facilities - such as USB storage - end up disabled indirectly from this. (*) Various block-based device drivers, such as IDE and the old CDROM drivers. (*) MTD blockdev handling and FTL. (*) JFFS - which uses set_bdev_super(), something it could avoid doing by taking a leaf out of JFFS2's book. (*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is, however, still used in places, and so is still available. (*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and parts of linux/fs.h. (*) Makes a number of files in fs/ contingent on CONFIG_BLOCK. (*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK. (*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK is not enabled. (*) fs/no-block.c is created to hold out-of-line stubs and things that are required when CONFIG_BLOCK is not set: (*) Default blockdev file operations (to give error ENODEV on opening). (*) Makes some /proc changes: (*) /proc/devices does not list any blockdevs. (*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK. (*) Makes some compat ioctl handling contingent on CONFIG_BLOCK. (*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if given command other than Q_SYNC or if a special device is specified. (*) In init/do_mounts.c, no reference is made to the blockdev routines if CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2. (*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return error ENOSYS by way of cond_syscall if so). (*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if CONFIG_BLOCK is not set, since they can't then happen. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] blktrace: cleanup using on_each_cpuMartin Peschke2006-09-30
| | | | | | | | This patch kills a few lines of code in blktrace by making use of on_each_cpu(). Signed-off-by: Martin Peschke <mp3@de.ibm.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] exit_io_context: don't disable irqsOleg Nesterov2006-09-30
| | | | | | | | | | | | | We don't need to disable irqs to clear current->io_context, it is protected by ->alloc_lock. Even IF it was possible to submit I/O from IRQ on behalf of current this irq_disable() can't help: current_io_context() will re-instantiate ->io_context after irq_enable(). We don't need task_lock() or local_irq_disable() to clear ioc->task. This can't prevent other CPUs from playing with our io_context anyway. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] blktrace: support for logging metadata readsJens Axboe2006-09-30
| | | | Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: use metadata read flagJens Axboe2006-09-30
| | | | | | | | Give meta data reads preference over regular reads, as the process often needs to get that out of the way to do the io it was actually interested in. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Allow file systems to differentiate between data and meta readsJens Axboe2006-09-30
| | | | | | | We can use this information for making more intelligent priority decisions, and it will also be useful for blktrace. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] ll_rw_blk: allow more flexibility for read_ahead_kb storeJens Axboe2006-09-30
| | | | | | | | | | | It can make sense to set read-ahead larger than a single request. We should not be enforcing such policy on the user. Additionally, using the BLKRASET ioctl doesn't impose such a restriction. So additionally we now expose identical behaviour through the two. Issue also reported by Anton <cbou@mail.ru> Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: improve queue preemptionJens Axboe2006-09-30
| | | | | | | Don't touch the current queues, just make sure that the wanted queue is selected next. Simplifies the logic. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Add blk_start_queueing() helperJens Axboe2006-09-30
| | | | | | | | | | CFQ implements this on its own now, but it's really block layer knowledge. Tells a device queue to start dispatching requests to the driver, taking care to unplug if needed. Also fixes the issue where as/cfq will invoke a stopped queue, which we really don't want. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: kill the empty_listJens Axboe2006-09-30
| | | | | | | No point in having a place holder list just for empty queues, so remove it. It's not used for anything other than to keep ->cfq_list busy. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: Kill O(N) runtime of cfq_resort_rr_list()Jens Axboe2006-09-30
| | | | | | | | | Currently it scales with number of processes in that priority group, which is potentially not very nice as it's called quite often. Basically we always need to do tail inserts, except for the case of a new process. So just mark/detect a queue as such. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Make sure all block/io scheduler setups are node awareJens Axboe2006-09-30
| | | | | | | Some were kmalloc_node(), some were still kmalloc(). Change them all to kmalloc_node(). Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Audit block layer inlinesJens Axboe2006-09-30
| | | | | | | Kill a few inlines that bring in too much code to more than one location Shrinks kernel text by about 300 bytes on 32-bit x86. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: use new io context counting mechanismJens Axboe2006-09-30
| | | | | | | | It's ok if the read path is a lot more costly, as long as inc/dec is really cheap. The inc/dec will happen for each created/freed io context, while the reading only happens when a disk queue exits. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] as-iosched: use new io context counting mechanismJens Axboe2006-09-30
| | | | | | | | It's ok if the read path is a lot more costly, as long as inc/dec is really cheap. The inc/dec will happen for each created/freed io context, while the reading only happens when a disk queue exits. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: kill cfq_exit_lockJens Axboe2006-09-30
| | | | | | | | | | | | | | | | | | | | | cfq_exit_lock is protecting two things now: - The per-ioc rbtree of cfq_io_contexts - The per-cfqd linked list of cfq_io_contexts The per-cfqd linked list can be protected by the queue lock, as it is (by definition) per cfqd as the queue lock is. The per-ioc rbtree is mainly used and updated by the process itself only. The only outside use is the io priority changing. If we move the priority changing to not browsing the rbtree, we can remove any locking from the rbtree updates and lookup completely. Let the sys_ioprio syscall just mark processes as having the iopriority changed and lazily update the private cfq io contexts the next time io is queued, and we can remove this locking as well. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: cleanups, fixes, dead code removalJens Axboe2006-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | A collection of little fixes and cleanups: - We don't use the 'queued' sysfs exported attribute, since the may_queue() logic was rewritten. So kill it. - Remove dead defines. - cfq_set_active_queue() can be rewritten cleaner with else if conditions. - Several places had cfq_exit_cfqq() like logic, abstract that out and use that. - Annotate the cfqq kmem_cache_alloc() so the allocator knows that this is a repeat allocation if it fails with __GFP_WAIT set. Allows the allocator to start freeing some memory, if needed. CFQ already loops for this condition, so might as well pass the hint down. - Remove cfqd->rq_starved logic. It's not needed anymore after we dropped the crq allocation in cfq_set_request(). - Remove uneeded parameter passing. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] ll_rw_blk: cleanup __make_request()Jens Axboe2006-09-30
| | | | | | | | - Don't assign variables that are only used once. - Kill spin_lock() prefetching, it's opportunistic at best. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Drop useless bio passing in may_queue/set_request APIJens Axboe2006-09-30
| | | | | | It's not needed for anything, so kill the bio passing. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Remove ->rq_status from struct requestJens Axboe2006-09-30
| | | | | | | | | | | After Christophs SCSI change, the only usage left is RQ_ACTIVE and RQ_INACTIVE. The block layer sets RQ_INACTIVE right before freeing the request, so any check for RQ_INACTIVE in a driver is a bug and indicates use-after-free. So kill/clean the remaining users, straight forward. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Remove struct request_list from struct requestJens Axboe2006-09-30
| | | | | | | | It is always identical to &q->rq, and we only use it for detecting whether this request came out of our mempool or not. So replace it with an additional ->flags bit flag. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Remove ->waiting member from struct requestJens Axboe2006-09-30
| | | | | | | | As the comments indicates in blkdev.h, we can fold it into ->end_io_data usage as that is really what ->waiting is. Fixup the users of blk_end_sync_rq(). Signed-off-by: Jens Axboe <axboe@kernel.dk>
* [PATCH] as-iosched: kill arqJens Axboe2006-09-30
| | | | | | | | | Get rid of the as_rq request type. With the added elevator_private2, we have enough room in struct request to get rid of any arq allocation/free for each request. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de>
* [PATCH] cfq-iosched: kill crqJens Axboe2006-09-30
| | | | | | | | Get rid of the cfq_rq request type. With the added elevator_private2, we have enough room in struct request to get rid of any crq allocation/free for each request. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: remove the crq flag functions/variableJens Axboe2006-09-30
| | | | | | | There's just one flag currently (SYNC), and that one can be grabbed from the request. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] deadline-iosched: remove elevator private drq request typeJens Axboe2006-09-30
| | | | | | | | A big win, we now save an allocation/free on each request! With the previous rb/hash abstractions, we can just reuse queuelist/donelist for the FIFO data and be done with it. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] as-iosched: remove arq->is_sync memberJens Axboe2006-09-30
| | | | | | | We can track this in struct request. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de>
* [PATCH] as-iosched: reuse rq for fifoJens Axboe2006-09-30
| | | | | | | Saves some space in arq. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de>
* [PATCH] cfq-iosched: convert to using the FIFO elevator definesJens Axboe2006-09-30
| | | | Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] deadline-iosched: migrate to using the elevator rb functionsJens Axboe2006-09-30
| | | | | | This removes the rbtree handling from deadline. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: migrate to using the elevator rb functionsJens Axboe2006-09-30
| | | | | | This removes the rbtree handling from CFQ. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] as-iosched: migrate to using the elevator rb functionsJens Axboe2006-09-30
| | | | | | | This removes the rbtree handling from AS. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de>
* [PATCH] elevator: abstract out the rbtree sort handlingJens Axboe2006-09-30
| | | | | | | | | | | | The rbtree sort/lookup/reposition logic is mostly duplicated in cfq/deadline/as, so move it to the elevator core. The io schedulers still provide the actual rb root, as we don't want to impose any sort of specific handling on the schedulers. Introduce the helpers and rb_node in struct request to help migrate the IO schedulers. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] rbtree: fixed reversed RB_EMPTY_NODE and rb_next/prevJens Axboe2006-09-30
| | | | | | | The conditions got reserved. Also make rb_next() and rb_prev() check for the empty condition. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] elevator: move the backmerging logic into the elevator coreJens Axboe2006-09-30
| | | | | | | | | | | Right now, every IO scheduler implements its own backmerging (except for noop, which does no merging). That results in duplicated code for essentially the same operation, which is never a good thing. This patch moves the backmerging out of the io schedulers and into the elevator core. We save 1.6kb of text and as a bonus get backmerging for noop as well. Win-win! Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Split struct request ->flags into two partsJens Axboe2006-09-30
| | | | | | | | | | Right now ->flags is a bit of a mess: some are request types, and others are just modifiers. Clean this up by splitting it into ->cmd_type and ->cmd_flags. This allows introduction of generic Linux block message types, useful for sending generic Linux commands to block devices. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] ifdef blktrace debugging fieldsAlexey Dobriyan2006-09-29
| | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] block: handle subsystem_register() init errorsRandy Dunlap2006-09-29
| | | | | | | | | Check and handle init errors. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] inode_diet: Replace inode.u.generic_ip with inode.i_privateTheodore Ts'o2006-09-27
| | | | | | | | | | | | | | | | | | | | | | | | | The following patches reduce the size of the VFS inode structure by 28 bytes on a UP x86. (It would be more on an x86_64 system). This is a 10% reduction in the inode size on a UP kernel that is configured in a production mode (i.e., with no spinlock or other debugging functions enabled; if you want to save memory taken up by in-core inodes, the first thing you should do is disable the debugging options; they are responsible for a huge amount of bloat in the VFS inode structure). This patch: The filesystem or device-specific pointer in the inode is inside a union, which is pretty pointless given that all 30+ users of this field have been using the void pointer. Get rid of the union and rename it to i_private, with a comment to explain who is allowed to use the void pointer. This is just a cleanup, but it allows us to reuse the union 'u' for something something where the union will actually be used. [judith@osdl.org: powerpc build fix] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Judith Lebzelter <judith@osdl.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge mulgrave-w:git/linux-2.6James Bottomley2006-09-23
|\ | | | | | | | | | | | | | | Conflicts: include/linux/blkdev.h Trivial merge to incorporate tag prototypes.
| * Add a real API for dealing with blk_congestion_wait()Trond Myklebust2006-09-22
| | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | [SCSI] block: add support for shared tag mapsJames Bottomley2006-08-31
|/ | | | | | | | | | The current block queue implementation already contains most of the machinery for shared tag maps. The only remaining pieces are a way to allocate and destroy a tag map independently of the queues (so that the maps can be managed on the life cycle of the overseeing entity) Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
* elv_unregister: fix possible crash on module unloadOleg Nesterov2006-08-22
| | | | | | | | | | An exiting task or process which didn't do I/O yet have no io context, elv_unregister() should check it is not NULL. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* [PATCH] cfq_cic_link: fix usage of wrong cfq_io_contextOleg Nesterov2006-08-21
| | | | | | | | Obviously, cfq_cic_link() shouldn't free a just allocated cfq_io_context? The dead key is from __cic, so drop that. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Fix current_io_context() vs set_task_ioprio() raceOleg Nesterov2006-08-21
| | | | | | | | | | | I know nothing about io scheduler, but I suspect set_task_ioprio() is not safe. current_io_context() initializes "struct io_context", then sets ->io_context. set_task_ioprio() running on another cpu may see the changes out of order, so ->set_ioprio(ioc) may use io_context which was not initialized properly. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] cfq-iosched: don't use a hard jiffies value, translate from msecsJens Axboe2006-07-25
| | | | | | | | | | | | The CIC_SEEKY() test really wants to use the minimum of either: - 2 msecs (not jiffies) - or, the pending slice time So code it like that. Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] blktrace: fix read-ahead bitMilton Miller2006-07-25
| | | | | | It should be toggling the same bit on and off, fix it up. Signed-off-by: Jens Axboe <axboe@suse.de>