aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* drbd: New packet for Ahead/Behind mode: P_OUT_OF_SYNCPhilipp Reisner2011-03-10
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Implemented two new connection states Ahead/BehindPhilipp Reisner2011-03-10
| | | | | | | | In this connection mode, the ahead node no longer replicates application IO. The behind's disk becomes out dated. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: New configuration parameters for dealing with network congestionPhilipp Reisner2011-03-10
| | | | | | | | | | | net { on_congestion {block|pull-ahead|disconnect}; congestion-fill {sectors}; congestion-extents {al-extents}; } Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Track the numbers of sectors in flightPhilipp Reisner2011-03-10
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: Renamed write_flags_to_bio() to wire_flags_to_bio()Lars Ellenberg2011-03-10
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: restore compatibility with 32bit kernelsLars Ellenberg2011-03-10
| | | | | | | | | | | With commit drbd: further converge progress display of resync and online-verify accidentally an u64/u64 div was introduced, causing an unresolvable symbol __udivdi3 to be reference. Actually for that division, 32bit are still suficient for now, so we can revert to unsigned long instead. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: properly use max_hw_sectors to limit the our bio sizeLars Ellenberg2011-03-10
| | | | | | | | | | | | | To ease tracking of bios in some hash tables, we want it to not cross certain boundaries (128k, used to be 32k). We limit the maximum bio size using queue parameters. Historically some defines and variables we use there have been named max_segment_size, which was misguided. Rename them to max_bio_size, and use [blk_]queue_max_hw_sectors where appropriate. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: debug: limit nelink-broadcast of request on digest mismatch to 32kLars Ellenberg2011-03-10
| | | | | | | | | | | | | | We used to be limited to 32k requests, but have increased that limit to 128k now. This part of the code can only deal with 32k, it would scramble arbitrary pages for larger requests. As it is used for debugging only anyways, it is ok to simply truncate the dumped data here. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: detect modification of in-flight buffersLars Ellenberg2011-03-10
| | | | | | | | | With data-integrity digest enabled, double-check on the sending side for modifications by upper layers of buffers under write back, so we can tell it appart from corruption on the "wire". Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: further converge progress display of resync and online-verifyLars Ellenberg2011-03-10
| | | | | | | | Show progressbar and ETA always, with proc_details >= 1 also show the current sector position for both resync and online-verify on both nodes. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: fix potential wrap of 32bit oos:%lu display in /proc/drbdLars Ellenberg2011-03-10
| | | | | | | | | When converting bits (4k resolution, still) to kB, we shift left. If it was a large number of bits on a 32bit box (>= 4 TiB storage), we may wrap the 32bit unsigned long base type, resulting in incorrect display. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: use the resync controller for online-verify requests as wellLars Ellenberg2011-03-10
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: factor out drbd_rs_number_requestsLars Ellenberg2011-03-10
| | | | | | | | Preparation patch to be able to use the auto-throttling resync controller for online-verify requests as well. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: factor out drbd_rs_controller_resetLars Ellenberg2011-03-10
| | | | | | | | Preparation patch to be able to use the auto-throttling resync controller for online-verify requests as well. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: show progress bar and ETA for online-verifyLars Ellenberg2011-03-10
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: advance progress step marks for online-verifyLars Ellenberg2011-03-10
| | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: factor out advancement of resync marks for progress reportingLars Ellenberg2011-03-10
| | | | | | | | This is in preparation to unify progress reporting of online-verify and resync requests. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: initialize online-verify progress tracking on verify targetLars Ellenberg2011-03-10
| | | | | | | | For partial (resumed) online verify, initialize the resync step marks once we know what the online verify start sector is. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: improve online-verify progress trackingLars Ellenberg2011-03-10
| | | | | | | | | For a partial (resumed) online-verify, initialize rs_total not to total bits, but to number of bits to check in this run, to match the meaning rs_total has for actual resync. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd: only reset online-verify start sector if verify completedLars Ellenberg2011-03-10
| | | | | | | | | | | | | | For network hickups during online-verify, on the next verify triggered, we by default want to resume where it left off. After any replication link interruption, there will be a (possibly empty) resync. Do not reset online-verify start sector if some resync completed, that would defeats the purpose. Only reset the start sector once a verify run is completed. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/coreJens Axboe2011-03-10
|\ | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: block/blk-core.c block/blk-flush.c drivers/md/raid1.c drivers/md/raid10.c drivers/md/raid5.c fs/nilfs2/btnode.c fs/nilfs2/mdt.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * blk-throttle: Use blk_plug in throttle dispatchVivek Goyal2011-03-10
| | | | | | | | | | | | | | | | Use plug in throttle dispatch also as we are dispatching a bunch of bios in throttle context and some of them might merge. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * block: kill off REQ_UNPLUGJens Axboe2011-03-10
| | | | | | | | | | | | | | | | | | | | With the plugging now being explicitly controlled by the submitter, callers need not pass down unplugging hints to the block layer. If they want to unplug, it's because they manually plugged on their own - in which case, they should just unplug at will. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * aio: remove request submission batchingJens Axboe2011-03-10
| | | | | | | | | | | | | | This should be useless now that we have on-stack plugging. So lets just kill it. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * fs: make aio plugShaohua Li2011-03-10
| | | | | | | | | | Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * fs: make mpage read/write_pages() plugJens Axboe2011-03-10
| | | | | | | | Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * read-ahead: use pluggingJens Axboe2011-03-10
| | | | | | | | Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * fs: make generic file read/write functions plugJens Axboe2011-03-10
| | | | | | | | Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * block: remove per-queue pluggingJens Axboe2011-03-10
| | | | | | | | | | | | | | | | Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * block: initial patch for on-stack per-task pluggingJens Axboe2011-03-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for creating a queuing context outside of the queue itself. This enables us to batch up pieces of IO before grabbing the block device queue lock and submitting them to the IO scheduler. The context is created on the stack of the process and assigned in the task structure, so that we can auto-unplug it if we hit a schedule event. The current queue plugging happens implicitly if IO is submitted to an empty device, yet callers have to remember to unplug that IO when they are going to wait for it. This is an ugly API and has caused bugs in the past. Additionally, it requires hacks in the vm (->sync_page() callback) to handle that logic. By switching to an explicit plugging scheme we make the API a lot nicer and can get rid of the ->sync_page() hack in the vm. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * scsi: convert to blk_delay_queue()Jens Axboe2011-03-10
| | | | | | | | | | | | | | | | | | It was always abuse to reuse the plugging infrastructure for this, convert it to the (new) real API for delaying queueing a bit. A default delay of 3 msec is defined, to match the previous behaviour. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * ide-cd: convert to blk_delay_queue() for a short pauseJens Axboe2011-03-10
| | | | | | | | | | | | | | | | It was always abuse to reuse the plugging infrastructure for this, convert it to the (new) real API for delaying queueing a bit. Signed-off-by: Jens Axboe <jaxboe@fusionio.com> Acked-by: David S. Miller <davem@davemloft.net>
| * block: add API for delaying work/request_fn a little bitJens Axboe2011-03-10
| | | | | | | | | | | | | | Currently we use plugging for that, but as plugging is going away, we need an alternative mechanism. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* | staging: Convert to bdops->check_events()Tejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | Convert two staging drivers - blkvsc_drv and cyasblkdev_block - from ->media_changed() to ->check_events(). The former always indicated media changed while the latter always indicated media not changed. Not sure what the drivers are trying to achieve but keep the original behavior. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org>
* | pktcdvd: Convert to bdops->check_events()Tejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | Convert from ->media_changed() to ->check_events(). pktcdvd needs to forward all event related operations to the underlying device. Forward ->check_events() instead of ->media_changed() and inherit disk->[async_]events. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Peter Osterlund <petero2@telia.com>
* | umem: Drop dummy ->media_changed()Tejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | umem doesn't implement media changed detection and there's no need to implement dummy callback anymore. Remove it. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org>
* | s390/tape_block: Convert to bdops->check_events()Tejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | Convert from ->media_changed() to ->check_events(). s390/tape_block buffers media changed state and clears it on revalidation. It will behave correctly with kernel event polling. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
* | i2o_block: Convert to bdops->check_events()Tejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | Convert from ->media_changed() to ->check_events(). i2o_block buffers media changed state and clears it after reporting. It will behave correctly with kernel event polling. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
* | xsysace: Convert to bdops->check_events()Tejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | Convert from ->media_changed() to ->check_events(). xsysace buffers media changed state and clears it on revalidation. It will behave correctly with kernel event polling. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Grant Likely <grant.likely@secretlab.ca> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org>
* | ub: Convert to bdops->check_events()Tejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | Convert from ->media_changed() to ->check_events(). ub buffers media changed state and clears it on revalidation. It will behave correctly with kernel event polling. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Pete Zaitcev <zaitcev@redhat.com>
* | swim[3]: Convert to bdops->check_events()Tejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | Convert from ->media_changed() to ->check_events(). Both swim and swim3 buffer media changed state and clear it on revalidation. They will behave correctly with kernel event polling. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Laurent Vivier <laurent@lvivier.info> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | dac960: Convert to bdops->check_events()Tejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | Convert from ->media_changed() to ->check_events(). DAC960 media change notification seems to be one way (once set, never cleared) and will generate spurious events when polled once the condition triggers. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org>
* | paride: Convert to bdops->check_events()Tejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | Convert paride drivers from ->media_changed() to ->check_events(). pcd and pd buffer and clear events after reporting; however, pf unconditionally reports MEDIA_CHANGE and will generate spurious events when polled. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Tim Waugh <tim@cyberelk.net>
* | gdrom,viocd: Convert to bdops->check_events()Tejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | Convert gdrom and viocd from ->media_changed() to ->check_events(). It's unclear how the conditions are cleared and it's possible that it may generate spurious events when polled. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org>
* | floppy,{ami|ata}flop: Convert to bdops->check_events()Tejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | Convert the floppy drivers from ->media_changed() to ->check_events(). Both floppy and ataflop buffer media changed state bit and clear them on revalidation and will behave correctly with kernel event polling. I can't tell how amiflop clears its event and it's possible that it may generate spurious events when polled. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org>
* | ide: Convert to bdops->check_events()Tejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert ->media_changed() to the new ->check_events() method. The conversion is mostly mechanical. The only notable change is that cdrom now doesn't generate any event if @slot_nr isn't CDSL_CURRENT. It used to return -EINVAL which would be treated as media changed. As media changer isn't supported anyway, this doesn't make any difference. This makes ide emit the standard disk events and allows kernel event polling. Currently, only MEDIA_CHANGE event is implemented. Adding support for EJECT_REQUEST shouldn't be difficult; however, given that ide driver is already deprecated, it probably is best to leave it alone. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-ide@vger.kernel.org
* | block: Don't check events while open is in progressTejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Not all block drivers clear events immediately after reporting. Some do so in ->revalidate_disk() or other steps during ->open(). There is a slim chance event poll may happen between the clearing event check from check_disk_change() and the actual clearing of the events which would result in spurious events. Block event checks while block device open is in progress. There is no need to kick explicit event check afterwards as events are always checked during open. -v2: The original patch could have called disk_unblock_events() with an already released or %NULL @disk causing oops. Fixed by making sure references are put after disk_unblock_events() is called. It also makes the error path of __blkdev_get() a bit simpler. This problem was reported by Jens. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org>
* | block: Don't check events on close unless it was blockedTejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The block event mechanism currently always checks events when the device is being closed regardless of the open mode. The intention was to allow detection of EJECT_REQUEST when a device is closed whether disk event polling is enabled or not. This is unnecessary as, for devices of interest, events are checked from either userland or kernel and in the former case ->check_events() is performed on open of each poll attempt anyway. Furthermore, this unconditional event check on close makes the code susceptible to event loop if the block driver doesn't clear reported events correctly - an event triggers userland to open and close the device which in turn causes another event, rinse and repeat. Check events on close only if it was blocked by excl write open. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org>
* | block: Don't implicitly trigger event check on disk_unblock_events()Tejun Heo2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently, disk_unblock_events() implicitly kick event check if the block count reaches zero. This behavior is not described in the comment and hinders with future changes. Make the unblocker explicitly check events by calling disk_check_events() as necessary. This patch doesn't cause any behavior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kay Sievers <kay.sievers@vrfy.org>
* | blk-cgroup: Lower minimum weight from 100 to 10.Justin TerAvest2011-03-08
| | | | | | | | | | | | | | | | | | | | We've found that we still get good, useful isolation at weights this low. I'd like to adjust the minimum so that any other changes can take these values into account. Signed-off-by: Justin TerAvest <teravest@google.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>