aboutsummaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAge
...
| * | cifs: remove NULL termination from rename target in CIFSSMBRenameOpenFIleJeff Layton2008-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cifs: remove NULL termination from rename target in CIFSSMBRenameOpenFIle The rename destination isn't supposed to be null terminated. Also, change the name string arg to be const. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | cifs: work around samba returning -ENOENT on SetFileDisposition callJeff Layton2008-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cifs: work around samba returning -ENOENT on SetFileDisposition call Samba seems to return STATUS_OBJECT_NAME_NOT_FOUND when we try to set the delete on close bit after doing a rename by filehandle. This looks like a samba bug to me, but a lot of servers will do this. For now, pretend an -ENOENT return is a success. Samba does however seem to respect the CREATE_DELETE_ON_CLOSE bit when opening files that already exist. Windows will ignore it, but so adding it to the open flags should be harmless. We're also currently ignoring the return code on the rename by filehandle, so no need to set rc based on it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | cifs: fix inverted NULL check after kmallocJeff Layton2008-09-24
| | | | | | | | | | | | | | | | | | | | | cifs: fix inverted NULL check after kmalloc Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | [CIFS] clean up upcall handling for dns_resolver keysSteve French2008-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We're given the datalen in the downcall, so there's no need to do any calls to strlen(). Just keep track of the datalen in the key. Finally, add a sanity check of the data in the downcall to make sure that it looks like a real IP address. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | [CIFS] fix busy-file renames and refactor cifs_rename logicSteve French2008-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Break out the code that does the actual renaming into a separate function and have cifs_rename call that. That function will attempt a path based rename first and then do a filehandle based one if it looks like the source is busy. The existing logic tried a path based rename first, but if we needed to remove the destination then it only attempted a filehandle based rename afterward. Not all servers support renaming by filehandle, so we need to always attempt path rename first and fall back to filehandle rename if it doesn't work. This also fixes renames of open files on windows servers (at least when the source and destination directories are the same). CC: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | cifs: add function to set file dispositionJeff Layton2008-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cifs: add function to set file disposition The proper way to set the delete on close bit on an already existing file is to use SET_FILE_INFO with an infolevel of SMB_FILE_DISPOSITION_INFO. Add a function to do that and have the silly-rename code use it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | [CIFS] add constants for string lengths of keynames in SPNEGO upcall stringSteve French2008-09-23
| | | | | | | | | | | | | | | Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | cifs: move rename and delete-on-close logic into helper functionJeff Layton2008-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cifs: move rename and delete-on-close logic into helper function When a file is still open on the server, we attempt to set the DELETE_ON_CLOSE bit and rename it to a new filename. When the last opener closes the file, the server should delete it. This patch moves this mechanism into a helper function and has the two places in cifs_unlink that do this procedure call it. It also fixes the open flags to be correct. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | cifs: have find_writeable_file prefer filehandles opened by same taskJeff Layton2008-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the CIFS client goes to write out pages, it needs to pick a filehandle to write to. find_writeable_file however just picks the first filehandle that it finds. This can cause problems when a lock is issued against a particular filehandle and we pick a different filehandle to write to. This patch tries to avert this situation by having find_writable_file prefer filehandles that have a pid that matches the current task. This seems to fix lock test 11 from the connectathon test suite when run against a windows server. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | cifs: don't use GFP_KERNEL with GFP_NOFSPekka Enberg2008-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GFP_KERNEL and GFP_NOFS are mutually exclusive. If you combine them, you end up with plain GFP_KERNEL which can deadlock in cases where you really want GFP_NOFS. Cc: Steve French <sfrench@samba.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | [CIFS] use common code for turning off ATTR_READONLY in cifs_unlinkSteve French2008-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already have a cifs_set_file_info function that can flip DOS attribute bits. Have cifs_unlink call it to handle turning ATTR_HIDDEN on and ATTR_READONLY off when an unlink attempt returns -EACCES. This also removes a level of indentation from cifs_unlink. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | cifs: clean up variables in cifs_unlinkJeff Layton2008-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change parameters to cifs_unlink to match the ones used in the generic VFS. Add some local variables to cut down on the amount of struct dereferencing that needs to be done, and eliminate some unneeded NULL pointer checks on the parent directory inode. Finally, rename pTcon to "tcon" to more closely match standard kernel coding style. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2008-10-10
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: choose better identifiers dlm: remove bkl dlm: fix address compare dlm: fix locking of lockspace list in dlm_scand dlm: detect available userspace daemon dlm: allow multiple lockspace creates
| * | | dlm: choose better identifiersAndrew Morton2008-09-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sparc32: fs/dlm/config.c:397: error: expected identifier or '(' before '{' token fs/dlm/config.c: In function 'drop_node': fs/dlm/config.c:589: warning: initialization from incompatible pointer type fs/dlm/config.c:589: warning: initialization from incompatible pointer type fs/dlm/config.c: In function 'release_node': fs/dlm/config.c:601: warning: initialization from incompatible pointer type fs/dlm/config.c:601: warning: initialization from incompatible pointer type fs/dlm/config.c: In function 'show_node': fs/dlm/config.c:717: warning: initialization from incompatible pointer type fs/dlm/config.c:717: warning: initialization from incompatible pointer type fs/dlm/config.c: In function 'store_node': fs/dlm/config.c:726: warning: initialization from incompatible pointer type fs/dlm/config.c:726: warning: initialization from incompatible pointer type Cc: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Teigland <teigland@redhat.com>
| * | | dlm: remove bklDavid Teigland2008-09-04
| | | | | | | | | | | | | | | | | | | | | | | | BLK from recent pushdown is not needed. Signed-off-by: David Teigland <teigland@redhat.com>
| * | | dlm: fix address compareDavid Teigland2008-09-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Compare only the addr and port fields of sockaddr structures. Fixes a problem with ipv6 where sin6_scope_id does not match. Signed-off-by: David Teigland <teigland@redhat.com>
| * | | dlm: fix locking of lockspace list in dlm_scandDavid Teigland2008-08-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The dlm_scand thread needs to lock the list of lockspaces when going through it. Signed-off-by: David Teigland <teigland@redhat.com>
| * | | dlm: detect available userspace daemonDavid Teigland2008-08-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If dlm_controld (the userspace daemon that controls the setup and recovery of the dlm) fails, the kernel should shut down the lockspaces in the kernel rather than leaving them running. This is detected by having dlm_controld hold a misc device open while running, and if the kernel detects a close while the daemon is still needed, it stops the lockspaces in the kernel. Knowing that the userspace daemon isn't running also allows the lockspace create/remove routines to avoid waiting on the daemon for join/leave operations. Signed-off-by: David Teigland <teigland@redhat.com>
| * | | dlm: allow multiple lockspace createsDavid Teigland2008-08-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a count for lockspace create and release so that create can be called multiple times to use the lockspace from different places. Also add the new flag DLM_LSFL_NEWEXCL to create a lockspace with the previous behavior of returning -EEXIST if the lockspace already exists. Signed-off-by: David Teigland <teigland@redhat.com>
* | | | Fix barrier fail detection in XFSChristoph Hellwig2008-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we disable barriers as soon as we get a buffer in xlog_iodone that has the XBF_ORDERED flag cleared. But this can be the case not only for buffers where the barrier failed, but also the first buffer of a split log write in case of a log wraparound. Due to the disabled barriers we can easily get directory corruption on unclean shutdowns. So instead of using this check add a new buffer flag for failed barrier writes. This is a regression vs 2.6.26 caused by patch to use the right macro to check for the ORDERED flag, as we previously got true returned for every buffer. Thanks to Toei Rei for reporting the bug. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: David Chinner <david@fromorbit.com> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmwLinus Torvalds2008-10-10
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: GFS2: Support for I/O barriers GFS2: Add UUID to GFS2 sb GFS2: high time to take some time over atime GFS2: The war on bloat GFS2: GFS2 will panic if you misspell any mount options GFS2: Direct IO write at end of file error GFS2: Use an IS_ERR test rather than a NULL test GFS2: Fix race relating to glock min-hold time GFS2: Fix & clean up GFS2 rename GFS2: rm on multiple nodes causes panic GFS2: Fix metafs mounts GFS2: Fix debugfs glock file iterator
| * | | | GFS2: Support for I/O barriersSteven Whitehouse2008-09-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds barrier support to GFS2. There is not a lot of change really... we just add the barrier flag when we write journal header blocks. If the underlying device refuses to support them, we fall back to the previous way of doing things (wait for the I/O and hope) since there is nothing else we can do. There is no user configuration, barriers will always be on unless the device refuses to support them. This seems a reasonable solution to me since this is a correctness issue. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | GFS2: high time to take some time over atimeSteven Whitehouse2008-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now, we've used the same scheme as GFS1 for atime. This has failed since atime is a per vfsmnt flag, not a per fs flag and as such the "noatime" flag was not getting passed down to the filesystems. This patch removes all the "special casing" around atime updates and we simply use the VFS's atime code. The net result is that GFS2 will now support all the same atime related mount options of any other filesystem on a per-vfsmnt basis. We do lose the "lazy atime" updates, but we gain "relatime". We could add lazy atime to the VFS at a later date, if there is a requirement for that variant still - I suspect relatime will be enough. Also we lose about 100 lines of code after this patch has been applied, and I have a suspicion that it will speed things up a bit, even when atime is "on". So it seems like a nice clean up as well. From a user perspective, everything stays the same except the loss of the per-fs atime quantum tweekable (ought to be per-vfsmnt at the very least, and to be honest I don't think anybody ever used it) and that a number of options which were ignored before now work correctly. Please let me know if you've got any comments. I'm pushing this out early so that you can all see what my plans are. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | GFS2: The war on bloatSteven Whitehouse2008-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following patch shrinks the gfs2_args structure which is embedded in every GFS2 superblock. It cuts down the size of the options to a single unsigned int (the 13 bits of bitfields will be rounded up to that size by the compiler) from the current 11 unsigned ints. So on x86 thats 44 bytes shrinking to 4 bytes, in each and every GFS2 superblock. Signed-off-by: Steven Whitehouse <swhitho@redhat.com>
| * | | | GFS2: GFS2 will panic if you misspell any mount optionsAbhijith Das2008-09-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gfs2 superblock pointer is NULL after a failed mount. When control eventually goes to gfs2_kill_sb, we dereference this NULL pointer. This patch ensures that the gfs2 superblock pointer is not NULL before being dereferenced in gfs2_kill_sb. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | GFS2: Direct IO write at end of file errorBob Peterson2008-09-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a problem whereby a direct_io write doesn't fall back to buffered write properly at end of file. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | GFS2: Use an IS_ERR test rather than a NULL testJulien Brunel2008-09-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case of error, the function gfs2_inode_lookup returns an ERR pointer, but never returns a NULL pointer. So a NULL test that necessarily comes after an IS_ERR test should be deleted, and a NULL test that may come after a call to this function should be strengthened by an IS_ERR test. The semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @match_bad_null_test@ expression x, E; statement S1,S2; @@ x = gfs2_inode_lookup(...) ... when != x = E * if (x != NULL) S1 else S2 // </smpl> Signed-off-by: Julien Brunel <brunel@diku.dk> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | GFS2: Fix race relating to glock min-hold timeSteven Whitehouse2008-09-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case that a request for a glock arrives right after the grant reply has arrived, it sometimes means that the gl_tstamp field hasn't been updated recently enough. The net result is that the min-hold time for the glock is ignored. If this happens often enough, it leads to poor performance. This patch adds an additional test, so that if the reply pending bit is set on a glock, then it will select the maximum length of time for the min-hold time, rather than looking at gl_tstamp. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | GFS2: Fix & clean up GFS2 renameSteven Whitehouse2008-08-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a locking issue in the rename code by ensuring that we hold the per sb rename lock over both directory and "other" renames which involve different parent directories. At the same time, this moved the (only called from one place) function gfs2_ok_to_move into the file that its called from, so we can mark it static. This should make a code a bit easier to follow. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Peter Staubach <staubach@redhat.com>
| * | | | GFS2: rm on multiple nodes causes panicBob Peterson2008-08-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a problem whereby simultaneous unlink, rmdir, rename and link operations (e.g. rm -fR *) from multiple nodes on the same GFS2 file system can cause kernel panics, hangs, and/or memory corruption. It also gets rid of all the non-rgrp calls to gfs2_glock_nq_m. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | GFS2: Fix metafs mountsSteven Whitehouse2008-08-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is intended to fix the issues reported in bz #457798. Instead of having the metafs as a separate filesystem, it becomes a second root of gfs2. As a result it will appear as type gfs2 in /proc/mounts, but it is still possible (for backwards compatibility purposes) to mount it as type gfs2meta. A new mount flag "meta" is introduced so that its possible to tell the two cases apart in /proc/mounts. As a result it becomes possible to mount type gfs2 with -o meta and get the same result as mounting type gfs2meta. So it is possible to mount just the metafs on its own. Currently if you do this, its then impossible to mount the "normal" root of the gfs2 filesystem without first unmounting the metafs root. I'm not sure if thats a feature or a bug :-) Either way, this is a great improvement on the previous scheme and I've verified that it works ok with bind mounts on both the "normal" root and the metafs root in various combinations. There were also a bunch of functions in super.c which didn't belong there, so this moves them into ops_fstype.c where they can be static. Hopefully the mount/umount sequence is now more obvious as a result. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Alexander Viro <aviro@redhat.com>
| * | | | GFS2: Fix debugfs glock file iteratorSteven Whitehouse2008-08-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to an incorrect iterator, some glocks were being missed from the glock dumps obtained via debugfs. This patch fixes the problem and ensures that we don't miss any glocks in future. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* | | | | Merge branch 'for-2.6.28' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2008-10-10
|\ \ \ \ \ | |_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.28' of git://git.kernel.dk/linux-2.6-block: (132 commits) doc/cdrom: Trvial documentation error, file not present block_dev: fix kernel-doc in new functions block: add some comments around the bio read-write flags block: mark bio_split_pool static block: Find bio sector offset given idx and offset block: gendisk integrity wrapper block: Switch blk_integrity_compare from bdev to gendisk block: Fix double put in blk_integrity_unregister block: Introduce integrity data ownership flag block: revert part of d7533ad0e132f92e75c1b2eb7c26387b25a583c1 bio.h: Remove unused conditional code block: remove end_{queued|dequeued}_request() block: change elevator to use __blk_end_request() gdrom: change to use __blk_end_request() memstick: change to use __blk_end_request() virtio_blk: change to use __blk_end_request() blktrace: use BLKTRACE_BDEV_SIZE as the name size for setup structure block: add lld busy state exporting interface block: Fix blk_start_queueing() to not kick a stopped queue include blktrace_api.h in headers_install ...
| * | | | block_dev: fix kernel-doc in new functionsRandy Dunlap2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix kernel-doc in new functions: Error(mmotm-2008-1002-1617//fs/block_dev.c:895): duplicate section name 'Description' Error(mmotm-2008-1002-1617//fs/block_dev.c:924): duplicate section name 'Description' Warning(mmotm-2008-1002-1617//fs/block_dev.c:1282): No description found for parameter 'pathname' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> cc: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: mark bio_split_pool staticDenis ChengRq2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since all bio_split calls refer the same single bio_split_pool, the bio_split function can use bio_split_pool directly instead of the mempool_t parameter; then the mempool_t parameter can be removed from bio_split param list, and bio_split_pool is only referred in fs/bio.c file, can be marked static. Signed-off-by: Denis ChengRq <crquan@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: Find bio sector offset given idx and offsetMartin K. Petersen2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Helper function to find the sector offset in a bio given bvec index and page offset. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: Introduce integrity data ownership flagMartin K. Petersen2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A filesystem might supply its own integrity metadata. Introduce a flag that indicates whether the filesystem or the block layer owns the integrity buffer. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: revert part of d7533ad0e132f92e75c1b2eb7c26387b25a583c1Jens Axboe2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need bdev_get_integrity() to support the pending md/dm patches. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: cleanup some of the integrity stuff in blkdev.hJens Axboe2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't put functions that are only used in fs/bio-integrity.c in blkdev.h, it's much cleaner to just keep it in there. Also kill completely unused bdev_get_tag_size() Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: add bio_kmalloc()Jens Axboe2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Not all callers need (or want!) the mempool backing guarentee, it essentially means that you can only use bio_alloc() for short allocations and not for preallocating some bio's at setup or init time. So add bio_kmalloc() which does the same thing as bio_alloc(), except it just uses kmalloc() as the backing instead of the bio mempools. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | Call flush_disk() after detecting an online resize.Andrew Patterson2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We call flush_disk() to make sure the buffer cache for the disk is flushed after a disk resize. There are two resize cases, growing and shrinking. Given that users can shrink/then grow a disk before revalidate_disk() is called, we treat the grow case identically to shrinking. We need to flush the buffer cache after an online shrink because, as James Bottomley puts it, The two use cases for shrinking I can see are 1. planned: the fs is already shrunk to within the new boundaries and all data is relocated, so invalidate is fine (any dirty buffers that might exist in the shrunk region are there only because they were relocated but not yet written to their original location). 2. unplanned: In this case, the fs is probably toast, so whether we invalidate or not isn't going to make a whole lot of difference; it's still going to try to read or write from sectors beyond the new size and get I/O errors. Immediately invalidating shrunk disks will cause errors for outstanding I/Os for reads/write beyond the new end of the disk to be generated earlier then if we waited for the normal buffer cache operation. It also removes a potential security hole where we might keep old data around from beyond the end of the shrunk disk if the disk was not invalidated. Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | Added flush_disk to factor out common buffer cache flushing code.Andrew Patterson2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to be able to flush the buffer cache for for more than just when a disk is changed, so we factor out common cache flush code in check_disk_change() to an internal flush_disk() routine. This routine will then be used for both disk changes and disk resizes (in a later patch). Include the disk name in the text indicating that there are busy inodes on the device and increase the KERN severity of the message. Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | Check for device resize when rescanning partitionsAndrew Patterson2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check for device resize in the rescan_partitions() routine. If the device has been resized, the bdev size is set to match. The rescan_partitions() routine is called when opening the device and when calling the BLKRRPART ioctl. Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | Adjust block device size after an online resize of a disk.Andrew Patterson2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The revalidate_disk routine now checks if a disk has been resized by comparing the gendisk capacity to the bdev inode size. If they are different (usually because the disk has been resized underneath the kernel) the bdev inode size is adjusted to match the capacity. Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | Wrapper for lower-level revalidate_disk routines.Andrew Patterson2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a wrapper for the lower-level revalidate_disk call-backs such as sd_revalidate_disk(). It allows us to perform pre and post operations when calling them. We will use this wrapper in a later patch to adjust block device sizes after an online resize (a _post_ operation). Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: make blk_rq_map_user take a NULL user-space bufferFUJITA Tomonori2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes blk_rq_map_user to accept a NULL user-space buffer with a READ command if rq_map_data is not NULL. Thus a caller can pass page frames to lk_rq_map_user to just set up a request and bios with page frames propely. bio_uncopy_user (called via blk_rq_unmap_user) doesn't copy data to user space with such request. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | bio: convert bio_copy_kern to use bio_copy_userFUJITA Tomonori2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bio_copy_kern and bio_copy_user are very similar. This converts bio_copy_kern to use bio_copy_user. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: introduce struct rq_map_data to use reserved pagesFUJITA Tomonori2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces struct rq_map_data to enable bio_copy_use_iov() use reserved pages. Currently, bio_copy_user_iov allocates bounce pages but drivers/scsi/sg.c wants to allocate pages by itself and use them. struct rq_map_data can be used to pass allocated pages to bio_copy_user_iov. The current users of bio_copy_user_iov simply passes NULL (they don't want to use pre-allocated pages). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iovFUJITA Tomonori2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, blk_rq_map_user and blk_rq_map_user_iov always do GFP_KERNEL allocation. This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov so sg can use it (sg always does GFP_ATOMIC allocation). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: add support for IO CPU affinityJens Axboe2008-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for controlling the IO completion CPU of either all requests on a queue, or on a per-request basis. We export a sysfs variable (rq_affinity) which, if set, migrates completions of requests to the CPU that originally submitted it. A bio helper (bio_set_completion_cpu()) is also added, so that queuers can ask for completion on that specific CPU. In testing, this has been show to cut the system time by as much as 20-40% on synthetic workloads where CPU affinity is desired. This requires a little help from the architecture, so it'll only work as designed for archs that are using the new generic smp helper infrastructure. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>