aboutsummaryrefslogtreecommitdiffstats
path: root/fs/nfs
Commit message (Collapse)AuthorAge
* NFS: show retransmit settings when displaying mount optionsChuck Lever2006-03-20
| | | | | | | | | | | | | Sometimes it's important to know the exact RPC retransmit settings the kernel is using for an NFS mount point. Add this facility to the NFS client's show_options method. Test plan: Set various retransmit settings via the mount command, and check that the settings are reflected in /proc/mounts. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: sem2mutex idmap.cIngo Molnar2006-03-20
| | | | | | | | | | | | semaphore to mutex conversion. the conversion was generated via scripts, and the result was validated automatically via a script as well. build and boot tested. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: kzalloc conversion in fs/nfsEric Sesterhenn2006-03-20
| | | | | | | | this converts fs/nfs to kzalloc() usage. compile tested with make allyesconfig Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Kill braindead gcc warningsTrond Myklebust2006-03-20
| | | | | | | | | nfs4_open_revalidate: 'res' may be used uninitialized nfs4_callback_compound: ‘hdr_res.nops’ may be used uninitialized 'op_nr’ may be used uninitialized encode_getattr_res: ‘savep’ may be used uninitialized Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Do not call rpciod_down() before call to destroy_nfsv4_state()Trond Myklebust2006-03-20
| | | | | | | The reason is that the idmapper cleanup may call flush_workqueue() on rpciod_workqueue. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: Ensure that rpc_mkpipe returns a refcounted dentryTrond Myklebust2006-03-20
| | | | | | | If not, we cannot guarantee that idmap->idmap_dentry, gss_auth->dentry and clnt->cl_dentry are valid dentries. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: reduce the number of false cache invalidations.Trond Myklebust2006-03-20
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: "const static" vs "static const" in nfs4Jesper Juhl2006-03-20
| | | | | | | | My previous "const static" vs "static const" cleanup missed a single case, patch below takes care of it. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Don't invalidate cached attributes if change attribute is unchangedTrond Myklebust2006-03-20
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: writes should not clobber utimes() callsTrond Myklebust2006-03-20
| | | | | | | Ensure that we flush out writes in the case when someone calls utimes() in order to set the file times. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix buglet in fs/nfs/write.cNeil Brown2006-03-20
| | | | | | | | | | I've been reading through fs/nfs/write.c trying to track down a bug that seems to be related to pages loosing a refcount and getting freed too early (you interested in detail??) and I spotted a little bug which the following patch should fix. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Avoid races between writebacks and truncationTrond Myklebust2006-03-20
| | | | | | | | | | | | | | | Currently, there is no serialisation between NFS asynchronous writebacks and truncation at the page level due to the fact that nfs_sync_inode() cannot lock the pages that it is about to write out. This means that it is possible to be flushing out data (and calling something like set_page_writeback()) while the page cache is busy evicting the page. Oops... Use the hooks provided in try_to_release_page() to ensure that dirty pages are always written back to storage before we evict them. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix a busy inodes issue...Trond Myklebust2006-03-20
| | | | | | | | | | | The nfs_open_context may live longer than the file descriptor that spawned it, so it needs to carry a reference to the vfsmount. If not, then generic_shutdown_super() may end up being called before reads and writes have been flushed out. Make a couple of functions static while we're at it... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* [PATCH] NFSv4: fix mount segfault on errors returned that are < -1000Trond Myklebust2006-03-14
| | | | | | | | | | It turns out that nfs4_proc_get_root() may return raw NFSv4 errors instead of mapping them to kernel errors. Problem spotted by Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] NFS: Fix a potential panic in O_DIRECTTrond Myklebust2006-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on an original patch by Mike O'Connor and Greg Banks of SGI. Mike states: A normal user can panic an NFS client and cause a local DoS with 'judicious'(?) use of O_DIRECT. Any O_DIRECT write to an NFS file where the user buffer starts with a valid mapped page and contains an unmapped page, will crash in this way. I haven't followed the code, but O_DIRECT reads with similar user buffers will probably also crash albeit in different ways. Details: when nfs_get_user_pages() calls get_user_pages(), it detects and correctly handles get_user_pages() returning an error, which happens if the first page covered by the user buffer's address range is unmapped. However, if the first page is mapped but some subsequent page isn't, get_user_pages() will return a positive number which is less than the number of pages requested (this behaviour is sort of analagous to a short write() call and appears to be intentional). nfs_get_user_pages() doesn't detect this and hands off the array of pages (whose last few elements are random rubbish from the newly allocated array memory) to it's caller, whence they go to nfs_direct_write_seg(), which then totally ignores the nr_pages it's given, and calculates its own idea of how many pages are in the array from the user buffer length. Needless to say, when it comes to transmit those uninitialised page* pointers, we see a crash in the network stack. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] nfsroot port= parameter fix [backport of 2.4 fix]Al Viro2006-02-07
| | | | | | | | | | | | | | | | | | | | | Direct backport of 2.4 fix that didn't get propagated to 2.6; original comment follows: <quote> When I specify the NFS port for nfsroot (e.g., nfsroot=<dir>,port=2049), the kernel uses the wrong port. In my case it tries to use 264 (0x108) instead of 2049 (0x801). This patch adds the missing htons(). Eric </quote> Patch got applied in 2.4.21-pre6. Author: Eric Lammerts (<eric@lammerts.org>, AFAICS). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* NFSv3: fix sync_retry in direct i/o NFSDirk Mueller2006-02-01
| | | | | | | Only do a sync_retry if the memcmp failed. Signed-off-by: Dirk Mueller <dmueller@suse.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* [PATCH] per-mountpoint noatime/nodiratimeChristoph Hellwig2006-01-10
| | | | | | | | | | | | | | | | | | | | Turn noatime and nodiratime into per-mount instead of per-sb flags. After all the preparations this is a rather trivial patch. The mount code needs to treat the two options as per-mount instead of per-superblock, and touch_atime needs to be changed to check the new MNT_ flags in addition to the MS_ flags that are kept for filesystems that are always noatime/nodiratime but not user settable anymore. Besides that core code only nfs needed an update because it's leaving atime updates to the server and thus sets the S_NOATIME flag on every inode, but needs to know whether it's a real noatime mount for an getattr optimization. While we're at it I've killed the IS_NOATIME/IS_NODIRATIME macros that were only used by touch_atime. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_semJes Sorensen2006-01-09
| | | | | | | | | | | | | This patch converts the inode semaphore to a mutex. I have tested it on XFS and compiled as much as one can consider on an ia64. Anyway your luck with it might be different. Modified-by: Ingo Molnar <mingo@elte.hu> (finished the conversion) Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* [PATCH] nfsroot: do not silently stop parsing on an unknown optionJorn Dreyer2006-01-08
| | | | | | | | | | | | It would be helpful if the kernel did not silently stop parsing nfs options, but instead warned about any he does not recognize. The attached patch adds one printk to do just that. It took me a couple of hours to find my configuration mistake. Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Fix and add EXPORT_SYMBOL(filemap_write_and_wait)OGAWA Hirofumi2006-01-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch add EXPORT_SYMBOL(filemap_write_and_wait) and use it. See mm/filemap.c: And changes the filemap_write_and_wait() and filemap_write_and_wait_range(). Current filemap_write_and_wait() doesn't wait if filemap_fdatawrite() returns error. However, even if filemap_fdatawrite() returned an error, it may have submitted the partially data pages to the device. (e.g. in the case of -ENOSPC) <quotation> Andrew Morton writes, If filemap_fdatawrite() returns an error, this might be due to some I/O problem: dead disk, unplugged cable, etc. Given the generally crappy quality of the kernel's handling of such exceptions, there's a good chance that the filemap_fdatawait() will get stuck in D state forever. </quotation> So, this patch doesn't wait if filemap_fdatawrite() returns the -EIO. Trond, could you please review the nfs part? Especially I'm not sure, nfs must use the "filemap_fdatawrite(inode->i_mapping) == 0", or not. Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* NFSv4: Fix an Oops in nfs_do_expire_all_delegationsTrond Myklebust2006-01-06
| | | | | | If the loop errors, we need to exit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Allow entries in the idmap cache to expireTrond Myklebust2006-01-06
| | | | | | | | | | If someone changes the uid/gid mapping in userland, then we do eventually want those changes to be propagated to the kernel. Currently the kernel assumes that it may cache entries forever. Add an expiration time + garbage collector for idmap entries. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: get rid of some needless code obfuscation in xdr_encode_sattr().Trond Myklebust2006-01-06
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Send valid mode bits to the serverTrond Myklebust2006-01-06
| | | | | | | inode->i_mode contains a lot more than just the mode bits. Make sure that we mask away this extra stuff in SETATTR calls to the server. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: get rid of cl_chattyChuck Lever2006-01-06
| | | | | | | | | | | | Clean up: Every ULP that uses the in-kernel RPC client, except the NLM client, sets cl_chatty. There's no reason why NLM shouldn't set it, so just get rid of cl_chatty and always be verbose. Test-plan: Compile with CONFIG_NFS enabled. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv3: try get_root user-supplied security_flavorJ. Bruce Fields2006-01-06
| | | | | | | | | | | | | | | | Thanks to Ed Keizer for bug and root cause. He says: "... we could only mount the top-level Solaris share. We could not mount deeper into the tree. Investigation showed that Solaris allows UNIX authenticated FSINFO only on the top level of the share. This is a problem because we share/export our home directories one level higher than we mount them. I.e. we share the partition and not the individual home directories. This prevented access to home directories." We still may need to try auth_sys for the case where the client doesn't have appropriate credentials. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Allow user to set the port used by the NFSv4 callback channelTrond Myklebust2006-01-06
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Clean up weak cache consistency codeTrond Myklebust2006-01-06
| | | | | | ...and ensure that nfs_update_inode() respects wcc Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Ensure DELEGRETURN returns attributesTrond Myklebust2006-01-06
| | | | | | | | Upon return of a write delegation, the server will almost always bump the change attribute. Ensure that we pick up that change so that we don't invalidate our data cache unnecessarily. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Ensure change attribute returned by GETATTR callback conforms to specTrond Myklebust2006-01-06
| | | | | | | | | | | According to RFC3530 we're supposed to cache the change attribute at the time the client receives a write delegation. If the inode is clean, a CB_GETATTR callback by the server to the client is supposed to return the cached change attribute. If, OTOH, the inode is dirty, the client should bump the cached change attribute by 1. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Make directIO aware of compound pages...Trond Myklebust2006-01-06
| | | | | | ...and avoid calling set_page_dirty on them Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Make stat() return updated mtimes after a write()Trond Myklebust2006-01-06
| | | | | | | | | The SuS states that a call to write() will cause mtime to be updated on the file. In order to satisfy that requirement, we need to flush out any cached writes in nfs_getattr(). Speed things up slightly by not committing the writes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Ensure that we return the delegation on the target of a rename too.Trond Myklebust2006-01-06
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: support large reads and writes on the wireChuck Lever2006-01-06
| | | | | | | | | | | | | | | | | Most NFS server implementations allow up to 64KB reads and writes on the wire. The Solaris NFS server allows up to a megabyte, for instance. Now the Linux NFS client supports transfer sizes up to 1MB, too. This will help reduce protocol and context switch overhead on read/write intensive NFS workloads, and support larger atomic read and write operations on servers that support them. Test-plan: Connectathon and iozone on mount point with wsize=rsize>32768 over TCP. Tests with NFS over UDP to verify the maximum RPC payload size cap. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: make "inode number mismatch" message more usefulChuck Lever2006-01-06
| | | | | | | | | | | To help NFS users and server developers, make the "inode number mismatch" message display more useful information. Test-plan: None. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: get rid of useless kernel log messageChuck Lever2006-01-06
| | | | | | | | | | | nfs_statfs() generates a log message when GETATTR returns an error. This is usually a useless message. Make it a dprintk. Test plan: None Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix error recovery code in fs/nfs/inode.c:__init_nfs()Chuck Lever2006-01-06
| | | | | | | Red Hat found a problem in the error recovery logic in __init_nfs. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: use generic_write_checks() to sanity check direct writesChuck Lever2006-01-06
| | | | | | | | | | | | | Replace ad hoc write parameter sanity checking in nfs_file_direct_write() with a call to generic_write_checks(). This should make the proper checks modulo the O_LARGEFILE flag, and should catch NFSv2-specific limitations by virtue of i_sb->s_maxbytes. Test plan: Posix compliance testing with both NFSv2 and NFSv3. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Remove requirement for machine creds for the "setclientid" operationTrond Myklebust2006-01-06
| | | | | | Use a cred from the nfs4_client->cl_state_owners list. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Remove requirement for machine creds for the "renew" operationTrond Myklebust2006-01-06
| | | | | | | | | | | | | | | | | In RFC3530, the RENEW operation is allowed to use either the same principal, RPC security flavour and (if RPCSEC_GSS), the same mechanism and service that was used for SETCLIENTID_CONFIRM OR Any principal, RPC security flavour and service combination that currently has an OPEN file on the server. Choose the latter since that doesn't require us to keep credentials for the same principal for the entire duration of the mount. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Send RENEW requests to the server only when we're holding stateTrond Myklebust2006-01-06
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Convert instances of kernel_thread() to kthread()Trond Myklebust2006-01-06
| | | | | | | Convert private implementations in NFSv4 state recovery and delegation code to use kthreads. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: State recovery cleanupTrond Myklebust2006-01-06
| | | | | | Use wait_on_bit() when waiting for state recovery to complete. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: OPEN/LOCK/LOCKU/CLOSE will automatically renew the NFSv4 leaseTrond Myklebust2006-01-06
| | | | | | Cut down on the number of unnecessary RENEW requests on the wire. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Make DELEGRETURN an interruptible operation.Trond Myklebust2006-01-06
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Convert LOCK rpc call into an asynchronous RPC callTrond Myklebust2006-01-06
| | | | | | In order to allow users to interrupt/cancel it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: locking XDR cleanupTrond Myklebust2006-01-06
| | | | | | Get rid of some unnecessary intermediate structures Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Make open recovery track O_RDWR, O_RDONLY and O_WRONLY correctlyTrond Myklebust2006-01-06
| | | | | | | When recovering from a delegation recall or a network partition, we need to replay open(O_RDWR), open(O_RDONLY) and open(O_WRONLY) separately. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Make nfs4_state track O_RDWR, O_RDONLY and O_WRONLY separatelyTrond Myklebust2006-01-06
| | | | | | | | | | | | | A closer reading of RFC3530 reveals that OPEN_DOWNGRADE must always specify a access modes that have been the argument of a previous OPEN operation. IOW: doing OPEN(O_RDWR) and then OPEN_DOWNGRADE(O_WRONLY) is forbidden unless the user called OPEN(O_WRONLY) In order to fix that, we really need to track the three possible open states separately. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>