aboutsummaryrefslogtreecommitdiffstats
path: root/fs/nfs/nfs4proc.c
Commit message (Collapse)AuthorAge
* NFS: Fix double d_drop in nfs_instantiate() error pathChuck Lever2006-09-22
| | | | | | | | | | | | | | | | | | | If the LOOKUP or GETATTR in nfs_instantiate fail, nfs_instantiate will do a d_drop before returning. But some callers already do a d_drop in the case of an error return. Make certain we do only one d_drop in all error paths. This issue was introduced because over time, the symlink proc API diverged slightly from the create/mkdir/mknod proc API. To prevent other coding mistakes of this type, change the symlink proc API to be more like create/mkdir/mknod and move the nfs_instantiate call into the symlink proc routines so it is used in exactly the same way for create, mkdir, mknod, and symlink. Test plan: Connectathon, all versions of NFS. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Share NFS superblocks per-protocol per-server per-FSIDDavid Howells2006-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The attached patch makes NFS share superblocks between mounts from the same server and FSID over the same protocol. It does this by creating each superblock with a false root and returning the real root dentry in the vfsmount presented by get_sb(). The root dentry set starts off as an anonymous dentry if we don't already have the dentry for its inode, otherwise it simply returns the dentry we already have. We may thus end up with several trees of dentries in the superblock, and if at some later point one of anonymous tree roots is discovered by normal filesystem activity to be located in another tree within the superblock, the anonymous root is named and materialises attached to the second tree at the appropriate point. Why do it this way? Why not pass an extra argument to the mount() syscall to indicate the subpath and then pathwalk from the server root to the desired directory? You can't guarantee this will work for two reasons: (1) The root and intervening nodes may not be accessible to the client. With NFS2 and NFS3, for instance, mountd is called on the server to get the filehandle for the tip of a path. mountd won't give us handles for anything we don't have permission to access, and so we can't set up NFS inodes for such nodes, and so can't easily set up dentries (we'd have to have ghost inodes or something). With this patch we don't actually create dentries until we get handles from the server that we can use to set up their inodes, and we don't actually bind them into the tree until we know for sure where they go. (2) Inaccessible symbolic links. If we're asked to mount two exports from the server, eg: mount warthog:/warthog/aaa/xxx /mmm mount warthog:/warthog/bbb/yyy /nnn We may not be able to access anything nearer the root than xxx and yyy, but we may find out later that /mmm/www/yyy, say, is actually the same directory as the one mounted on /nnn. What we might then find out, for example, is that /warthog/bbb was actually a symbolic link to /warthog/aaa/xxx/www, but we can't actually determine that by talking to the server until /warthog is made available by NFS. This would lead to having constructed an errneous dentry tree which we can't easily fix. We can end up with a dentry marked as a directory when it should actually be a symlink, or we could end up with an apparently hardlinked directory. With this patch we need not make assumptions about the type of a dentry for which we can't retrieve information, nor need we assume we know its place in the grand scheme of things until we actually see that place. This patch reduces the possibility of aliasing in the inode and page caches for inodes that may be accessed by more than one NFS export. It also reduces the number of superblocks required for NFS where there are many NFS exports being used from a server (home directory server + autofs for example). This in turn makes it simpler to do local caching of network filesystems, as it can then be guaranteed that there won't be links from multiple inodes in separate superblocks to the same cache file. Obviously, cache aliasing between different levels of NFS protocol could still be a problem, but at least that gives us another key to use when indexing the cache. This patch makes the following changes: (1) The server record construction/destruction has been abstracted out into its own set of functions to make things easier to get right. These have been moved into fs/nfs/client.c. All the code in fs/nfs/client.c has to do with the management of connections to servers, and doesn't touch superblocks in any way; the remaining code in fs/nfs/super.c has to do with VFS superblock management. (2) The sequence of events undertaken by NFS mount is now reordered: (a) A volume representation (struct nfs_server) is allocated. (b) A server representation (struct nfs_client) is acquired. This may be allocated or shared, and is keyed on server address, port and NFS version. (c) If allocated, the client representation is initialised. The state member variable of nfs_client is used to prevent a race during initialisation from two mounts. (d) For NFS4 a simple pathwalk is performed, walking from FH to FH to find the root filehandle for the mount (fs/nfs/getroot.c). For NFS2/3 we are given the root FH in advance. (e) The volume FSID is probed for on the root FH. (f) The volume representation is initialised from the FSINFO record retrieved on the root FH. (g) sget() is called to acquire a superblock. This may be allocated or shared, keyed on client pointer and FSID. (h) If allocated, the superblock is initialised. (i) If the superblock is shared, then the new nfs_server record is discarded. (j) The root dentry for this mount is looked up from the root FH. (k) The root dentry for this mount is assigned to the vfsmount. (3) nfs_readdir_lookup() creates dentries for each of the entries readdir() returns; this function now attaches disconnected trees from alternate roots that happen to be discovered attached to a directory being read (in the same way nfs_lookup() is made to do for lookup ops). The new d_materialise_unique() function is now used to do this, thus permitting the whole thing to be done under one set of locks, and thus avoiding any race between mount and lookup operations on the same directory. (4) The client management code uses a new debug facility: NFSDBG_CLIENT which is set by echoing 1024 to /proc/net/sunrpc/nfs_debug. (5) Clone mounts are now called xdev mounts. (6) Use the dentry passed to the statfs() op as the handle for retrieving fs statistics rather than the root dentry of the superblock (which is now a dummy). Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Move rpc_ops from nfs_server to nfs_clientDavid Howells2006-09-22
| | | | | | | | Move the rpc_ops from the nfs_server struct to the nfs_client struct as they're common to all server records of a particular NFS protocol version. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Make better use of inode* dereferencing macrosDavid Howells2006-09-22
| | | | | | | | Make better use of inode* dereferencing macros to hide dereferencing chains (including NFS_PROTO and NFS_CLIENT). Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Add extra const qualifiersDavid Howells2006-09-22
| | | | | | | Add some extra const qualifiers into NFS. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Generalise the nfs_client structureDavid Howells2006-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generalise the nfs_client structure by: (1) Moving nfs_client to a more general place (nfs_fs_sb.h). (2) Renaming its maintenance routines to be non-NFS4 specific. (3) Move those maintenance routines to a new non-NFS4 specific file (client.c) and move the declarations to internal.h. (4) Make nfs_find/get_client() take a full sockaddr_in to include the port number (will be required for NFS2/3). (5) Make nfs_find/get_client() take the NFS protocol version (again will be required to differentiate NFS2, 3 & 4 client records). Also: (6) Make nfs_client construction proceed akin to inodes, marking them as under construction and providing a function to indicate completion. (7) Make nfs_get_client() wait interruptibly if it finds a client that it can share, but that client is currently being constructed. (8) Make nfs4_create_client() use (6) and (7) instead of locking cl_sem. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Add a server capabilities NFS RPC opDavid Howells2006-09-22
| | | | | | | Add a set_capabilities NFS RPC op so that the server capabilities can be set. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Add a lookupfh NFS RPC opDavid Howells2006-09-22
| | | | | | | | | Add a lookup filehandle NFS RPC op so that a file handle can be looked up without requiring dentries and inodes and other VFS stuff when doing an NFS4 pathwalk during mounting. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Rename nfs_server::nfs4_stateDavid Howells2006-09-22
| | | | | | | | Rename nfs_server::nfs4_state to nfs_client as it will be used to represent the client state for NFS2 and NFS3 also. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Rename struct nfs4_client to struct nfs_clientDavid Howells2006-09-22
| | | | | | | | Rename struct nfs4_client to struct nfs_client so that it can become the basis for a general client record for NFS2 and NFS3 in addition to NFS4. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Fix incorrect semaphore release in _nfs4_do_open()Trond Myklebust2006-09-19
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Add v4 exception handling for the ACL functions.Trond Myklebust2006-08-24
| | | | | | | | | This is needed in order to handle any NFS4ERR_DELAY errors that might be returned by the server. It also ensures that we map the NFSv4 errors before they are returned to userland. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 71c12b3f0abc7501f6ed231a6d17bc9c05a238dc commit)
* NLM,NFSv4: Wait on local locks before we put RPC calls on the wireTrond Myklebust2006-07-05
| | | | | | | Use FL_ACCESS flag to test and/or wait for local locks before we try requesting a lock from the server Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Ensure nfs4_lock_expired() caches delegated locksTrond Myklebust2006-07-05
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NLM,NFSv4: Don't put UNLOCK requests on the wire unless we hold a lockTrond Myklebust2006-07-05
| | | | | | | | Use the new behaviour of {flock,posix}_file_lock(F_UNLCK) to determine if we held a lock, and only send the RPC request to the server if this was the case. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Split fs/nfs/inode.cDavid Howells2006-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As fs/nfs/inode.c is rather large, heterogenous and unwieldy, the attached patch splits it up into a number of files: (*) fs/nfs/inode.c Strictly inode specific functions. (*) fs/nfs/super.c Superblock management functions for NFS and NFS4, normal access, clones and referrals. The NFS4 superblock functions _could_ move out into a separate conditionally compiled file, but it's probably not worth it as there're so many common bits. (*) fs/nfs/namespace.c Some namespace-specific functions have been moved here. (*) fs/nfs/nfs4namespace.c NFS4-specific namespace functions (this could be merged into the previous file). This file is conditionally compiled. (*) fs/nfs/internal.h Inter-file declarations, plus a few simple utility functions moved from fs/nfs/inode.c. Additionally, all the in-.c-file externs have been moved here, and those files they were moved from now includes this file. For the most part, the functions have not been changed, only some multiplexor functions have changed significantly. I've also: (*) Added some extra banner comments above some functions. (*) Rearranged the function order within the files to be more logical and better grouped (IMO), though someone may prefer a different order. (*) Reduced the number of #ifdefs in .c files. (*) Added missing __init and __exit directives. Signed-Off-By: David Howells <dhowells@redhat.com>
* NFSv4: Follow a referralManoj Naik2006-06-09
| | | | | | | | | | | Respond to a moved error on NFS lookup by setting up the referral. Note: We don't actually follow the referral during lookup/getattr, but later when we detect fsid mismatch in inode revalidation (similar to the processing done for cloning submounts). Referrals will have fake attributes until they are actually followed or traversed. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Define an fs_locations bitmapManoj Naik2006-06-09
| | | | | | | | | | | This is (similar to getattr bitmap) but includes fs_locations and mounted_on_fileid attributes. Use this bitmap for encoding in fs_locations requests. Note: We can probably do better by requesting locations as part of fsinfo itself. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: GETATTR attributes on referralManoj Naik2006-06-09
| | | | | | | | Per referral draft, only fs_locations, fsid, and mounted_on_fileid can be requested in a GETATTR on referrals. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: convert fs-locations-components to conform to RFC3530Manoj Naik2006-06-09
| | | | | | | | Use component4-style formats for decoding list of servers and pathnames in fs_locations. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Implement the fs_locations function callTrond Myklebust2006-06-09
| | | | | | | | | | | NFSv4 allows for the fact that filesystems may be replicated across several servers or that they may be migrated to a backup server in case of failure of the primary server. fs_locations is an NFSv4 operation for retrieving information about the location of migrated and/or replicated filesystems. Based on an initial implementation by Jiaying Zhang <jiayingz@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Ensure the client submounts, when it crosses a server mountpoint.Trond Myklebust2006-06-09
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: More page cache revalidation fixupsTrond Myklebust2006-06-09
| | | | | | | | Whenever the directory changes, we want to make sure that we always invalidate its page cache. Fix up update_changeattr() and nfs_mark_for_revalidate() so that they do so. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Clean up inode metadata updatesTrond Myklebust2006-06-09
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* VFS: Fix another open intent OopsTrond Myklebust2006-04-19
| | | | | | | If the call to nfs_intent_set_file() fails to open a file in nfs4_proc_create(), we should return an error. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Dont list system.nfs4_acl for filesystems that don't support it.J. Bruce Fields2006-03-20
| | | | | | | | | | Thanks to Frank Filz for pointing out that we list system.nfs4_acl extended attribute even on filesystems where we don't actually support nfs4_acl. This is inconsistent with the e.g. ext3 POSIX ACL behaviour, and seems to annoy cp. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: Ensure rpc_call_async() always calls tk_ops->rpc_release()Trond Myklebust2006-03-20
| | | | | | | Currently this will not happen if we exit before rpc_new_task() was called. Also fix up rpc_run_task() to do the same (for consistency). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Make nfs_fhget() return appropriate error valuesTrond Myklebust2006-03-20
| | | | | | | Currently it returns NULL, which usually gets interpreted as ENOMEM. In fact it can mean a host of issues. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: SETCLIENTID_CONFIRM should handle NFS4ERR_DELAY/NFS4ERR_RESOURCETrond Myklebust2006-03-20
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Send the delegation stateid for SETATTR callsTrond Myklebust2006-03-20
| | | | | | | In the case where we hold a delegation stateid, use that in for inside SETATTR calls. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Cleanup of NFS read codeTrond Myklebust2006-03-20
| | | | | | | | Same callback hierarchy inversion as for the NFS write calls. This patch is not strictly speaking needed by the O_DIRECT code, but avoids confusing differences between the asynchronous read and write code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Cleanup of NFS write code in preparation for asynchronous o_directTrond Myklebust2006-03-20
| | | | | | | | | | | | | | | | | This patch inverts the callback hierarchy for NFS write calls. Instead of having the NFSv2/v3/v4-specific code set up the RPC callback ops, we allow the original caller to do so. This allows for more flexibility w.r.t. how to set up and tear down the nfs_write_data structure while still allowing the NFSv3/v4 code to perform error handling. The greater flexibility is needed by the asynchronous O_DIRECT code, which wants to be able to hold on to the original nfs_write_data structures after the WRITE RPC call has completed in order to be able to replay them if the COMMIT call determines that the server has rebooted. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: add hooks to account for NFSERR_JUKEBOX errorsChuck Lever2006-03-20
| | | | | | | | | | | | | | Make an inode or an nfs_server struct available in the logic that handles JUKEBOX/DELAY type errors so the NFS client can account for them. This patch is split out from the main nfs iostat patch to highlight minor architectural changes required to support this statistic. Test plan: None. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Kill braindead gcc warningsTrond Myklebust2006-03-20
| | | | | | | | | nfs4_open_revalidate: 'res' may be used uninitialized nfs4_callback_compound: ‘hdr_res.nops’ may be used uninitialized 'op_nr’ may be used uninitialized encode_getattr_res: ‘savep’ may be used uninitialized Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: "const static" vs "static const" in nfs4Jesper Juhl2006-03-20
| | | | | | | | My previous "const static" vs "static const" cleanup missed a single case, patch below takes care of it. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* [PATCH] NFSv4: fix mount segfault on errors returned that are < -1000Trond Myklebust2006-03-14
| | | | | | | | | | It turns out that nfs4_proc_get_root() may return raw NFSv4 errors instead of mapping them to kernel errors. Problem spotted by Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* NFSv4: Ensure DELEGRETURN returns attributesTrond Myklebust2006-01-06
| | | | | | | | Upon return of a write delegation, the server will almost always bump the change attribute. Ensure that we pick up that change so that we don't invalidate our data cache unnecessarily. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Remove requirement for machine creds for the "setclientid" operationTrond Myklebust2006-01-06
| | | | | | Use a cred from the nfs4_client->cl_state_owners list. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Remove requirement for machine creds for the "renew" operationTrond Myklebust2006-01-06
| | | | | | | | | | | | | | | | | In RFC3530, the RENEW operation is allowed to use either the same principal, RPC security flavour and (if RPCSEC_GSS), the same mechanism and service that was used for SETCLIENTID_CONFIRM OR Any principal, RPC security flavour and service combination that currently has an OPEN file on the server. Choose the latter since that doesn't require us to keep credentials for the same principal for the entire duration of the mount. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Send RENEW requests to the server only when we're holding stateTrond Myklebust2006-01-06
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: State recovery cleanupTrond Myklebust2006-01-06
| | | | | | Use wait_on_bit() when waiting for state recovery to complete. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: OPEN/LOCK/LOCKU/CLOSE will automatically renew the NFSv4 leaseTrond Myklebust2006-01-06
| | | | | | Cut down on the number of unnecessary RENEW requests on the wire. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Make DELEGRETURN an interruptible operation.Trond Myklebust2006-01-06
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Convert LOCK rpc call into an asynchronous RPC callTrond Myklebust2006-01-06
| | | | | | In order to allow users to interrupt/cancel it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: locking XDR cleanupTrond Myklebust2006-01-06
| | | | | | Get rid of some unnecessary intermediate structures Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Make open recovery track O_RDWR, O_RDONLY and O_WRONLY correctlyTrond Myklebust2006-01-06
| | | | | | | When recovering from a delegation recall or a network partition, we need to replay open(O_RDWR), open(O_RDONLY) and open(O_WRONLY) separately. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Make nfs4_state track O_RDWR, O_RDONLY and O_WRONLY separatelyTrond Myklebust2006-01-06
| | | | | | | | | | | | | A closer reading of RFC3530 reveals that OPEN_DOWNGRADE must always specify a access modes that have been the argument of a previous OPEN operation. IOW: doing OPEN(O_RDWR) and then OPEN_DOWNGRADE(O_WRONLY) is forbidden unless the user called OPEN(O_WRONLY) In order to fix that, we really need to track the three possible open states separately. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Make open_confirm() asynchronous tooTrond Myklebust2006-01-06
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Convert open() into an asynchronous RPC callTrond Myklebust2006-01-06
| | | | | | | | | OPEN is a stateful operation, so we must ensure that it always completes. In order to allow users to interrupt the operation, we need to make the RPC call asynchronous, and then wait on completion (or cancel). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Allocate OPEN call RPC arguments using kmalloc()Trond Myklebust2006-01-06
| | | | | | Cleanup in preparation for making OPEN calls interruptible by the user. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>