aboutsummaryrefslogtreecommitdiffstats
path: root/include/linux/nfs_fs_sb.h
Commit message (Collapse)AuthorAge
* nfs41: nfs41: fix state manager deadlock in session resetAndy Adamson2009-12-04
| | | | | | | | | | | | | | | | If the session is reset during state recovery, the state manager thread can sleep on the slot_tbl_waitq causing a deadlock. Add a completion framework to the session. Have the state manager thread set a new session state (NFS4CLNT_SESSION_DRAINING) and wait for the session slot table to drain. Signal the state manager thread in nfs41_sequence_free_slot when the NFS4CLNT_SESSION_DRAINING bit is set and the session is drained. Reported-by: Trond Myklebust <trond@netapp.com> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Add 'server capability' flags for NFSv4 recommended attributesTrond Myklebust2009-08-09
| | | | | | | | | | | | | If the NFSv4 server doesn't support a POSIX attribute, the generic NFS code needs to know that, so that it don't keep trying to poll for it. However, by the same count, if the NFSv4 server does support that attribute, then we should ensure that the inode metadata is appropriately labelled as being untrusted. For instance, if we don't know the correct value of the file's uid, we should certainly not be caching ACLs or ACCESS results. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: Backchannel: Add a backchannel slot table to the sessionRicardo Labiaga2009-06-17
| | | | | | | | | | | | | Defines a new 'struct nfs4_slot_table' in the 'struct nfs4_session' for use by the backchannel. Initializes, resets, and destroys the backchannel slot table in the same manner the forechannel slot table is initialized, reset, and destroyed. The sequenceid for each slot in the backchannel slot table is initialized to 0, whereas the forechannel slotid's sequenceid is set to 1. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
* nfs41: add session setup to the state managerAndy Adamson2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At mount, nfs_alloc_client sets the cl_state NFS4CLNT_LEASE_EXPIRED bit and nfs4_alloc_session sets the NFS4CLNT_SESSION_SETUP bit, so both bits are set when nfs4_lookup_root calls nfs4_recover_expired_lease which schedules the nfs4_state_manager and waits for it to complete. Place the session setup after the clientid establishment in nfs4_state_manager so that the session is setup right after the clientid has been established without rescheduling the state manager. Unlike nfsv4.0, the nfs_client struct is not ready to use until the session has been established. Postpone marking the nfs_client struct to NFS_CS_READY until after a successful CREATE_SESSION call so that other threads cannot use the client until the session is established. If the EXCHANGE_ID call fails and the session has not been setup (the NFS4CLNT_SESSION_SETUP bit is set), mark the client with the error and return. If the session setup CREATE_SESSION call fails with NFS4ERR_STALE_CLIENTID which could occur due to server reboot or network partition inbetween the EXCHANGE_ID and CREATE_SESSION call, reset the NFS4CLNT_LEASE_EXPIRED and NFS4CLNT_SESSION_SETUP bits and try again. If the CREATE_SESSION call fails with other errors, mark the client with the error and return. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: NFS_CS_SESSION_SETUP cl_cons_state for back channel setup] On session setup, the CREATE_SESSION reply races with the server back channel probe which needs to succeed to setup the back channel. Set a new cl_cons_state NFS_CS_SESSION_SETUP just prior to the CREATE_SESSION call and add it as a valid state to nfs_find_client so that the client back channel can find the nfs_client struct and won't drop the server backchannel probe. Use a new cl_cons_state so that NFSv4.0 back channel behaviour which only sets NFS_CS_READY is unchanged. Adjust waiting on the nfs_client_active_wq accordingly. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: rename NFS_CS_SESSION_SETUP to NFS_CS_SESSION_INITING] Signed-off-by: Andy Adamson <andros@netapp.com> [nfs41: set NFS_CL_SESSION_INITING in alloc_session] Signed-off-by: Andy Adamson <andros@netapp.com> [nfs41: move session setup into a function] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [moved nfs4_proc_create_session declaration here] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: exchange_id operationBenny Halevy2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement the exchange_id operation conforming to http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26 Unlike NFSv4.0, NFSv4.1 requires machine credentials. RPC_AUTH_GSS machine credentials will be passed into the kernel at mount time to be available for the exchange_id operation. RPC_AUTH_UNIX root mounts can use the UNIX root credential. Store the root credential in the nfs_client struct. Without a credential, NFSv4.1 state renewal fails. [nfs41: establish clientid via exchange id only if cred != NULL] Signed-off-by: Andy Adamson<andros@umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: move nfstime4 from under CONFIG_NFS_V4_1] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: do not wait a lease time in exchange id] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: pass *session in seq_args and seq_res] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [nfs41: Ignoring impid in decode_exchange_id is missing a READ_BUF] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: fix Xcode_exchange_id's xdr Xcoding pointer type] [nfs41: get rid of unused struct nfs41_exchange_id_res members] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
* nfs41: introduce nfs4_call_syncAndy Adamson2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use nfs4_call_sync rather than rpc_call_sync to provide for a nfs41 sessions-enabled interface for sessions manipulation. The nfs41 rpc logic uses the rpc_call_prepare method to recover and create the session, as well as selecting a free slot id and the rpc_call_done to free the slot and update slot table related metadata. In the coming patches we'll add rpc prepare and done routines for setting up the sequence op and processing the sequence result. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: nfs4_call_sync] As per 11-14-08 review. Squash into "nfs41: introduce nfs4_call_sync" and "nfs41: nfs4_setup_sequence" Define two functions one for v4 and one for v41 add a pointer to struct nfs4_client to the correct one. Signed-off-by: Andy Adamson <andros@netapp.com> [added BUG() in _nfs4_call_sync_session if !CONFIG_NFS_V4_1] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: check for session not minorversion] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [group minorversion specific stuff together] Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Andy Adamson <andros@netapp.com> [nfs41: fixup nfs4_clear_client_minor_version] [introduce nfs4_init_client_minor_version() in this patch] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [cleaned-up patch: got rid of nfs_call_sync_t, dprintks, cosmetics, extra server defs] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: sessions client infrastructureAndy Adamson2009-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFSv4.1 Sessions basic data types, initialization, and destruction. The session is always associated with a struct nfs_client that holds the exchange_id results. Signed-off-by: Rahul Iyer <iyer@netapp.com> Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [remove extraneous rpc_clnt pointer, use the struct nfs_client cl_rpcclient. remove the rpc_clnt parameter from nfs4 nfs4_init_session] Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Use the presence of a session to determine behaviour instead of the minorversion number.] Signed-off-by: Andy Adamson <andros@netapp.com> [constified nfs4_has_session's struct nfs_client parameter] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Rename nfs4_put_session() to nfs4_destroy_session() and call it from nfs4_free_client() not nfs4_free_server(). Also get rid of nfs4_get_session() and the ref_count in nfs4_session struct as keeping track of nfs_client should be sufficient] Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com> [nfs41: pass rsize and wsize into nfs4_init_session] Signed-off-by: Andy Adamson <andros@netapp.com> [separated out removal of rpc_clnt parameter from nfs4_init_session ot a patch of its own] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Pass the nfs_client pointer into nfs4_alloc_session] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: don't assign to session->clp->cl_session in nfs4_destroy_session] [nfs41: fixup nfs4_clear_client_minor_version] [introduce nfs4_clear_client_minor_version() in this patch] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Refactor nfs4_init_session] Moved session allocation into nfs4_init_client_minor_version, called from nfs4_init_client. Leave rwise and wsize initialization in nfs4_init_session, called from nfs4_init_server. Reverted moving of nfs_fsid definition to nfs_fs_sb.h Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: Move NFS4_MAX_SLOT_TABLE define from under CONFIG_NFS_V4_1] [Fix comile error when CONFIG_NFS_V4_1 is not set.] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [moved nfs4_init_slot_table definition to "create_session operation"] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: alloc session with GFP_KERNEL] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: nfs_client.cl_minorversionBenny Halevy2009-06-17
| | | | | | | | | | | | | | | This field is set to the nfsv4 minor version for this mount. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Note: This patch sets the referral to the same minorversion as the current mount. Revisit in future patch. Signed-off-by: Andy Adamson <andros@netapp.com> [removed cl_minorversion assignment in nfs_set_client] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [always define nfs_client.cl_minorversion] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Define and create superblock-level objectsDavid Howells2009-04-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define and create superblock-level cache index objects (as managed by nfs_server structs). Each superblock object is created in a server level index object and is itself an index into which inode-level objects are inserted. Ideally there would be one superblock-level object per server, and the former would be folded into the latter; however, since the "nosharecache" option exists this isn't possible. The superblock object key is a sequence consisting of: (1) Certain superblock s_flags. (2) Various connection parameters that serve to distinguish superblocks for sget(). (3) The volume FSID. (4) The security flavour. (5) The uniquifier length. (6) The uniquifier text. This is normally an empty string, unless the fsc=xyz mount option was used to explicitly specify a uniquifier. The key blob is of variable length, depending on the length of (6). The superblock object is given no coherency data to carry in the auxiliary data permitted by the cache. It is assumed that the superblock is always coherent. This patch also adds uniquification handling such that two otherwise identical superblocks, at least one of which is marked "nosharecache", won't end up trying to share the on-disk cache. It will be possible to manually provide a uniquifier through a mount option with a later patch to avoid the error otherwise produced. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
* NFS: Define and create server-level objectsDavid Howells2009-04-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define and create server-level cache index objects (as managed by nfs_client structs). Each server object is created in the NFS top-level index object and is itself an index into which superblock-level objects are inserted. Ideally there would be one superblock-level object per server, and the former would be folded into the latter; however, since the "nosharecache" option exists this isn't possible. The server object key is a sequence consisting of: (1) NFS version (2) Server address family (eg: AF_INET or AF_INET6) (3) Server port. (4) Server IP address. The key blob is of variable length, depending on the length of (4). The server object is given no coherency data to carry in the auxiliary data permitted by the cache. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
* NFS: Add FS-Cache option bit and debug bitDavid Howells2009-04-03
| | | | | | | | | | | | | Add FS-Cache option bit to nfs_server struct. This is set to indicate local on-disk caching is enabled for a particular superblock. Also add debug bit for local caching operations. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
* NFSv4: Simplify some cache consistency post-op GETATTRsTrond Myklebust2009-03-11
| | | | | | | | | | | Certain asynchronous operations such as write() do not expect (or care) that other metadata such as the file owner, mode, acls, ... change. All they want to do is update and/or check the change attribute, ctime, and mtime. By skipping the file owner and group update, we also avoid having to do a potential idmapper upcall for these asynchronous RPC calls. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Remove nfs_client->cl_semTrond Myklebust2008-12-23
| | | | | | | | | | Now that we're using the flags to indicate state that needs to be recovered, as well as having implemented proper refcounting and spinlocking on the state and open_owners, we can get rid of nfs_client->cl_sem. The only remaining case that was dubious was the file locking, and that case is now covered by the nfsi->rwsem. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Clean up nfs_sb_active/nfs_sb_deactiveTrond Myklebust2008-10-06
| | | | | | | | | Instead of causing umount requests to block on server->active_wq while the asynchronous sillyrename deletes are executing, we can use the sb->s_active counter to obtain a reference to the super_block, and then release that reference in nfs_async_unlink_release(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Reintroduce machine credsTrond Myklebust2008-04-19
| | | | | | | | | We need to try to ensure that we always use the same credentials whenever we re-establish the clientid on the server. If not, the server won't recognise that we're the same client, and so may not allow us to recover state. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Save the values of the "mount*=" mount optionsChuck Lever2008-03-19
| | | | | | | | | | | | | Save the value of the mountproto= mountport= mountvers= and mountaddr= options so that these values can be displayed later via nfs_show_options(). This preserves the intent of the original mount options, should the file system need to be remounted based on what's displayed in /proc/mounts. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Save the value of the "port=" mount optionChuck Lever2008-03-19
| | | | | | | | | | | | | | | | | | | | | | | During a remount based on the mount options displayed in /proc/mounts, we want to preserve the original behavior of the mount request. Let's save the original setting of the "port=" mount option in the mount's nfs_server structure. This allows us to simplify the default behavior of port setting for NFSv4 mounts: by default, NFSv2/3 mounts first try an RPC bind to determine the NFS server's port, unless the user specified the "port=" mount option; Users can force the client to skip the RPC bind by explicitly specifying "port=<value>". NFSv4, by contrast, assumes the NFS server port is 2049 and skips the RPC bind, unless the user specifies "port=". Users can force an RPC bind for NFSv4 by explicitly specifying "port=0". I added a couple of extra comments to clarify this behavior. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Invoke nlmclnt_init during NFS mount processingChuck Lever2008-01-30
| | | | | | | | | | | | Cache an appropriate nlm_host structure in the NFS client's mount point metadata for later use. Note that there is no need to set NFS_MOUNT_NONLM in the error case -- if nfs_start_lockd() returns a non-zero value, its callers ensure that the mount request fails outright. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix the 'proto=' mount optionTrond Myklebust2008-01-30
| | | | | | | | Currently, if you have a server mounted using networking protocol, you cannot specify a different value using the 'proto=' option on another mountpoint. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Support per-mountpoint timeout parameters.Trond Myklebust2008-01-30
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Expand server address storage in nfs_client structChuck Lever2008-01-30
| | | | | | | | | | | Prepare for managing larger addresses in the NFS client by widening the nfs_client struct's cl_addr field. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net> (Modified to work with the new parameters for nfs_alloc_client) Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Increase size of cl_ipaddr field to hold IPv6 addressesChuck Lever2008-01-30
| | | | | | | | | The nfs_client's cl_ipaddr field needs to be larger to hold strings that represent IPv6 addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Remove the redundant nfs_client->cl_nfsversionTrond Myklebust2008-01-30
| | | | | | We can get the same information from the rpc_ops structure instead. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Stop sillyname renames and unmounts from racingSteve Dickson2008-01-30
| | | | | | | | | Added an active/deactive mechanism to the nfs_server structure allowing async operations to hold off umount until the operations are done. Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Reduce the chances of an open_owner identifier collisionTrond Myklebust2007-07-10
| | | | | | Currently we just use a 32-bit counter. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Don't reuse expired nfs4_state_owner structsTrond Myklebust2007-07-10
| | | | | | That just confuses certain NFSv4 servers. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: Remove redundant calls to rpciod_up()/rpciod_down()Trond Myklebust2007-07-10
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs: fix congestion control: use atomic_longsPeter Zijlstra2007-05-08
| | | | | | | | | | | | Change the atomic_t in struct nfs_server to atomic_long_t in anticipation of machines that can handle 8+TB of (4K) pages under writeback. However I suspect other things in NFS will start going *bang* by then. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] nfs: fix congestion controlPeter Zijlstra2007-03-16
| | | | | | | | | | | | | | | | | | | The current NFS client congestion logic is severly broken, it marks the backing device congested during each nfs_writepages() call but doesn't mirror this in nfs_writepage() which makes for deadlocks. Also it implements its own waitqueue. Replace this by a more regular congestion implementation that puts a cap on the number of active writeback pages and uses the bdi congestion waitqueue. Also always use an interruptible wait since it makes sense to be able to SIGKILL the process even for mounts without 'intr'. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Christoph Lameter <clameter@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* WorkStruct: Separate delayable and non-delayable events.David Howells2006-11-22
| | | | | | | | | | | | Separate delayable work items from non-delayable work items be splitting them into a separate structure (delayed_work), which incorporates a work_struct and the timer_list removed from work_struct. The work_struct struct is huge, and this limits it's usefulness. On a 64-bit architecture it's nearly 100 bytes in size. This reduces that by half for the non-delayable type of event. Signed-Off-By: David Howells <dhowells@redhat.com>
* NFSv4: Fix a use-after-free issue with the nfs server.Trond Myklebust2006-09-22
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Share NFS superblocks per-protocol per-server per-FSIDDavid Howells2006-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The attached patch makes NFS share superblocks between mounts from the same server and FSID over the same protocol. It does this by creating each superblock with a false root and returning the real root dentry in the vfsmount presented by get_sb(). The root dentry set starts off as an anonymous dentry if we don't already have the dentry for its inode, otherwise it simply returns the dentry we already have. We may thus end up with several trees of dentries in the superblock, and if at some later point one of anonymous tree roots is discovered by normal filesystem activity to be located in another tree within the superblock, the anonymous root is named and materialises attached to the second tree at the appropriate point. Why do it this way? Why not pass an extra argument to the mount() syscall to indicate the subpath and then pathwalk from the server root to the desired directory? You can't guarantee this will work for two reasons: (1) The root and intervening nodes may not be accessible to the client. With NFS2 and NFS3, for instance, mountd is called on the server to get the filehandle for the tip of a path. mountd won't give us handles for anything we don't have permission to access, and so we can't set up NFS inodes for such nodes, and so can't easily set up dentries (we'd have to have ghost inodes or something). With this patch we don't actually create dentries until we get handles from the server that we can use to set up their inodes, and we don't actually bind them into the tree until we know for sure where they go. (2) Inaccessible symbolic links. If we're asked to mount two exports from the server, eg: mount warthog:/warthog/aaa/xxx /mmm mount warthog:/warthog/bbb/yyy /nnn We may not be able to access anything nearer the root than xxx and yyy, but we may find out later that /mmm/www/yyy, say, is actually the same directory as the one mounted on /nnn. What we might then find out, for example, is that /warthog/bbb was actually a symbolic link to /warthog/aaa/xxx/www, but we can't actually determine that by talking to the server until /warthog is made available by NFS. This would lead to having constructed an errneous dentry tree which we can't easily fix. We can end up with a dentry marked as a directory when it should actually be a symlink, or we could end up with an apparently hardlinked directory. With this patch we need not make assumptions about the type of a dentry for which we can't retrieve information, nor need we assume we know its place in the grand scheme of things until we actually see that place. This patch reduces the possibility of aliasing in the inode and page caches for inodes that may be accessed by more than one NFS export. It also reduces the number of superblocks required for NFS where there are many NFS exports being used from a server (home directory server + autofs for example). This in turn makes it simpler to do local caching of network filesystems, as it can then be guaranteed that there won't be links from multiple inodes in separate superblocks to the same cache file. Obviously, cache aliasing between different levels of NFS protocol could still be a problem, but at least that gives us another key to use when indexing the cache. This patch makes the following changes: (1) The server record construction/destruction has been abstracted out into its own set of functions to make things easier to get right. These have been moved into fs/nfs/client.c. All the code in fs/nfs/client.c has to do with the management of connections to servers, and doesn't touch superblocks in any way; the remaining code in fs/nfs/super.c has to do with VFS superblock management. (2) The sequence of events undertaken by NFS mount is now reordered: (a) A volume representation (struct nfs_server) is allocated. (b) A server representation (struct nfs_client) is acquired. This may be allocated or shared, and is keyed on server address, port and NFS version. (c) If allocated, the client representation is initialised. The state member variable of nfs_client is used to prevent a race during initialisation from two mounts. (d) For NFS4 a simple pathwalk is performed, walking from FH to FH to find the root filehandle for the mount (fs/nfs/getroot.c). For NFS2/3 we are given the root FH in advance. (e) The volume FSID is probed for on the root FH. (f) The volume representation is initialised from the FSINFO record retrieved on the root FH. (g) sget() is called to acquire a superblock. This may be allocated or shared, keyed on client pointer and FSID. (h) If allocated, the superblock is initialised. (i) If the superblock is shared, then the new nfs_server record is discarded. (j) The root dentry for this mount is looked up from the root FH. (k) The root dentry for this mount is assigned to the vfsmount. (3) nfs_readdir_lookup() creates dentries for each of the entries readdir() returns; this function now attaches disconnected trees from alternate roots that happen to be discovered attached to a directory being read (in the same way nfs_lookup() is made to do for lookup ops). The new d_materialise_unique() function is now used to do this, thus permitting the whole thing to be done under one set of locks, and thus avoiding any race between mount and lookup operations on the same directory. (4) The client management code uses a new debug facility: NFSDBG_CLIENT which is set by echoing 1024 to /proc/net/sunrpc/nfs_debug. (5) Clone mounts are now called xdev mounts. (6) Use the dentry passed to the statfs() op as the handle for retrieving fs statistics rather than the root dentry of the superblock (which is now a dummy). Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Eliminate client_sys in favour of cl_rpcclientDavid Howells2006-09-22
| | | | | | | | | | | | | | | | | | | Eliminate nfs_server::client_sys in favour of nfs_client::cl_rpcclient as we only really need one per server that we're talking to since it doesn't have any security on it. The retransmission management variables are also moved to the common struct as they're required to set up the cl_rpcclient connection. The NFS2/3 client and client_acl connections are thenceforth derived by cloning the cl_rpcclient connection and post-applying the authorisation flavour. The code for setting up the initial common connection has been moved to client.c as nfs_create_rpc_client(). All the NFS program definition tables are also moved there as that's where they're now required rather than super.c. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Move rpc_ops from nfs_server to nfs_clientDavid Howells2006-09-22
| | | | | | | | Move the rpc_ops from the nfs_server struct to the nfs_client struct as they're common to all server records of a particular NFS protocol version. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Maintain a common server record for NFS2/3 as well as for NFS4David Howells2006-09-22
| | | | | | | | Maintain a common server record for NFS2/3 as well as for NFS4 so that common stuff can be moved there from struct nfs_server. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Add extra const qualifiersDavid Howells2006-09-22
| | | | | | | Add some extra const qualifiers into NFS. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Generalise the nfs_client structureDavid Howells2006-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generalise the nfs_client structure by: (1) Moving nfs_client to a more general place (nfs_fs_sb.h). (2) Renaming its maintenance routines to be non-NFS4 specific. (3) Move those maintenance routines to a new non-NFS4 specific file (client.c) and move the declarations to internal.h. (4) Make nfs_find/get_client() take a full sockaddr_in to include the port number (will be required for NFS2/3). (5) Make nfs_find/get_client() take the NFS protocol version (again will be required to differentiate NFS2, 3 & 4 client records). Also: (6) Make nfs_client construction proceed akin to inodes, marking them as under construction and providing a function to indicate completion. (7) Make nfs_get_client() wait interruptibly if it finds a client that it can share, but that client is currently being constructed. (8) Make nfs4_create_client() use (6) and (7) instead of locking cl_sem. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Rename nfs_server::nfs4_stateDavid Howells2006-09-22
| | | | | | | | Rename nfs_server::nfs4_state to nfs_client as it will be used to represent the client state for NFS2 and NFS3 also. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Rename struct nfs4_client to struct nfs_clientDavid Howells2006-09-22
| | | | | | | | Rename struct nfs4_client to struct nfs_client so that it can become the basis for a general client record for NFS2 and NFS3 in addition to NFS4. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Store the file system "fsid" value in the NFS super block.Trond Myklebust2006-06-09
| | | | | | | This should enable us to detect if we are crossing a mountpoint in the case where the server is exporting "nohide" mounts. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: report how long an NFS file system has been mountedChuck Lever2006-03-20
| | | | | | | | | | | | | Add a field in nfs_server to record a timestamp when a mount succeeds. Report the number of seconds the file system has been mounted via nfs_show_stats(). Test plan: Mount an NFS file system, watch the mountstats reports and compare with clock time. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: introduce mechanism for tracking NFS client metricsChuck Lever2006-03-20
| | | | | | | | | | | | | | | | | Add a per-superblock performance counter facility to the NFS client. This facility mimics the counters available for block devices and for networking. Expose these new counters via the new /proc/self/mountstats interface. Thanks to Andrew Morton and Trond Myklebust for their review and comments. Test plan: fsx and iozone on UP and SMP systems, with and without pre-emption. Watch for memory overwrite bugs, and performance loss (significantly more CPU required per op). Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: show retransmit settings when displaying mount optionsChuck Lever2006-03-20
| | | | | | | | | | | | | Sometimes it's important to know the exact RPC retransmit settings the kernel is using for an NFS mount point. Add this facility to the NFS client's show_options method. Test plan: Set various retransmit settings via the mount command, and check that the settings are reflected in /proc/mounts. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* [PATCH] NFS: Add support for NFSv3 ACLsAndreas Gruenbacher2005-06-22
| | | | | | | | | | | | | This adds acl support fo nfs clients via the NFSACL protocol extension, by implementing the getxattr, listxattr, setxattr, and removexattr iops for the system.posix_acl_access and system.posix_acl_default attributes. This patch implements a dumb version that uses no caching (and thus adds some overhead). (Another patch in this patchset adds caching as well.) Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-by: Olaf Kirch <okir@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Linux-2.6.12-rc2v2.6.12-rc2Linus Torvalds2005-04-16
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!