aboutsummaryrefslogtreecommitdiffstats
path: root/fs/nfsd
Commit message (Collapse)AuthorAge
* NLM: Remove unused argument from svc_addsock() functionChuck Lever2008-10-04
| | | | | | | | | Clean up: The svc_addsock() function no longer uses its "proto" argument, so remove it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* NLM: Remove "proto" argument from lockd_up()Chuck Lever2008-10-04
| | | | | | | | | Clean up: Now that lockd_up() starts listeners for both transports, the "proto" argument is no longer needed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: common grace period controlJ. Bruce Fields2008-10-03
| | | | | | | | | | | | | | | | | | | | | Rewrite grace period code to unify management of grace period across lockd and nfsd. The current code has lockd and nfsd cooperate to compute a grace period which is satisfactory to them both, and then individually enforce it. This creates a slight race condition, since the enforcement is not coordinated. It's also more complicated than necessary. Here instead we have lockd and nfsd each inform common code when they enter the grace period, and when they're ready to leave the grace period, and allow normal locking only after both of them are ready to leave. We also expect the locks_start_grace()/locks_end_grace() interface here to be simpler to build on for future cluster/high-availability work, which may require (for example) putting individual filesystems into grace, or enforcing grace periods across multiple cluster nodes. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: use nfs client rpc callback programBenny Halevy2008-09-29
| | | | | | | | | | | | | | | since commit ff7d9756b501744540be65e172d27ee321d86103 "nfsd: use static memory for callback program and stats" do_probe_callback uses a static callback program (NFS4_CALLBACK) rather than the one set in clp->cl_callback.cb_prog as passed in by the client in setclientid (4.0) or create_session (4.1). This patches introduces rpc_create_args.prognumber that allows overriding program->number when creating rpc_clnt. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: do_probe_callback should not clear rpc statsBenny Halevy2008-09-29
| | | | | | | | | | | | | | | | | | | Now that cb_stats are static (since commit ff7d9756b501744540be65e172d27ee321d86103) there's no need to clear them. Initially I thought it might make sense to do that every callback probing but since the stats are per-program and they are shared between possibly several client callback instances, zeroing them out seems like the wrong thing to do. Note that that commit also introduced a bug since stats.program is also being cleared in the process and it is not restored after the memset as it used to be. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* knfsd: allocate readahead cache in individual chunksJeff Layton2008-09-29
| | | | | | | | | | | | | | | | | | | I had a report from someone building a large NFS server that they were unable to start more than 585 nfsd threads. It was reported against an older kernel using the slab allocator, and I tracked it down to the large allocation in nfsd_racache_init failing. It appears that the slub allocator handles large allocations better, but large contiguous allocations can often be problematic. There doesn't seem to be any reason that the racache has to be allocated as a single large chunk. This patch breaks this up so that the racache is built up from separate allocations. (Thanks also to Takashi Iwai for a bugfix.) Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Takashi Iwai <tiwai@suse.de>
* nfsd: nfs4xdr decode_stateid helper functionBenny Halevy2008-09-29
| | | | | Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: properly xdr-decode NFS4_OPEN_CLAIM_DELEGATE_CUR stateidBenny Halevy2008-09-29
| | | | | Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: don't declare p in ENCODE_SEQID_OP_HEADBenny Halevy2008-09-29
| | | | | | | | | | After using the encode_stateid helper the "p" pointer declared by ENCODE_SEQID_OP_HEAD is warned as unused. In the single site where it is still needed it can be declared separately using the ENCODE_HEAD macro. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: nfs4xdr encode_stateid helper functionBenny Halevy2008-09-29
| | | | | Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: fix nfsd4_encode_open buffer space reservationBenny Halevy2008-09-29
| | | | | | | | | nfsd4_encode_open first reservation is currently for 36 + sizeof(stateid_t) while it writes after the stateid a cinfo (20 bytes) and 5 more 4-bytes words, for a total of 40 + sizeof(stateid_t). Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: properly xdr-encode deleg stateid returned from openBenny Halevy2008-09-29
| | | | | Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: properly xdr-encode stateid4.seqid as uint32_t for cb_recallBenny Halevy2008-09-29
| | | | | Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: permit unauthenticated stat of export rootJ. Bruce Fields2008-09-29
| | | | | | | | | | | | | | | | | | | | | | | | RFC 2623 section 2.3.2 permits the server to bypass gss authentication checks for certain operations that a client may perform when mounting. In the case of a client that doesn't have some form of credentials available to it on boot, this allows it to perform the mount unattended. (Presumably real file access won't be needed until a user with credentials logs in.) Being slightly more lenient allows lots of old clients to access krb5-only exports, with the only loss being a small amount of information leaked about the root directory of the export. This affects only v2 and v3; v4 still requires authentication for all access. Thanks to Peter Staubach testing against a Solaris client, which suggesting addition of v3 getattr, to the list, and to Trond for noting that doing so exposes no additional information. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Peter Staubach <staubach@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
* SUNRPC: Add address family field to svc_serv data structureChuck Lever2008-09-29
| | | | | | | | | | Introduce and initialize an address family field in the svc_serv structure. This field will determine what family to use for the service's listener sockets and what families are advertised via the local rpcbind daemon. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: fix buffer overrun decoding NFSv4 aclJ. Bruce Fields2008-09-01
| | | | | | | | | | The array we kmalloc() here is not large enough. Thanks to Johann Dahm and David Richter for bug report and testing. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: David Richter <richterd@citi.umich.edu> Tested-by: Johann Dahm <jdahm@umich.edu>
* nfsd: fix compound state allocation error handlingAndy Adamson2008-09-01
| | | | | | | | | Move the cstate_alloc call so that if it fails, the response is setup to encode the NFS error. The out label now means that the nfsd4_compound_state has not been allocated. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2008-08-12
|\ | | | | | | | | | | | | * 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: fs/nfsd/export.c: Adjust error handling code involving auth_domain_put MAINTAINERS: mention lockd and sunrpc in nfs entries lockd: trivial sparse endian annotations
| * fs/nfsd/export.c: Adjust error handling code involving auth_domain_putJulia Lawall2008-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once clp is assigned, it never becomes NULL, so we can make a label for it in the error handling code. Because the call to path_lookup follows the call to auth_domain_find, its error handling code should jump to this new label. The semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @r@ expression x,E; statement S; position p1,p2,p3; @@ ( if ((x = auth_domain_find@p1(...)) == NULL || ...) S | x = auth_domain_find@p1(...) ... when != x if (x == NULL || ...) S ) <... if@p3 (...) { ... when != auth_domain_put(x) when != if (x) { ... auth_domain_put(x); ...} return@p2 ...; } ...> ( return x; | return 0; | x = E | E = x | auth_domain_put(x) ) @exists@ position r.p1,r.p2,r.p3; expression x; int ret != 0; statement S; @@ * x = auth_domain_find@p1(...) <... * if@p3 (...) S ...> * return@p2 \(NULL\|ret\); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* | [NFSD] uninline nfsd4_op_name()Adrian Bunk2008-08-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There doesn't seem to be a compelling reason why nfsd4_op_name() is marked as "inline": It's only used in a dprintk(), and as long as it has only one caller non-ancient gcc versions anyway inline it automatically. This patch fixes the following compile error with gcc 3.4: ... CC fs/nfsd/nfs4proc.o nfs4proc.c: In function `nfsd4_proc_compound': nfs4proc.c:854: sorry, unimplemented: inlining failed in call to nfs4proc.c:897: sorry, unimplemented: called from here make[3]: *** [fs/nfsd/nfs4proc.o] Error 1 Reported-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Adrian Bunk <bunk@kernel.org> [ Also made it "const char *" - Linus] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] get rid of indirect users of namei.hAl Viro2008-07-26
| | | | | | | | | | | | | | fs.h needs path.h, not namei.h; nfs_fs.h doesn't need it at all. Several places in the tree needed direct include. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | [PATCH] kill nameidata passing to permission(), rename to inode_permission()Al Viro2008-07-26
| | | | | | | | | | | | | | Incidentally, the name that gives hundreds of false positives on grep is not a good idea... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | [patch 5/5] vfs: remove mode parameter from vfs_symlink()Miklos Szeredi2008-07-26
| | | | | | | | | | | | | | | | | | Remove the unused mode parameter from vfs_symlink and callers. Thanks to Tetsuo Handa for noticing. CC: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
* | lockd: dont return EAGAIN for a permanent errorMiklos Szeredi2008-07-25
|/ | | | | | | | | | | | | | | | | | Fix nlm_fopen() to return NLM_FAILED (or NLM_LCK_DENIED_NOLOCKS) instead of NLM_LCK_DENIED. The latter means the lock request failed because of a conflicting lock (i.e. a temporary error), which is wrong in this case. Also fix the client to return ENOLCK instead of EAGAIN if a blocking lock request returns with NLM_LOCK_DENIED. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: David Teigland <teigland@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2008-07-21
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: (51 commits) nfsd: nfs4xdr.c do-while is not a compound statement nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.c lockd: Pass "struct sockaddr *" to new failover-by-IP function lockd: get host reference in nlmsvc_create_block() instead of callers lockd: minor svclock.c style fixes lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock lockd: nlm_release_host() checks for NULL, caller needn't file lock: reorder struct file_lock to save space on 64 bit builds nfsd: take file and mnt write in nfs4_upgrade_open nfsd: document open share bit tracking nfsd: tabulate nfs4 xdr encoding functions nfsd: dprint operation names svcrdma: Change WR context get/put to use the kmem cache svcrdma: Create a kmem cache for the WR contexts svcrdma: Add flush_scheduled_work to module exit function svcrdma: Limit ORD based on client's advertised IRD svcrdma: Remove unused wait q from svcrdma_xprt structure svcrdma: Remove unneeded spin locks from __svc_rdma_free svcrdma: Add dma map count and WARN_ON ...
| * nfsd: nfs4xdr.c do-while is not a compound statementHarvey Harrison2008-07-18
| | | | | | | | | | | | | | | | | | The WRITEMEM macro produces sparse warnings of the form: fs/nfsd/nfs4xdr.c:2668:2: warning: do-while statement is not a compound statement Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.cJ. Bruce Fields2008-07-18
| | | | | | | | | | | | | | | | Thanks to problem report and original patch from Harvey Harrison. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Harvey Harrison <harvey.harrison@gmail.com> Cc: Benny Halevy <bhalevy@panasas.com>
| * lockd: Pass "struct sockaddr *" to new failover-by-IP functionChuck Lever2008-07-15
| | | | | | | | | | | | | | | | | | | | | | | | | | Pass a more generic socket address type to nlmsvc_unlock_all_by_ip() to allow for future support of IPv6. Also provide additional sanity checking in failover_unlock_ip() when constructing the server's IP address. As an added bonus, provide clean kerneldoc comments on related NLM interfaces which were recently added. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * nfsd: take file and mnt write in nfs4_upgrade_openBenny Halevy2008-07-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | testing with newpynfs revealed this warning: Jul 3 07:32:50 buml kernel: writeable file with no mnt_want_write() Jul 3 07:32:50 buml kernel: ------------[ cut here ]------------ Jul 3 07:32:50 buml kernel: WARNING: at /usr0/export/dev/bhalevy/git/linux-pnfs-bh-nfs41/include/linux/fs.h:855 drop_file_write_access+0x6b/0x7e() Jul 3 07:32:50 buml kernel: Modules linked in: nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc Jul 3 07:32:50 buml kernel: Call Trace: Jul 3 07:32:50 buml kernel: 6eaadc88: [<6002f471>] warn_on_slowpath+0x54/0x8e Jul 3 07:32:50 buml kernel: 6eaadcc8: [<601b790d>] printk+0xa0/0x793 Jul 3 07:32:50 buml kernel: 6eaadd38: [<601b6205>] __mutex_lock_slowpath+0x1db/0x1ea Jul 3 07:32:50 buml kernel: 6eaadd68: [<7107d4d5>] nfs4_preprocess_seqid_op+0x2a6/0x31c [nfsd] Jul 3 07:32:50 buml kernel: 6eaadda8: [<60078dc9>] drop_file_write_access+0x6b/0x7e Jul 3 07:32:50 buml kernel: 6eaaddc8: [<710804e4>] nfsd4_open_downgrade+0x114/0x1de [nfsd] Jul 3 07:32:50 buml kernel: 6eaade08: [<71076215>] nfsd4_proc_compound+0x1ba/0x2dc [nfsd] Jul 3 07:32:50 buml kernel: 6eaade48: [<71068221>] nfsd_dispatch+0xe5/0x1c2 [nfsd] Jul 3 07:32:50 buml kernel: 6eaade88: [<71312f81>] svc_process+0x3fd/0x714 [sunrpc] Jul 3 07:32:50 buml kernel: 6eaadea8: [<60039a81>] kernel_sigprocmask+0xf3/0x100 Jul 3 07:32:50 buml kernel: 6eaadee8: [<7106874b>] nfsd+0x182/0x29b [nfsd] Jul 3 07:32:50 buml kernel: 6eaadf48: [<60021cc9>] run_kernel_thread+0x41/0x4a Jul 3 07:32:50 buml kernel: 6eaadf58: [<710685c9>] nfsd+0x0/0x29b [nfsd] Jul 3 07:32:50 buml kernel: 6eaadf98: [<60021cb0>] run_kernel_thread+0x28/0x4a Jul 3 07:32:50 buml kernel: 6eaadfc8: [<60013829>] new_thread_handler+0x72/0x9c Jul 3 07:32:50 buml kernel: Jul 3 07:32:50 buml kernel: ---[ end trace 2426dd7cb2fba3bf ]--- Bruce Fields suggested this (Thanks!): maybe we need to be doing a mnt_want_write on open_upgrade and mnt_put_write on downgrade? This patch adds a call to mnt_want_write and file_take_write (which is doing the actual work). The counter-calls mnt_drop_write a file_release_write are now being properly called by drop_file_write_access in the exact path printed by the warning above. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * nfsd: document open share bit trackingJ. Bruce Fields2008-07-07
| | | | | | | | | | | | | | It's not immediately obvious from the code why we're doing this. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Benny Halevy <bhalevy@panasas.com>
| * nfsd: tabulate nfs4 xdr encoding functionsBenny Halevy2008-07-04
| | | | | | | | | | | | | | | | | | | | | | In preparation for minorversion 1 All encoders now return an nfserr status (typically their nfserr argument). Unsupported ops go through nfsd4_encode_operation too, so use nfsd4_encode_noop to encode nothing for their reply body. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * Merge branch 'for-bfields' of git://linux-nfs.org/~tomtucker/xprt-switch-2.6 ↵J. Bruce Fields2008-07-03
| |\ | | | | | | | | | into for-2.6.27
| * | nfsd: dprint operation namesBenny Halevy2008-07-02
| | | | | | | | | | | | | | | Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | nfsd: nfs4 minorversion decoder vectorsBenny Halevy2008-07-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | Have separate vectors of operation decoders for each minorversion. Obsolete ops in newer minorversions have default implementation returning nfserr_opnotsupp. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | nfsd: unsupported nfs4 ops should fail with nfserr_opnotsuppBenny Halevy2008-07-02
| | | | | | | | | | | | | | | | | | | | | nfserr_opnotsupp should be returned for unsupported nfs4 ops rather than nfserr_op_illegal. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | nfsd: tabulate nfs4 xdr decoding functionsBenny Halevy2008-07-02
| | | | | | | | | | | | | | | | | | | | | In preparation for minorversion 1 Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | nfsd: return nfserr_minor_vers_mismatch when compound minorversion != 0Benny Halevy2008-07-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check minorversion once before decoding any operation and reject with nfserr_minor_vers_mismatch if != 0 (this still happens in nfsd4_proc_compound). In this case return a zero length resultdata array as required by RFC3530. minorversion 1 processing will have its own vector of decoders. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | nfsd: clean up mnt_want_write callsMiklos Szeredi2008-07-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | Multiple mnt_want_write() calls in the switch statement looks really ugly. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | nfsd: treat all shutdown signals as equivalentJeff Layton2008-06-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | knfsd currently uses 2 signal masks when processing requests. A "loose" mask (SHUTDOWN_SIGS) that it uses when receiving network requests, and then a more "strict" mask (ALLOWED_SIGS, which is just SIGKILL) that it allows when doing the actual operation on the local storage. This is apparently unnecessarily complicated. The underlying filesystem should be able to sanely handle a signal in the middle of an operation. This patch removes the signal mask handling from knfsd altogether. When knfsd is started as a kthread, all signals are ignored. It then allows all of the signals in SHUTDOWN_SIGS. There's no need to set the mask as well. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | nfsd: fix spurious EACCESS in reconnect_path()Neil Brown2008-06-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Thanks to Frank Van Maarseveen for the original problem report: "A privileged process on an NFS client which drops privileges after using them to change the current working directory, will experience incorrect EACCES after an NFS server reboot. This problem can also occur after memory pressure on the server, particularly when the client side is quiet for some time." This occurs because the filehandle points to a directory whose parents are no longer in the dentry cache, and we're attempting to reconnect the directory to its parents without adequate permissions to perform lookups in the parent directories. We can therefore fix the problem by acquiring the necessary capabilities before attempting the reconnection. We do this only in the no_subtree_check case, since the documented behavior of the subtree_check export option requires the server to check that the user has lookup permissions on all parents. The subtree_check case still has a problem, since reconnect_path() unnecessarily requires both read and lookup permissions on all parent directories. However, a fix in that case would be more delicate, and use of subtree_check is already discouraged for other reasons. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Frank van Maarseveen <frankvm@frankvm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | nfsd: rename MAY_ flagsMiklos Szeredi2008-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename nfsd_permission() specific MAY_* flags to NFSD_MAY_* to make it clear, that these are not used outside nfsd, and to avoid name and number space conflicts with the VFS. [comment from hch: rename MAY_READ, MAY_WRITE and MAY_EXEC as well] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | knfsd: nfsd: Handle ERESTARTSYS from syscalls.NeilBrown2008-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | OCFS2 can return -ERESTARTSYS from write requests (and possibly elsewhere) if there is a signal pending. If nfsd is shutdown (by sending a signal to each thread) while there is still an IO load from the client, each thread could handle one last request with a signal pending. This can result in -ERESTARTSYS which is not understood by nfserrno() and so is reflected back to the client as nfserr_io aka -EIO. This is wrong. Instead, interpret ERESTARTSYS to mean "try again later" by returning nfserr_jukebox. The client will resend and - if the server is restarted - the write will (hopefully) be successful and everyone will be happy. The symptom that I narrowed down to this was: copy a large file via NFS to an OCFS2 filesystem, and restart the nfs server during the copy. The 'cp' might get an -EIO, and the file will be corrupted - presumably holes in the middle where writes appeared to fail. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | nfsd: fix race in nfsd_nrthreads()Neil Brown2008-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | We need the nfsd_mutex before accessing nfsd_serv->sv_nrthreads or we can't even guarantee nfsd_serv will still be there. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | sunrpc: remove sv_kill_signal field from svc_serv structJeff Layton2008-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | Since we no longer make any distinction between shutdown signals with nfsd, then it becomes easier to just standardize on a particular signal to use to bring it down (SIGINT, in this case). Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | knfsd: convert knfsd to kthread APIJeff Layton2008-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is rather large, but I couldn't figure out a way to break it up that would remain bisectable. It does several things: - change svc_thread_fn typedef to better match what kthread_create expects - change svc_pool_map_set_cpumask to be more kthread friendly. Make it take a task arg and and get rid of the "oldmask" - have svc_set_num_threads call kthread_create directly - eliminate __svc_create_thread Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | knfsd: remove special handling for SIGHUPJeff Layton2008-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The special handling for SIGHUP in knfsd is a holdover from much earlier versions of Linux where reloading the export table was more expensive. That facility is not really needed anymore and to my knowledge, is seldom-used. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | knfsd: clean up nfsd filesystem interfacesJeff Layton2008-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several of the nfsd filesystem interfaces allow changes to parameters that don't have any effect on a running nfsd service. They are only ever checked when nfsd is started. This patch fixes it so that changes to those procfiles return -EBUSY if nfsd is already running to make it clear that changes on the fly don't work. The patch should also close some relatively harmless races between changing the info in those interfaces and starting nfsd, since these variables are being moved under the protection of the nfsd_mutex. Finally, the nfsv4recoverydir file always returns -EINVAL if read. This patch fixes it to return the recoverydir path as expected. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | knfsd: Replace lock_kernel with a mutex for nfsd thread startup/shutdown ↵Neil Brown2008-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | locking. This removes the BKL from the RPC service creation codepath. The BKL really isn't adequate for this job since some of this info needs protection across sleeps. Also, add some comments to try and clarify how the locking should work and to make it clear that the BKL isn't necessary as long as there is adequate locking between tasks when touching the svc_serv fields. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | nfsd: make nfs4xdr WRITEMEM safe against zero countBenny Halevy2008-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | WRITEMEM zeroes the last word in the destination buffer for padding purposes, but this must not be done if no bytes are to be copied, as it would result in zeroing of the word right before the array. The current implementation works since it's always called with non zero nbytes or it follows an encoding of the string (or opaque) length which, if equal to zero, can be overwritten with zero. Nevertheless, it seems safer to check for this case. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * | nfsd: add dprintk of compound returnJ. Bruce Fields2008-06-23
| | | | | | | | | | | | | | | | | | | | | We already print each operation of the compound when debugging is turned on; printing the result could also help with remote debugging. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>