aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* NLM: Fix reclaim racesTrond Myklebust2006-06-09
| | | | | | | | | Currently it is possible for a task to remove its locks at the same time as the NLM recovery thread is trying to recover them. This quickly leads to an Oops. Protect the locks using an rw semaphore while they are being recovered. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NLM: sem to mutex conversionTrond Myklebust2006-06-09
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* locks.c: add the fl_owner to nlm_compare_locksMarc Eshel2006-06-09
| | | | | | | | | | Add the fl_owner to NLM compare locks. Since two different client can present the same pid to the server it is not enough to distinguish locks from different clients. The fl_owner field is a pointer to the struct nlm_host which is unique for each client. Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mountsTrond Myklebust2006-06-09
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Split fs/nfs/inode.cDavid Howells2006-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As fs/nfs/inode.c is rather large, heterogenous and unwieldy, the attached patch splits it up into a number of files: (*) fs/nfs/inode.c Strictly inode specific functions. (*) fs/nfs/super.c Superblock management functions for NFS and NFS4, normal access, clones and referrals. The NFS4 superblock functions _could_ move out into a separate conditionally compiled file, but it's probably not worth it as there're so many common bits. (*) fs/nfs/namespace.c Some namespace-specific functions have been moved here. (*) fs/nfs/nfs4namespace.c NFS4-specific namespace functions (this could be merged into the previous file). This file is conditionally compiled. (*) fs/nfs/internal.h Inter-file declarations, plus a few simple utility functions moved from fs/nfs/inode.c. Additionally, all the in-.c-file externs have been moved here, and those files they were moved from now includes this file. For the most part, the functions have not been changed, only some multiplexor functions have changed significantly. I've also: (*) Added some extra banner comments above some functions. (*) Rearranged the function order within the files to be more logical and better grouped (IMO), though someone may prefer a different order. (*) Reduced the number of #ifdefs in .c files. (*) Added missing __init and __exit directives. Signed-Off-By: David Howells <dhowells@redhat.com>
* NFS: Fix typo in nfs_do_clone_mount()Trond Myklebust2006-06-09
| | | | | | Doh! Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix compile errors introduced by referrals patchesTrond Myklebust2006-06-09
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Ensure that referral mounts bind to a reserved portTrond Myklebust2006-06-09
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: A root pathname is sent as a zero component4Andy Adamson2006-06-09
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Follow a referralManoj Naik2006-06-09
| | | | | | | | | | | Respond to a moved error on NFS lookup by setting up the referral. Note: We don't actually follow the referral during lookup/getattr, but later when we detect fsid mismatch in inode revalidation (similar to the processing done for cloning submounts). Referrals will have fake attributes until they are actually followed or traversed. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Ensure client submounts when following a referralManoj Naik2006-06-09
| | | | | | | | Set up mountpoint when hitting a referral on moved error by getting fs_locations. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Expand clone mounts to include other serversManoj Naik2006-06-09
| | | | | Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Create NFSv4 transport and clientManoj Naik2006-06-09
| | | | | | | | Move existing code into a separate function so that it can be also used by referral code. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Define an fs_locations bitmapManoj Naik2006-06-09
| | | | | | | | | | | This is (similar to getattr bitmap) but includes fs_locations and mounted_on_fileid attributes. Use this bitmap for encoding in fs_locations requests. Note: We can probably do better by requesting locations as part of fsinfo itself. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: GETATTR attributes on referralManoj Naik2006-06-09
| | | | | | | | Per referral draft, only fs_locations, fsid, and mounted_on_fileid can be requested in a GETATTR on referrals. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Decode mounted_on_fileid attribute in getattr.Manoj Naik2006-06-09
| | | | | | | | It is ignored if fileid is also requested. This will be used on referrals (fs_locations). Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: convert fs-locations-components to conform to RFC3530Manoj Naik2006-06-09
| | | | | | | | Use component4-style formats for decoding list of servers and pathnames in fs_locations. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Implement the fs_locations function callTrond Myklebust2006-06-09
| | | | | | | | | | | NFSv4 allows for the fact that filesystems may be replicated across several servers or that they may be migrated to a backup server in case of failure of the primary server. fs_locations is an NFSv4 operation for retrieving information about the location of migrated and/or replicated filesystems. Based on an initial implementation by Jiaying Zhang <jiayingz@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* RPC: Allow struc xdr_stream to read the page section of an xdr_bufTrond Myklebust2006-06-09
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Add timeout to submountsTrond Myklebust2006-06-09
| | | | | | | Make automounted partitions expire using the mark_mounts_for_expiry() function. The timeout is controlled via a sysctl. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Ensure the client submounts, when it crosses a server mountpoint.Trond Myklebust2006-06-09
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Store the file system "fsid" value in the NFS super block.Trond Myklebust2006-06-09
| | | | | | | This should enable us to detect if we are crossing a mountpoint in the case where the server is exporting "nohide" mounts. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* VFS: Remove dependency of ->umount_begin() call on MNT_FORCETrond Myklebust2006-06-09
| | | | | | | Allow filesystems to decide to perform pre-umount processing whether or not MNT_FORCE is set. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* VFS: Add shrink_submounts()Trond Myklebust2006-06-09
| | | | | | | | Allow a submount to be marked as being 'shrinkable' by means of the vfsmount->mnt_flags, and then add a function 'shrink_submounts()' which attempts to recursively unmount these submounts. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* VFS: Unexport do_kern_mount() and clean up simple_pin_fs()Trond Myklebust2006-06-09
| | | | | | | Replace all module uses with the new vfs_kern_mount() interface, and fix up simple_pin_fs(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* VFS: Add GPL_EXPORTED function vfs_kern_mount()Trond Myklebust2006-06-09
| | | | | | | | | | | | do_kern_mount() does not allow the kernel to use private mount interfaces without exposing the same interfaces to userland. The problem is that the filesystem is referenced by name, thus meaning that it and its mount interface must be registered in the global filesystem list. vfs_kern_mount() passes the struct file_system_type as an explicit parameter in order to overcome this limitation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Remove nfs_delete_inode()Trond Myklebust2006-06-09
| | | | | | | | Now that we have a real nfs_invalidate_page() to ensure that truncate_inode_pages() does the right thing when there are pending dirty pages, we can get rid of nfs_delete_inode(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Flesh out nfs_invalidate_page()Trond Myklebust2006-06-09
| | | | | | | In the case of a call to truncate_inode_pages(), we should really try to cancel any pending writes on the page. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: remove obviously bogus comparison from decode_getaclJ. Bruce Fields2006-06-09
| | | | | | | | | | We just set *acl_len to zero, and attrlen is unsigned, so this comparison is clearly bogus. I have no idea what I was thinking. Fixes a bug that caused getacl to fail over krb5p. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: really return status from decode_recall_args()Alexey Dobriyan2006-06-09
| | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv3: Client-side nfsacl caching fixAndreas Gruenbacher2006-06-09
| | | | | | | | | | | | | | Fix two errors in the client-side acl cache: First, when nfs3_proc_getacl requests only the default acl of a file and the access acl is not cached already, a NULL access acl entry is cached instead of ERR_PTR(-EAGAIN) ("not cached"). Second, update the cached acls in nfs3_proc_setacls: nfs_refresh_inode does not always invalidate the cached acls, and when it does not, the cached acls get out of sync. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix up inode revalidation accountingTrond Myklebust2006-06-09
| | | | | | | Currently, we are accounting for all calls to nfs_revalidate_inode(), but not to nfs_revalidate_mapping(), or nfs_lookup_verify_inode(), etc... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Separate metadata and page cache revalidation mechanismsTrond Myklebust2006-06-09
| | | | | | | | | | Separate out the function of revalidating the inode metadata, and revalidating the mapping. The former may be called by lookup(), and only really needs to check that permissions, ctime, etc haven't changed whereas the latter needs only done when we want to read data from the page cache, and may need to sync and then invalidate the mapping. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: More page cache revalidation fixupsTrond Myklebust2006-06-09
| | | | | | | | Whenever the directory changes, we want to make sure that we always invalidate its page cache. Fix up update_changeattr() and nfs_mark_for_revalidate() so that they do so. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix page cache revalidationTrond Myklebust2006-06-09
| | | | | | | | Fix up a bug in the handling of NFS_INO_REVAL_PAGECACHE: make sure that nfs_update_inode() clears it when we're sure we're not racing with other updates. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Optimize allocation of nfs_read/write_data structuresChuck Lever2006-06-09
| | | | | | | | | | | | Clean up use of page_array, and fix an off-by-one error noticed by Tom Talpey which causes kmalloc calls in cases where using the page_array is sufficient. Test plan: Normal client functional testing with r/wsize=32768. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: NFS_ROOT always uses the same XIDsChuck Lever2006-06-09
| | | | | | | | | | | | | | | | | | The XID generator uses get_random_bytes to generate an initial XID. NFS_ROOT starts up before the random driver, though, so get_random_bytes doesn't set a random XID for NFS_ROOT. This causes NFS_ROOT mount points to reuse XIDs every time the client is booted. If the client boots often enough, the server will start serving old replies out of its DRC. Use net_random() instead. Test plan: I/O intensive workloads should perform well and generate no errors. Traces taken during client reboots should show that NFS_ROOT mounts use unique XIDs after every reboot. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC: select privileged port numbers at randomChuck Lever2006-06-09
| | | | | | | | | | | | | | | | | | | Make the RPC client select privileged ephemeral source ports at random. This improves DRC behavior on the server by using the same port when reconnecting for the same mount point, but using a different port for fresh mounts. The Linux TCP implementation already does this for nonprivileged ports. Note that TCP sockets in TIME_WAIT will prevent quick reuse of a random ephemeral port number by leaving the port INUSE until the connection transitions out of TIME_WAIT. Test plan: Connectathon against every known server implementation using multiple mount points. Locking especially. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Clean up inode metadata updatesTrond Myklebust2006-06-09
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Some NFSv4 servers have broken behaviour for the change attributeTrond Myklebust2006-06-09
| | | | | | | | | The Linux NFSv4 server violates RFC3530 in that the change attribute is not guaranteed to be updated for every change to the inode. Our optimisation for checking whether or not the inode metadata has changed or not is broken too. Grr.... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Clean up and fix page zeroing when we have short readsTrond Myklebust2006-06-09
| | | | | | | | | | | | | The code that is supposed to zero the uninitialised partial pages when the server returns a short read is currently broken: it looks at the nfs_page wb_pgbase and wb_bytes fields instead of the equivalent nfs_read_data values when deciding where to start truncating the page. Also ensure that we are more careful about setting PG_uptodate before retrying a short read: the retry will change the nfs_read_data args.pgbase and args.count. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Merge branch 'upstream-fixes' of ↵Linus Torvalds2006-06-08
|\ | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 * 'upstream-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: e1000: remove risky prefetch on next_skb->data e1000: fix ethtool test irq alloc as "probe" [PATCH] bcm43xx: add DMA rx poll workaround to DMA4
| * Merge branch 'upstream-fixes' of ↵Jeff Garzik2006-06-08
| |\ | | | | | | | | | git://lost.foo-projects.org/~ahkok/git/netdev-2.6 into upstream-fixes
| | * e1000: remove risky prefetch on next_skb->dataAuke Kok2006-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was brought to our attention that the prefetches break e1000 traffic on xscale/arm architectures. Remove them for now. We'll let them stay in mm for a while, or find a better solution to enable. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
| | * e1000: fix ethtool test irq alloc as "probe"Auke Kok2006-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | New code added in 2.6.17 caused setup_irq to print a warning when running ethtool -t eth0 offline. This test marks the request_irq call made by this test as a "probe" to see if the interrupt is shared or not. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
| * | Merge branch 'upstream-fixes' of ↵Jeff Garzik2006-06-08
| |\ \ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream-fixes
| | * | [PATCH] bcm43xx: add DMA rx poll workaround to DMA4Michael Buesch2006-06-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also add the Poll RX DMA Memory workaround to the DMA4 (xmitstatus) path. Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* | | | [PATCH] s390: fix in-user atomic futex operation.Martin Schwidefsky2006-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Martin Schwidefsky <schwidefsky@de.ibm.com> __futex_atomic_op needs to do an atomic operation in the user address space, not the kernel address space. Add the missing sacf 256/sacf 0 to switch to the secondary mode before doing the compare-and-swap. In addition add another fixup for catch specification exceptions if the compare-and-swap address is not aligned. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] debugfs inode leakJens Axboe2006-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Looking at the reiser4 crash, I found a leak in debugfs. In debugfs_mknod(), we create the inode before checking if the dentry already has one attached. We don't free it if that is the case. These bugs happen quite often, I'm starting to think we should disallow such coding in CodingStyle. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] elevator switching raceJens Axboe2006-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a race between shutting down one io scheduler and firing up the next, in which a new io could enter and cause the io scheduler to be invoked with bad or NULL data. To fix this, we need to maintain the queue lock for a bit longer. Unfortunately we cannot do that, since the elevator init requires to be run without the lock held. This isn't easily fixable, without also changing the mempool API. So split the initialization into two parts, and alloc-init operation and an attach operation. Then we can preallocate the io scheduler and related structures, and run the attach inside the lock after we detach the old one. This patch has survived 30 minutes of 1 second io scheduler switching with a very busy io load. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>