aboutsummaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAge
* nfs41: add support for the exclusive create flagsAlexandros Batsakis2009-12-05
| | | | | | | | | | In v4.1 the client MUST/SHOULD use the EXCLUSIVE4_1 flag instead of EXCLUSIVE4, and GUARDED when the server supports persistent sessions. For now (and until we support suppattr_exclcreat), we don't send any attributes with EXCLUSIVE4_1 relying in the subsequent SETATTR as in v4.0 Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: check if session exists and if it is persistentAlexandros Batsakis2009-12-05
| | | | | Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: V2 initial support for CB_RECALL_ANYAlexandros Batsakis2009-12-05
| | | | | | | | For now the clients returns _all_ the delegations of the specificed type it holds Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs4: V2 return/expire delegations depending on their typeAlexandros Batsakis2009-12-05
| | | | | Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs4: minor delegation cleaningAlexandros Batsakis2009-12-05
| | | | | Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: add support for callback with RPC version number 4Alexandros Batsakis2009-12-05
| | | | | | | | | The NFSv4.1 spec-29 (18.36.3) says that the server MUST use an ONC RPC (program) version number equal to 4 in callbacks sent to the client. For now we allow both versions 1 and 4. Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: only state manager sets NFS4CLNT_SESSION_SETUPAndy Adamson2009-12-04
| | | | | | | | Replace sync and async handlers setting of the NFS4CLNT_SESSION_SETUP bit with setting NFS4CLNT_CHECK_LEASE, and let the state manager decide to reset the session. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: drain session cleanupAndy Adamson2009-12-04
| | | | | | | | | | | | Do not wake up the next slot_tbl_waitq task in nfs4_free_slot because we may be draining the slot. Either signal the state manager that the session is drained (the state manager wakes up tasks) OR wake up the next task. In nfs41_sequence_done, the slot dereference is only needed in the sequence operation success case. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: nfs41: fix state manager deadlock in session resetAndy Adamson2009-12-04
| | | | | | | | | | | | | | | | If the session is reset during state recovery, the state manager thread can sleep on the slot_tbl_waitq causing a deadlock. Add a completion framework to the session. Have the state manager thread set a new session state (NFS4CLNT_SESSION_DRAINING) and wait for the session slot table to drain. Signal the state manager thread in nfs41_sequence_free_slot when the NFS4CLNT_SESSION_DRAINING bit is set and the session is drained. Reported-by: Trond Myklebust <trond@netapp.com> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: remove nfs4_recover_sessionAndy Adamson2009-12-04
| | | | | | | | nfs4_recover_session can put rpciod to sleep. Just use nfs4_schedule_recovery. Reported-by: Trond Myklebust <trond.myklebust@netapp.com> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: don't clear tk_action on successAndy Adamson2009-12-04
| | | | | Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: fix switch in nfs4_recovery_handle_errorAndy Adamson2009-12-04
| | | | | | | Do not fall through and set NFS4CLNT_SESSION_RESET bit on NFS4ERR_EXPIRED Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: fix switch in nfs4_handle_exceptionAndy Adamson2009-12-04
| | | | | | | Do not fall through and call nfs4_delay on session error handling. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: free the slot on unhandled read errorsAndy Adamson2009-12-04
| | | | | | | | | nfs4_read_done returns zero on unhandled errors. nfs_readpage_result will return on a negative tk_status without freeing the slot. Call nfs4_sequence_free_slot on unhandled errors in nfs4_read_done. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: call free slot from nfs4_restart_rpcAndy Adamson2009-12-04
| | | | | | | | | | | | | | | | | | | | nfs41_sequence_free_slot can be called multiple times on SEQUENCE operation errors. No reason to inline nfs4_restart_rpc Reported-by: Trond Myklebust <trond.myklebust@netapp.com> nfs_writeback_done and nfs_readpage_retry call nfs4_restart_rpc outside the error handler, and the slot is not freed prior to restarting in the rpc_prepare state during session reset. Fix this by moving the call to nfs41_sequence_free_slot from the error path of nfs41_sequence_done into nfs4_restart_rpc, and by removing the test for NFS4CLNT_SESSION_SETUP. Always free slot and goto the rpc prepare state on async errors. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: nfs4_get_lease_time will never session resetAndy Adamson2009-12-04
| | | | | | | | Make this clear by calling rpc_restart-call. Prepare for nfs4_restart_rpc() to free slots. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: rename cl_state session SETUP bit to RESETAndy Adamson2009-12-04
| | | | | | | The bit is no longer used for session setup, only for session reset. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs41: add create session into establish_clidAndy Adamson2009-12-04
| | | | | | | | | | | | | | Reported-by: Trond Myklebust <trond.myklebust@netapp.com> Resetting the clientid from the state manager could result in not confirming the clientid due to create session not being called. Move the create session call from the NFS4CLNT_SESSION_SETUP state manager initialize session case into the NFS4CLNT_LEASE_EXPIRED case establish_clid call. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Merge branch 'devel' into linux-nextTrond Myklebust2009-12-03
|\
| * NFS4ERR_FILE_OPEN handling in Linux/NFSNeilBrown2009-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFS4ERR_FILE_OPEN is return by the server when an operation cannot be performed because the file is currently open and local (to the server) semantics prohibit the operation while the file is open. A typical case is a RENAME operation on an MS-Windows platform, which prevents rename while the file is open. While it is possible that such a condition is transitory, it is also very possible that the file will be held open for an extended period of time thus preventing the operation. The current behaviour of Linux/NFS is to retry the operation indefinitely. This is not appropriate - we do not expect a rename to take an arbitrary amount of time to complete. Rather, and error should be returned. The most obvious error code would be EBUSY, which is a legal at least for 'rename' and 'unlink', and accurately captures the reason for the error. This patch allows a few retries until about 2 seconds have elapsed, then returns EBUSY. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * nfs: clean up sillyrenaming in nfs_rename()Miklos Szeredi2009-12-03
| | | | | | | | | | | | | | | | | | | | | | | | The d_instantiate(new_dentry, NULL) is superfluous, the dentry is already negative. Rehashing this dummy dentry isn't needed either, d_move() works fine on an unhashed target. The re-checking for busy after a failed nfs_sillyrename() is bogus too: new_dentry->d_count < 2 would be a bug here. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * nfs: dont unhash target if renaming a directoryMiklos Szeredi2009-12-03
| | | | | | | | | | | | | | | | | | | | | | | | Move unhashing the target to after the check for existence and being a non-directory. If renaming a directory then the VFS already unhashes the target if it is not busy. If it's busy then acquiring more references during the rename makes no difference. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * nfs: fix comments in nfs_rename()Miklos Szeredi2009-12-03
| | | | | | | | | | | | | | | | | | Comments are wrong or out of date. In particular d_drop() doesn't free the inode it just unhashes the dentry. And if target is a directory then it is not checked for being busy. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * nfs: remove unnecessary check from nfs_rename()Miklos Szeredi2009-12-03
| | | | | | | | | | | | | | VFS already checks if both source and target are directories. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFSv4.1: Handle NFSv4.1 session errors in the lock recovery codeTrond Myklebust2009-12-03
| | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Revert default r/wsize behaviorChuck Lever2009-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the "rsize=" or "wsize=" mount options are not specified, text-based mounts have slightly different behavior than legacy binary mounts. Text-based mounts use the smaller of the server's maximum and the client's maximum, but binary mounts use the smaller of the server's _preferred_ size and the client's maximum. This difference is actually pretty subtle. Most servers advertise the same value as their maximum and their preferred transfer size, so the end result is the same in most cases. The reason for this difference is that for text-based mounts, if r/wsize are not specified, they are set to the largest value supported by the client. For legacy mounts, the values are set to zero if these options are not specified. nfs_server_set_fsinfo() can negotiate the transfer size defaults correctly in any case. There's no need to specify any particular value as default in the text-based option parsing logic. Note that nfs4 doesn't use nfs_server_set_fsinfo(), but the mount.nfs4 command does set rsize and wsize to 0 if the user didn't specify these options. So, make the same change for text-based NFSv4 mounts. Thanks to James Pearson <james-p@moving-picture.com> for reporting and diagnosing the problem. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: Display compressed (shorthand) IPv6 in /proc/mountsChuck Lever2009-12-03
| | | | | | | | | | | | | | | | | | Recent changes to snprintf() introduced the %pI6c formatter, which can display an IPv6 address with standard shorthanding. Use this new formatter when displaying IPv6 server addresses in /proc/mounts. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: convert proto= option to use netids rather than a protonameJeff Layton2009-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Solaris uses netids as values for the proto= option, so that when someone specifies "tcp6" they get traffic over TCP + IPv6. Until recently, this has never really been an issue for Linux since it didn't support NFS over IPv6. The netid and the protocol name were generally always the same (modulo any strange configuration in /etc/netconfig). The solaris manpage documents their proto= option as: proto= _netid_ | rdma This patch is intended to bring Linux closer to how the Solaris proto= option works, by declaring a static netid mapping in the kernel and converting the proto= and mountproto= options to follow it and display the proper values in /proc/mounts. Much of this functionality will need to be provided by a userspace mount.nfs patch. Chuck Lever has a patch to change mount.nfs in the same way. In principle, we could do *all* of this in userspace but that would mean that the options in /proc/mounts may not match the options used by userspace. The alternative to the static mapping here is to add a mechanism to upcall to userspace for netid's. I'm not opposed to that option, but it'll probably mean more overhead (and quite a bit more code). Rather than shoot for that at first, I figured it was probably better to start simply. Comments welcome. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * The rpc server does not require that service threads take the BKL.J. Bruce Fields2009-12-03
| | | | | | | | | | Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFS: BKL removal from the mount code...Trond Myklebust2009-12-03
| | | | | | | | | | | | | | None of the code in nfs_umount_begin() or nfs_remount() has any BKL dependency. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | NFSv4: Ensure nfs4_close_context() is declared as staticTrond Myklebust2009-12-03
| | | | | | | | | | | | Fix another 'sparse' warning in fs/nfs/nfs4proc.c Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | NFSv4: Ensure nfs_dns_lookup() and nfs_dns_update() are declared staticTrond Myklebust2009-12-03
| | | | | | | | | | | | Fix two 'sparse' warnings in fs/nfs/dns_resolve.c Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | NFSv4: Fix up error handling in the state manager main loop.Trond Myklebust2009-12-03
| | | | | | | | | | | | | | | | The nfs4_state_manager should not be looking at the error values when deciding whether or not to loop round in order to handle a higher priority state recovery task. It should rather be looking at the clp->cl_state. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | NFSv4: Handle NFS4ERR_GRACE when recovering an expired lease.Trond Myklebust2009-12-03
| | | | | | | | | | | | | | If our lease expires, and the server reboots while we're recovering, we need to be able to wait until the grace period is over. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | NFSv4: Ensure the state manager handles NFS4ERR_NO_GRACE correctlyTrond Myklebust2009-12-03
| | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | NFSv4: The state manager shouldn't exit on errors that were handledTrond Myklebust2009-12-03
| | | | | | | | | | | | | | | | | | | | | | | | nfs4_recovery_handle_error() will correctly handle errors such as NFS4ERR_CB_PATH_DOWN, however because they are still passed back to the main loop in nfs4_state_manager(), they can cause the latter to exit prematurely. Fix this by letting nfs4_recovery_handle_error() change the error value in cases where there is no action required by the caller. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | NFSv4: Fix up the callers of nfs4_state_end_reclaim_rebootTrond Myklebust2009-12-03
| | | | | | | | | | | | | | | | | | | | | | In practice, we need to ensure that we call nfs4_state_end_reclaim_reboot in 2 cases: - If we lose the lease while we were reclaiming state OR - After we're done with reboot recovery Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | NFSv4: Fix a potential state manager deadlock when returning delegationsTrond Myklebust2009-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nfsv4 state manager could potentially deadlock inside __nfs_inode_return_delegation() if the server reboots, so that the calls to nfs_msync_inode() end up waiting on state recovery to complete. Also ensure that if a server reboot or network partition causes us to have to stop returning delegations, that NFS4CLNT_DELEGRETURN is set so that the state manager can resume any outstanding delegation returns after it has dealt with the state recovery situation. Finally, ensure that the state manager doesn't wait for the DELEGRETURN call to complete. It doesn't need to, and that too can cause a deadlock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | Re: acl trouble after upgrading ubuntuJ. Bruce Fields2009-12-03
|/ | | | | | | | | | | | | | Subject: [PATCH] nfs: fix acl decoding Commit 28f566942c6b1d929f5e240e69e7081b77b238d3 "NFS: use dynamically computed compound_hdr.replen for xdr_inline_pages offset" accidentally changed the amount of space to allow for the acl reply, resulting in an IO error on attempts to get an acl. Reported-by: Paul Rudin <paul@rudin.co.uk> Cc: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* CacheFiles: Update IMA counters when using dentry_openMarc Dionne2009-12-01
| | | | | | | | | | When IMA is active, using dentry_open without updating the IMA counters will result in free/open imbalance errors when fput is eventually called. Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 9p: fix build breakage introduced by FS-CacheDavid Howells2009-12-01
| | | | | | | | | | | | | | | | | | | | | While building 2.6.32-rc8-git2 for Fedora I noticed the following thinko in commit 201a15428bd54f83eccec8b7c64a04b8f9431204 ("FS-Cache: Handle pages pending storage that get evicted under OOM conditions"): fs/9p/cache.c: In function '__v9fs_fscache_release_page': fs/9p/cache.c:346: error: 'vnode' undeclared (first use in this function) fs/9p/cache.c:346: error: (Each undeclared identifier is reported only once fs/9p/cache.c:346: error: for each function it appears in.) make[2]: *** [fs/9p/cache.o] Error 1 Fix the 9P filesystem to correctly construct the argument to fscache_maybe_release_page(). Signed-off-by: Kyle McMartin <kyle@redhat.com> Signed-off-by: Xiaotian Feng <dfeng@redhat.com> [from identical patch] Signed-off-by: Stefan Lippers-Hollmann <s.l-h@gmx.de> [from identical patch] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds2009-11-30
|\ | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] Fix sparse warning [CIFS] Duplicate data on appending to some Samba servers [CIFS] fix oops in cifs_lookup during net boot
| * [CIFS] Fix sparse warningSteve French2009-11-24
| | | | | | | | | | | | Also update CHANGES file Signed-off-by: Steve French <sfrench@us.ibm.com>
| * [CIFS] Duplicate data on appending to some Samba serversSteve French2009-11-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SMB writes are sent with a starting offset and length. When the server supports the newer SMB trans2 posix open (rather than using the SMB NTCreateX) a file can be opened with SMB_O_APPEND flag, and for that case Samba server assumes that the offset sent in SMBWriteX is unneeded since the write should go to the end of the file - which can cause problems if the write was cached (since the beginning part of a page could be written twice by the client mm). Jeff suggested that masking the flag on posix open on the client is easiest for the time being. Note that recent Samba server also had an unrelated problem with SMB NTCreateX and append (see samba bugzilla bug number 6898) which should not affect current Linux clients (unless cifs Unix Extensions are disabled). The cifs client did not send the O_APPEND flag on posix open before 2.6.29 so the fix is unneeded on early kernels. In the future, for the non-cached case (O_DIRECT, and forcedirectio mounts) it would be possible and useful to send O_APPEND on posix open (for Windows case: FILE_APPEND_DATA but not FILE_WRITE_DATA on SMB NTCreateX) but for cached writes although the vfs sets the offset to end of file it may fragment a write across pages - so we can't send O_APPEND on open (could result in sending part of a page twice). CC: Stable <stable@kernel.org> Reviewed-by: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * [CIFS] fix oops in cifs_lookup during net bootSteve French2009-11-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes bugzilla.kernel.org bug number 14641 Lookup called during network boot (network root filesystem for diskless workstation) has case where nd is null in lookup. This patch fixes that in cifs_lookup. (Shirish noted that 2.6.30 and 2.6.31 stable need the same check) Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com> Acked-by: Jeff Layton <jlayton@redhat.com> Tested-by: Vladimir Stavrinov <vs@inist.ru> CC: Stable <stable@kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
* | Merge branch 'for-linus' of ↵Linus Torvalds2009-11-30
|\ \ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: reject O_DIRECT flag also in fuse_create
| * | fuse: reject O_DIRECT flag also in fuse_createCsaba Henk2009-11-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The comment in fuse_open about O_DIRECT: "VFS checks this, but only _after_ ->open()" also holds for fuse_create, however, the same kind of check was missing there. As an impact of this bug, open(newfile, O_RDWR|O_CREAT|O_DIRECT) fails, but a stub newfile will remain if the fuse server handled the implied FUSE_CREATE request appropriately. Other impact: in the above situation ima_file_free() will complain to open/free imbalance if CONFIG_IMA is set. Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Harshavardhana <harsha@gluster.com> Cc: stable@kernel.org
* | | jffs2: Fix memory corruption in jffs2_read_inode_range()David Woodhouse2009-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In 2.6.23 kernel, commit a32ea1e1f925399e0d81ca3f7394a44a6dafa12c ("Fix read/truncate race") fixed a race in the generic code, and as a side effect, now do_generic_file_read() can ask us to readpage() past the i_size. This seems to be correctly handled by the block routines (e.g. block_read_full_page() fills the page with zeroes in case if somebody is trying to read past the last inode's block). JFFS2 doesn't handle this; it assumes that it won't be asked to read pages which don't exist -- and thus that there will be at least _one_ valid 'frag' on the page it's being asked to read. It will fill any holes with the following memset: memset(buf, 0, min(end, frag->ofs + frag->size) - offset); When the 'closest smaller match' returned by jffs2_lookup_node_frag() is actually on a previous page and ends before 'offset', that results in: memset(buf, 0, <huge unsigned negative>); Hopefully, in most cases the corruption is fatal, and quickly causing random oopses, like this: root@10.0.0.4:~/ltp-fs-20090531# ./testcases/kernel/fs/ftest/ftest01 Unable to handle kernel paging request for data at address 0x00000008 Faulting instruction address: 0xc01cd980 Oops: Kernel access of bad area, sig: 11 [#1] [...] NIP [c01cd980] rb_insert_color+0x38/0x184 LR [c0043978] enqueue_hrtimer+0x88/0xc4 Call Trace: [c6c63b60] [c004f9a8] tick_sched_timer+0xa0/0xe4 (unreliable) [c6c63b80] [c0043978] enqueue_hrtimer+0x88/0xc4 [c6c63b90] [c0043a48] __run_hrtimer+0x94/0xbc [c6c63bb0] [c0044628] hrtimer_interrupt+0x140/0x2b8 [c6c63c10] [c000f8e8] timer_interrupt+0x13c/0x254 [c6c63c30] [c001352c] ret_from_except+0x0/0x14 --- Exception: 901 at memset+0x38/0x5c LR = jffs2_read_inode_range+0x144/0x17c [c6c63cf0] [00000000] (null) (unreliable) This patch fixes the issue, plus fixes all LTP tests on NAND/UBI with JFFS2 filesystem that were failing since 2.6.23 (seems like the bug above also broke the truncation). Reported-By: Anton Vorontsov <avorontsov@ru.mvista.com> Tested-By: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-fscacheLinus Torvalds2009-11-30
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-fscache: (31 commits) FS-Cache: Provide nop fscache_stat_d() if CONFIG_FSCACHE_STATS=n SLOW_WORK: Fix GFS2 to #include <linux/module.h> before using THIS_MODULE SLOW_WORK: Fix CIFS to pass THIS_MODULE to slow_work_register_user() CacheFiles: Don't log lookup/create failing with ENOBUFS CacheFiles: Catch an overly long wait for an old active object CacheFiles: Better showing of debugging information in active object problems CacheFiles: Mark parent directory locks as I_MUTEX_PARENT to keep lockdep happy CacheFiles: Handle truncate unlocking the page we're reading CacheFiles: Don't write a full page if there's only a partial page to cache FS-Cache: Actually requeue an object when requested FS-Cache: Start processing an object's operations on that object's death FS-Cache: Make sure FSCACHE_COOKIE_LOOKING_UP cleared on lookup failure FS-Cache: Add a retirement stat counter FS-Cache: Handle pages pending storage that get evicted under OOM conditions FS-Cache: Handle read request vs lookup, creation or other cache failure FS-Cache: Don't delete pending pages from the page-store tracking tree FS-Cache: Fix lock misorder in fscache_write_op() FS-Cache: The object-available state can't rely on the cookie to be available FS-Cache: Permit cache retrieval ops to be interrupted in the initial wait phase FS-Cache: Use radix tree preload correctly in tracking of pages to be stored ...
| * | FS-Cache: Provide nop fscache_stat_d() if CONFIG_FSCACHE_STATS=nDavid Howells2009-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide nop fscache_stat_d() macro if CONFIG_FSCACHE_STATS=n lest errors like the following occur: fs/fscache/cache.c: In function 'fscache_withdraw_cache': fs/fscache/cache.c:386: error: implicit declaration of function 'fscache_stat_d' fs/fscache/cache.c:386: error: 'fscache_n_cop_sync_cache' undeclared (first use in this function) fs/fscache/cache.c:386: error: (Each undeclared identifier is reported only once fs/fscache/cache.c:386: error: for each function it appears in.) fs/fscache/cache.c:392: error: 'fscache_n_cop_dissociate_pages' undeclared (first use in this function) Signed-off-by: David Howells <dhowells@redhat.com>