aboutsummaryrefslogtreecommitdiffstats
path: root/fs/gfs2/sys.c
Commit message (Collapse)AuthorAge
* Merge branch 'for-linus' of ↵Linus Torvalds2013-02-26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle *and* ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...
| * fs: change return values from -EACCES to -EPERMZhao Hongjiang2013-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to SUSv3: [EACCES] Permission denied. An attempt was made to access a file in a way forbidden by its file access permissions. [EPERM] Operation not permitted. An attempt was made to perform an operation limited to processes with appropriate privileges or to the owner of a file or other resource. So -EPERM should be returned if capability checks fails. Strictly speaking this is an API change since the error code user sees is altered. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'for-linus' of ↵Linus Torvalds2013-02-25
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace and namespace infrastructure changes from Eric W Biederman: "This set of changes starts with a few small enhnacements to the user namespace. reboot support, allowing more arbitrary mappings, and support for mounting devpts, ramfs, tmpfs, and mqueuefs as just the user namespace root. I do my best to document that if you care about limiting your unprivileged users that when you have the user namespace support enabled you will need to enable memory control groups. There is a minor bug fix to prevent overflowing the stack if someone creates way too many user namespaces. The bulk of the changes are a continuation of the kuid/kgid push down work through the filesystems. These changes make using uids and gids typesafe which ensures that these filesystems are safe to use when multiple user namespaces are in use. The filesystems converted for 3.9 are ceph, 9p, afs, ocfs2, gfs2, ncpfs, nfs, nfsd, and cifs. The changes for these filesystems were a little more involved so I split the changes into smaller hopefully obviously correct changes. XFS is the only filesystem that remains. I was hoping I could get that in this release so that user namespace support would be enabled with an allyesconfig or an allmodconfig but it looks like the xfs changes need another couple of days before it they are ready." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (93 commits) cifs: Enable building with user namespaces enabled. cifs: Convert struct cifs_ses to use a kuid_t and a kgid_t cifs: Convert struct cifs_sb_info to use kuids and kgids cifs: Modify struct smb_vol to use kuids and kgids cifs: Convert struct cifsFileInfo to use a kuid cifs: Convert struct cifs_fattr to use kuid and kgids cifs: Convert struct tcon_link to use a kuid. cifs: Modify struct cifs_unix_set_info_args to hold a kuid_t and a kgid_t cifs: Convert from a kuid before printing current_fsuid cifs: Use kuids and kgids SID to uid/gid mapping cifs: Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID to keyring_alloc cifs: Use BUILD_BUG_ON to validate uids and gids are the same size cifs: Override unmappable incoming uids and gids nfsd: Enable building with user namespaces enabled. nfsd: Properly compare and initialize kuids and kgids nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids nfsd: Modify nfsd4_cb_sec to use kuids and kgids nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion nfsd: Convert nfsxdr to use kuids and kgids nfsd: Convert nfs3xdr to use kuids and kgids ...
| * | gfs2: Convert gfs2_quota_refresh to take a kqidEric W. Biederman2013-02-13
| |/ | | | | | | | | | | | | | | | | | | | | - In quota_refresh_user_store convert the user supplied uid into a kqid and pass it to gfs2_quota_refresh. - In quota_refresh_group_store convert the user supplied gid into a kqid and pass it to gfs2_quota_refresh. Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | GFS2: Reinstate withdraw ack systemSteven Whitehouse2013-02-13
| | | | | | | | | | | | | | | | | | This patch reinstates the ack system which withdraw should be using. It appears to have been accidentally forgotten when the lock module was merged into GFS2, due to two different sysfs files having the same name. Reported-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* | GFS2: Clean up freeze codeSteven Whitehouse2013-01-29
|/ | | | | | | | | | | | | | | The freeze code has not been looked at a lot recently. Upstream has moved on, and this is an attempt to catch us back up again. There is a vfs level interface for the freeze code which can be called from our (obsolete, but kept for backward compatibility purposes) sysfs freeze interface. This means freezing this way vs. doing it from the ioctl should now work in identical fashion. As a result of this, the freeze function is only called once and we can drop our own special purpose code for counting the number of freezes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmwLinus Torvalds2012-07-24
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull GFS2 updates from Steven Whitehouse. * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw: GFS2: Eliminate 64-bit divides GFS2: Reduce file fragmentation GFS2: kernel panic with small gfs2 filesystems - 1 RG GFS2: Fixing double brelse'ing bh allocated in gfs2_meta_read when EIO occurs GFS2: Combine functions get_local_rgrp and gfs2_inplace_reserve GFS2: Add kobject release method GFS2: Size seq_file buffer more carefully GFS2: Use seq_vprintf for glocks debugfs file seq_file: Add seq_vprintf function and export it GFS2: Use lvbs for storing rgrp information with mount option GFS2: Cache last hash bucket for glock seq_files GFS2: Increase buffer size for glocks and glstats debugfs files GFS2: Fix error handling when reading an invalid block from the journal GFS2: Add "top dir" flag support GFS2: Fold quota data into the reservations struct GFS2: Extend the life of the reservations
| * GFS2: Add kobject release methodBob Peterson2012-06-13
| | | | | | | | | | | | | | | | | | | | This patch adds a kobject release function that properly maintains the kobject use count, so that accesses to the sysfs files do not cause an access to freed kernel memory after an unmount. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* | quota: Split dquot_quota_sync() to writeback and cache flushing partJan Kara2012-07-22
|/ | | | | | | | | | | | Split off part of dquot_quota_sync() which writes dquots into a quota file to a separate function. In the next patch we will use the function from filesystems and we do not want to abuse ->quota_sync quotactl callback more than necessary. Acked-by: Steven Whitehouse <swhiteho@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* gfs2: fix recovery during unmountDavid Teigland2012-05-02
| | | | | | | | | Journal recovery from lock_dlm should not be ignored if there is an unmount in progress. Ignoring it will causes the recovery to get stuck. The recovery process will correctly handle an in-progess unmount. Signed-off-by: David Teigland <teigland@redhat.com>
* GFS2: dlm based recovery coordinationDavid Teigland2012-01-11
| | | | | | | | | | | | | | This new method of managing recovery is an alternative to the previous approach of using the userland gfs_controld. - use dlm slot numbers to assign journal id's - use dlm recovery callbacks to initiate journal recovery - use a dlm lock to determine the first node to mount fs - use a dlm lock to track journals that need recovery Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Fix race during filesystem mountSteven Whitehouse2011-07-12
| | | | | | | | | | | | | | | | | | | | There is a potential race during filesystem mounting which has recently been reported. It occurs when the userland gfs_controld is able to process requests fast enough that it tries to use the sysfs interface before the lock module is properly initialised. This is a pretty unusual case as normally the lock module initialisation is very quick compared with gfs_controld. This patch adds an interruptible completion which is used to ensure that userland will wait for the initialisation of the lock module to complete. There are other potential solutions to this problem, but this is the quickest at this stage and has been tested both with and without mount.gfs2 present in the system. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Reported-by: David Booher <dbooher@adams.net>
* GFS2: Use UUID field in generic superblockSteven Whitehouse2011-05-10
| | | | | | | The VFS superblock structure now has a UUID field, so we can use that in preference to the UUID field in the GFS2 superblock now. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Fix type mapping for demote_rq interfaceSteven Whitehouse2010-10-06
| | | | | | | | | Mostly the glock operations follow the type of the glock. The one exception is the transaction glock, so we need to check for that directly. Reported-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Improve journal allocation via sysfsSteven Whitehouse2010-09-29
| | | | | | | | | | | | | | | | Recently a feature was added to GFS2 to allow journal id allocation via sysfs. This patch builds upon that so that a negative journal id will be treated as an error code to be passed back as the return code from mount. This allows termination of the mount process if there is a failure. Also, the process has been updated so that the kernel will wait for a journal id, even in the "spectator" case. This is required in order to avoid mounting a filesystem in case there is an error while joining the cluster. In the spectator case, 0 is written into the file to indicate that all is well, and that mount should continue. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds2010-08-07
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (55 commits) workqueue: mark init_workqueues() as early_initcall() workqueue: explain for_each_*cwq_cpu() iterators fscache: fix build on !CONFIG_SYSCTL slow-work: kill it gfs2: use workqueue instead of slow-work drm: use workqueue instead of slow-work cifs: use workqueue instead of slow-work fscache: drop references to slow-work fscache: convert operation to use workqueue instead of slow-work fscache: convert object to use workqueue instead of slow-work workqueue: fix how cpu number is stored in work->data workqueue: fix mayday_mask handling on UP workqueue: fix build problem on !CONFIG_SMP workqueue: fix locking in retry path of maybe_create_worker() async: use workqueue for worker pool workqueue: remove WQ_SINGLE_CPU and use WQ_UNBOUND instead workqueue: implement unbound workqueue workqueue: prepare for WQ_UNBOUND implementation libata: take advantage of cmwq and remove concurrency limitations workqueue: fix worker management invocation without pending works ... Fixed up conflicts in fs/cifs/* as per Tejun. Other trivial conflicts in include/linux/workqueue.h, kernel/trace/Kconfig and kernel/workqueue.c
| * gfs2: use workqueue instead of slow-workTejun Heo2010-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Workqueue can now handle high concurrency. Convert gfs to use workqueue instead of slow-work. * Steven pointed out that recovery path might be run from allocation path and thus requires forward progress guarantee without memory allocation. Create and use gfs_recovery_wq with rescuer. Please note that forward progress wasn't guaranteed with slow-work. * Updated to use non-reentrant workqueue. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
* | GFS2: Wait for journal id on mount if not specified on mount command lineSteven Whitehouse2010-07-29
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements a wait for the journal id in the case that it has not been specified on the command line. This is to allow the future removal of the mount.gfs2 helper. The journal id would instead be directly communicated by gfs_controld to the file system. Here is a comparison of the two systems: Current: 1. mount calls mount.gfs2 2. mount.gfs2 connects to gfs_controld to retrieve the journal id 3. mount.gfs2 adds the journal id to the mount command line and calls the mount system call 4. gfs_controld receives the status of the mount request via a uevent Proposed: 1. mount calls the mount system call (no mount.gfs2 helper) 2. gfs_controld receives a uevent for a gfs2 fs which it doesn't know about already 3. gfs_controld assigns a journal id to it via sysfs 4. the mount system call then completes as normal (sending a uevent according to status) The advantage of the proposed system is that it is completely backward compatible with the current system both at the kernel and at the userland levels. The "first" parameter can also be set the same way, with the restriction that it must be set before the journal id is assigned. In addition, if mount becomes stuck waiting for a reply from gfs_controld which never arrives, then it is killable and will abort the mount gracefully. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmwLinus Torvalds2010-05-21
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: GFS2: Fix typo GFS2: stuck in inode wait, no glocks stuck GFS2: Eliminate useless err variable GFS2: Fix writing to non-page aligned gfs2_quota structures GFS2: Add some useful messages GFS2: fix quota state reporting GFS2: Various gfs2_logd improvements GFS2: glock livelock GFS2: Clean up stuffed file copying GFS2: docs update GFS2: Remove space from slab cache name
| * GFS2: Fix typoSteven Whitehouse2010-05-14
| | | | | | | | | | | | A missing ! in a test. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * GFS2: Add some useful messagesSteven Whitehouse2010-05-06
| | | | | | | | | | | | | | | | | | | | | | | | | | The following patch adds a message to indicate when barriers have been disabled due to a block device which doesn't support them. You could already tell this via the mount options in /proc/mounts, but all the other filesystems also log a message at the same time. Also, the same mechanisms are used to indicate when the lock demote interface has been used (only ever used for debugging) which is a request from our support team. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * GFS2: Various gfs2_logd improvementsBenjamin Marzinski2010-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch contains various tweaks to how log flushes and active item writeback work. gfs2_logd is now managed by a waitqueue, and gfs2_log_reseve now waits for gfs2_logd to do the log flushing. Multiple functions were rewritten to remove the need to call gfs2_log_lock(). Instead of using one test to see if gfs2_logd had work to do, there are now seperate tests to check if there are two many buffers in the incore log or if there are two many items on the active items list. This patch is a port of a patch Steve Whitehouse wrote about a year ago, with some minor changes. Since gfs2_ail1_start always submits all the active items, it no longer needs to keep track of the first ai submitted, so this has been removed. In gfs2_log_reserve(), the order of the calls to prepare_to_wait_exclusive() and wake_up() when firing off the logd thread has been switched. If it called wake_up first there was a small window for a race, where logd could run and return before gfs2_log_reserve was ready to get woken up. If gfs2_logd ran, but did not free up enough blocks, gfs2_log_reserve() would be left waiting for gfs2_logd to eventualy run because it timed out. Finally, gt_logd_secs, which controls how long to wait before gfs2_logd times out, and flushes the log, can now be set on mount with ar_commit. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* | include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-30
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* Driver core: Constify struct sysfs_ops in struct kobj_typeEmese Revfy2010-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Constify struct sysfs_ops. This is part of the ops structure constification effort started by Arjan van de Ven et al. Benefits of this constification: * prevents modification of data that is shared (referenced) by many other structure instances at runtime * detects/prevents accidental (but not intentional) modification attempts on archs that enforce read-only kernel data at runtime * potentially better optimized code as the compiler can assume that the const data cannot be changed * the compiler/linker move const data into .rodata and therefore exclude them from false sharing Signed-off-by: Emese Revfy <re.emese@gmail.com> Acked-by: David Teigland <teigland@redhat.com> Acked-by: Matt Domsch <Matt_Domsch@dell.com> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Hans J. Koch <hjk@linutronix.de> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Jens Axboe <jens.axboe@oracle.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* kobject: Constify struct kset_uevent_opsEmese Revfy2010-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | Constify struct kset_uevent_ops. This is part of the ops structure constification effort started by Arjan van de Ven et al. Benefits of this constification: * prevents modification of data that is shared (referenced) by many other structure instances at runtime * detects/prevents accidental (but not intentional) modification attempts on archs that enforce read-only kernel data at runtime * potentially better optimized code as the compiler can assume that the const data cannot be changed * the compiler/linker move const data into .rodata and therefore exclude them from false sharing Signed-off-by: Emese Revfy <re.emese@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* Merge branch 'for_linus' of ↵Linus Torvalds2010-03-05
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits) quota: stop using QUOTA_OK / NO_QUOTA dquot: cleanup dquot initialize routine dquot: move dquot initialization responsibility into the filesystem dquot: cleanup dquot drop routine dquot: move dquot drop responsibility into the filesystem dquot: cleanup dquot transfer routine dquot: move dquot transfer responsibility into the filesystem dquot: cleanup inode allocation / freeing routines dquot: cleanup space allocation / freeing routines ext3: add writepage sanity checks ext3: Truncate allocated blocks if direct IO write fails to update i_size quota: Properly invalidate caches even for filesystems with blocksize < pagesize quota: generalize quota transfer interface quota: sb_quota state flags cleanup jbd: Delay discarding buffers in journal_unmap_buffer ext3: quota_write cross block boundary behaviour quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota quota: split out compat_sys_quotactl support from quota.c quota: split out netlink notification support from quota.c quota: remove invalid optimization from quota_sync_all ... Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c
| * quota: move code from sync_quota_sb into vfs_quota_syncChristoph Hellwig2010-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | Currenly sync_quota_sb does a lot of sync and truncate action that only applies to "VFS" style quotas and is actively harmful for the sync performance in XFS. Move it into vfs_quota_sync and add a wait parameter to ->quota_sync to tell if we need it or not. My audit of the GFS2 code says it's also not needed given the way GFS2 implements quotas, but I'd be happy if this can get a detailed review. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
* | GFS2: Remove loopy umount codeSteven Whitehouse2010-03-01
|/ | | | | | | | | | | | | As a consequence of the previous patch, we can now remove the loop which used to be required due to the circular dependency between the inodes and glocks. Instead we can just invalidate the inodes, and then clear up any glocks which are left. Also we no longer need the rwsem since there is no longer any danger of the inode invalidation calling back into the glock code (and from there back into the inode code). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* fs/gfs2/sys.c: use %pUB to print UUIDsJoe Perches2009-12-15
| | | | | | | Signed-off-by: Joe Perches <joe@perches.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* GFS2: Add proper error reporting to quota sync via sysfsSteven Whitehouse2009-12-03
| | | | | | For some reason, the errors were not making it to userspace. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Alter arguments of gfs2_quota/statfs_syncSteven Whitehouse2009-12-03
| | | | | | | | | These two functions are altered so that gfs2_quota_sync may in future be called directly from the VFS. The GFS2 superblock changes to a VFS super block and there is an addition of an int argument which is currently ignored. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Remove unused sysfs fileSteven Whitehouse2009-09-09
| | | | | | | | The /sys/fs/gfs2/<fsname>/lock_module/id file has been unused for some time now, so we can remove it. We still accept the mount option though, as userspace still sends that. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Add sysfs link to deviceSteven Whitehouse2009-08-17
| | | | | | | | This adds a link from the per-gfs2 sb sysfs directory to the block device upon which the filesystem is mounted. The link is called "device", strangely enough :-) Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Add some more info to ueventsSteven Whitehouse2009-08-17
| | | | | | | | | | | | | With each uevent, we now always include the journal ID. We can't call it JID since that is already in use by some of the individual events relating to recovery, so we use JOURNALID instead. We don't send the JOURNALID for spectator mounts, since there isn't one. Also the ADD event now has both RDONLY and SPECTATOR information to match that of the ONLINE event. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Fix permissions on "recover" fileSteven Whitehouse2009-08-14
| | | | | | | | | | | Although this file is only ever written and not read by userspace, it seems that the utils are opening this file O_RDWR, so we need to allow that. Also fixes the whitespace which seemed to be broken. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: David Teigland <teigland@redhat.com>
* GFS2: Remove args subdir from gfs2 sysfs filesSteven Whitehouse2009-05-26
| | | | | | | | | | | | | | | | | | | | | Since we can cat /proc/mounts there is no need to have this subdirectory in the gfs2 sysfs files. In fact this does not reflect the full range of possible mount argumenmts, where as /proc/mounts does. There was only one userland user of this set of sysfs files and it will function perfectly well without these files being present (in fact that subcommand of gfs2_tool is obsolete anyway). The tune/* subdirectory is also considered mostly obsolete, but there are a few uses of this until mount arguments can be added for the last few functions for which there are no equivalents currently. However the tune/* directory is still in my sights and new code should avoid using it. Only the gfs2_quota and gfs2_tool programs are know to use tune/* at the moment. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Remove lockstruct subdir from gfs2 sysfs filesSteven Whitehouse2009-05-26
| | | | | | | | | The lockstruct sub directory contained two entries, both of which are duplicated elsewhere in the gfs2 sysfs files as well as being available via /proc/mounts. There is no userland program using either of them, so this patch removes them. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Umount recovery race fixSteven Whitehouse2009-05-19
| | | | | | | | | | | | | | | | | | | This patch fixes a race condition where we can receive recovery requests part way through processing a umount. This was causing problems since the recovery thread had already gone away. Looking in more detail at the recovery code, it was really trying to implement a slight variation on a work queue, and that happens to align nicely with the recently introduced slow-work subsystem. As a result I've updated the code to use slow-work, rather than its own home grown variety of work queue. When using the wait_on_bit() function, I noticed that the wait function that was supplied as an argument was appearing in the WCHAN field, so I've updated the function names in order to produce more meaningful output. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Remove a couple of unused sysfs entriesSteven Whitehouse2009-05-13
| | | | | | | | | These two tunables are pointless and would never need to be changed anyway. There is also a race between them and umount as the deamons which they refer to might have gone away. The easiest way to fix the race is to remove the interface. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Add commit= mount optionSteven Whitehouse2009-05-13
| | | | | | | | | | | | | | | It has always been possible to adjust the gfs2 log commit interval, but only from the sysfs interface. This adds a mount option, commit=<nn>, which will be familar to ext3 users. The sysfs interface continues to be available as well, although this might be removed in the future. Also this patch cleans up some duplicated structures in the GFS2 sysfs code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Add a "demote a glock" interface to sysfsSteven Whitehouse2009-03-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a sysfs file called demote_rq to GFS2's per filesystem directory. Its possible to use this file to demote arbitrary glocks in exactly the same way as if a request had come in from a remote node. This is intended for testing issues relating to caching of data under glocks. Despite that, the interface is generic enough to send requests to any type of glock, but be careful as its not always safe to send an arbitrary message to an arbitrary glock. For that reason and to prevent DoS, this interface is restricted to root only. The messages look like this: <type>:<glocknumber> <mode> Example: echo -n "2:13324 EX" >/sys/fs/gfs2/unity:myfs/demote_rq Which means "please demote inode glock (type 2) number 13324 so that I can get an EX (exclusive) lock". The lock modes are those which would normally be sent by a remote node in its callback so if you want to unlock a glock, you use EX, to demote to shared, use SH or PR (depending on whether you like GFS2 or DLM lock modes better!). If the glock doesn't exist, you'll get -ENOENT returned. If the arguments don't make sense, you'll get -EINVAL returned. The plan is that this interface will be used in combination with the blktrace patch which I recently posted for comments although it is, of course, still useful in its own right. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Expose UUID via sysfs/ueventSteven Whitehouse2009-03-24
| | | | | | | | | | | | | Since we have a UUID, we ought to expose it to the user via sysfs and uevents. We already have the fs name in both of these places (a combination of the lock proto and lock table name) so if we add the UUID as well, we have a full set. For older filesystems (i.e. those created before mkfs.gfs2 was writing UUIDs by default) the sysfs file will appear zero length, and no UUID env var will be added to the uevents. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Merge lock_dlm module into GFS2Steven Whitehouse2009-03-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is the big patch that I've been working on for some time now. There are many reasons for wanting to make this change such as: o Reducing overhead by eliminating duplicated fields between structures o Simplifcation of the code (reduces the code size by a fair bit) o The locking interface is now the DLM interface itself as proposed some time ago. o Fewer lookups of glocks when processing replies from the DLM o Fewer memory allocations/deallocations for each glock o Scope to do further optimisations in the future (but this patch is more than big enough for now!) Please note that (a) this patch relates to the lock_dlm module and not the DLM itself, that is still a separate module; and (b) that we retain the ability to build GFS2 as a standalone single node filesystem with out requiring the DLM. This patch needs a lot of testing, hence my keeping it I restarted my -git tree after the last merge window. That way, this has the maximum exposure before its merged. This is (modulo a few minor bug fixes) the same patch that I've been posting on and off the the last three months and its passed a number of different tests so far. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: change gfs2_quota_scan into a shrinkerAbhijith Das2009-03-24
| | | | | | | | Deallocation of gfs2_quota_data objects now happens on-demand through a shrinker instead of routinely deallocating through the quotad daemon. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Remove ancient, unused codeSteven Whitehouse2009-01-05
| | | | | | | Remove code that used to have something to do with initrd but has been unused for a long time. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Send some sensible sysfs stuffSteven Whitehouse2009-01-05
| | | | | | | We ought to inform the user of the locktable and lockproto for each uevent we generate. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Kill two daemons with one patchSteven Whitehouse2009-01-05
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the two daemons, gfs2_scand and gfs2_glockd and replaces them with a shrinker which is called from the VM. The net result is that GFS2 responds better when there is memory pressure, since it shrinks the glock cache at the same rate as the VFS shrinks the dcache and icache. There are no longer any time based criteria for shrinking glocks, they are kept until such time as the VM asks for more memory and then we demote just as many glocks as required. There are potential future changes to this code, including the possibility of sorting the glocks which are to be written back into inode number order, to get a better I/O ordering. It would be very useful to have an elevator based workqueue implementation for this, as that would automatically deal with the read I/O cases at the same time. This patch is my answer to Andrew Morton's remark, made during the initial review of GFS2, asking why GFS2 needs so many kernel threads, the answer being that it doesn't :-) This patch is a net loss of about 200 lines of code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Clean up & move gfs2_quotadSteven Whitehouse2009-01-05
| | | | | | | | | | | | | | | | | | | | | | This patch is a clean up of gfs2_quotad prior to giving it an extra job to do in addition to the current portfolio of updating the quota and statfs information from time to time. As a result it has been moved into quota.c allowing one of the functions it calls to be made static. Also the clean up allows the two existing functions to have separate timeouts and also to coexist with its future role of dealing with the "truncate in progress" inode flag. The (pointless) setting of gfs2_quotad_secs is removed since we arrange to only wake up quotad when one of the two timers expires. In addition the struct gfs2_quota_data is moved into a slab cache, mainly for easier debugging. It should also be possible to use a shrinker in the future, rather than the current scheme of scanning the quota data entries from time to time. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: high time to take some time over atimeSteven Whitehouse2008-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now, we've used the same scheme as GFS1 for atime. This has failed since atime is a per vfsmnt flag, not a per fs flag and as such the "noatime" flag was not getting passed down to the filesystems. This patch removes all the "special casing" around atime updates and we simply use the VFS's atime code. The net result is that GFS2 will now support all the same atime related mount options of any other filesystem on a per-vfsmnt basis. We do lose the "lazy atime" updates, but we gain "relatime". We could add lazy atime to the VFS at a later date, if there is a requirement for that variant still - I suspect relatime will be enough. Also we lose about 100 lines of code after this patch has been applied, and I have a suspicion that it will speed things up a bit, even when atime is "on". So it seems like a nice clean up as well. From a user perspective, everything stays the same except the loss of the per-fs atime quantum tweekable (ought to be per-vfsmnt at the very least, and to be honest I don't think anybody ever used it) and that a number of options which were ignored before now work correctly. Please let me know if you've got any comments. I'm pushing this out early so that you can all see what my plans are. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Remove support for unused and pointless flagSteven Whitehouse2008-07-10
| | | | | | | | The ability to mark files for direct i/o access when opened normally is both unused and pointless, so this patch removes support for that feature. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>