aboutsummaryrefslogtreecommitdiffstats
path: root/fs/ocfs2/super.c
Commit message (Collapse)AuthorAge
* ocfs2: Negotiate locking protocol versions.Joel Becker2008-02-06
| | | | | | | | | | | | | | | | | | | | | | | Currently, when ocfs2 nodes connect via TCP, they advertise their compatibility level. If the versions do not match, two nodes cannot speak to each other and they disconnect. As a result, this provides no forward or backwards compatibility. This patch implements a simple protocol negotiation at the dlm level by introducing a major/minor version number scheme for entities that communicate. Specifically, o2dlm has a major/minor version for interaction with o2dlm on other nodes, and ocfs2 itself has a major/minor version for interacting with the filesystem on other nodes. This will allow rolling upgrades of ocfs2 clusters when changes to the locking or network protocols can be done in a backwards compatible manner. In those cases, only the minor number is changed and the negotatied protocol minor is returned from dlm join. In the far less likely event that a required protocol change makes backwards compatibility impossible, we simply bump the major number. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Silence false lockdep warningsJan Kara2008-01-25
| | | | | | | | | Create separate lockdep lock classes for system file's i_mutexes. They are used to guard allocations and similar things and thus rank differently than i_mutex of a regular file or directory. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH 2/2] ocfs2: cluster aware flock()Mark Fasheh2008-01-25
| | | | | | | | Hook up ocfs2_flock(), using the new flock lock type in dlmglue.c. A new mount option, "localflocks" is added so that users can revert to old functionality as need be. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Local alloc window size changeable via mount optionSunil Mushran2008-01-25
| | | | | | | | | | | | | | Local alloc is a performance optimization in ocfs2 in which a node takes a window of bits from the global bitmap and then uses that for all small local allocations. This window size is fixed to 8MB currently. This patch allows users to specify the window size in MB including disabling it by passing in 0. If the number specified is too large, the fs will use the default value of 8MB. mount -o localalloc=X /dev/sdX /mntpoint Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Support commit= mount optionMark Fasheh2008-01-25
| | | | | | | | | Mostly taken from ext3. This allows the user to set the jbd commit interval, in seconds. The default of 5 seconds stays the same, but now users can easily increase the commit interval. Typically, this would be increased in order to benefit performance at the expense of data-safety. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Initalize bitmap_cpg of ocfs2_super to be the maximum.Tao Ma2008-01-25
| | | | | | | | | | | | | | | | | | | | This value is initialized from global_bitmap->id2.i_chain.cl_cpg. If there is only 1 group, it will be equal to the total clusters in the volume. So as for online resize, it should change for all the nodes in the cluster. It isn't easy and there is no corresponding lock for it. bitmap_cpg is only used in 2 areas: 1. Check whether the suballoc is too large for us to allocate from the global bitmap, so it is little used. And now the suballoc size is 2048, it rarely meet this situation and the check is almost useless. 2. Calculate which group a cluster belongs to. We use it during truncate to figure out which cluster group an extent belongs too. But we should be OK if we increase it though as the cluster group calculated shouldn't change and we only ever have a small bitmap_cpg on file systems with a single cluster group. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Rename ocfs2_meta_[un]lockMark Fasheh2008-01-25
| | | | | | | Call this the "inode_lock" now, since it covers both data and meta data. This patch makes no functional changes. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove data locksMark Fasheh2008-01-25
| | | | | | | | | | | | | | The meta lock now covers both meta data and data, so this just removes the now-redundant data lock. Combining locks saves us a round of lock mastery per inode and one less lock to ping between nodes during read/write. We don't lose much - since meta locks were always held before a data lock (and at the same level) ordered writeout mode (the default) ensured that flushing for the meta data lock also pushed out data anyways. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove mount/unmount votesMark Fasheh2008-01-25
| | | | | | | | | | | | The node maps that are set/unset by these votes are no longer relevant, thus we can remove the mount and umount votes. Since those are the last two remaining votes, we can also remove the entire vote infrastructure. The vote thread has been renamed to the downconvert thread, and the small amount of functionality related to managing it has been moved into fs/ocfs2/dlmglue.c. All references to votes have been removed or updated. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove fs dependency on ocfs2_heartbeat moduleMark Fasheh2008-01-25
| | | | | | | | | Now that the dlm exposes domain information to us, we don't need generic node up / node down callbacks. And since the DLM is only telling us when a node goes down unexpectedly, we no longer need to optimize away node down callbacks via the umount map. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Reset journal parameters after s_mount_opt updateMark Fasheh2007-11-27
| | | | | | | Right now we're just setting them from the existing parameters, not the new ones that a remount specified. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* Slab API: remove useless ctor parameter and reorder parametersChristoph Lameter2007-10-17
| | | | | | | | | | | | | | | | | | | | | Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void *object, struct kmem_cache *s, unsigned long flags) to ctor(struct kmem_cache *s, void *object) throughout the kernel [akpm@linux-foundation.org: coupla fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] fs/ocfs2/: removed unneeded initial value and function's return valueDenis Cheng2007-10-12
| | | | | Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Implement show_options()Sunil Mushran2007-10-12
| | | | | | | | Implement sops->show_options() so as to allow /proc/mounts to show the mount options. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Clear slot map when umounting a local volumeMark Fasheh2007-10-12
| | | | | | | This is technically harmless (recovery will clean it out later), but leaves a bogus entry in the slot_map which really shouldn't be there. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] ocfs2: fix mount option parsingTiger Yang2007-09-11
| | | | | | | | | | | | | For some mount option types, ocfs2_parse_options() will try to access sb->s_fs_info to get at the ocfs2 private superblock. Unfortunately, that hasn't been allocated yet and will cause a kernel crash. Fix this by storing options in a struct which can then get pushed into the ocfs2_super once it's been allocated later. If we need more options which store to the ocfs2_super in the future, we can just fields to this struct. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: set non-default s_time_gran during mountMark Fasheh2007-08-09
| | | | | | | We need to manually set this to '1' during mount, otherwise inode_setattr() will chop off the nanosecond portion of our timestamps. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Fix max offset calculationsMark Fasheh2007-08-09
| | | | | | | | ocfs2_max_file_offset() was over-estimating the largest file size for several cases. This wasn't really a problem before, but now that we support sparse files, it needs to be more accurate. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: use s_maxbytes directly in ocfs2_change_file_space()Mark Fasheh2007-08-09
| | | | | | | | There's no need to recalculate things via ocfs2_max_file_offset() as we've already done that to fill s_maxbytes, so use that instead. We can also un-export ocfs2_max_file_offset() then. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* mm: Remove slab destructors from kmem_cache_create().Paul Mundt2007-07-19
| | | | | | | | | | | | | | Slab destructors were no longer supported after Christoph's c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* ocfs2: Support xfs style space reservation ioctlsMark Fasheh2007-07-10
| | | | | | | | | | | | | | | We re-use the RESVSP/UNRESVSP ioctls from xfs which allow the user to allocate and deallocate regions to a file without zeroing data or changing i_size. Though renamed, the structure passed in from user is identical to struct xfs_flock64. The three fields that are actually used right now are l_whence, l_start and l_len. This should get ocfs2 immediate compatibility with userspace software using the pre-existing xfs ioctls. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Add "preferred slot" mount optionSunil Mushran2007-07-10
| | | | | | | | | ocfs2 will attempt to assign the node the slot# provided in the mount option. Failure to assign the preferred slot is not an error. This small feature can be useful for automated testing. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* Remove SLAB_CTOR_CONSTRUCTORChristoph Lameter2007-05-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: David Howells <dhowells@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Steven French <sfrench@us.ibm.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@ucw.cz> Cc: David Chinner <dgc@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* slab allocators: Remove SLAB_DEBUG_INITIAL flagChristoph Lameter2007-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by SLAB. I think its purpose was to have a callback after an object has been freed to verify that the state is the constructor state again? The callback is performed before each freeing of an object. I would think that it is much easier to check the object state manually before the free. That also places the check near the code object manipulation of the object. Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was compiled with SLAB debugging on. If there would be code in a constructor handling SLAB_DEBUG_INITIAL then it would have to be conditional on SLAB_DEBUG otherwise it would just be dead code. But there is no such code in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real use of, difficult to understand and there are easier ways to accomplish the same effect (i.e. add debug code before kfree). There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be clear in fs inode caches. Remove the pointless checks (they would even be pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors. This is the last slab flag that SLUB did not support. Remove the check for unimplemented flags from SLUB. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ocfs2: fix sparse warnings in fs/ocfs2Mark Fasheh2007-05-02
| | | | | | | None of these are actually harmful, but the noise makes looking for real problems difficult. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Cache extent recordsMark Fasheh2007-04-26
| | | | | | | | | | | | | | | The extent map code was ripped out earlier because of an inability to deal with holes. This patch adds back a simpler caching scheme requiring far less code. Our old extent map caching was designed back when meta data block caching in Ocfs2 didn't work very well, resulting in many disk reads. These days our metadata caching is much better, resulting in no un-necessary disk reads. As a result, extent caching doesn't have to be as fancy, nor does it have to cache as many extents. Keeping the last 3 extents seen should be sufficient to give us a small performance boost on some streaming workloads. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: temporarily remove extent map cachingMark Fasheh2007-04-26
| | | | | | | | | | | The code in extent_map.c is not prepared to deal with a subtree being rotated between lookups. This can happen when filling holes in sparse files. Instead of a lengthy patch to update the code (which would likely lose the benefit of caching subtree roots), we remove most of the algorithms and implement a simple path based lookup. A less ambitious extent caching scheme will be added in a later patch. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove delete inode voteTiger Yang2007-04-26
| | | | | | | | | | | Ocfs2 currently does cluster-wide node messaging to check the open state of an inode during delete. This patch removes that mechanism in favor of an inode cluster lock which is taken at shared read when an inode is first read and dropped in clear_inode(). This allows a deleting node to test the liveness of an inode by attempting to take an exclusive lock. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] Mark struct super_operations constJosef 'Jeff' Sipek2007-02-12
| | | | | | | | | | | This patch is inspired by Arjan's "Patch series to mark struct file_operations and struct inode_operations const". Compile tested with gcc & sparse. Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] Fix numerous kcalloc() calls, convert to kzalloc()Robert P. J. Day2006-12-13
| | | | | | | | | | | | | | | | | | | All kcalloc() calls of the form "kcalloc(1,...)" are converted to the equivalent kzalloc() calls, and a few kcalloc() calls with the incorrect ordering of the first two arguments are fixed. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: Jeff Garzik <jeff@garzik.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Adam Belay <ambx1@neo.rr.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* ocfs2: local mountsSunil Mushran2006-12-07
| | | | | | | | | | This allows users to format an ocfs2 file system with a special flag, OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT. When the file system sees this flag, it will not use any cluster services, nor will it require a cluster configuration, thus acting like a 'local' file system. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] fs/*: trivial vsnprintf() conversionAlexey Dobriyan2006-12-07
| | | | | | | | | | It would very lame to get buffer overflow via one of the following. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] slab: remove kmem_cache_tChristoph Lameter2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] slab: remove SLAB_NOFSChristoph Lameter2006-12-07
| | | | | | | | SLAB_NOFS is an alias of GFP_NOFS. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge branch 'master' of ↵David Howells2006-12-05
|\ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/ata/libata-scsi.c include/linux/libata.h Futher merge of Linus's head and compilation fixups. Signed-Off-By: David Howells <dhowells@redhat.com>
| * ocfs2: core atime update functionsTiger Yang2006-12-01
| | | | | | | | | | | | | | This patch adds the core routines for updating atime in ocfs2. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
| * ocfs2: remove unused handle argument from ocfs2_meta_lock_full()Mark Fasheh2006-12-01
| | | | | | | | | | | | Now that this is unused and all callers pass NULL, we can safely remove it. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
| * ocfs2: remove unused ocfs2_handle_add_lock()Mark Fasheh2006-12-01
| | | | | | | | | | | | | | | | | | | | This gets us rid of a slab we no longer need, as well as removing the majority of what's left on ocfs2_journal_handle. ocfs2_commit_unstarted_handle() has no more real work to do, so remove that function too. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
| * ocfs2: remove unused ocfs2_handle_add_inode()Mark Fasheh2006-12-01
| | | | | | | | | | | | | | | | We can also delete the unused infrastructure which was once in place to support this functionality. ocfs2_inode_private loses ip_handle and ip_handle_list. ocfs2_journal_handle loses handle_list. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* | WorkStruct: make allyesconfigDavid Howells2006-11-22
|/ | | | | | Fix up for make allyesconfig. Signed-Off-By: David Howells <dhowells@redhat.com>
* [PATCH] fs/*: use BUILD_BUG_ONAlexey Dobriyan2006-10-11
| | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: David Howells <dhowells@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* ocfs2: Remove i_generation from inode lock namesMark Fasheh2006-09-24
| | | | | | | | | | | | | | | | | | | | | | | OCFS2 puts inode meta data in the "lock value block" provided by the DLM. Typically, i_generation is encoded in the lock name so that a deleted inode on and a new one in the same block don't share the same lvb. Unfortunately, that scheme means that the read in ocfs2_read_locked_inode() is potentially thrown away as soon as the meta data lock is taken - we cannot encode the lock name without first knowing i_generation, which requires a disk read. This patch encodes i_generation in the inode meta data lvb, and removes the value from the inode meta data lock name. This way, the read can be covered by a lock, and at the same time we can distinguish between an up to date and a stale LVB. This will help cold-cache stat(2) performance in particular. Since this patch changes the protocol version, we take the opportunity to do a minor re-organization of two of the LVB fields. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: manually d_move() during ocfs2_rename()Mark Fasheh2006-09-24
| | | | | | | | | | Make use of FS_RENAME_DOES_D_MOVE to avoid a race condition that can occur during ->rename() if we d_move() outside of the parent directory cluster locks, and another node discovers the new name (created during the rename) and unlinks it. d_move() will unconditionally rehash a dentry - which will leave stale data in the system. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: limit cluster bitmap information saved at mountMark Fasheh2006-08-07
| | | | | | | | | We were storing cluster count on the ocfs2_super structure, but never actually using it so remove that. Also, we don't want to populate the uptodate cache with the unlocked block read - it is technically safe as is, but we should change it for correctness. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: clean up some osb fieldsMark Fasheh2006-06-29
| | | | | | | | Get rid of osb->uuid, osb->proc_sub_dir, and osb->osb_id. Those fields were unused, or could easily be removed. As a result, we also no longer need MAX_OSB_ID or ocfs2_globals_lock. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: fix init of uuid_net_keyMark Fasheh2006-06-29
| | | | | | ocfs2_initialize_super() should be copying from the beginning of the uuid. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Cleanup message printsSunil Mushran2006-06-29
| | | | | Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] VFS: Permit filesystem to perform statfs with a known root dentryDavid Howells2006-06-23
| | | | | | | | | | | | | | | | | | | | | Give the statfs superblock operation a dentry pointer rather than a superblock pointer. This complements the get_sb() patch. That reduced the significance of sb->s_root, allowing NFS to place a fake root there. However, NFS does require a dentry to use as a target for the statfs operation. This permits the root in the vfsmount to be used instead. linux/mount.h has been added where necessary to make allyesconfig build successfully. Interest has also been expressed for use with the FUSE and XFS filesystems. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] VFS: Permit filesystem to override root dentry on mountDavid Howells2006-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the get_sb() filesystem operation to take an extra argument that permits the VFS to pass in the target vfsmount that defines the mountpoint. The filesystem is then required to manually set the superblock and root dentry pointers. For most filesystems, this should be done with simple_set_mnt() which will set the superblock pointer and then set the root dentry to the superblock's s_root (as per the old default behaviour). The get_sb() op now returns an integer as there's now no need to return the superblock pointer. This patch permits a superblock to be implicitly shared amongst several mount points, such as can be done with NFS to avoid potential inode aliasing. In such a case, simple_set_mnt() would not be called, and instead the mnt_root and mnt_sb would be set directly. The patch also makes the following changes: (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount pointer argument and return an integer, so most filesystems have to change very little. (*) If one of the convenience function is not used, then get_sb() should normally call simple_set_mnt() to instantiate the vfsmount. This will always return 0, and so can be tail-called from get_sb(). (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the dcache upon superblock destruction rather than shrink_dcache_anon(). This is required because the superblock may now have multiple trees that aren't actually bound to s_root, but that still need to be cleaned up. The currently called functions assume that the whole tree is rooted at s_root, and that anonymous dentries are not the roots of trees which results in dentries being left unculled. However, with the way NFS superblock sharing are currently set to be implemented, these assumptions are violated: the root of the filesystem is simply a dummy dentry and inode (the real inode for '/' may well be inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries with child trees. [*] Anonymous until discovered from another tree. (*) The documentation has been adjusted, including the additional bit of changing ext2_* into foo_* in the documentation. [akpm@osdl.org: convert ipath_fs, do other stuff] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* ocfs2: don't use MLF* in the file systemMark Fasheh2006-03-24
| | | | Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>