aboutsummaryrefslogtreecommitdiffstats
path: root/fs/ocfs2
Commit message (Collapse)AuthorAge
...
* | | ocfs2: Set the xattr name+value pair in one placeJoel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We create two new functions on ocfs2_xa_loc, ocfs2_xa_prepare_entry() and ocfs2_xa_store_inline_value(). ocfs2_xa_prepare_entry() makes sure that the xl_entry field of ocfs2_xa_loc is ready to receive an xattr. The entry will point to an appropriately sized name+value region in storage. If an existing entry can be reused, it will be. If no entry already exists, it will be allocated. If there isn't space to allocate it, -ENOSPC will be returned. ocfs2_xa_store_inline_value() stores the data that goes into the 'value' part of the name+value pair. For values that don't fit directly, this stores the value tree root. A number of operations are added to ocfs2_xa_loc_operations to support these functions. This reflects the disparate behaviors of xattr blocks and buckets. With these functions, the overlapping ocfs2_xattr_set_entry_local() and ocfs2_xattr_set_entry_normal() can be replaced with a single call scheme. Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | | ocfs2: Wrap calculation of name+value pair size.Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An ocfs2 xattr entry stores the text name and value as a pair in the storage area. Obviously names and values can be variable-sized. If a value is too large for the entry storage, a tree root is stored instead. The name+value pair is also padded. Because of this, there are a million places in the code that do: if (needs_external_tree(value_size) namevalue_size = pad(name_size) + tree_root_size; else namevalue_size = pad(name_size) + pad(value_size); Let's create some convenience functions to make the code more readable. There are three forms. The first takes the raw sizes. The second takes an ocfs2_xattr_info structure. The third takes an existing ocfs2_xattr_entry. Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | | ocfs2: Add a name_len field to ocfs2_xattr_info.Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | Rather than calculating strlen all over the place, let's store the name length directly on ocfs2_xattr_info. Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | | ocfs2: Prefix the member fields of struct ocfs2_xattr_info.Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | struct ocfs2_xattr_info is a useful structure describing an xattr you'd like to set. Let's put prefixes on the member fields so it's easier to read and use. Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | | ocfs2: Remove xattrs via ocfs2_xa_locJoel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | Add ocfs2_xa_remove_entry(), which will remove an xattr entry from its storage via the ocfs2_xa_loc descriptor. Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | | ocfs2: Introduce ocfs2_xa_locJoel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ocfs2 extended attribute (xattr) code is very flexible. It can store xattrs in the inode itself, in an external block, or in a tree of data structures. This allows the number of xattrs to be bounded by the filesystem size. However, the code that manages each possible storage location is different. Maintaining the ocfs2 xattr code requires changing each hunk separately. This patch is the start of a series introducing the ocfs2_xa_loc structure. This structure wraps the on-disk details of an xattr entry. The goal is that the generic xattr routines can use ocfs2_xa_loc without knowing the underlying storage location. This first pass merely implements the basic structure, initializing it, and wiping the name+value pair of the entry. Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | | ocfs2: Add current->comm in trace outputSunil Mushran2010-02-26
| | | | | | | | | | | | | | | | | | | | | Add current->comm to the standard mlog() output to help with debugging. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | | ocfs2: Clean up the checks for CoW and direct I/O.Wengang Wang2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | When ocfs2 has to do CoW for refcounted extents, we disable direct I/O and go through the buffered I/O path. This makes the combined check easier to read. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | | ocfs2: add extent block stealing for ocfs2 v5Tiger Yang2010-02-26
|/ / | | | | | | | | | | | | | | | | | | | | This patch add extent block (metadata) stealing mechanism for extent allocation. This mechanism is same as the inode stealing. if no room in slot specific extent_alloc, we will try to allocate extent block from the next slot. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Acked-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2/cluster: Make o2net connect messages KERN_NOTICESunil Mushran2010-02-08
| | | | | | | | | | | | | | | | | | | | Connect and disconnect messages are more than informational as they are required during root cause analysis for failures. This patch changes them from KERN_INFO to KERN_NOTICE. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Faseh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2/dlm: Fix printing of locknameSunil Mushran2010-02-08
| | | | | | | | | | | | | | | | | | The debug call printing the name of the lock resource was chopping off the last character. This patch fixes the problem. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2: Fix contiguousness check in ocfs2_try_to_merge_extent_map()Roel Kluin2010-02-05
| | | | | | | | | | | | | | | | The wrong member was compared in the continguousness check. Acked-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2/dlm: Remove BUG_ON in dlm recovery when freeing locks of a dead nodeSunil Mushran2010-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During recovery, the dlm frees the locks for the dead node. If it finds a lock in a resource for the dead node, it expects that node to also have a ref in that lock resource. If not, it BUGs. ossbz#1175 was filed with the above BUG. Now, while it is correct that we should be expecting the ref, I see no reason why we have to BUG. After all, we are freeing up the lock and clearing the ref. This patch replaces the BUG_ON with a printk(). Hopefully, that will give us more clues next time this happens. http://oss.oracle.com/bugzilla/show_bug.cgi?id=1175 Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2: Plugs race between the dc thread and an unlock ast messageSunil Mushran2010-02-03
| | | | | | | | | | | | | | | | | | | | | | This patch plugs a race between the downconvert thread and an unlock ast message. Specifically, after the downconvert worker has done its task, the dc thread needs to check whether an unlock ast made the downconvert moot. Reported-by: David Teigland <teigland@redhat.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@sus.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2: Remove overzealous BUG_ON during blocked lock processingSunil Mushran2010-02-03
| | | | | | | | | | | | | | | | | | | | | | During blocked lock processing, we should consider the possibility that the lock is no longer blocking. Joel Becker <joel.becker@oracle.com> assisted in fixing this issue. Reported-by: David Teigland <teigland@redhat.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2: Do not downconvert if the lock level is already compatibleSunil Mushran2010-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During upconvert, if the master were to send a BAST, dlmglue will detect the upconversion in process and send a cancel convert to the master. Upon receiving the AST for the cancel convert, it will re-process the lock resource to determine whether it needs downconverting. Say, the up was from PR to EX and the BAST was for EX. After the cancel convert, it will need to downconvert to NL. However, if the node was originally upconverting from NL to EX, then there would be no reason to downconvert (assuming the same message sequence). This patch makes dlmglue consider the possibility that the current lock level is already compatible and that downconverting is not required. Joel Becker <joel.becker@oracle.com> assisted in fixing this issue. Fixes ossbz#1178 http://oss.oracle.com/bugzilla/show_bug.cgi?id=1178 Reported-by: Coly Li <coly.li@suse.de> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2: Prevent a livelock in dlmglueSunil Mushran2010-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is possibility of a livelock in __ocfs2_cluster_lock(). If a node were to get an ast for an upconvert request, followed immediately by a bast, there is a small window where the fs may downconvert the lock before the process requesting the upconvert is able to take the lock. This patch adds a new flag to indicate that the upconvert is still in progress and that the dc thread should not downconvert it right now. Wengang Wang <wen.gang.wang@oracle.com> and Joel Becker <joel.becker@oracle.com> contributed heavily to this patch. Reported-by: David Teigland <teigland@redhat.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2: Fix setting of OCFS2_LOCK_BLOCKED during bastWengang Wang2010-02-03
| | | | | | | | | | | | | | | | | | | | During bast, set the OCFS2_LOCK_BLOCKED flag only if the lock needs to downconverted. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Acked-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2: Use compat_ptr in reflink_arguments.Tao Ma2010-02-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although we use u64 to pass userspace pointers to the kernel to avoid compat_ioctl, it doesn't work in some ppc platform. So wrap them with compat_ptr and add compat_ioctl. The detailed discussion about compat_ptr can be found in thread http://lkml.org/lkml/2009/10/27/423. We indeed met with a bug when testing on ppc(-EFAULT is returned when using old_path). This patch try to fix this. I have tested in ppc64(with 32 bit reflink) and x86_64(with i686 reflink), both works. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2/dlm: Handle EAGAIN for compatibility - v2Sunil Mushran2010-02-02
| | | | | | | | | | | | | | | | | | | | | | | | Mainline commit aad1b15310b9bcd59fa81ab8f2b1513b59553ea8 made the dlm_begin_reco_handler() return -EAGAIN instead of EAGAIN. As this error is transmitted over the wire, we want the receiver, dlm_send_begin_reco_message(), to understand both the older EAGAIN and the newer -EAGAIN, to allow rolling upgrade of the cluster nodes. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2: Add parenthesis to wrap the check for O_DIRECT.Tao Ma2010-02-02
| | | | | | | | | | | | | | Add parenthesis to wrap the check for O_DIRECT. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2: Only bug out when page size is larger than cluster size.Tao Ma2010-02-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In CoW, we have to make sure that the page is already written out to the disk. So we have a BUG_ON(PageDirty(page)). In ppc platform we have pagesize=64K, so if the cs=4K, if the file have fragmented clusters, we will map the page many times. See this file as an example. Tree Depth: 0 Count: 19 Next Free Rec: 14 ## Offset Clusters Block# Flags 0 0 4 2164864 0x2 Refcounted 1 4 2 9302792 0x2 Refcounted ... We have to replace the extent recs one by one, so the page with index 0 will be mapped and dirtied twice. I'd like to leave the BUG_ON there while adding a check so that in case we meet with an error in other platforms, we can find it easily. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2: Fix memory overflow in cow_by_page.Tao Ma2010-02-02
| | | | | | | | | | | | | | | | | | | | | | In ocfs2_duplicate_clusters_by_page, we calculate map_end by shifting page_index. But actually in case we meet with a large offset(say in a i686 box, poff_t is only 32 bits and page_index=2056240), we will overflow. So change the type of page_index to loff_t. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2/dlm: Print more messages during lock migrationSunil Mushran2010-01-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When a lock resource is migrated, the dlm compares the migrated locks with that that was already existing on the new node. If the comparison fails, it BUGs. This patch prints more messages when the comparison fails inorder to help with the root cause analyis. http://oss.oracle.com/bugzilla/show_bug.cgi?id=1206 This does not fix bz1206. However, if we run into it again, we will have more information to chew on. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2/dlm: Ignore LVBs of locks in the Blocked listSunil Mushran2010-01-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During lock resource migration, o2dlm fills the packet with a LVB from the first valid lock. For sanity, it ensures that the other valid locks have the same LVB. If not, it BUGs. The valid locks are ones that have granted EX or PR lock levels and are either on the Granted or Converting lists. Locks in the Blocked list cannot have a valid LVB. This patch ensures that we skip the locks in the Blocked list. Fixes oss bugzilla#1202 http://oss.oracle.com/bugzilla/show_bug.cgi?id=1202 Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2/trivial: Remove trailing whitespacesSunil Mushran2010-01-25
| | | | | | | | | | | | | | Patch removes trailing whitespaces. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2: fix a misleading variable nameWengang Wang2010-01-25
| | | | | | | | | | | | | | | | a local variable "dlm_version" is used as a fs locking version. rename it fs_version. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2: Sync max_inline_data_with_xattr from tools.Tao Ma2010-01-25
| | | | | | | | | | | | | | | | In ocfs2-tools, we have added ocfs2_max_inline_data_with_xattr, so add it in the kernel's ocfs2_fs.h. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | ocfs2: Fix refcnt leak on ocfs2_fast_follow_link() error pathOGAWA Hirofumi2010-01-11
|/ | | | | | | | | | | If ->follow_link handler returns an error, it should decrement nd->path refcnt. But ocfs2_fast_follow_link() doesn't decrement. This patch fixes the problem by using nd_set_link() style error handling instead of playing with nd->path. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* ocfs2: Handle O_DIRECT when writing to a refcounted cluster.Tao Ma2009-12-30
| | | | | | | | | | | | | | In case of writing to a refcounted cluster with O_DIRECT, we need to fall back to buffer write. And when it is finished, we need to flush the page and the journal as we did for other O_DIRECT writes. This patch fix oss bug 1191. http://oss.oracle.com/bugzilla/show_bug.cgi?id=1191 Signed-off-by: Tao Ma <tao.ma@oracle.com> Tested-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* Merge branch 'upstream-linus' of ↵Linus Torvalds2009-12-24
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: ocfs2/trivial: Use le16_to_cpu for a disk value in xattr.c ocfs2/trivial: Use proper mask for 2 places in hearbeat.c Ocfs2: Let ocfs2 support fiemap for symlink and fast symlink. Ocfs2: Should ocfs2 support fiemap for S_IFDIR inode? ocfs2: Use FIEMAP_EXTENT_SHARED fiemap: Add new extent flag FIEMAP_EXTENT_SHARED ocfs2: replace u8 by __u8 in ocfs2_fs.h ocfs2: explicit declare uninitialized var in user_cluster_connect() ocfs2-devel: remove redundant OCFS2_MOUNT_POSIX_ACL check in ocfs2_get_acl_nolock() ocfs2: return -EAGAIN instead of EAGAIN in dlm ocfs2/cluster: Make fence method configurable - v2 ocfs2: Set MS_POSIXACL on remount ocfs2: Make acl use the default ocfs2: Always include ACL support
| * ocfs2/trivial: Use le16_to_cpu for a disk value in xattr.cTao Ma2009-12-23
| | | | | | | | | | | | | | | | In ocfs2_value_metas_in_xattr_header, we should Use le16_to_cpu for ocfs2_extent_list.l_next_free_rec. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * ocfs2/trivial: Use proper mask for 2 places in hearbeat.cTao Ma2009-12-23
| | | | | | | | | | | | | | | | | | | | I just noticed today that there are 2 places of "mlog(0,...)" in fs/ocfs2/cluster/heartbeat.c, but actually have no default mask prefix in that file. So change them to mlog(ML_HEARTBEAT,...). Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * Ocfs2: Let ocfs2 support fiemap for symlink and fast symlink.Tristan Ye2009-12-23
| | | | | | | | | | | | | | | | | | | | | | For fast symlink, it can be treated the same as inlined files since the data extent we want to return of both case all were stored in metadata block. For symlink, it can be simply treated the same as we did for regular files. Signed-off-by: Tristan Ye <tristan.ye@oracle.com> Acked-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * Ocfs2: Should ocfs2 support fiemap for S_IFDIR inode?Tristan Ye2009-12-18
| | | | | | | | | | | | | | | | Let userspace have a chance to get the extent info of a directory just like extN did. Signed-off-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * ocfs2: Use FIEMAP_EXTENT_SHAREDSunil Mushran2009-12-17
| | | | | | | | | | | | | | | | Adds FIEMAP_EXTENT_SHARED flag to refcounted extents. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * ocfs2: replace u8 by __u8 in ocfs2_fs.hColy Li2009-12-17
| | | | | | | | | | | | | | | | This patch replaces date type 'u8' with '__u8', which follows the coding style of ocfs2_fs.h, and portable to user space for ocfs2-tools. Signed-off-by: Coly Li <coly.li@suse.de> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * ocfs2: explicit declare uninitialized var in user_cluster_connect()Coly Li2009-12-17
| | | | | | | | | | | | | | This patch explicitly declares an uninitialized local variable in user_cluster_connect(), to remove a compiling warning. Signed-off-by: Coly Li <coly.li@suse.de> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * ocfs2-devel: remove redundant OCFS2_MOUNT_POSIX_ACL check in ↵Jeff Liu2009-12-17
| | | | | | | | | | | | | | | | | | | | ocfs2_get_acl_nolock() osb->s_mount_opt has already been checked against OCFS2_MOUNT_POSIX_ACL_CHECK before calling ocfs2_get_acl_nolock() in ocfs2_init_acl() && ocfs2_get_acl(), so remove it. Signed-off-by: Jeff Liu <jeff.liu@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * ocfs2: return -EAGAIN instead of EAGAIN in dlmTiger Yang2009-12-02
| | | | | | | | | | | | | | | | | | We used to return positive EAGAIN to indicate a retry action is needed in dlm_begin_reco_handler(). Now we return negative -EAGAIN to erase the confusion caused by this error code. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * ocfs2/cluster: Make fence method configurable - v2Sunil Mushran2009-12-02
| | | | | | | | | | | | | | | | | | | | | | | | By default, o2cb fences the box by calling emergency_restart(). While this scheme works well in production, it comes in the way during testing as it does not let the tester take stack/core dumps for analysis. This patch allows user to dynamically change the fence method to panic() by: # echo "panic" > /sys/kernel/config/cluster/<clustername>/fence_method Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * ocfs2: Set MS_POSIXACL on remountJan Kara2009-10-29
| | | | | | | | | | | | | | | | | | We have to set MS_POSIXACL on remount as well. Otherwise VFS would not know we started supporting ACLs after remount and thus ACLs would not work. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * ocfs2: Make acl use the defaultJan Kara2009-10-29
| | | | | | | | | | | | | | | | | | | | | | | | Change acl mount options handling to match the one of XFS and BTRFS and hopefully it is also easier to use now. When admin does not specify any acl mount option, acls are enabled if and only if the filesystem has xattr feature enabled. If admin specifies 'acl' mount option, we fail the mount if the filesystem does not have xattr feature and thus acls cannot be enabled. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * ocfs2: Always include ACL supportJan Kara2009-10-29
| | | | | | | | | | | | | | | | | | To become consistent with filesystems such as XFS or BTRFS, make posix ACLs always available. This also reduces possibility of misconfiguration on admin's side. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | Merge branch 'upstream-linus' of ↵Linus Torvalds2009-12-23
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: ocfs2: Set i_nlink properly during reflink. ocfs2: Add reflinked file's inode to inode hash eariler. ocfs2: refcounttree.c cleanup. ocfs2: Find proper end cpos for a leaf refcount block.
| * | ocfs2: Set i_nlink properly during reflink.Tao Ma2009-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We create a file in orphan dir for reflink so that if there is any error, we don't create any wrong dentry in the dir. But actually the file in orphan dir should be i_nlink = 0 so that it can be replayed and freed successfully. This patch first set i_nlink to 0 when creating the file in orphan dir and then set it to 1(reflink now only works for regular file) when we move it to the dest dir. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | ocfs2: Add reflinked file's inode to inode hash eariler.Tao Ma2009-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to add reflinked file's inode to inode hash when we add it to the dest dir. But actually there is a race. Consider the following sequence. 1. reflink happens and create the inode in orphan dir. 2. reflink thread is scheduled out because of some io. 3. recovery begins to work and calls ocfs2_recover_orphans. It calls ocfs2_iget and get a new inode and i_count = 1. It calls iput then and delete inode. the buffer's uptodate state is cleared. This patch move insert_inode_hash to the create function so that it can be found by step 3 and prevented from deleting because i_count > 1. This resolves the bug http://oss.oracle.com/bugzilla/show_bug.cgi?id=1183. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | ocfs2: refcounttree.c cleanup.Tao Ma2009-12-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | sparse check finds some endian problem and some other minor issues. There is an obsolete function which should be removed. So this patch resolve all these. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | ocfs2: Find proper end cpos for a leaf refcount block.Tao Ma2009-12-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ocfs2 refcount tree is stored as an extent tree while the leaf ocfs2_refcount_rec points to a refcount block. The following step can trip a kernel panic. mkfs.ocfs2 -b 512 -C 1M --fs-features=refcount $DEVICE mount -t ocfs2 $DEVICE $MNT_DIR FILE_NAME=$RANDOM FILE_NAME_1=$RANDOM FILE_REF="${FILE_NAME}_ref" FILE_REF_1="${FILE_NAME}_ref_1" for((i=0;i<305;i++)) do # /mnt/1048576 is a file with 1048576 sizes. cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME_1 done for((i=0;i<3;i++)) do cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME done for((i=0;i<2;i++)) do cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME_1 done cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME for((i=0;i<11;i++)) do cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME_1 done reflink $MNT_DIR/$FILE_NAME $MNT_DIR/$FILE_REF # write_f is a program which will write some bytes to a file at offset. # write_f -f file_name -l offset -w write_bytes. ./write_f -f $MNT_DIR/$FILE_REF -l $[310*1048576] -w 4096 ./write_f -f $MNT_DIR/$FILE_REF -l $[306*1048576] -w 4096 ./write_f -f $MNT_DIR/$FILE_REF -l $[311*1048576] -w 4096 ./write_f -f $MNT_DIR/$FILE_NAME -l $[310*1048576] -w 4096 ./write_f -f $MNT_DIR/$FILE_NAME -l $[311*1048576] -w 4096 reflink $MNT_DIR/$FILE_NAME $MNT_DIR/$FILE_REF_1 ./write_f -f $MNT_DIR/$FILE_NAME -l $[311*1048576] -w 4096 #kernel panic here. The reason is that if the ocfs2_extent_rec is the last record in a leaf extent block, the old solution fails to find the suitable end cpos. So this patch try to walk through the b-tree, find the next sub root and get the c_pos the next sub-tree starts from. btw, I have runned tristan's test case against the patched kernel for several days and this type of kernel panic never happens again. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | | ocfs: stop using do_sync_mapping_rangeChristoph Hellwig2009-12-16
| | | | | | | | | | | | | | | | | | | | | | | | do_sync_mapping_range(..., SYNC_FILE_RANGE_WRITE) is a very awkward way to perform a filemap_fdatawrite_range. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>