aboutsummaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAge
...
| * | | | | | | | | | ocfs2: Pass the locking protocol into ocfs2_cluster_connect().Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inside the stackglue, the locking protocol structure is hanging off of the ocfs2_cluster_connection. This takes it one further; the locking protocol is passed into ocfs2_cluster_connect(). Now different cluster connections can have different locking protocols with distinct asts. Note that all locking protocols have to keep their maximum protocol version in lock-step. With the protocol structure set in ocfs2_cluster_connect(), there is no need for the stackglue to have a static pointer to a specific protocol structure. We can change initialization to only pass in the maximum protocol version. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Remove the ast pointers from ocfs2_stack_pluginsJoel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the full ocfs2_locking_protocol hanging off of the ocfs2_cluster_connection, ast wrappers can get the ast/bast pointers there. They don't need to get them from their plugin structure. The user plugin still needs the maximum locking protocol version, though. This changes the plugin structure so that it only holds the max version, not the entire ocfs2_locking_protocol pointer. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Hang the locking proto on the cluster conn and use it in asts.Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the ocfs2_cluster_connection hanging off of the ocfs2_dlm_lksb, we have access to it in the ast and bast wrapper functions. Attach the ocfs2_locking_protocol to the conn. Now, instead of refering to a static variable for ast/bast pointers, the wrappers can look at the connection. This means different connections can have different ast/bast pointers, and it reduces the need for the static pointer. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Attach the connection to the lksbJoel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We're going to want it in the ast functions, so we convert union ocfs2_dlm_lksb to struct ocfs2_dlm_lksb and let it carry the connection. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Pass lksbs back from stackglue ast/bast functions.Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The stackglue ast and bast functions tried to maintain the fiction that their arguments were void pointers. In reality, stack_user.c had to know that the argument was an ocfs2_lock_res in order to get the status off of the lksb. That's ugly. This changes stackglue to always pass the lksb as the argument to ast and bast functions. The caller can always use container_of() to get the ocfs2_lock_res or user_dlm_lock_res. The net effect to the caller is zero. They still get back the lockres in their ast. stackglue gets cleaner, and now can use the lksb itself. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2_dlmfs: Move to its own directoryJoel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We're going to remove the tie between ocfs2_dlmfs and o2dlm. ocfs2_dlmfs doesn't belong in the fs/ocfs2/dlm directory anymore. Here we move it to fs/ocfs2/dlmfs. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2_dlmfs: Use poll() to signify BASTs.Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | o2dlm's userspace filesystem is an easy way to use the DLM from userspace. It is intentionally simple. For example, it does not allow for asynchronous behavior or lock conversion. This is intentional to keep the interface simple. Because there is no asynchronous notification, there is no way for a process holding a lock to know another node needs the lock. This is the number one complaint of ocfs2_dlmfs users. Turns out, we can solve this very easily. We add poll() support to ocfs2_dlmfs. When a BAST is received, the lock's file descriptor will receive POLLIN. This is trivial to implement. Userdlm already has an appropriate waitqueue, and the lock knows when it is blocked. We add the "bast" capability to tell userspace this is available. Signed-off-by: Joel Becker <joel.becker@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2_dlmfs: Add capabilities parameter.Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Over time, dlmfs has added some features that were not part of the initial ABI. Unfortunately, some of these features are not detectable via standard usage. For example, Linux's default poll always returns POLLIN, so there is no way for a caller of poll(2) to know when dlmfs added poll support. Instead, we provide this list of new capabilities. Capabilities is a read-only attribute. We do it as a module parameter so we can discover it whether dlmfs is built in, loaded, or even not loaded (via modinfo). The ABI features are local to this machine's dlmfs mount. This is distinct from the locking protocol, which is concerned with inter-node interaction. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Handle errors while setting external xattr values.Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ocfs2 can store extended attribute values as large as a single file. It does this using a standard ocfs2 btree for the large value. However, the previous code did not handle all error cases cleanly. There are multiple problems to have. 1) We have trouble allocating space for a new xattr. This leaves us with an empty xattr. 2) We overwrote an existing local xattr with a value root, and now we have an error allocating the storage. This leaves us an empty xattr. where there used to be a value. The value is lost. 3) We have trouble truncating a reused value. This leaves us with the original entry pointing to the truncated original value. The value is lost. 4) We have trouble extending the storage on a reused value. This leaves us with the original value safely in place, but with more storage allocated when needed. This doesn't consider storing local xattrs (values that don't require a btree). Those only fail when the journal fails. Case (1) is easy. We just remove the xattr we added. We leak the storage because we can't safely remove it, but otherwise everything is happy. We'll print a warning about the leak. Case (4) is easy. We still have the original value in place. We can just leave the extra storage attached to this xattr. We return the error, but the old value is untouched. We print a warning about the storage. Case (2) and (3) are hard because we've lost the original values. In the old code, we ended up with values that could be partially read. That's not good. Instead, we just wipe the xattr entry and leak the storage. It stinks that the original value is lost, but now there isn't a partial value to be read. We'll print a big fat warning. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Set inline xattr entries with ocfs2_xa_set()Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ocfs2_xattr_ibody_set() is the only remaining user of ocfs2_xattr_set_entry(). ocfs2_xattr_set_entry() actually does two things: it calls ocfs2_xa_set(), and it initializes the inline xattrs. Initializing the inline space really belongs in its own call. We lift the initialization to ocfs2_xattr_ibody_init(), called from ocfs2_xattr_ibody_set() only when necessary. Now ocfs2_xattr_ibody_set() can call ocfs2_xa_set() directly. ocfs2_xattr_set_entry() goes away. Another nice fact is that ocfs2_init_dinode_xa_loc() can trust i_xattr_inline_size. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Set xattr block entries with ocfs2_xa_set()Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ocfs2_xattr_block_set() calls into ocfs2_xattr_set_entry() with just the HAS_XATTR flag. Most of the machinery of ocfs2_xattr_set_entry() is skipped. All that really happens other than the call to ocfs2_xa_set() is making sure the HAS_XATTR flag is set on the inode. But HAS_XATTR should be set when we also set di->i_xattr_loc. And that's done in ocfs2_create_xattr_block(). So let's move it there, and then ocfs2_xattr_block_set() can just call ocfs2_xa_set(). While we're there, ocfs2_create_xattr_block() can take the set_ctxt for a smaller argument list. It also learns to set HAS_XATTR_FL, because it knows for sure. ocfs2_create_empty_xatttr_block() in the reflink path fakes a set_ctxt to call ocfs2_create_xattr_block(). Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Let ocfs2_xa_prepare_entry() do space checks.Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ocfs2_xattr_set_in_bucket() doesn't need to do its own hacky space checking. Let's let ocfs2_xa_prepare_entry() (via ocfs2_xa_set()) do the more accurate work. Whenever it doesn't have space, ocfs2_xattr_set_in_bucket() can try to get more space. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Gell into ocfs2_xa_set()Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ocfs2_xa_set() wraps the ocfs2_xa_prepare_entry()/ocfs2_xa_store_value() logic. Both callers can now use the same routine. ocfs2_xa_remove() moves directly into ocfs2_xa_set(). Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Allocation in ocfs2_xa_prepare_entry(), values in ocfs2_xa_store_value()Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ocfs2_xa_prepare_entry() gets all the logic to add, remove, or modify external value trees. Now, when it exits, the entry is ready to receive a value of any size. ocfs2_xa_remove() is added to handle the complete removal of an entry. It truncates the external value tree before calling ocfs2_xa_remove_entry(). ocfs2_xa_store_inline_value() becomes ocfs2_xa_store_value(). It can store any value. ocfs2_xattr_set_entry() loses all the allocation logic and just uses these functions. ocfs2_xattr_set_value_outside() disappears. ocfs2_xattr_set_in_bucket() uses these functions and makes ocfs2_xattr_set_entry_in_bucket() obsolete. That goes away, as does ocfs2_xattr_bucket_set_value_outside() and ocfs2_xattr_bucket_value_truncate(). Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Teach ocfs2_xa_loc how to do its own journal workJoel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We're going to want to make sure our buffers get accessed and dirtied correctly. So have the xa_loc do the work. This includes storing the inode on ocfs2_xa_loc. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Provide ocfs2_xa_fill_value_buf() for external value processingJoel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use the ocfs2_xattr_value_buf structure to manage external values. It lets the value tree code do its work regardless of the containing storage. ocfs2_xa_fill_value_buf() initializes a value buf from an ocfs2_xa_loc entry. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Handle value tree roots in ocfs2_xa_set_inline_value()Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the xattr code would send in a fake value, containing a tree root, to the function that installed name+value pairs. Instead, we pass the real value to ocfs2_xa_set_inline_value(), and it notices that the value cannot fit. Thus, it installs a tree root. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Set the xattr name+value pair in one placeJoel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We create two new functions on ocfs2_xa_loc, ocfs2_xa_prepare_entry() and ocfs2_xa_store_inline_value(). ocfs2_xa_prepare_entry() makes sure that the xl_entry field of ocfs2_xa_loc is ready to receive an xattr. The entry will point to an appropriately sized name+value region in storage. If an existing entry can be reused, it will be. If no entry already exists, it will be allocated. If there isn't space to allocate it, -ENOSPC will be returned. ocfs2_xa_store_inline_value() stores the data that goes into the 'value' part of the name+value pair. For values that don't fit directly, this stores the value tree root. A number of operations are added to ocfs2_xa_loc_operations to support these functions. This reflects the disparate behaviors of xattr blocks and buckets. With these functions, the overlapping ocfs2_xattr_set_entry_local() and ocfs2_xattr_set_entry_normal() can be replaced with a single call scheme. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Wrap calculation of name+value pair size.Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An ocfs2 xattr entry stores the text name and value as a pair in the storage area. Obviously names and values can be variable-sized. If a value is too large for the entry storage, a tree root is stored instead. The name+value pair is also padded. Because of this, there are a million places in the code that do: if (needs_external_tree(value_size) namevalue_size = pad(name_size) + tree_root_size; else namevalue_size = pad(name_size) + pad(value_size); Let's create some convenience functions to make the code more readable. There are three forms. The first takes the raw sizes. The second takes an ocfs2_xattr_info structure. The third takes an existing ocfs2_xattr_entry. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Add a name_len field to ocfs2_xattr_info.Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than calculating strlen all over the place, let's store the name length directly on ocfs2_xattr_info. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Prefix the member fields of struct ocfs2_xattr_info.Joel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct ocfs2_xattr_info is a useful structure describing an xattr you'd like to set. Let's put prefixes on the member fields so it's easier to read and use. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Remove xattrs via ocfs2_xa_locJoel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add ocfs2_xa_remove_entry(), which will remove an xattr entry from its storage via the ocfs2_xa_loc descriptor. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Introduce ocfs2_xa_locJoel Becker2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ocfs2 extended attribute (xattr) code is very flexible. It can store xattrs in the inode itself, in an external block, or in a tree of data structures. This allows the number of xattrs to be bounded by the filesystem size. However, the code that manages each possible storage location is different. Maintaining the ocfs2 xattr code requires changing each hunk separately. This patch is the start of a series introducing the ocfs2_xa_loc structure. This structure wraps the on-disk details of an xattr entry. The goal is that the generic xattr routines can use ocfs2_xa_loc without knowing the underlying storage location. This first pass merely implements the basic structure, initializing it, and wiping the name+value pair of the entry. Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Add current->comm in trace outputSunil Mushran2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add current->comm to the standard mlog() output to help with debugging. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: Clean up the checks for CoW and direct I/O.Wengang Wang2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When ocfs2 has to do CoW for refcounted extents, we disable direct I/O and go through the buffered I/O path. This makes the combined check easier to read. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
| * | | | | | | | | | ocfs2: add extent block stealing for ocfs2 v5Tiger Yang2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch add extent block (metadata) stealing mechanism for extent allocation. This mechanism is same as the inode stealing. if no room in slot specific extent_alloc, we will try to allocate extent block from the next slot. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Acked-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
* | | | | | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2010-03-03
|\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix large stack use fuse: cleanup in fuse_notify_inval_...()
| * | | | | | | | | | | fuse: fix large stack useFang Wenqi2010-02-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc 4.4 warns about: fs/fuse/dev.c: In function ‘fuse_notify_inval_entry’: fs/fuse/dev.c:925: warning: the frame size of 1060 bytes is larger than 1024 bytes The problem is we declare two structures and a large array on the stack, I move the array alway from the stack and allocate memory for it dynamically. Signed-off-by: Fang Wenqi <antonf@turbolinux.com.cn> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| * | | | | | | | | | | fuse: cleanup in fuse_notify_inval_...()Miklos Szeredi2010-02-05
| | |_|_|_|/ / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Small cleanup in fuse_notify_inval_inode() and fuse_notify_inval_entry(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
* | | | | | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2010-03-03
|\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: add __percpu sparse annotations to what's left percpu: add __percpu sparse annotations to fs percpu: add __percpu sparse annotations to core kernel subsystems local_t: Remove leftover local.h this_cpu: Remove pageset_notifier this_cpu: Page allocator conversion percpu, x86: Generic inc / dec percpu instructions local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c module: Use this_cpu_xx to dynamically allocate counters local_t: Remove cpu_local_xx macros percpu: refactor the code in pcpu_[de]populate_chunk() percpu: remove compile warnings caused by __verify_pcpu_ptr() percpu: make accessors check for percpu pointer in sparse percpu: add __percpu for sparse. percpu: make access macros universal percpu: remove per_cpu__ prefix.
| * | | | | | | | | | | percpu: add __percpu sparse annotations to fsTejun Heo2010-02-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add __percpu sparse annotations to fs. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Alex Elder <aelder@sgi.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
* | | | | | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmwLinus Torvalds2010-03-03
|\ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: GFS2: print glock numbers in hex GFS2: ordered writes are backwards GFS2: Remove old, unused linked list code from quota GFS2: Remove loopy umount code GFS2: Metadata address space clean up
| * | | | | | | | | | | | GFS2: print glock numbers in hexBob Peterson2010-03-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes glock numbers from printing in decimal to hex. Since DLM prints corresponding resource IDs in hex, it makes debugging easier. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | | | | | | | | | GFS2: ordered writes are backwardsDave Chinner2010-03-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we queue data buffers for ordered write, the buffers are added to the head of the ordered write list. When the log needs to push these buffers to disk, it also walks the list from the head. The result is that the the ordered buffers are submitted to disk in reverse order. For large writes, this means that whenever the log flushes large streams of reverse sequential order buffers are pushed down into the block layers. The elevators don't handle this particularly well, so IO rates tend to be significantly lower than if the IO was issued in ascending block order. Queue new ordered buffers to the tail of the ordered buffer list to ensure that IO is dispatched in the order it was submitted. This should significantly improve large sequential write speeds. On a disk capable of 85MB/s, speeds increase from 50MB/s to 65MB/s for noop and from 38MB/s to 50MB/s for cfq. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | | | | | | | | | GFS2: Remove loopy umount codeSteven Whitehouse2010-03-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As a consequence of the previous patch, we can now remove the loop which used to be required due to the circular dependency between the inodes and glocks. Instead we can just invalidate the inodes, and then clear up any glocks which are left. Also we no longer need the rwsem since there is no longer any danger of the inode invalidation calling back into the glock code (and from there back into the inode code). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | | | | | | | | | GFS2: Metadata address space clean upSteven Whitehouse2010-03-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the start of GFS2, an "extra" inode has been used to store the metadata belonging to each inode. The only reason for using this inode was to have an extra address space, the other fields were unused. This means that the memory usage was rather inefficient. The reason for keeping each inode's metadata in a separate address space is that when glocks are requested on remote nodes, we need to be able to efficiently locate the data and metadata which relating to that glock (inode) in order to sync or sync and invalidate it (depending on the remotely requested lock mode). This patch adds a new type of glock, which has in addition to its normal fields, has an address space. This applies to all inode and rgrp glocks (but to no other glock types which remain as before). As a result, we no longer need to have the second inode. This results in three major improvements: 1. A saving of approx 25% of memory used in caching inodes 2. A removal of the circular dependency between inodes and glocks 3. No confusion between "normal" and "metadata" inodes in super.c Although the first of these is the more immediately apparent, the second is just as important as it now enables a number of clean ups at umount time. Those will be the subject of future patches. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* | | | | | | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds2010-03-03
|\ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] pSesInfo->sesSem is used as mutex. Rename it to session_mutex and [CIFS] Use unsigned ea length for clarity cifs: set server_eof in cifs_fattr_to_inode [CIFS] Minor cleanup to EA patch cifs: merge CIFSSMBQueryEA with CIFSSMBQAllEAs cifs: verify lengths of QueryAllEAs reply cifs: increase maximum buffer size in CIFSSMBQAllEAs cifs: rename name_len to list_len in CIFSSMBQAllEAs cifs: clean up indentation in CIFSSMBQAllEAs cifs: add parens around smb_var in BCC macros
| * | | | | | | | | | | | | [CIFS] pSesInfo->sesSem is used as mutex. Rename it to session_mutex andSteve French2010-02-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | convert it to a real mutex. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | | | | | | | | | | | [CIFS] Use unsigned ea length for claritySteve French2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jeff correctly noted that using unsigned ea length is more intuitive. CC: Jeff Lyaton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | | | | | | | | | | | cifs: set server_eof in cifs_fattr_to_inodeJeff Layton2010-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | | | | | | | | | | | [CIFS] Minor cleanup to EA patchSteve French2010-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CC: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | | | | | | | | | | | cifs: merge CIFSSMBQueryEA with CIFSSMBQAllEAsJeff Layton2010-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an "ea_name" parameter to CIFSSMBQAllEAs. When it's set make it behave like CIFSSMBQueryEA does now. The current callers of CIFSSMBQueryEA are converted to use CIFSSMBQAllEAs, and the old CIFSSMBQueryEA function is removed. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | | | | | | | | | | | cifs: verify lengths of QueryAllEAs replyJeff Layton2010-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure the lengths in a QUERY_ALL_EAS reply don't make the parser walk off the end of the SMB. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | | | | | | | | | | | cifs: increase maximum buffer size in CIFSSMBQAllEAsJeff Layton2010-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's 4000 now, but there's no reason to limit it to that. We should be able to handle a response up to CIFSMaxBufSize. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | | | | | | | | | | | cifs: rename name_len to list_len in CIFSSMBQAllEAsJeff Layton2010-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ...for clarity and so we can reuse the name for the real name_len. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | | | | | | | | | | | cifs: clean up indentation in CIFSSMBQAllEAsJeff Layton2010-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a label that we can goto on error, and reduce some of the if/then/else indentation in this function. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | | | | | | | | | | | cifs: add parens around smb_var in BCC macrosJeff Layton2010-02-23
| | |_|_|_|_|/ / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ...to remove ambiguity about how these values are interpreted when passing in more complex values as arguments. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
* | | | | | | | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2010-03-02
|\ \ \ \ \ \ \ \ \ \ \ \ \ | |_|_|_|_|_|_|_|_|/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (38 commits) SELinux: Make selinux_kernel_create_files_as() shouldn't just always return 0 TOMOYO: Protect find_task_by_vpid() with RCU. Security: add static to security_ops and default_security_ops variable selinux: libsepol: remove dead code in check_avtab_hierarchy_callback() TOMOYO: Remove __func__ from tomoyo_is_correct_path/domain security: fix a couple of sparse warnings TOMOYO: Remove unneeded parameter. TOMOYO: Use shorter names. TOMOYO: Use enum for index numbers. TOMOYO: Add garbage collector. TOMOYO: Add refcounter on domain structure. TOMOYO: Merge headers. TOMOYO: Add refcounter on string data. TOMOYO: Reduce lines by using common path for addition and deletion. selinux: fix memory leak in sel_make_bools TOMOYO: Extract bitfield syslog: clean up needless comment syslog: use defined constants instead of raw numbers syslog: distinguish between /proc/kmsg and syscalls selinux: allow MLS->non-MLS and vice versa upon policy reload ...
| * | | | | | | | | | | | Merge branch 'next' into for-linusJames Morris2010-02-28
| |\ \ \ \ \ \ \ \ \ \ \ \ | | |_|/ / / / / / / / / / | |/| | | | | | | | | | |
| | * | | | | | | | | | | syslog: use defined constants instead of raw numbersKees Cook2010-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now the syslog "type" action are just raw numbers which makes the source difficult to follow. This patch replaces the raw numbers with defined constants for some level of sanity. Signed-off-by: Kees Cook <kees.cook@canonical.com> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>