aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/filesystems
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2013-11-13 01:34:18 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2013-11-13 01:34:18 -0500
commit9bc9ccd7db1c9f043f75380b5a5b94912046a60e (patch)
treedd0a1b3396ae9414f668b0110cc39d11268ad3ed /Documentation/filesystems
parentf0230294271f511b41797305b685365a9e569a09 (diff)
parentbdd3536618443809d18868563eeafa63b9d29603 (diff)
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro: "All kinds of stuff this time around; some more notable parts: - RCU'd vfsmounts handling - new primitives for coredump handling - files_lock is gone - Bruce's delegations handling series - exportfs fixes plus misc stuff all over the place" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (101 commits) ecryptfs: ->f_op is never NULL locks: break delegations on any attribute modification locks: break delegations on link locks: break delegations on rename locks: helper functions for delegation breaking locks: break delegations on unlink namei: minor vfs_unlink cleanup locks: implement delegations locks: introduce new FL_DELEG lock flag vfs: take i_mutex on renamed file vfs: rename I_MUTEX_QUOTA now that it's not used for quotas vfs: don't use PARENT/CHILD lock classes for non-directories vfs: pull ext4's double-i_mutex-locking into common code exportfs: fix quadratic behavior in filehandle lookup exportfs: better variable name exportfs: move most of reconnect_path to helper function exportfs: eliminate unused "noprogress" counter exportfs: stop retrying once we race with rename/remove exportfs: clear DISCONNECTED on all parents sooner exportfs: more detailed comment for path_reconnect ...
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/directory-locking31
-rw-r--r--Documentation/filesystems/porting8
2 files changed, 30 insertions, 9 deletions
diff --git a/Documentation/filesystems/directory-locking b/Documentation/filesystems/directory-locking
index ff7b611abf33..09bbf9a54f80 100644
--- a/Documentation/filesystems/directory-locking
+++ b/Documentation/filesystems/directory-locking
@@ -2,6 +2,10 @@
2kinds of locks - per-inode (->i_mutex) and per-filesystem 2kinds of locks - per-inode (->i_mutex) and per-filesystem
3(->s_vfs_rename_mutex). 3(->s_vfs_rename_mutex).
4 4
5 When taking the i_mutex on multiple non-directory objects, we
6always acquire the locks in order by increasing address. We'll call
7that "inode pointer" order in the following.
8
5 For our purposes all operations fall in 5 classes: 9 For our purposes all operations fall in 5 classes:
6 10
71) read access. Locking rules: caller locks directory we are accessing. 111) read access. Locking rules: caller locks directory we are accessing.
@@ -12,8 +16,9 @@ kinds of locks - per-inode (->i_mutex) and per-filesystem
12locks victim and calls the method. 16locks victim and calls the method.
13 17
144) rename() that is _not_ cross-directory. Locking rules: caller locks 184) rename() that is _not_ cross-directory. Locking rules: caller locks
15the parent, finds source and target, if target already exists - locks it 19the parent and finds source and target. If target already exists, lock
16and then calls the method. 20it. If source is a non-directory, lock it. If that means we need to
21lock both, lock them in inode pointer order.
17 22
185) link creation. Locking rules: 235) link creation. Locking rules:
19 * lock parent 24 * lock parent
@@ -30,7 +35,9 @@ rules:
30 fail with -ENOTEMPTY 35 fail with -ENOTEMPTY
31 * if new parent is equal to or is a descendent of source 36 * if new parent is equal to or is a descendent of source
32 fail with -ELOOP 37 fail with -ELOOP
33 * if target exists - lock it. 38 * If target exists, lock it. If source is a non-directory, lock
39 it. In case that means we need to lock both source and target,
40 do so in inode pointer order.
34 * call the method. 41 * call the method.
35 42
36 43
@@ -56,9 +63,11 @@ objects - A < B iff A is an ancestor of B.
56 renames will be blocked on filesystem lock and we don't start changing 63 renames will be blocked on filesystem lock and we don't start changing
57 the order until we had acquired all locks). 64 the order until we had acquired all locks).
58 65
59(3) any operation holds at most one lock on non-directory object and 66(3) locks on non-directory objects are acquired only after locks on
60 that lock is acquired after all other locks. (Proof: see descriptions 67 directory objects, and are acquired in inode pointer order.
61 of operations). 68 (Proof: all operations but renames take lock on at most one
69 non-directory object, except renames, which take locks on source and
70 target in inode pointer order in the case they are not directories.)
62 71
63 Now consider the minimal deadlock. Each process is blocked on 72 Now consider the minimal deadlock. Each process is blocked on
64attempt to acquire some lock and already holds at least one lock. Let's 73attempt to acquire some lock and already holds at least one lock. Let's
@@ -66,9 +75,13 @@ consider the set of contended locks. First of all, filesystem lock is
66not contended, since any process blocked on it is not holding any locks. 75not contended, since any process blocked on it is not holding any locks.
67Thus all processes are blocked on ->i_mutex. 76Thus all processes are blocked on ->i_mutex.
68 77
69 Non-directory objects are not contended due to (3). Thus link 78 By (3), any process holding a non-directory lock can only be
70creation can't be a part of deadlock - it can't be blocked on source 79waiting on another non-directory lock with a larger address. Therefore
71and it means that it doesn't hold any locks. 80the process holding the "largest" such lock can always make progress, and
81non-directory objects are not included in the set of contended locks.
82
83 Thus link creation can't be a part of deadlock - it can't be
84blocked on source and it means that it doesn't hold any locks.
72 85
73 Any contended object is either held by cross-directory rename or 86 Any contended object is either held by cross-directory rename or
74has a child that is also contended. Indeed, suppose that it is held by 87has a child that is also contended. Indeed, suppose that it is held by
diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
index f0890581f7f6..fe2b7ae6f962 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -455,3 +455,11 @@ in your dentry operations instead.
455 vfs_follow_link has been removed. Filesystems must use nd_set_link 455 vfs_follow_link has been removed. Filesystems must use nd_set_link
456 from ->follow_link for normal symlinks, or nd_jump_link for magic 456 from ->follow_link for normal symlinks, or nd_jump_link for magic
457 /proc/<pid> style links. 457 /proc/<pid> style links.
458--
459[mandatory]
460 iget5_locked()/ilookup5()/ilookup5_nowait() test() callback used to be
461 called with both ->i_lock and inode_hash_lock held; the former is *not*
462 taken anymore, so verify that your callbacks do not rely on it (none
463 of the in-tree instances did). inode_hash_lock is still held,
464 of course, so they are still serialized wrt removal from inode hash,
465 as well as wrt set() callback of iget5_locked().