aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/filesystems
diff options
context:
space:
mode:
authorJ. Bruce Fields <bfields@redhat.com>2012-03-05 11:40:41 -0500
committerAl Viro <viro@zeniv.linux.org.uk>2013-11-09 00:16:40 -0500
commit6cedba8962f440c72447f811d0d530a8a9dc637a (patch)
tree740e74df27113d34d06da011b3c5561d19eb3349 /Documentation/filesystems
parent40bd22c9f8617ddd5da06044c81f72a2cf700791 (diff)
vfs: take i_mutex on renamed file
A read delegation is used by NFSv4 as a guarantee that a client can perform local read opens without informing the server. The open operation takes the last component of the pathname as an argument, thus is also a lookup operation, and giving the client the above guarantee means informing the client before we allow anything that would change the set of names pointing to the inode. Therefore, we need to break delegations on rename, link, and unlink. We also need to prevent new delegations from being acquired while one of these operations is in progress. We could add some completely new locking for that purpose, but it's simpler to use the i_mutex, since that's already taken by all the operations we care about. The single exception is rename. So, modify rename to take the i_mutex on the file that is being renamed. Also fix up lockdep and Documentation/filesystems/directory-locking to reflect the change. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/directory-locking31
1 files changed, 22 insertions, 9 deletions
diff --git a/Documentation/filesystems/directory-locking b/Documentation/filesystems/directory-locking
index ff7b611abf33..09bbf9a54f80 100644
--- a/Documentation/filesystems/directory-locking
+++ b/Documentation/filesystems/directory-locking
@@ -2,6 +2,10 @@
2kinds of locks - per-inode (->i_mutex) and per-filesystem 2kinds of locks - per-inode (->i_mutex) and per-filesystem
3(->s_vfs_rename_mutex). 3(->s_vfs_rename_mutex).
4 4
5 When taking the i_mutex on multiple non-directory objects, we
6always acquire the locks in order by increasing address. We'll call
7that "inode pointer" order in the following.
8
5 For our purposes all operations fall in 5 classes: 9 For our purposes all operations fall in 5 classes:
6 10
71) read access. Locking rules: caller locks directory we are accessing. 111) read access. Locking rules: caller locks directory we are accessing.
@@ -12,8 +16,9 @@ kinds of locks - per-inode (->i_mutex) and per-filesystem
12locks victim and calls the method. 16locks victim and calls the method.
13 17
144) rename() that is _not_ cross-directory. Locking rules: caller locks 184) rename() that is _not_ cross-directory. Locking rules: caller locks
15the parent, finds source and target, if target already exists - locks it 19the parent and finds source and target. If target already exists, lock
16and then calls the method. 20it. If source is a non-directory, lock it. If that means we need to
21lock both, lock them in inode pointer order.
17 22
185) link creation. Locking rules: 235) link creation. Locking rules:
19 * lock parent 24 * lock parent
@@ -30,7 +35,9 @@ rules:
30 fail with -ENOTEMPTY 35 fail with -ENOTEMPTY
31 * if new parent is equal to or is a descendent of source 36 * if new parent is equal to or is a descendent of source
32 fail with -ELOOP 37 fail with -ELOOP
33 * if target exists - lock it. 38 * If target exists, lock it. If source is a non-directory, lock
39 it. In case that means we need to lock both source and target,
40 do so in inode pointer order.
34 * call the method. 41 * call the method.
35 42
36 43
@@ -56,9 +63,11 @@ objects - A < B iff A is an ancestor of B.
56 renames will be blocked on filesystem lock and we don't start changing 63 renames will be blocked on filesystem lock and we don't start changing
57 the order until we had acquired all locks). 64 the order until we had acquired all locks).
58 65
59(3) any operation holds at most one lock on non-directory object and 66(3) locks on non-directory objects are acquired only after locks on
60 that lock is acquired after all other locks. (Proof: see descriptions 67 directory objects, and are acquired in inode pointer order.
61 of operations). 68 (Proof: all operations but renames take lock on at most one
69 non-directory object, except renames, which take locks on source and
70 target in inode pointer order in the case they are not directories.)
62 71
63 Now consider the minimal deadlock. Each process is blocked on 72 Now consider the minimal deadlock. Each process is blocked on
64attempt to acquire some lock and already holds at least one lock. Let's 73attempt to acquire some lock and already holds at least one lock. Let's
@@ -66,9 +75,13 @@ consider the set of contended locks. First of all, filesystem lock is
66not contended, since any process blocked on it is not holding any locks. 75not contended, since any process blocked on it is not holding any locks.
67Thus all processes are blocked on ->i_mutex. 76Thus all processes are blocked on ->i_mutex.
68 77
69 Non-directory objects are not contended due to (3). Thus link 78 By (3), any process holding a non-directory lock can only be
70creation can't be a part of deadlock - it can't be blocked on source 79waiting on another non-directory lock with a larger address. Therefore
71and it means that it doesn't hold any locks. 80the process holding the "largest" such lock can always make progress, and
81non-directory objects are not included in the set of contended locks.
82
83 Thus link creation can't be a part of deadlock - it can't be
84blocked on source and it means that it doesn't hold any locks.
72 85
73 Any contended object is either held by cross-directory rename or 86 Any contended object is either held by cross-directory rename or
74has a child that is also contended. Indeed, suppose that it is held by 87has a child that is also contended. Indeed, suppose that it is held by