diff options
Diffstat (limited to 'fs/notify/mark.c')
| -rw-r--r-- | fs/notify/mark.c | 50 |
1 files changed, 22 insertions, 28 deletions
diff --git a/fs/notify/mark.c b/fs/notify/mark.c index fc6b49bf7360..923fe4a5f503 100644 --- a/fs/notify/mark.c +++ b/fs/notify/mark.c | |||
| @@ -20,28 +20,29 @@ | |||
| 20 | * fsnotify inode mark locking/lifetime/and refcnting | 20 | * fsnotify inode mark locking/lifetime/and refcnting |
| 21 | * | 21 | * |
| 22 | * REFCNT: | 22 | * REFCNT: |
| 23 | * The mark->refcnt tells how many "things" in the kernel currently are | 23 | * The group->recnt and mark->refcnt tell how many "things" in the kernel |
| 24 | * referencing this object. The object typically will live inside the kernel | 24 | * currently are referencing the objects. Both kind of objects typically will |
| 25 | * with a refcnt of 2, one for each list it is on (i_list, g_list). Any task | 25 | * live inside the kernel with a refcnt of 2, one for its creation and one for |
| 26 | * which can find this object holding the appropriete locks, can take a reference | 26 | * the reference a group and a mark hold to each other. |
| 27 | * and the object itself is guaranteed to survive until the reference is dropped. | 27 | * If you are holding the appropriate locks, you can take a reference and the |
| 28 | * object itself is guaranteed to survive until the reference is dropped. | ||
| 28 | * | 29 | * |
| 29 | * LOCKING: | 30 | * LOCKING: |
| 30 | * There are 3 spinlocks involved with fsnotify inode marks and they MUST | 31 | * There are 3 locks involved with fsnotify inode marks and they MUST be taken |
| 31 | * be taken in order as follows: | 32 | * in order as follows: |
| 32 | * | 33 | * |
| 34 | * group->mark_mutex | ||
| 33 | * mark->lock | 35 | * mark->lock |
| 34 | * group->mark_lock | ||
| 35 | * inode->i_lock | 36 | * inode->i_lock |
| 36 | * | 37 | * |
| 37 | * mark->lock protects 2 things, mark->group and mark->inode. You must hold | 38 | * group->mark_mutex protects the marks_list anchored inside a given group and |
| 38 | * that lock to dereference either of these things (they could be NULL even with | 39 | * each mark is hooked via the g_list. It also protects the groups private |
| 39 | * the lock) | 40 | * data (i.e group limits). |
| 40 | * | 41 | |
| 41 | * group->mark_lock protects the marks_list anchored inside a given group | 42 | * mark->lock protects the marks attributes like its masks and flags. |
| 42 | * and each mark is hooked via the g_list. It also sorta protects the | 43 | * Furthermore it protects the access to a reference of the group that the mark |
| 43 | * free_g_list, which when used is anchored by a private list on the stack of the | 44 | * is assigned to as well as the access to a reference of the inode/vfsmount |
| 44 | * task which held the group->mark_lock. | 45 | * that is being watched by the mark. |
| 45 | * | 46 | * |
| 46 | * inode->i_lock protects the i_fsnotify_marks list anchored inside a | 47 | * inode->i_lock protects the i_fsnotify_marks list anchored inside a |
| 47 | * given inode and each mark is hooked via the i_list. (and sorta the | 48 | * given inode and each mark is hooked via the i_list. (and sorta the |
| @@ -64,18 +65,11 @@ | |||
| 64 | * inode. We take i_lock and walk the i_fsnotify_marks safely. For each | 65 | * inode. We take i_lock and walk the i_fsnotify_marks safely. For each |
| 65 | * mark on the list we take a reference (so the mark can't disappear under us). | 66 | * mark on the list we take a reference (so the mark can't disappear under us). |
| 66 | * We remove that mark form the inode's list of marks and we add this mark to a | 67 | * We remove that mark form the inode's list of marks and we add this mark to a |
| 67 | * private list anchored on the stack using i_free_list; At this point we no | 68 | * private list anchored on the stack using i_free_list; we walk i_free_list |
| 68 | * longer fear anything finding the mark using the inode's list of marks. | 69 | * and before we destroy the mark we make sure that we dont race with a |
| 69 | * | 70 | * concurrent destroy_group by getting a ref to the marks group and taking the |
| 70 | * We can safely and locklessly run the private list on the stack of everything | 71 | * groups mutex. |
| 71 | * we just unattached from the original inode. For each mark on the private list | 72 | |
| 72 | * we grab the mark-> and can thus dereference mark->group and mark->inode. If | ||
| 73 | * we see the group and inode are not NULL we take those locks. Now holding all | ||
| 74 | * 3 locks we can completely remove the mark from other tasks finding it in the | ||
| 75 | * future. Remember, 10 things might already be referencing this mark, but they | ||
| 76 | * better be holding a ref. We drop our reference we took before we unhooked it | ||
| 77 | * from the inode. When the ref hits 0 we can free the mark. | ||
| 78 | * | ||
| 79 | * Very similarly for freeing by group, except we use free_g_list. | 73 | * Very similarly for freeing by group, except we use free_g_list. |
| 80 | * | 74 | * |
| 81 | * This has the very interesting property of being able to run concurrently with | 75 | * This has the very interesting property of being able to run concurrently with |
