diff options
Diffstat (limited to 'fs/ntfs/ChangeLog')
-rw-r--r-- | fs/ntfs/ChangeLog | 42 |
1 files changed, 42 insertions, 0 deletions
diff --git a/fs/ntfs/ChangeLog b/fs/ntfs/ChangeLog index 3d2cac4061d6..9709fac6531d 100644 --- a/fs/ntfs/ChangeLog +++ b/fs/ntfs/ChangeLog | |||
@@ -132,6 +132,48 @@ ToDo/Notes: | |||
132 | the already mapped runlist fragment which causes | 132 | the already mapped runlist fragment which causes |
133 | ntfs_mapping_pairs_decompress() to fail and return error. Update | 133 | ntfs_mapping_pairs_decompress() to fail and return error. Update |
134 | ntfs_attr_find_vcn_nolock() accordingly. | 134 | ntfs_attr_find_vcn_nolock() accordingly. |
135 | - Fix a nasty deadlock that appeared in recent kernels. | ||
136 | The situation: VFS inode X on a mounted ntfs volume is dirty. For | ||
137 | same inode X, the ntfs_inode is dirty and thus corresponding on-disk | ||
138 | inode, i.e. mft record, which is in a dirty PAGE_CACHE_PAGE belonging | ||
139 | to the table of inodes, i.e. $MFT, inode 0. | ||
140 | What happens: | ||
141 | Process 1: sys_sync()/umount()/whatever... calls | ||
142 | __sync_single_inode() for $MFT -> do_writepages() -> write_page for | ||
143 | the dirty page containing the on-disk inode X, the page is now locked | ||
144 | -> ntfs_write_mst_block() which clears PageUptodate() on the page to | ||
145 | prevent anyone else getting hold of it whilst it does the write out. | ||
146 | This is necessary as the on-disk inode needs "fixups" applied before | ||
147 | the write to disk which are removed again after the write and | ||
148 | PageUptodate is then set again. It then analyses the page looking | ||
149 | for dirty on-disk inodes and when it finds one it calls | ||
150 | ntfs_may_write_mft_record() to see if it is safe to write this | ||
151 | on-disk inode. This then calls ilookup5() to check if the | ||
152 | corresponding VFS inode is in icache(). This in turn calls ifind() | ||
153 | which waits on the inode lock via wait_on_inode whilst holding the | ||
154 | global inode_lock. | ||
155 | Process 2: pdflush results in a call to __sync_single_inode for the | ||
156 | same VFS inode X on the ntfs volume. This locks the inode (I_LOCK) | ||
157 | then calls write-inode -> ntfs_write_inode -> map_mft_record() -> | ||
158 | read_cache_page() for the page (in page cache of table of inodes | ||
159 | $MFT, inode 0) containing the on-disk inode. This page has | ||
160 | PageUptodate() clear because of Process 1 (see above) so | ||
161 | read_cache_page() blocks when it tries to take the page lock for the | ||
162 | page so it can call ntfs_read_page(). | ||
163 | Thus Process 1 is holding the page lock on the page containing the | ||
164 | on-disk inode X and it is waiting on the inode X to be unlocked in | ||
165 | ifind() so it can write the page out and then unlock the page. | ||
166 | And Process 2 is holding the inode lock on inode X and is waiting for | ||
167 | the page to be unlocked so it can call ntfs_readpage() or discover | ||
168 | that Process 1 set PageUptodate() again and use the page. | ||
169 | Thus we have a deadlock due to ifind() waiting on the inode lock. | ||
170 | The solution: The fix is to use the newly introduced | ||
171 | ilookup5_nowait() which does not wait on the inode's lock and hence | ||
172 | avoids the deadlock. This is safe as we do not care about the VFS | ||
173 | inode and only use the fact that it is in the VFS inode cache and the | ||
174 | fact that the vfs and ntfs inodes are one struct in memory to find | ||
175 | the ntfs inode in memory if present. Also, the ntfs inode has its | ||
176 | own locking so it does not matter if the vfs inode is locked. | ||
135 | 177 | ||
136 | 2.1.22 - Many bug and race fixes and error handling improvements. | 178 | 2.1.22 - Many bug and race fixes and error handling improvements. |
137 | 179 | ||