summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorAndreas Rohner <andreas.rohner@gmx.net>2017-11-17 18:29:35 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2017-11-17 19:10:03 -0500
commit31ccb1f7ba3cfe29631587d451cf5bb8ab593550 (patch)
tree0f42708142e8de5815d26a220513c70e692100a4
parent7554e9c4cfa208acf3164a86c05aaa967b043425 (diff)
nilfs2: fix race condition that causes file system corruption
There is a race condition between nilfs_dirty_inode() and nilfs_set_file_dirty(). When a file is opened, nilfs_dirty_inode() is called to update the access timestamp in the inode. It calls __nilfs_mark_inode_dirty() in a separate transaction. __nilfs_mark_inode_dirty() caches the ifile buffer_head in the i_bh field of the inode info structure and marks it as dirty. After some data was written to the file in another transaction, the function nilfs_set_file_dirty() is called, which adds the inode to the ns_dirty_files list. Then the segment construction calls nilfs_segctor_collect_dirty_files(), which goes through the ns_dirty_files list and checks the i_bh field. If there is a cached buffer_head in i_bh it is not marked as dirty again. Since nilfs_dirty_inode() and nilfs_set_file_dirty() use separate transactions, it is possible that a segment construction that writes out the ifile occurs in-between the two. If this happens the inode is not on the ns_dirty_files list, but its ifile block is still marked as dirty and written out. In the next segment construction, the data for the file is written out and nilfs_bmap_propagate() updates the b-tree. Eventually the bmap root is written into the i_bh block, which is not dirty, because it was written out in another segment construction. As a result the bmap update can be lost, which leads to file system corruption. Either the virtual block address points to an unallocated DAT block, or the DAT entry will be reused for something different. The error can remain undetected for a long time. A typical error message would be one of the "bad btree" errors or a warning that a DAT entry could not be found. This bug can be reproduced reliably by a simple benchmark that creates and overwrites millions of 4k files. Link: http://lkml.kernel.org/r/1509367935-3086-2-git-send-email-konishi.ryusuke@lab.ntt.co.jp Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: Andreas Rohner <andreas.rohner@gmx.net> Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r--fs/nilfs2/segment.c6
1 files changed, 4 insertions, 2 deletions
diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 472f0b53a724..f572538dcc4f 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -1954,8 +1954,6 @@ static int nilfs_segctor_collect_dirty_files(struct nilfs_sc_info *sci,
1954 err, ii->vfs_inode.i_ino); 1954 err, ii->vfs_inode.i_ino);
1955 return err; 1955 return err;
1956 } 1956 }
1957 mark_buffer_dirty(ibh);
1958 nilfs_mdt_mark_dirty(ifile);
1959 spin_lock(&nilfs->ns_inode_lock); 1957 spin_lock(&nilfs->ns_inode_lock);
1960 if (likely(!ii->i_bh)) 1958 if (likely(!ii->i_bh))
1961 ii->i_bh = ibh; 1959 ii->i_bh = ibh;
@@ -1964,6 +1962,10 @@ static int nilfs_segctor_collect_dirty_files(struct nilfs_sc_info *sci,
1964 goto retry; 1962 goto retry;
1965 } 1963 }
1966 1964
1965 // Always redirty the buffer to avoid race condition
1966 mark_buffer_dirty(ii->i_bh);
1967 nilfs_mdt_mark_dirty(ifile);
1968
1967 clear_bit(NILFS_I_QUEUED, &ii->i_state); 1969 clear_bit(NILFS_I_QUEUED, &ii->i_state);
1968 set_bit(NILFS_I_BUSY, &ii->i_state); 1970 set_bit(NILFS_I_BUSY, &ii->i_state);
1969 list_move_tail(&ii->i_dirty, &sci->sc_dirty_files); 1971 list_move_tail(&ii->i_dirty, &sci->sc_dirty_files);