diff options
author | Dmitry Monakhov <dmonakhov@openvz.org> | 2012-10-05 11:31:55 -0400 |
---|---|---|
committer | Theodore Ts'o <tytso@mit.edu> | 2012-10-05 11:31:55 -0400 |
commit | c278531d39f3158bfee93dc67da0b77e09776de2 (patch) | |
tree | b83341e04d54b3f1cd8171f43ec77bbfba06e571 /fs/ext4/extents.c | |
parent | 041bbb6d369811e948ae01f3d00414264076be35 (diff) |
ext4: fix ext4_flush_completed_IO wait semantics
BUG #1) All places where we call ext4_flush_completed_IO are broken
because buffered io and DIO/AIO goes through three stages
1) submitted io,
2) completed io (in i_completed_io_list) conversion pended
3) finished io (conversion done)
And by calling ext4_flush_completed_IO we will flush only
requests which were in (2) stage, which is wrong because:
1) punch_hole and truncate _must_ wait for all outstanding unwritten io
regardless to it's state.
2) fsync and nolock_dio_read should also wait because there is
a time window between end_page_writeback() and ext4_add_complete_io()
As result integrity fsync is broken in case of buffered write
to fallocated region:
fsync blkdev_completion
->filemap_write_and_wait_range
->ext4_end_bio
->end_page_writeback
<-- filemap_write_and_wait_range return
->ext4_flush_completed_IO
sees empty i_completed_io_list but pended
conversion still exist
->ext4_add_complete_io
BUG #2) Race window becomes wider due to the 'ext4: completed_io
locking cleanup V4' patch series
This patch make following changes:
1) ext4_flush_completed_io() now first try to flush completed io and when
wait for any outstanding unwritten io via ext4_unwritten_wait()
2) Rename function to more appropriate name.
3) Assert that all callers of ext4_flush_unwritten_io should hold i_mutex to
prevent endless wait
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Diffstat (limited to 'fs/ext4/extents.c')
-rw-r--r-- | fs/ext4/extents.c | 6 |
1 files changed, 3 insertions, 3 deletions
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index c1fcf489e056..1c94cca35ed1 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c | |||
@@ -4268,7 +4268,7 @@ void ext4_ext_truncate(struct inode *inode) | |||
4268 | * finish any pending end_io work so we won't run the risk of | 4268 | * finish any pending end_io work so we won't run the risk of |
4269 | * converting any truncated blocks to initialized later | 4269 | * converting any truncated blocks to initialized later |
4270 | */ | 4270 | */ |
4271 | ext4_flush_completed_IO(inode); | 4271 | ext4_flush_unwritten_io(inode); |
4272 | 4272 | ||
4273 | /* | 4273 | /* |
4274 | * probably first extent we're gonna free will be last in block | 4274 | * probably first extent we're gonna free will be last in block |
@@ -4847,10 +4847,10 @@ int ext4_ext_punch_hole(struct file *file, loff_t offset, loff_t length) | |||
4847 | 4847 | ||
4848 | /* Wait all existing dio workers, newcomers will block on i_mutex */ | 4848 | /* Wait all existing dio workers, newcomers will block on i_mutex */ |
4849 | ext4_inode_block_unlocked_dio(inode); | 4849 | ext4_inode_block_unlocked_dio(inode); |
4850 | inode_dio_wait(inode); | 4850 | err = ext4_flush_unwritten_io(inode); |
4851 | err = ext4_flush_completed_IO(inode); | ||
4852 | if (err) | 4851 | if (err) |
4853 | goto out_dio; | 4852 | goto out_dio; |
4853 | inode_dio_wait(inode); | ||
4854 | 4854 | ||
4855 | credits = ext4_writepage_trans_blocks(inode); | 4855 | credits = ext4_writepage_trans_blocks(inode); |
4856 | handle = ext4_journal_start(inode, credits); | 4856 | handle = ext4_journal_start(inode, credits); |