diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2012-07-23 15:27:27 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2012-07-23 15:27:27 -0400 |
commit | a66d2c8f7ec1284206ca7c14569e2a607583f1e3 (patch) | |
tree | 08cf68bcef3559b370843cab8191e5cc0f740bde /fs/ext4 | |
parent | a6be1fcbc57f95bb47ef3c8e4ee3d83731b8f21e (diff) | |
parent | 8cae6f7158ec1fa44c8a04a43db7d8020ec60437 (diff) |
Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull the big VFS changes from Al Viro:
"This one is *big* and changes quite a few things around VFS. What's in there:
- the first of two really major architecture changes - death to open
intents.
The former is finally there; it was very long in making, but with
Miklos getting through really hard and messy final push in
fs/namei.c, we finally have it. Unlike his variant, this one
doesn't introduce struct opendata; what we have instead is
->atomic_open() taking preallocated struct file * and passing
everything via its fields.
Instead of returning struct file *, it returns -E... on error, 0
on success and 1 in "deal with it yourself" case (e.g. symlink
found on server, etc.).
See comments before fs/namei.c:atomic_open(). That made a lot of
goodies finally possible and quite a few are in that pile:
->lookup(), ->d_revalidate() and ->create() do not get struct
nameidata * anymore; ->lookup() and ->d_revalidate() get lookup
flags instead, ->create() gets "do we want it exclusive" flag.
With the introduction of new helper (kern_path_locked()) we are rid
of all struct nameidata instances outside of fs/namei.c; it's still
visible in namei.h, but not for long. Come the next cycle,
declaration will move either to fs/internal.h or to fs/namei.c
itself. [me, miklos, hch]
- The second major change: behaviour of final fput(). Now we have
__fput() done without any locks held by caller *and* not from deep
in call stack.
That obviously lifts a lot of constraints on the locking in there.
Moreover, it's legal now to call fput() from atomic contexts (which
has immediately simplified life for aio.c). We also don't need
anti-recursion logics in __scm_destroy() anymore.
There is a price, though - the damn thing has become partially
asynchronous. For fput() from normal process we are guaranteed
that pending __fput() will be done before the caller returns to
userland, exits or gets stopped for ptrace.
For kernel threads and atomic contexts it's done via
schedule_work(), so theoretically we might need a way to make sure
it's finished; so far only one such place had been found, but there
might be more.
There's flush_delayed_fput() (do all pending __fput()) and there's
__fput_sync() (fput() analog doing __fput() immediately). I hope
we won't need them often; see warnings in fs/file_table.c for
details. [me, based on task_work series from Oleg merged last
cycle]
- sync series from Jan
- large part of "death to sync_supers()" work from Artem; the only
bits missing here are exofs and ext4 ones. As far as I understand,
those are going via the exofs and ext4 trees resp.; once they are
in, we can put ->write_super() to the rest, along with the thread
calling it.
- preparatory bits from unionmount series (from dhowells).
- assorted cleanups and fixes all over the place, as usual.
This is not the last pile for this cycle; there's at least jlayton's
ESTALE work and fsfreeze series (the latter - in dire need of fixes,
so I'm not sure it'll make the cut this cycle). I'll probably throw
symlink/hardlink restrictions stuff from Kees into the next pile, too.
Plus there's a lot of misc patches I hadn't thrown into that one -
it's large enough as it is..."
* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (127 commits)
ext4: switch EXT4_IOC_RESIZE_FS to mnt_want_write_file()
btrfs: switch btrfs_ioctl_balance() to mnt_want_write_file()
switch dentry_open() to struct path, make it grab references itself
spufs: shift dget/mntget towards dentry_open()
zoran: don't bother with struct file * in zoran_map
ecryptfs: don't reinvent the wheels, please - use struct completion
don't expose I_NEW inodes via dentry->d_inode
tidy up namei.c a bit
unobfuscate follow_up() a bit
ext3: pass custom EOF to generic_file_llseek_size()
ext4: use core vfs llseek code for dir seeks
vfs: allow custom EOF in generic_file_llseek code
vfs: Avoid unnecessary WB_SYNC_NONE writeback during sys_sync and reorder sync passes
vfs: Remove unnecessary flushing of block devices
vfs: Make sys_sync writeout also block device inodes
vfs: Create function for iterating over block devices
vfs: Reorder operations during sys_sync
quota: Move quota syncing to ->sync_fs method
quota: Split dquot_quota_sync() to writeback and cache flushing part
vfs: Move noop_backing_dev_info check from sync into writeback
...
Diffstat (limited to 'fs/ext4')
-rw-r--r-- | fs/ext4/dir.c | 75 | ||||
-rw-r--r-- | fs/ext4/file.c | 9 | ||||
-rw-r--r-- | fs/ext4/fsync.c | 11 | ||||
-rw-r--r-- | fs/ext4/ioctl.c | 4 | ||||
-rw-r--r-- | fs/ext4/namei.c | 8 | ||||
-rw-r--r-- | fs/ext4/super.c | 5 |
6 files changed, 32 insertions, 80 deletions
diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c index aa39e600d159..8e07d2a5a139 100644 --- a/fs/ext4/dir.c +++ b/fs/ext4/dir.c | |||
@@ -324,74 +324,27 @@ static inline loff_t ext4_get_htree_eof(struct file *filp) | |||
324 | 324 | ||
325 | 325 | ||
326 | /* | 326 | /* |
327 | * ext4_dir_llseek() based on generic_file_llseek() to handle both | 327 | * ext4_dir_llseek() calls generic_file_llseek_size to handle htree |
328 | * non-htree and htree directories, where the "offset" is in terms | 328 | * directories, where the "offset" is in terms of the filename hash |
329 | * of the filename hash value instead of the byte offset. | 329 | * value instead of the byte offset. |
330 | * | 330 | * |
331 | * NOTE: offsets obtained *before* ext4_set_inode_flag(dir, EXT4_INODE_INDEX) | 331 | * Because we may return a 64-bit hash that is well beyond offset limits, |
332 | * will be invalid once the directory was converted into a dx directory | 332 | * we need to pass the max hash as the maximum allowable offset in |
333 | * the htree directory case. | ||
334 | * | ||
335 | * For non-htree, ext4_llseek already chooses the proper max offset. | ||
333 | */ | 336 | */ |
334 | loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin) | 337 | loff_t ext4_dir_llseek(struct file *file, loff_t offset, int origin) |
335 | { | 338 | { |
336 | struct inode *inode = file->f_mapping->host; | 339 | struct inode *inode = file->f_mapping->host; |
337 | loff_t ret = -EINVAL; | ||
338 | int dx_dir = is_dx_dir(inode); | 340 | int dx_dir = is_dx_dir(inode); |
341 | loff_t htree_max = ext4_get_htree_eof(file); | ||
339 | 342 | ||
340 | mutex_lock(&inode->i_mutex); | 343 | if (likely(dx_dir)) |
341 | 344 | return generic_file_llseek_size(file, offset, origin, | |
342 | /* NOTE: relative offsets with dx directories might not work | 345 | htree_max, htree_max); |
343 | * as expected, as it is difficult to figure out the | 346 | else |
344 | * correct offset between dx hashes */ | 347 | return ext4_llseek(file, offset, origin); |
345 | |||
346 | switch (origin) { | ||
347 | case SEEK_END: | ||
348 | if (unlikely(offset > 0)) | ||
349 | goto out_err; /* not supported for directories */ | ||
350 | |||
351 | /* so only negative offsets are left, does that have a | ||
352 | * meaning for directories at all? */ | ||
353 | if (dx_dir) | ||
354 | offset += ext4_get_htree_eof(file); | ||
355 | else | ||
356 | offset += inode->i_size; | ||
357 | break; | ||
358 | case SEEK_CUR: | ||
359 | /* | ||
360 | * Here we special-case the lseek(fd, 0, SEEK_CUR) | ||
361 | * position-querying operation. Avoid rewriting the "same" | ||
362 | * f_pos value back to the file because a concurrent read(), | ||
363 | * write() or lseek() might have altered it | ||
364 | */ | ||
365 | if (offset == 0) { | ||
366 | offset = file->f_pos; | ||
367 | goto out_ok; | ||
368 | } | ||
369 | |||
370 | offset += file->f_pos; | ||
371 | break; | ||
372 | } | ||
373 | |||
374 | if (unlikely(offset < 0)) | ||
375 | goto out_err; | ||
376 | |||
377 | if (!dx_dir) { | ||
378 | if (offset > inode->i_sb->s_maxbytes) | ||
379 | goto out_err; | ||
380 | } else if (offset > ext4_get_htree_eof(file)) | ||
381 | goto out_err; | ||
382 | |||
383 | /* Special lock needed here? */ | ||
384 | if (offset != file->f_pos) { | ||
385 | file->f_pos = offset; | ||
386 | file->f_version = 0; | ||
387 | } | ||
388 | |||
389 | out_ok: | ||
390 | ret = offset; | ||
391 | out_err: | ||
392 | mutex_unlock(&inode->i_mutex); | ||
393 | |||
394 | return ret; | ||
395 | } | 348 | } |
396 | 349 | ||
397 | /* | 350 | /* |
diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 8c7642a00054..782eecb57e43 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c | |||
@@ -211,9 +211,9 @@ static int ext4_file_open(struct inode * inode, struct file * filp) | |||
211 | } | 211 | } |
212 | 212 | ||
213 | /* | 213 | /* |
214 | * ext4_llseek() copied from generic_file_llseek() to handle both | 214 | * ext4_llseek() handles both block-mapped and extent-mapped maxbytes values |
215 | * block-mapped and extent-mapped maxbytes values. This should | 215 | * by calling generic_file_llseek_size() with the appropriate maxbytes |
216 | * otherwise be identical with generic_file_llseek(). | 216 | * value for each. |
217 | */ | 217 | */ |
218 | loff_t ext4_llseek(struct file *file, loff_t offset, int origin) | 218 | loff_t ext4_llseek(struct file *file, loff_t offset, int origin) |
219 | { | 219 | { |
@@ -225,7 +225,8 @@ loff_t ext4_llseek(struct file *file, loff_t offset, int origin) | |||
225 | else | 225 | else |
226 | maxbytes = inode->i_sb->s_maxbytes; | 226 | maxbytes = inode->i_sb->s_maxbytes; |
227 | 227 | ||
228 | return generic_file_llseek_size(file, offset, origin, maxbytes); | 228 | return generic_file_llseek_size(file, offset, origin, |
229 | maxbytes, i_size_read(inode)); | ||
229 | } | 230 | } |
230 | 231 | ||
231 | const struct file_operations ext4_file_operations = { | 232 | const struct file_operations ext4_file_operations = { |
diff --git a/fs/ext4/fsync.c b/fs/ext4/fsync.c index bb6c7d811313..2a1dcea4f12e 100644 --- a/fs/ext4/fsync.c +++ b/fs/ext4/fsync.c | |||
@@ -135,14 +135,7 @@ static int ext4_sync_parent(struct inode *inode) | |||
135 | inode = igrab(inode); | 135 | inode = igrab(inode); |
136 | while (ext4_test_inode_state(inode, EXT4_STATE_NEWENTRY)) { | 136 | while (ext4_test_inode_state(inode, EXT4_STATE_NEWENTRY)) { |
137 | ext4_clear_inode_state(inode, EXT4_STATE_NEWENTRY); | 137 | ext4_clear_inode_state(inode, EXT4_STATE_NEWENTRY); |
138 | dentry = NULL; | 138 | dentry = d_find_any_alias(inode); |
139 | spin_lock(&inode->i_lock); | ||
140 | if (!list_empty(&inode->i_dentry)) { | ||
141 | dentry = list_first_entry(&inode->i_dentry, | ||
142 | struct dentry, d_alias); | ||
143 | dget(dentry); | ||
144 | } | ||
145 | spin_unlock(&inode->i_lock); | ||
146 | if (!dentry) | 139 | if (!dentry) |
147 | break; | 140 | break; |
148 | next = igrab(dentry->d_parent->d_inode); | 141 | next = igrab(dentry->d_parent->d_inode); |
@@ -232,7 +225,7 @@ int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync) | |||
232 | 225 | ||
233 | if (!journal) { | 226 | if (!journal) { |
234 | ret = __sync_inode(inode, datasync); | 227 | ret = __sync_inode(inode, datasync); |
235 | if (!ret && !list_empty(&inode->i_dentry)) | 228 | if (!ret && !hlist_empty(&inode->i_dentry)) |
236 | ret = ext4_sync_parent(inode); | 229 | ret = ext4_sync_parent(inode); |
237 | goto out; | 230 | goto out; |
238 | } | 231 | } |
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index 6ec6f9ee2fec..7f7dad787603 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c | |||
@@ -389,7 +389,7 @@ group_add_out: | |||
389 | if (err) | 389 | if (err) |
390 | return err; | 390 | return err; |
391 | 391 | ||
392 | err = mnt_want_write(filp->f_path.mnt); | 392 | err = mnt_want_write_file(filp); |
393 | if (err) | 393 | if (err) |
394 | goto resizefs_out; | 394 | goto resizefs_out; |
395 | 395 | ||
@@ -401,7 +401,7 @@ group_add_out: | |||
401 | } | 401 | } |
402 | if (err == 0) | 402 | if (err == 0) |
403 | err = err2; | 403 | err = err2; |
404 | mnt_drop_write(filp->f_path.mnt); | 404 | mnt_drop_write_file(filp); |
405 | resizefs_out: | 405 | resizefs_out: |
406 | ext4_resize_end(sb); | 406 | ext4_resize_end(sb); |
407 | return err; | 407 | return err; |
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 5845cd97bf8b..d0d3f0e87f99 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c | |||
@@ -1312,7 +1312,7 @@ errout: | |||
1312 | return NULL; | 1312 | return NULL; |
1313 | } | 1313 | } |
1314 | 1314 | ||
1315 | static struct dentry *ext4_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd) | 1315 | static struct dentry *ext4_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags) |
1316 | { | 1316 | { |
1317 | struct inode *inode; | 1317 | struct inode *inode; |
1318 | struct ext4_dir_entry_2 *de; | 1318 | struct ext4_dir_entry_2 *de; |
@@ -2072,8 +2072,8 @@ static int ext4_add_nondir(handle_t *handle, | |||
2072 | int err = ext4_add_entry(handle, dentry, inode); | 2072 | int err = ext4_add_entry(handle, dentry, inode); |
2073 | if (!err) { | 2073 | if (!err) { |
2074 | ext4_mark_inode_dirty(handle, inode); | 2074 | ext4_mark_inode_dirty(handle, inode); |
2075 | d_instantiate(dentry, inode); | ||
2076 | unlock_new_inode(inode); | 2075 | unlock_new_inode(inode); |
2076 | d_instantiate(dentry, inode); | ||
2077 | return 0; | 2077 | return 0; |
2078 | } | 2078 | } |
2079 | drop_nlink(inode); | 2079 | drop_nlink(inode); |
@@ -2091,7 +2091,7 @@ static int ext4_add_nondir(handle_t *handle, | |||
2091 | * with d_instantiate(). | 2091 | * with d_instantiate(). |
2092 | */ | 2092 | */ |
2093 | static int ext4_create(struct inode *dir, struct dentry *dentry, umode_t mode, | 2093 | static int ext4_create(struct inode *dir, struct dentry *dentry, umode_t mode, |
2094 | struct nameidata *nd) | 2094 | bool excl) |
2095 | { | 2095 | { |
2096 | handle_t *handle; | 2096 | handle_t *handle; |
2097 | struct inode *inode; | 2097 | struct inode *inode; |
@@ -2249,8 +2249,8 @@ out_clear_inode: | |||
2249 | err = ext4_mark_inode_dirty(handle, dir); | 2249 | err = ext4_mark_inode_dirty(handle, dir); |
2250 | if (err) | 2250 | if (err) |
2251 | goto out_clear_inode; | 2251 | goto out_clear_inode; |
2252 | d_instantiate(dentry, inode); | ||
2253 | unlock_new_inode(inode); | 2252 | unlock_new_inode(inode); |
2253 | d_instantiate(dentry, inode); | ||
2254 | out_stop: | 2254 | out_stop: |
2255 | brelse(dir_block); | 2255 | brelse(dir_block); |
2256 | ext4_journal_stop(handle); | 2256 | ext4_journal_stop(handle); |
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index eb7aa3e4ef05..d8759401ecae 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c | |||
@@ -4325,6 +4325,11 @@ static int ext4_sync_fs(struct super_block *sb, int wait) | |||
4325 | 4325 | ||
4326 | trace_ext4_sync_fs(sb, wait); | 4326 | trace_ext4_sync_fs(sb, wait); |
4327 | flush_workqueue(sbi->dio_unwritten_wq); | 4327 | flush_workqueue(sbi->dio_unwritten_wq); |
4328 | /* | ||
4329 | * Writeback quota in non-journalled quota case - journalled quota has | ||
4330 | * no dirty dquots | ||
4331 | */ | ||
4332 | dquot_writeback_dquots(sb, -1); | ||
4328 | if (jbd2_journal_start_commit(sbi->s_journal, &target)) { | 4333 | if (jbd2_journal_start_commit(sbi->s_journal, &target)) { |
4329 | if (wait) | 4334 | if (wait) |
4330 | jbd2_log_wait_commit(sbi->s_journal, target); | 4335 | jbd2_log_wait_commit(sbi->s_journal, target); |