diff options
author | Nick Piggin <npiggin@suse.de> | 2009-01-04 15:00:53 -0500 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2009-01-04 16:33:20 -0500 |
commit | 54566b2c1594c2326a645a3551f9d989f7ba3c5e (patch) | |
tree | b373f3283fe5e197d0df29cd6b645c35adf1076c /fs/ext3 | |
parent | e687d691cb3790d25e31c74f5941fd7c565e9df5 (diff) |
fs: symlink write_begin allocation context fix
With the write_begin/write_end aops, page_symlink was broken because it
could no longer pass a GFP_NOFS type mask into the point where the
allocations happened. They are done in write_begin, which would always
assume that the filesystem can be entered from reclaim. This bug could
cause filesystem deadlocks.
The funny thing with having a gfp_t mask there is that it doesn't really
allow the caller to arbitrarily tinker with the context in which it can be
called. It couldn't ever be GFP_ATOMIC, for example, because it needs to
take the page lock. The only thing any callers care about is __GFP_FS
anyway, so turn that into a single flag.
Add a new flag for write_begin, AOP_FLAG_NOFS. Filesystems can now act on
this flag in their write_begin function. Change __grab_cache_page to
accept a nofs argument as well, to honour that flag (while we're there,
change the name to grab_cache_page_write_begin which is more instructive
and does away with random leading underscores).
This is really a more flexible way to go in the end anyway -- if a
filesystem happens to want any extra allocations aside from the pagecache
ones in ints write_begin function, it may now use GFP_KERNEL (rather than
GFP_NOFS) for common case allocations (eg. ocfs2_alloc_write_ctxt, for a
random example).
[kosaki.motohiro@jp.fujitsu.com: fix ubifs]
[kosaki.motohiro@jp.fujitsu.com: fix fuse]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@kernel.org> [2.6.28.x]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Cleaned up the calling convention: just pass in the AOP flags
untouched to the grab_cache_page_write_begin() function. That
just simplifies everybody, and may even allow future expansion of the
logic. - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'fs/ext3')
-rw-r--r-- | fs/ext3/inode.c | 2 | ||||
-rw-r--r-- | fs/ext3/namei.c | 3 |
2 files changed, 2 insertions, 3 deletions
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c index c4bdccf976b5..5fa453b49a64 100644 --- a/fs/ext3/inode.c +++ b/fs/ext3/inode.c | |||
@@ -1161,7 +1161,7 @@ static int ext3_write_begin(struct file *file, struct address_space *mapping, | |||
1161 | to = from + len; | 1161 | to = from + len; |
1162 | 1162 | ||
1163 | retry: | 1163 | retry: |
1164 | page = __grab_cache_page(mapping, index); | 1164 | page = grab_cache_page_write_begin(mapping, index, flags); |
1165 | if (!page) | 1165 | if (!page) |
1166 | return -ENOMEM; | 1166 | return -ENOMEM; |
1167 | *pagep = page; | 1167 | *pagep = page; |
diff --git a/fs/ext3/namei.c b/fs/ext3/namei.c index 297ea8dfac7c..1dd2abe6313e 100644 --- a/fs/ext3/namei.c +++ b/fs/ext3/namei.c | |||
@@ -2175,8 +2175,7 @@ retry: | |||
2175 | * We have a transaction open. All is sweetness. It also sets | 2175 | * We have a transaction open. All is sweetness. It also sets |
2176 | * i_size in generic_commit_write(). | 2176 | * i_size in generic_commit_write(). |
2177 | */ | 2177 | */ |
2178 | err = __page_symlink(inode, symname, l, | 2178 | err = __page_symlink(inode, symname, l, 1); |
2179 | mapping_gfp_mask(inode->i_mapping) & ~__GFP_FS); | ||
2180 | if (err) { | 2179 | if (err) { |
2181 | drop_nlink(inode); | 2180 | drop_nlink(inode); |
2182 | unlock_new_inode(inode); | 2181 | unlock_new_inode(inode); |