aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorNick Piggin <npiggin@suse.de>2009-01-06 17:40:28 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2009-01-06 18:59:09 -0500
commit67d58ac47d25f7e2a105248a4aea6113131ab874 (patch)
tree5bf9440696b72ec0962d352c6d60b81929b79602
parent856bf4d717feb8c55d4e2f817b71ebb70cfbc67b (diff)
mm: pagecache gfp flags fix
Frustratingly, gfp_t is really divided into two classes of flags. One are the context dependent ones (can we sleep? can we enter filesystem? block subsystem? should we use some extra reserves, etc.). The other ones are the type of memory required and depend on how the algorithm is implemented rather than the point at which the memory is allocated (highmem? dma memory? etc). Some of the functions which allocate a page and add it to page cache take a gfp_t, but sometimes those functions or their callers aren't really doing the right thing: when allocating pagecache page, the memory type should be mapping_gfp_mask(mapping). When allocating radix tree nodes, the memory type should be kernel mapped (not highmem) memory. The gfp_t argument should only really be needed for context dependent options. This patch doesn't really solve that tangle in a nice way, but it does attempt to fix a couple of bugs. - find_or_create_page changes its radix-tree allocation to only include the main context dependent flags in order so the pagecache page may be allocated from arbitrary types of memory without affecting the radix-tree. In practice, slab allocations don't come from highmem anyway, and radix-tree only uses slab allocations. So there isn't a practical change (unless some fs uses GFP_DMA for pages). - grab_cache_page_nowait() is changed to allocate radix-tree nodes with GFP_NOFS, because it is not supposed to reenter the filesystem. This bug could cause lock recursion if a filesystem is not expecting the function to reenter the fs (as-per documentation). Filesystems should be careful about exactly what semantics they want and what they get when fiddling with gfp_t masks to allocate pagecache. One should be as liberal as possible with the type of memory that can be used, and same for the the context specific flags. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r--mm/filemap.c11
1 files changed, 9 insertions, 2 deletions
diff --git a/mm/filemap.c b/mm/filemap.c
index f3555fb806d3..2f55a1e2baf7 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -741,7 +741,14 @@ repeat:
741 page = __page_cache_alloc(gfp_mask); 741 page = __page_cache_alloc(gfp_mask);
742 if (!page) 742 if (!page)
743 return NULL; 743 return NULL;
744 err = add_to_page_cache_lru(page, mapping, index, gfp_mask); 744 /*
745 * We want a regular kernel memory (not highmem or DMA etc)
746 * allocation for the radix tree nodes, but we need to honour
747 * the context-specific requirements the caller has asked for.
748 * GFP_RECLAIM_MASK collects those requirements.
749 */
750 err = add_to_page_cache_lru(page, mapping, index,
751 (gfp_mask & GFP_RECLAIM_MASK));
745 if (unlikely(err)) { 752 if (unlikely(err)) {
746 page_cache_release(page); 753 page_cache_release(page);
747 page = NULL; 754 page = NULL;
@@ -950,7 +957,7 @@ grab_cache_page_nowait(struct address_space *mapping, pgoff_t index)
950 return NULL; 957 return NULL;
951 } 958 }
952 page = __page_cache_alloc(mapping_gfp_mask(mapping) & ~__GFP_FS); 959 page = __page_cache_alloc(mapping_gfp_mask(mapping) & ~__GFP_FS);
953 if (page && add_to_page_cache_lru(page, mapping, index, GFP_KERNEL)) { 960 if (page && add_to_page_cache_lru(page, mapping, index, GFP_NOFS)) {
954 page_cache_release(page); 961 page_cache_release(page);
955 page = NULL; 962 page = NULL;
956 } 963 }