From 68589bc353037f233fe510ad9ff432338c95db66 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Tue, 14 Nov 2006 02:03:32 -0800 Subject: [PATCH] hugetlb: prepare_hugepage_range check offset too (David:) If hugetlbfs_file_mmap() returns a failure to do_mmap_pgoff() - for example, because the given file offset is not hugepage aligned - then do_mmap_pgoff will go to the unmap_and_free_vma backout path. But at this stage the vma hasn't been marked as hugepage, and the backout path will call unmap_region() on it. That will eventually call down to the non-hugepage version of unmap_page_range(). On ppc64, at least, that will cause serious problems if there are any existing hugepage pagetable entries in the vicinity - for example if there are any other hugepage mappings under the same PUD. unmap_page_range() will trigger a bad_pud() on the hugepage pud entries. I suspect this will also cause bad problems on ia64, though I don't have a machine to test it on. (Hugh:) prepare_hugepage_range() should check file offset alignment when it checks virtual address and length, to stop MAP_FIXED with a bad huge offset from unmapping before it fails further down. PowerPC should apply the same prepare_hugepage_range alignment checks as ia64 and all the others do. Then none of the alignment checks in hugetlbfs_file_mmap are required (nor is the check for too small a mapping); but even so, move up setting of VM_HUGETLB and add a comment to warn of what David Gibson discovered - if hugetlbfs_file_mmap fails before setting it, do_mmap_pgoff's unmap_region when unwinding from error will go the non-huge way, which may cause bad behaviour on architectures (powerpc and ia64) which segregate their huge mappings into a separate region of the address space. Signed-off-by: Hugh Dickins Cc: "Luck, Tony" Cc: "David S. Miller" Acked-by: Adam Litke Acked-by: David Gibson Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/mmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'mm/mmap.c') diff --git a/mm/mmap.c b/mm/mmap.c index 497e502dfd6b..bdace87d7c01 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1379,7 +1379,7 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, * Check if the given range is hugepage aligned, and * can be made suitable for hugepages. */ - ret = prepare_hugepage_range(addr, len); + ret = prepare_hugepage_range(addr, len, pgoff); } else { /* * Ensure that a normal request is not falling in a -- cgit v1.2.2 From cb07c9a1864a8eac9f3123e428100d5b2a16e65a Mon Sep 17 00:00:00 2001 From: David Gibson Date: Tue, 14 Nov 2006 02:03:38 -0800 Subject: [PATCH] hugetlb: check for brk() entering a hugepage region Unlike mmap(), the codepath for brk() creates a vma without first checking that it doesn't touch a region exclusively reserved for hugepages. On powerpc, this can allow it to create a normal page vma in a hugepage region, causing oopses and other badness. Add a test to prevent this. With this patch, brk() will simply fail if it attempts to move the break into a hugepage reserved region. Signed-off-by: David Gibson Cc: Adam Litke Cc: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/mmap.c | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'mm/mmap.c') diff --git a/mm/mmap.c b/mm/mmap.c index bdace87d7c01..2526463c99a7 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1880,6 +1880,10 @@ unsigned long do_brk(unsigned long addr, unsigned long len) if ((addr + len) > TASK_SIZE || (addr + len) < addr) return -EINVAL; + error = is_hugepage_only_range(current->mm, addr, len); + if (error) + return error; + flags = VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT | mm->def_flags; error = arch_mmap_check(addr, len, flags); -- cgit v1.2.2 From cd2579d7aa7bfc966cc271a88e77f8cfc3b0b7ba Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Tue, 14 Nov 2006 13:43:38 +0000 Subject: [PATCH] hugetlb: fix error return for brk() entering a hugepage region Commit cb07c9a1864a8eac9f3123e428100d5b2a16e65a causes the wrong return value. is_hugepage_only_range() is a boolean, so we should return -EINVAL rather than 1. Also - we can use "mm" instead of looking up "current->mm" again. Signed-off-by: Hugh Dickins Signed-off-by: Linus Torvalds --- mm/mmap.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) (limited to 'mm/mmap.c') diff --git a/mm/mmap.c b/mm/mmap.c index 2526463c99a7..7b40abd7cba2 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1880,9 +1880,8 @@ unsigned long do_brk(unsigned long addr, unsigned long len) if ((addr + len) > TASK_SIZE || (addr + len) < addr) return -EINVAL; - error = is_hugepage_only_range(current->mm, addr, len); - if (error) - return error; + if (is_hugepage_only_range(mm, addr, len)) + return -EINVAL; flags = VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT | mm->def_flags; -- cgit v1.2.2