aboutsummaryrefslogtreecommitdiffstats
path: root/mm
diff options
context:
space:
mode:
authorMike Kravetz <mike.kravetz@oracle.com>2016-06-08 18:33:42 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2016-06-09 17:23:11 -0400
commit67961f9db8c477026ea20ce05761bde6f8bf85b0 (patch)
tree9fc05141d49c6eaa6f81aa68341e691b78fd430c /mm
parentc8ae067f2635be0f8c7e5db1bb74b757d623e05b (diff)
mm/hugetlb: fix huge page reserve accounting for private mappings
When creating a private mapping of a hugetlbfs file, it is possible to unmap pages via ftruncate or fallocate hole punch. If subsequent faults repopulate these mappings, the reserve counts will go negative. This is because the code currently assumes all faults to private mappings will consume reserves. The problem can be recreated as follows: - mmap(MAP_PRIVATE) a file in hugetlbfs filesystem - write fault in pages in the mapping - fallocate(FALLOC_FL_PUNCH_HOLE) some pages in the mapping - write fault in pages in the hole This will result in negative huge page reserve counts and negative subpool usage counts for the hugetlbfs. Note that this can also be recreated with ftruncate, but fallocate is more straight forward. This patch modifies the routines vma_needs_reserves and vma_has_reserves to examine the reserve map associated with private mappings similar to that for shared mappings. However, the reserve map semantics for private and shared mappings are very different. This results in subtly different code that is explained in the comments. Link: http://lkml.kernel.org/r/1464720957-15698-1-git-send-email-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm')
-rw-r--r--mm/hugetlb.c42
1 files changed, 40 insertions, 2 deletions
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d26162e81fea..388c2bb9b55c 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -832,8 +832,27 @@ static bool vma_has_reserves(struct vm_area_struct *vma, long chg)
832 * Only the process that called mmap() has reserves for 832 * Only the process that called mmap() has reserves for
833 * private mappings. 833 * private mappings.
834 */ 834 */
835 if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) 835 if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) {
836 return true; 836 /*
837 * Like the shared case above, a hole punch or truncate
838 * could have been performed on the private mapping.
839 * Examine the value of chg to determine if reserves
840 * actually exist or were previously consumed.
841 * Very Subtle - The value of chg comes from a previous
842 * call to vma_needs_reserves(). The reserve map for
843 * private mappings has different (opposite) semantics
844 * than that of shared mappings. vma_needs_reserves()
845 * has already taken this difference in semantics into
846 * account. Therefore, the meaning of chg is the same
847 * as in the shared case above. Code could easily be
848 * combined, but keeping it separate draws attention to
849 * subtle differences.
850 */
851 if (chg)
852 return false;
853 else
854 return true;
855 }
837 856
838 return false; 857 return false;
839} 858}
@@ -1816,6 +1835,25 @@ static long __vma_reservation_common(struct hstate *h,
1816 1835
1817 if (vma->vm_flags & VM_MAYSHARE) 1836 if (vma->vm_flags & VM_MAYSHARE)
1818 return ret; 1837 return ret;
1838 else if (is_vma_resv_set(vma, HPAGE_RESV_OWNER) && ret >= 0) {
1839 /*
1840 * In most cases, reserves always exist for private mappings.
1841 * However, a file associated with mapping could have been
1842 * hole punched or truncated after reserves were consumed.
1843 * As subsequent fault on such a range will not use reserves.
1844 * Subtle - The reserve map for private mappings has the
1845 * opposite meaning than that of shared mappings. If NO
1846 * entry is in the reserve map, it means a reservation exists.
1847 * If an entry exists in the reserve map, it means the
1848 * reservation has already been consumed. As a result, the
1849 * return value of this routine is the opposite of the
1850 * value returned from reserve map manipulation routines above.
1851 */
1852 if (ret)
1853 return 0;
1854 else
1855 return 1;
1856 }
1819 else 1857 else
1820 return ret < 0 ? ret : 0; 1858 return ret < 0 ? ret : 0;
1821} 1859}