mm for fs: add truncate_pagecache_range()

Holepunching filesystems ext4 and xfs are using truncate_inode_pages_range but forgetting to unmap pages first (ocfs2 remembers). This is not really a bug, since races already require truncate_inode_page() to handle that case once the page is locked; but it can be very inefficient if the file being punched happens to be mapped into many vmas. Provide a drop-in replacement truncate_pagecache_range() which does the unmapping pass first, handling the awkward mismatch between arguments to truncate_inode_pages_range() and arguments to unmap_mapping_range(). Note that holepunching does not unmap privately COWed pages in the range: POSIX requires that we do so when truncating, but it's hard to justify, difficult to implement without an i_size cutoff, and no filesystem is attempting to implement it. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Ben Myers <bpm@sgi.com> Cc: Alex Elder <elder@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
author: Hugh Dickins <hughd@google.com> 2012-03-28 17:42:40 -0400
committer: Linus Torvalds <torvalds@linux-foundation.org> 2012-03-28 20:14:35 -0400
commit: 623e3db9f9b7d6e7b2a99180f9cf0825c936ab7a (patch)
tree: d8eaa8f1665a048c4318ccd0759775e057792823 /mm/truncate.c
parent: 3748b2f15b06ea1861df39d5e9693dcd6e9542b1 (diff)
1 files changed, 40 insertions, 0 deletions
diff --git a/mm/truncate.c b/mm/truncate.c
index 18aded3a89fc..61a183b89df6 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -626,3 +626,43 @@ int vmtruncate_range(struct inode *inode, loff_t lstart, loff_t lend)
        return 0;
 }
+/**
+ * truncate_pagecache_range - unmap and remove pagecache that is hole-punched
+ * @inode: inode
+ * @lstart: offset of beginning of hole
+ * @lend: offset of last byte of hole
+ *
+ * This function should typically be called before the filesystem
+ * releases resources associated with the freed range (eg. deallocates
+ * blocks). This way, pagecache will always stay logically coherent
+ * with on-disk format, and the filesystem would not have to deal with
+ * situations such as writepage being called for a page that has already
+ * had its underlying blocks deallocated.
+ */
+void truncate_pagecache_range(struct inode *inode, loff_t lstart, loff_t lend)
+{
+        struct address_space *mapping = inode->i_mapping;
+        loff_t unmap_start = round_up(lstart, PAGE_SIZE);
+        loff_t unmap_end = round_down(1 + lend, PAGE_SIZE) - 1;
+        /*
+         * This rounding is currently just for example: unmap_mapping_range
+         * expands its hole outwards, whereas we want it to contract the hole
+         * inwards.  However, existing callers of truncate_pagecache_range are
+         * doing their own page rounding first; and truncate_inode_pages_range
+         * currently BUGs if lend is not pagealigned-1 (it handles partial
+         * page at start of hole, but not partial page at end of hole).  Note
+         * unmap_mapping_range allows holelen 0 for all, and we allow lend -1.
+         */
+        /*
+         * Unlike in truncate_pagecache, unmap_mapping_range is called only
+         * once (before truncating pagecache), and without "even_cows" flag:
+         * hole-punching should not remove private COWed pages from the hole.
+         */
+        if ((u64)unmap_end > (u64)unmap_start)
+                unmap_mapping_range(mapping, unmap_start,
+                                    1 + unmap_end - unmap_start, 0);
+        truncate_inode_pages_range(mapping, lstart, lend);
+}
+EXPORT_SYMBOL(truncate_pagecache_range);
author	Hugh Dickins <hughd@google.com>	2012-03-28 17:42:40 -0400
committer	Linus Torvalds <torvalds@linux-foundation.org>	2012-03-28 20:14:35 -0400
commit	623e3db9f9b7d6e7b2a99180f9cf0825c936ab7a (patch)
tree	d8eaa8f1665a048c4318ccd0759775e057792823 /mm/truncate.c
parent	3748b2f15b06ea1861df39d5e9693dcd6e9542b1 (diff)

diff --git a/mm/truncate.c b/mm/truncate.c index 18aded3a89fc..61a183b89df6 100644 --- a/mm/truncate.c +++ b/mm/truncate.c
@@ -626,3 +626,43 @@ int vmtruncate_range(struct inode *inode, loff_t lstart, loff_t lend)
626		626
627	return 0;	627	return 0;
628	}	628	}
		629
		630	/**
		631	* truncate_pagecache_range - unmap and remove pagecache that is hole-punched
		632	* @inode: inode
		633	* @lstart: offset of beginning of hole
		634	* @lend: offset of last byte of hole
		635	*
		636	* This function should typically be called before the filesystem
		637	* releases resources associated with the freed range (eg. deallocates
		638	* blocks). This way, pagecache will always stay logically coherent
		639	* with on-disk format, and the filesystem would not have to deal with
		640	* situations such as writepage being called for a page that has already
		641	* had its underlying blocks deallocated.
		642	*/
		643	void truncate_pagecache_range(struct inode *inode, loff_t lstart, loff_t lend)
		644	{
		645	struct address_space *mapping = inode->i_mapping;
		646	loff_t unmap_start = round_up(lstart, PAGE_SIZE);
		647	loff_t unmap_end = round_down(1 + lend, PAGE_SIZE) - 1;
		648	/*
		649	* This rounding is currently just for example: unmap_mapping_range
		650	* expands its hole outwards, whereas we want it to contract the hole
		651	* inwards. However, existing callers of truncate_pagecache_range are
		652	* doing their own page rounding first; and truncate_inode_pages_range
		653	* currently BUGs if lend is not pagealigned-1 (it handles partial
		654	* page at start of hole, but not partial page at end of hole). Note
		655	* unmap_mapping_range allows holelen 0 for all, and we allow lend -1.
		656	*/
		657
		658	/*
		659	* Unlike in truncate_pagecache, unmap_mapping_range is called only
		660	* once (before truncating pagecache), and without "even_cows" flag:
		661	* hole-punching should not remove private COWed pages from the hole.
		662	*/
		663	if ((u64)unmap_end > (u64)unmap_start)
		664	unmap_mapping_range(mapping, unmap_start,
		665	1 + unmap_end - unmap_start, 0);
		666	truncate_inode_pages_range(mapping, lstart, lend);
		667	}
		668	EXPORT_SYMBOL(truncate_pagecache_range);