diff options
author | Ross Zwisler <ross.zwisler@linux.intel.com> | 2016-01-22 18:10:34 -0500 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2016-01-22 20:02:18 -0500 |
commit | de14b9cb5e02b5daaea139590393af5ccccc4229 (patch) | |
tree | 650e132091098741b1f5ecc1a7b135ef41099f86 /fs | |
parent | d4bbe7068b60e9263f08c54e6c2a0166c0f37317 (diff) |
dax: fix conversion of holes to PMDs
When we get a DAX PMD fault for a write it is possible that there could
be some number of 4k zero pages already present for the same range that
were inserted to service reads from a hole. These 4k zero pages need to
be unmapped from the VMAs and removed from the struct address_space
radix tree before the real DAX PMD entry can be inserted.
For PTE faults this same use case also exists and is handled by a
combination of unmap_mapping_range() to unmap the VMAs and
delete_from_page_cache() to remove the page from the address_space radix
tree.
For PMD faults we do have a call to unmap_mapping_range() (protected by
a buffer_new() check), but nothing clears out the radix tree entry. The
buffer_new() check is also incorrect as the current ext4 and XFS
filesystem code will never return a buffer_head with BH_New set, even
when allocating new blocks over a hole. Instead the filesystem will
zero the blocks manually and return a buffer_head with only BH_Mapped
set.
Fix this situation by removing the buffer_new() check and adding a call
to truncate_inode_pages_range() to clear out the radix tree entries
before we insert the DAX PMD.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeff Layton <jlayton@poochiereds.net>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'fs')
-rw-r--r-- | fs/dax.c | 20 |
1 files changed, 10 insertions, 10 deletions
@@ -589,6 +589,7 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, | |||
589 | bool write = flags & FAULT_FLAG_WRITE; | 589 | bool write = flags & FAULT_FLAG_WRITE; |
590 | struct block_device *bdev; | 590 | struct block_device *bdev; |
591 | pgoff_t size, pgoff; | 591 | pgoff_t size, pgoff; |
592 | loff_t lstart, lend; | ||
592 | sector_t block; | 593 | sector_t block; |
593 | int result = 0; | 594 | int result = 0; |
594 | 595 | ||
@@ -643,15 +644,13 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, | |||
643 | goto fallback; | 644 | goto fallback; |
644 | } | 645 | } |
645 | 646 | ||
646 | /* | 647 | /* make sure no process has any zero pages covering this hole */ |
647 | * If we allocated new storage, make sure no process has any | 648 | lstart = pgoff << PAGE_SHIFT; |
648 | * zero pages covering this hole | 649 | lend = lstart + PMD_SIZE - 1; /* inclusive */ |
649 | */ | 650 | i_mmap_unlock_read(mapping); |
650 | if (buffer_new(&bh)) { | 651 | unmap_mapping_range(mapping, lstart, PMD_SIZE, 0); |
651 | i_mmap_unlock_read(mapping); | 652 | truncate_inode_pages_range(mapping, lstart, lend); |
652 | unmap_mapping_range(mapping, pgoff << PAGE_SHIFT, PMD_SIZE, 0); | 653 | i_mmap_lock_read(mapping); |
653 | i_mmap_lock_read(mapping); | ||
654 | } | ||
655 | 654 | ||
656 | /* | 655 | /* |
657 | * If a truncate happened while we were allocating blocks, we may | 656 | * If a truncate happened while we were allocating blocks, we may |
@@ -665,7 +664,8 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, | |||
665 | goto out; | 664 | goto out; |
666 | } | 665 | } |
667 | if ((pgoff | PG_PMD_COLOUR) >= size) { | 666 | if ((pgoff | PG_PMD_COLOUR) >= size) { |
668 | dax_pmd_dbg(&bh, address, "pgoff unaligned"); | 667 | dax_pmd_dbg(&bh, address, |
668 | "offset + huge page size > file size"); | ||
669 | goto fallback; | 669 | goto fallback; |
670 | } | 670 | } |
671 | 671 | ||