| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
| |
The extent_io writepage calls needed an extra check for discarding
pages that started on th last byte in the file.
btrfs_truncate_page needed checks to make sure the page was still part
of the file after reading it, and most importantly, needed to wait for
all IO to the page to finish before freeing the corresponding extents on
disk.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When duplicate copies exist, writes are allowed to fail to one of those
copies. This changeset includes a few changes that allow the FS to
continue even when some IOs fail.
It also adds verification of the parent generation number for btree blocks.
This generation is stored in the pointer to a block, and it ensures
that missed writes to are detected.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Once part of a delalloc request fails the cow checks, just cow the
entire range
It is possible for the back references to all be from the same root,
but still have snapshots against an extent. The checks are now more strict,
forcing cow any time there are multiple refs against the data extent.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Before, nodatacow only checked to make sure multiple roots didn't have
references on a single extent. This check makes sure that multiple
inodes don't have references.
nodatacow needed an extra check to see if the block group was currently
readonly. This way cows forced by the chunk relocation code are honored.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This required a few structural changes to the code that manages bdev pointers:
The VFS super block now gets an anon-bdev instead of a pointer to the
lowest bdev. This allows us to avoid swapping the super block bdev pointer
around at run time.
The code to read in the super block no longer goes through the extent
buffer interface. Things got ugly keeping the mapping constant.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
| |
Add a new ioctl to clone file data
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
| |
The generic O_DIRECT code assumes all the bios have the same bdev,
which isn't true for multi-device btrfs.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
btrfs_invalidatepage is not allowed to leave pages around on the lru.
Any such pages will trigger an oops later on because the VM will see
page->private and assume it is a buffer head.
This also forces extra flushes of the async work queues before
dropping all the pages on the btree inode during unmount. Left over
items on the work queues are one possible cause of busy state ranges
during truncate_inode_pages.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
retries
The data read retry code needs to find the logical disk block before it
can resubmit new bios. But, finding this block isn't allowed to take
the fs_mutex because that will deadlock with a number of different callers.
This changes the retry code to use the extent map cache instead, but
that requires the extent map cache to have the extent we're looking for.
This is a problem because btrfs_drop_extent_cache just drops the entire
extent instead of the little tiny part it is invalidating.
The bulk of the code in this patch changes btrfs_drop_extent_cache to
invalidate only a portion of the extent cache, and changes btrfs_get_extent
to deal with the results.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
| |
This isn't required anymore because we don't reallocate blocks that
have already been written in this transaction.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
| |
This significantly improves streaming write performance by allowing
concurrency in the data checksumming.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
| |
This allows checksumming to happen in parallel among many cpus, and
keeps us from bogging down pdflush with the checksumming code.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
|
|
| |
On huge machines, delayed allocation may try to allocate massive extents.
This change allows btrfs_alloc_extent to return something smaller than
the caller asked for, and the data allocation routines will loop over
the allocations until it fills the whole delayed alloc.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds basic O_DIRECT read and write support. In the write case, we
just do a normal buffered write followed by a cache flush. O_DIRECT +
O_SYNC are required to trigger metadata syncs.
In the read case, there is a basic btrfs_get_block call for use by
the generic O_DIRECT code. This does honor multi-volume mapping rules
but it skips all checksumming.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
| |
Before it was done by the bio end_io routine, the work queue code is able
to scale much better with faster IO subsystems.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
| |
This allows intelligent versions of unplug and congestion functions
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
| |
Remove the btrfs read_inode method, and use save_mount_options
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we checkum file data during writepage, the checksumming is done one
page at a time, making it difficult to do bulk metadata modifications
to insert checksums for large ranges of the file at once.
This patch changes btrfs to checksum on a per-bio basis instead. The
bios are checksummed before they are handed off to the block layer, so
each bio is contiguous and only has pages from the same inode.
Checksumming on a bio basis allows us to insert and modify the file
checksum items in large groups. It also allows the checksumming to
be done more easily by async worker threads.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
|
|
|
| |
Now that delayed allocation accounting works, i_blocks accounting is changed
to only modify i_blocks when extents inserted or removed.
The fillattr call is changed to include the delayed allocation byte count
in the i_blocks result.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
| |
---
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
| |
---
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The end_bio routines are changed to take a pointer to the extent state
struct, and the state tree is walked in order to set/clear appropriate
bits as IO completes. This greatly reduces the number of rbtree searches
done by the end_bio handlers, and reduces lock contention.
The extent_io releasepage function is changed to avoid expensive searches
for locked state.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
|
|
| |
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|