aboutsummaryrefslogtreecommitdiffstats
path: root/fs/xfs/libxfs
Commit message (Collapse)AuthorAge
...
* xfs: introduce the CoW forkDarrick J. Wong2016-10-04
| | | | | | | | | Introduce a new in-core fork for storing copy-on-write delalloc reservations and allocated extents that are in the process of being written out. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: don't allow reflinked dir/dev/fifo/socket/pipe filesDarrick J. Wong2016-10-04
| | | | | | | | Only non-rt files can be reflinked, so check that when we load an inode. Also, don't leak the attr fork if there's a failure. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: add reflink feature flag to geometryDarrick J. Wong2016-10-04
| | | | | | | | Report the reflink feature in the XFS geometry so that xfs_info and friends know the filesystem has this feature. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: return work remaining at the end of a bunmapi operationDarrick J. Wong2016-10-04
| | | | | | | | | | Return the range of file blocks that bunmapi didn't free. This hint is used by CoW and reflink to figure out what part of an extent actually got freed so that it can set up the appropriate atomic remapping of just the freed range. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: implement deferred bmbt map/unmap operationsDarrick J. Wong2016-10-04
| | | | | | | | | Implement deferred versions of the inode block map/unmap functions. These will be used in subsequent patches to make reflink operations atomic. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: pass bmapi flags through to bmap_del_extentDarrick J. Wong2016-10-04
| | | | | | | | | | Pass BMAPI_ flags from bunmapi into bmap_del_extent and extend BMAPI_REMAP (which means "don't touch the allocator or the quota accounting") to apply to bunmapi as well. This will be used to implement the unmap operation, which will be used by swapext. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: map an inode's offset to an exact physical blockDarrick J. Wong2016-10-04
| | | | | | | | | | Teach the bmap routine to know how to map a range of file blocks to a specific range of physical blocks, instead of simply allocating fresh blocks. This enables reflink to map a file to blocks that are already in use. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: log bmap intent itemsDarrick J. Wong2016-10-04
| | | | | | | | | Provide a mechanism for higher levels to create BUI/BUD items, submit them to the log, and a stub function to deal with recovered BUI items. These parts will be connected to the rmapbt in a later patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: create bmbt update intent log itemsDarrick J. Wong2016-10-04
| | | | | | | | | | | | Create bmbt update intent/done log items to record redo information in the log. Because we roll transactions multiple times for reflink operations, we also have to track the status of the metadata updates that will be recorded in the post-roll transactions in case we crash before committing the final transaction. This mechanism enables log recovery to finish what was already started. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: introduce reflink utility functionsDarrick J. Wong2016-10-03
| | | | | | | | These functions will be used by the other reflink functions to find the maximum length of a range of shared blocks. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.coM> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: reserve AG space for the refcount btree rootDarrick J. Wong2016-10-03
| | | | | | | | Reduce the max AG usable space size so that we always have space for the refcount btree root. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: adjust refcount when unmapping file blocksDarrick J. Wong2016-10-03
| | | | | | | | | When we're unmapping blocks from a reflinked file, decrease the refcount of the affected blocks and free the extents that are no longer in use. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: connect refcount adjust functions to upper layersDarrick J. Wong2016-10-03
| | | | | | | | Plumb in the upper level interface to schedule and finish deferred refcount operations via the deferred ops mechanism. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: adjust refcount of an extent of blocks in refcount btreeDarrick J. Wong2016-10-03
| | | | | | | | Provide functions to adjust the reference counts for an extent of physical blocks stored in the refcount btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* xfs: log refcount intent itemsDarrick J. Wong2016-10-03
| | | | | | | | | Provide a mechanism for higher levels to create CUI/CUD items, submit them to the log, and a stub function to deal with recovered CUI items. These parts will be connected to the refcountbt in a later patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: create refcount update intent log itemsDarrick J. Wong2016-10-03
| | | | | | | | | | | | | Create refcount update intent/done log items to record redo information in the log. Because we need to roll transactions between updating the bmbt mapping and updating the reverse mapping, we also have to track the status of the metadata updates that will be recorded in the post-roll transactions, just in case we crash before committing the final transaction. This mechanism enables log recovery to finish what was already started. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: add refcount btree operationsDarrick J. Wong2016-10-03
| | | | | | | | | | | | | Implement the generic btree operations required to manipulate refcount btree blocks. The implementation is similar to the bmapbt, though it will only allocate and free blocks from the AG. Since the refcount root and level fields are separate from the existing roots and levels array, they need a separate logging flag. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> [hch: fix logging of AGF refcount btree fields] Signed-off-by: Christoph Hellwig <hch@lst.de>
* xfs: account for the refcount btree in the alloc/free log reservationDarrick J. Wong2016-10-03
| | | | | | | | | | | | | | | | | | | | | Every time we allocate or free a data extent, we might need to split the refcount btree. Reserve some blocks in the transaction to handle this possibility. Even though the deferred refcount code can roll a transaction to avoid overloading the transaction, we can still exceed the reservation. Certain pathological workloads (1k blocks, no cowextsize hint, random directio writes), cause a perfect storm wherein a refcount adjustment of a large range of blocks causes full tree splits in two separate extents in two separate refcount tree blocks; allocating new refcount tree blocks causes rmap btree splits; and all the allocation activity causes the freespace btrees to split, blowing the reservation. (Reproduced by generic/167 over NFS atop XFS) Signed-off-by: Christoph Hellwig <hch@lst.de> [darrick.wong@oracle.com: add commit message] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: define the on-disk refcount btree formatDarrick J. Wong2016-10-03
| | | | | | | | | | Start constructing the refcount btree implementation by establishing the on-disk format and everything needed to read, write, and manipulate the refcount btree blocks. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: refcount btree add more reserved blocksDarrick J. Wong2016-10-03
| | | | | | | | | Since XFS reserves a small amount of space in each AG as the minimum free space needed for an operation, save some more space in case we touch the refcount btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: introduce refcount btree definitionsDarrick J. Wong2016-10-03
| | | | | | | | Add new per-AG refcount btree definitions to the per-AG structures. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de>
* Merge branch 'xfs-4.9-log-recovery-fixes' into for-nextDave Chinner2016-10-02
|\
| * xfs: remote attribute blocks aren't really userdataDave Chinner2016-09-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When adding a new remote attribute, we write the attribute to the new extent before the allocation transaction is committed. This means we cannot reuse busy extents as that violates crash consistency semantics. Hence we currently treat remote attribute extent allocation like userdata because it has the same overwrite ordering constraints as userdata. Unfortunately, this also allows the allocator to incorrectly apply extent size hints to the remote attribute extent allocation. This results in interesting failures, such as transaction block reservation overruns and in-memory inode attribute fork corruption. To fix this, we need to separate the busy extent reuse configuration from the userdata configuration. This changes the definition of XFS_BMAPI_METADATA slightly - it now means that allocation is metadata and reuse of busy extents is acceptible due to the metadata ordering semantics of the journal. If this flag is not set, it means the allocation is that has unordered data writeback, and hence busy extent reuse is not allowed. It no longer implies the allocation is for user data, just that the data write will not be strictly ordered. This matches the semantics for both user data and remote attribute block allocation. As such, This patch changes the "userdata" field to a "datatype" field, and adds a "no busy reuse" flag to the field. When we detect an unordered data extent allocation, we immediately set the no reuse flag. We then set the "user data" flags based on the inode fork we are allocating the extent to. Hence we only set userdata flags on data fork allocations now and consider attribute fork remote extents to be an unordered metadata extent. The result is that remote attribute extents now have the expected allocation semantics, and the data fork allocation behaviour is completely unchanged. It should be noted that there may be other ways to fix this (e.g. use ordered metadata buffers for the remote attribute extent data write) but they are more invasive and difficult to validate both from a design and implementation POV. Hence this patch takes the simple, obvious route to fixing the problem... Reported-and-tested-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | Merge branch 'xfs-4.9-delalloc-rework' into for-nextDave Chinner2016-10-02
|\ \
| * | xfs: rewrite and optimize the delalloc write pathChristoph Hellwig2016-09-18
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently xfs_iomap_write_delay does up to lookups in the inode extent tree, which is rather costly especially with the new iomap based write path and small write sizes. But it turns out that the low-level xfs_bmap_search_extents gives us all the information we need in the regular delalloc buffered write path: - it will return us an extent covering the block we are looking up if it exists. In that case we can simply return that extent to the caller and are done - it will tell us if we are beyoned the last current allocated block with an eof return parameter. In that case we can create a delalloc reservation and use the also returned information about the last extent in the file as the hint to size our delalloc reservation. - it can tell us that we are writing into a hole, but that there is an extent beyoned this hole. In this case we can create a delalloc reservation that covers the requested size (possible capped to the next existing allocation). All that can be done in one single routine instead of bouncing up and down a few layers. This reduced the CPU overhead of the block mapping routines and also simplified the code a lot. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: set up per-AG free space reservationsDarrick J. Wong2016-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One unfortunate quirk of the reference count and reverse mapping btrees -- they can expand in size when blocks are written to *other* allocation groups if, say, one large extent becomes a lot of tiny extents. Since we don't want to start throwing errors in the middle of CoWing, we need to reserve some blocks to handle future expansion. The transaction block reservation counters aren't sufficient here because we have to have a reserve of blocks in every AG, not just somewhere in the filesystem. Therefore, create two per-AG block reservation pools. One feeds the AGFL so that rmapbt expansion always succeeds, and the other feeds all other metadata so that refcountbt expansion never fails. Use the count of how many reserved blocks we need to have on hand to create a virtual reservation in the AG. Through selective clamping of the maximum length of allocation requests and of the length of the longest free extent, we can make it look like there's less free space in the AG unless the reservation owner is asking for blocks. In other words, play some accounting tricks in-core to make sure that we always have blocks available. On the plus side, there's nothing to clean up if we crash, which is contrast to the strategy that the rough draft used (actually removing extents from the freespace btrees). Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: defer should allow ->finish_item to request a new transactionDarrick J. Wong2016-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When xfs_defer_finish calls ->finish_item, it's possible that (refcount) won't be able to finish all the work in a single transaction. When this happens, the ->finish_item handler should shorten the log done item's list count, update the work item to reflect where work should continue, and return -EAGAIN so that defer_finish knows to retain the pending item on the pending list, roll the transaction, and restart processing where we left off. Plumb in the code and document how this mechanism is supposed to work. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* | xfs: count the blocks in a btreeDarrick J. Wong2016-09-18
| | | | | | | | | | | | | | | | | | | | | | Provide a helper method to count the number of blocks in a short form btree. The refcount and rmap btrees need to know the number of blocks already in use to set up their per-AG block reservations during mount. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: create a standard btree size calculator codeDarrick J. Wong2016-09-18
| | | | | | | | | | | | | | | | | | | | Create a helper to generate AG btree height calculator functions. This will be used (much) later when we get to the refcount btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: remove xfs_btree_bigkeyDarrick J. Wong2016-09-18
| | | | | | | | | | | | | | | | | | | | Remove the xfs_btree_bigkey mess and simply make xfs_btree_key big enough to hold both keys in-core. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: convert RUI log formats to use variable length arraysDarrick J. Wong2016-09-18
|/ | | | | | | | | | Use variable length array declarations for RUI log items, and replace the open coded sizeof formulae with a single function. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: track log done items directly in the deferred pending work itemDarrick J. Wong2016-08-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Christoph reports slab corruption when a deferred refcount update aborts during _defer_finish(). The cause of this was broken log item state tracking in xfs_defer_pending -- upon an abort, _defer_trans_abort() will call abort_intent on all intent items, including the ones that have already had a done item attached. This is incorrect because each intent item has 2 refcount: the first is released when the intent item is committed to the log; and the second is released when the _done_ item is committed to the log, or by the intent creator if there is no done item. In other words, once we log the done item, responsibility for releasing the intent item's second refcount is transferred to the done item and /must not/ be performed by anything else. The dfp_committed flag should have been tracking whether or not we had a done item so that _defer_trans_abort could decide if it needs to abort the intent item, but due to a thinko this was not the case. Rip it out and track the done item directly so that we do the right thing w.r.t. intent item freeing. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reported-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: fix superblock inprogress checkDave Chinner2016-08-26
| | | | | | | | | | | | | | | | | | | | From inspection, the superblock sb_inprogress check is done in the verifier and triggered only for the primary superblock via a "bp->b_bn == XFS_SB_DADDR" check. Unfortunately, the primary superblock is an uncached buffer, and hence it is configured by xfs_buf_read_uncached() with: bp->b_bn = XFS_BUF_DADDR_NULL; /* always null for uncached buffers */ And so this check never triggers. Fix it. cc: <stable@vger.kernel.org> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: simple btree query range should look right if LE lookup failsDarrick J. Wong2016-08-26
| | | | | | | | | | | | | | | If the initial LOOKUP_LE in the simple query range fails to find anything, we should attempt to increment the btree cursor to see if there actually /are/ records for what we're trying to find. Without this patch, a bnobt range query of (0, $agsize) returns no results because the leftmost record never has a startblock of zero. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: fix some key handling problems in _btree_simple_query_rangeDarrick J. Wong2016-08-26
| | | | | | | | | | | | We only need the record's high key for the first record that we look at; for all records, we /definitely/ need the regular record key. Therefore, fix how the simple range query function gets its keys. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: don't log the entire end of the AGFDarrick J. Wong2016-08-26
| | | | | | | | | | | | When we're logging the last non-spare field in the AGF, we don't need to log the spare fields, so plumb in a new AGF logging flag to help us avoid that. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: don't perform lookups on zero-height btreesDarrick J. Wong2016-08-26
| | | | | | | | | | | | If the caller passes in a cursor to a zero-height btree (which is impossible), we never set block to anything but NULL, which causes the later dereference of it to crash. Instead, just return -EFSCORRUPTED. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: remove OWN_AG rmap when allocating a block from the AGFLDarrick J. Wong2016-08-16
| | | | | | | | | | | | | | | | When we're really tight on space, xfs_alloc_ag_vextent_small() can allocate a block from the AGFL and give it to the caller. Since the caller is never the AGFL-fixing method, we must remove the OWN_AG reverse mapping because it will clash with whatever rmap the caller wants to set up. This bug was discovered by running generic/299 repeatedly. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: store rmapbt block count in the AGFDarrick J. Wong2016-08-16
| | | | | | | | | | | | Track the number of blocks used for the rmapbt in the AGF. When we get to the AG reservation code we need this counter to quickly make our reservation during mount. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: move (and rename) the deferred bmap-free tracepointsDarrick J. Wong2016-08-02
| | | | | | | | | | Rename the deferred bmap-free to extent_free and make them only trigger when we're really running deferred ops. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: remove the extents array from the rmap update done log itemDarrick J. Wong2016-08-02
| | | | | | | | | | Nothing ever uses the extent array in the rmap update done redo item, so remove it before it is fixed in the on-disk log format. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: in btree_lshift, only allocate temporary cursor when neededDarrick J. Wong2016-08-02
| | | | | | | | | | | | | | We only need the temporary cursor in _btree_lshift if we're shifting in an overlapped btree. Therefore, factor that into a single block of code so we avoid unnecessary cursor duplication. Also fix use of the wrong cursor when checking for corruption in xfs_btree_rshift(). Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: remove unnecesary lshift/rshift key initializationDarrick J. Wong2016-08-02
| | | | | | | | | | | In the lshift/rshift functions we don't use the key variable for anything now, so remove the variable and its initializer. The update_keys functions figure out the key for a block on their own. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: remove the get*keys and update_keys btree ops pointersDarrick J. Wong2016-08-02
| | | | | | | | | | | | | These are internal btree functions; we don't need them to be dispatched via function pointers. Make them static again and just check the overlapped flag to figure out what we need to do. The strategy behind this patch was suggested by Christoph. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: enable the rmap btree functionalityDarrick J. Wong2016-08-02
| | | | | | | | | | | | | | Originally-From: Dave Chinner <dchinner@redhat.com> Add the feature flag to the supported matrix so that the kernel can mount and use rmap btree enabled filesystems Signed-off-by: Dave Chinner <dchinner@redhat.com> [darrick.wong@oracle.com: move the experimental tag] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: don't update rmapbt when fixing agflDarrick J. Wong2016-08-02
| | | | | | | | | | | | | Allow a caller of xfs_alloc_fix_freelist to disable rmapbt updates when fixing the AG freelist. xfs_repair needs this during phase 5 to be able to adjust the freelist while it's reconstructing the rmap btree; the missing entries will be added back at the very end of phase 5 once the AGFL contents settle down. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: add rmap btree geometry feature flagDarrick J. Wong2016-08-02
| | | | | | | | | | | | Originally-From: Dave Chinner <dchinner@redhat.com> So xfs_info and other userspace utilities know the filesystem is using this feature. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: propagate bmap updates to rmapbtDarrick J. Wong2016-08-02
| | | | | | | | | | | | | | | | | | | | | | When we map, unmap, or convert an extent in a file's data or attr fork, schedule a respective update in the rmapbt. Previous versions of this patch required a 1:1 correspondence between bmap and rmap, but this is no longer true as we now have ability to make interval queries against the rmapbt. We use the deferred operations code to handle redo operations atomically and deadlock free. This plumbs in all five rmap actions (map, unmap, convert extent, alloc, free); we'll use the first three now for file data, and reflink will want the last two. We also add an error injection site to test log recovery. Finally, we need to fix the bmap shift extent code to adjust the rmaps correctly. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: enable the xfs_defer mechanism to process rmaps to updateDarrick J. Wong2016-08-02
| | | | | | | | | | | Connect the xfs_defer mechanism with the pieces that we'll need to handle deferred rmap updates. We'll wire up the existing code to our new deferred mechanism later. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: create rmap update intent log itemsDarrick J. Wong2016-08-02
| | | | | | | | | | | | | | | Create rmap update intent/done log items to record redo information in the log. Because we need to roll transactions between updating the bmbt mapping and updating the reverse mapping, we also have to track the status of the metadata updates that will be recorded in the post-roll transactions, just in case we crash before committing the final transaction. This mechanism enables log recovery to finish what was already started. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>