aboutsummaryrefslogtreecommitdiffstats
path: root/fs/xfs
diff options
context:
space:
mode:
authorBrian Foster <bfoster@redhat.com>2019-04-12 10:39:21 -0400
committerDarrick J. Wong <darrick.wong@oracle.com>2019-04-14 21:15:57 -0400
commit1ca89fbc48e1ea5044997328e403f8a13513e8c3 (patch)
treea9625a1edd3c50291a9eed4d01b368e5a734a534 /fs/xfs
parent22fedd80b652213e694b788e9389892b67b86286 (diff)
xfs: don't account extra agfl blocks as available
The block allocation AG selection code has parameters that allow a caller to perform multiple allocations from a single AG and transaction (under certain conditions). The parameters specify the total block allocation count required by the transaction and the AG selection code selects and locks an AG that will be able to satisfy the overall requirement. If the available block accounting calculation turns out to be inaccurate and a subsequent allocation call fails with -ENOSPC, the resulting transaction cancel leads to filesystem shutdown because the transaction is dirty. This exact problem can be reproduced with a highly parallel space consumer and fsstress workload running long enough to a large filesystem against -ENOSPC conditions. A bmbt block allocation request made for inode extent to bmap format conversion after an extent allocation is expected to be satisfied by the same AG and the same transaction as the extent allocation. The bmbt block allocation fails, however, because the block availability of the AG has changed since the AG was selected (outside of the blocks used for the extent itself). The inconsistent block availability calculation is caused by the deferred block freeing behavior of the AGFL. This immediately removes extra blocks from the AGFL to free up AGFL slots, but rather than immediately freeing such blocks as was done in the past, the block free is deferred such that said blocks are not available for allocation until the current transaction commits. The AG selection logic currently considers all AGFL blocks as available and executes shortly before any extra AGFL blocks are freed. This means the block availability of the current AG can change before the first allocation even occurs, but in practice a failure is more likely to manifest via a subsequent allocation because extent allocation usually has a contiguity requirement larger than a single block that can't be satisfied from the AGFL. In general, XFS prefers operational robustness to absolute allocation efficiency. In other words, we prefer to return -ENOSPC slightly earlier at the expense of not being able to allocate every last block in an AG to avoid this kind of problem. As such, update the AG block availability calculation to consider extra AGFL blocks as unavailable since they are immediately removed following the calculation and will not become available until the current transaction commits. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Diffstat (limited to 'fs/xfs')
-rw-r--r--fs/xfs/libxfs/xfs_alloc.c10
1 files changed, 8 insertions, 2 deletions
diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index bc3367b8b7bb..857a53e58b94 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -2042,6 +2042,7 @@ xfs_alloc_space_available(
2042 xfs_extlen_t alloc_len, longest; 2042 xfs_extlen_t alloc_len, longest;
2043 xfs_extlen_t reservation; /* blocks that are still reserved */ 2043 xfs_extlen_t reservation; /* blocks that are still reserved */
2044 int available; 2044 int available;
2045 xfs_extlen_t agflcount;
2045 2046
2046 if (flags & XFS_ALLOC_FLAG_FREEING) 2047 if (flags & XFS_ALLOC_FLAG_FREEING)
2047 return true; 2048 return true;
@@ -2054,8 +2055,13 @@ xfs_alloc_space_available(
2054 if (longest < alloc_len) 2055 if (longest < alloc_len)
2055 return false; 2056 return false;
2056 2057
2057 /* do we have enough free space remaining for the allocation? */ 2058 /*
2058 available = (int)(pag->pagf_freeblks + pag->pagf_flcount - 2059 * Do we have enough free space remaining for the allocation? Don't
2060 * account extra agfl blocks because we are about to defer free them,
2061 * making them unavailable until the current transaction commits.
2062 */
2063 agflcount = min_t(xfs_extlen_t, pag->pagf_flcount, min_free);
2064 available = (int)(pag->pagf_freeblks + agflcount -
2059 reservation - min_free - args->minleft); 2065 reservation - min_free - args->minleft);
2060 if (available < (int)max(args->total, alloc_len)) 2066 if (available < (int)max(args->total, alloc_len))
2061 return false; 2067 return false;