aboutsummaryrefslogtreecommitdiffstats
path: root/fs/xfs
diff options
context:
space:
mode:
authorDave Chinner <dchinner@redhat.com>2011-05-05 22:54:08 -0400
committerAlex Elder <aelder@sgi.com>2011-05-09 13:17:04 -0400
commite4d3c4a43b595d5124ae824d300626e6489ae857 (patch)
treee9bef16559fd47aa0bbea1d16ed13c2695f618b2 /fs/xfs
parentfd5670f22fce247754243cf2ed41941e5762d990 (diff)
xfs: fix race condition in AIL push trigger
The recent conversion of the xfsaild functionality to a work queue introduced a hard-to-hit log space grant hang. One is caused by a race condition in determining whether there is a psh in progress or not. The XFS_AIL_PUSHING_BIT is used to determine whether a push is currently in progress. When the AIL push work completes, it checked whether the target changed and cleared the PUSHING bit to allow a new push to be requeued. The race condition is as follows: Thread 1 push work smp_wmb() smp_rmb() check ailp->xa_target unchanged update ailp->xa_target test/set PUSHING bit does not queue clear PUSHING bit does not requeue Now that the push target is updated, new attempts to push the AIL will not trigger as the push target will be the same, and hence despite trying to push the AIL we won't ever wake it again. The fix is to ensure that the AIL push work clears the PUSHING bit before it checks if the target is unchanged. As a result, both push triggers operate on the same test/set bit criteria, so even if we race in the push work and miss the target update, the thread requesting the push will still set the PUSHING bit and queue the push work to occur. For safety sake, the same queue check is done if the push work detects the target change, though only one of the two will will queue new work due to the use of test_and_set_bit() checks. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
Diffstat (limited to 'fs/xfs')
-rw-r--r--fs/xfs/xfs_trans_ail.c16
1 files changed, 10 insertions, 6 deletions
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index d7eebbf71362..5fc2380092c8 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -487,15 +487,19 @@ out_done:
487 ailp->xa_last_pushed_lsn = 0; 487 ailp->xa_last_pushed_lsn = 0;
488 488
489 /* 489 /*
490 * Check for an updated push target before clearing the 490 * We clear the XFS_AIL_PUSHING_BIT first before checking
491 * XFS_AIL_PUSHING_BIT. If the target changed, we've got more 491 * whether the target has changed. If the target has changed,
492 * work to do. Wait a bit longer before starting that work. 492 * this pushes the requeue race directly onto the result of the
493 * atomic test/set bit, so we are guaranteed that either the
494 * the pusher that changed the target or ourselves will requeue
495 * the work (but not both).
493 */ 496 */
497 clear_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags);
494 smp_rmb(); 498 smp_rmb();
495 if (XFS_LSN_CMP(ailp->xa_target, target) == 0) { 499 if (XFS_LSN_CMP(ailp->xa_target, target) == 0 ||
496 clear_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags); 500 test_and_set_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags))
497 return; 501 return;
498 } 502
499 tout = 50; 503 tout = 50;
500 } else if (XFS_LSN_CMP(lsn, target) >= 0) { 504 } else if (XFS_LSN_CMP(lsn, target) >= 0) {
501 /* 505 /*