xfs: Use delayed write for inodes rather than async V2

We currently do background inode flush asynchronously, resulting in inodes being written in whatever order the background writeback issues them. Not only that, there are also blocking and non-blocking asynchronous inode flushes, depending on where the flush comes from. This patch completely removes asynchronous inode writeback. It removes all the strange writeback modes and replaces them with either a synchronous flush or a non-blocking delayed write flush. That is, inode flushes will only issue IO directly if they are synchronous, and background flushing may do nothing if the operation would block (e.g. on a pinned inode or buffer lock). Delayed write flushes will now result in the inode buffer sitting in the delwri queue of the buffer cache to be flushed by either an AIL push or by the xfsbufd timing out the buffer. This will allow accumulation of dirty inode buffers in memory and allow optimisation of inode cluster writeback at the xfsbufd level where we have much greater queue depths than the block layer elevators. We will also get adjacent inode cluster buffer IO merging for free when a later patch in the series allows sorting of the delayed write buffers before dispatch. This effectively means that any inode that is written back by background writeback will be seen as flush locked during AIL pushing, and will result in the buffers being pushed from there. This writeback path is currently non-optimal, but the next patch in the series will fix that problem. A side effect of this delayed write mechanism is that background inode reclaim will no longer directly flush inodes, nor can it wait on the flush lock. The result is that inode reclaim must leave the inode in the reclaimable state until it is clean. Hence attempts to reclaim a dirty inode in the background will simply skip the inode until it is clean and this allows other mechanisms (i.e. xfsbufd) to do more optimal writeback of the dirty buffers. As a result, the inode reclaim code has been rewritten so that it no longer relies on the ambiguous return values of xfs_iflush() to determine whether it is safe to reclaim an inode. Portions of this patch are derived from patches by Christoph Hellwig. Version 2: - cleanup reclaim code as suggested by Christoph - log background reclaim inode flush errors - just pass sync flags to xfs_iflush Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
author: Dave Chinner <david@fromorbit.com> 2010-02-05 20:39:36 -0500
committer: Dave Chinner <david@fromorbit.com> 2010-02-05 20:39:36 -0500
commit: c854363e80b49dd04a4de18ebc379eb8c8806674 (patch)
tree: 8c8d0dec26d961631a3cd8b6c402b5d1444336e5 /fs/xfs/xfs_mount.c
parent: 777df5afdb26c71634edd60582be620ff94e87a0 (diff)
1 files changed, 12 insertions, 1 deletions
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 5061149b2cc4..6afaaeb2950a 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1468,7 +1468,18 @@ xfs_unmountfs(
         * need to force the log first.
         */
        xfs_log_force(mp, XFS_LOG_SYNC);
-        xfs_reclaim_inodes(mp, XFS_IFLUSH_ASYNC);
+        /*
+         * Do a delwri reclaim pass first so that as many dirty inodes are
+         * queued up for IO as possible. Then flush the buffers before making
+         * a synchronous path to catch all the remaining inodes are reclaimed.
+         * This makes the reclaim process as quick as possible by avoiding
+         * synchronous writeout and blocking on inodes already in the delwri
+         * state as much as possible.
+         */
+        xfs_reclaim_inodes(mp, 0);
+        XFS_bflush(mp->m_ddev_targp);
+        xfs_reclaim_inodes(mp, SYNC_WAIT);
        xfs_qm_unmount(mp);
author	Dave Chinner <david@fromorbit.com>	2010-02-05 20:39:36 -0500
committer	Dave Chinner <david@fromorbit.com>	2010-02-05 20:39:36 -0500
commit	c854363e80b49dd04a4de18ebc379eb8c8806674 (patch)
tree	8c8d0dec26d961631a3cd8b6c402b5d1444336e5 /fs/xfs/xfs_mount.c
parent	777df5afdb26c71634edd60582be620ff94e87a0 (diff)

diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 5061149b2cc4..6afaaeb2950a 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c
@@ -1468,7 +1468,18 @@ xfs_unmountfs(
1468	* need to force the log first.	1468	* need to force the log first.
1469	*/	1469	*/
1470	xfs_log_force(mp, XFS_LOG_SYNC);	1470	xfs_log_force(mp, XFS_LOG_SYNC);
1471	xfs_reclaim_inodes(mp, XFS_IFLUSH_ASYNC);	1471
		1472	/*
		1473	* Do a delwri reclaim pass first so that as many dirty inodes are
		1474	* queued up for IO as possible. Then flush the buffers before making
		1475	* a synchronous path to catch all the remaining inodes are reclaimed.
		1476	* This makes the reclaim process as quick as possible by avoiding
		1477	* synchronous writeout and blocking on inodes already in the delwri
		1478	* state as much as possible.
		1479	*/
		1480	xfs_reclaim_inodes(mp, 0);
		1481	XFS_bflush(mp->m_ddev_targp);
		1482	xfs_reclaim_inodes(mp, SYNC_WAIT);
1472		1483
1473	xfs_qm_unmount(mp);	1484	xfs_qm_unmount(mp);
1474		1485