diff options
author | Dave Chinner <dchinner@redhat.com> | 2011-04-07 22:45:07 -0400 |
---|---|---|
committer | Dave Chinner <david@fromorbit.com> | 2011-04-07 22:45:07 -0400 |
commit | a7b339f1b8698667eada006e717cdb4523be2ed5 (patch) | |
tree | 77c44400c32284bdcf15829e10d01eb15ddd1d41 /fs/xfs/xfs_mount.h | |
parent | 89e4cb550a492cfca038a555fcc1bdac58822ec3 (diff) |
xfs: introduce background inode reclaim work
Background inode reclaim needs to run more frequently that the XFS
syncd work is run as 30s is too long between optimal reclaim runs.
Add a new periodic work item to the xfs syncd workqueue to run a
fast, non-blocking inode reclaim scan.
Background inode reclaim is kicked by the act of marking inodes for
reclaim. When an AG is first marked as having reclaimable inodes,
the background reclaim work is kicked. It will continue to run
periodically untill it detects that there are no more reclaimable
inodes. It will be kicked again when the first inode is queued for
reclaim.
To ensure shrinker based inode reclaim throttles to the inode
cleaning and reclaim rate but still reclaim inodes efficiently, make it kick the
background inode reclaim so that when we are low on memory we are
trying to reclaim inodes as efficiently as possible. This kick shoul
d not be necessary, but it will protect against failures to kick the
background reclaim when inodes are first dirtied.
To provide the rate throttling, make the shrinker pass do
synchronous inode reclaim so that it blocks on inodes under IO. This
means that the shrinker will reclaim inodes rather than just
skipping over them, but it does not adversely affect the rate of
reclaim because most dirty inodes are already under IO due to the
background reclaim work the shrinker kicked.
These two modifications solve one of the two OOM killer invocations
Chris Mason reported recently when running a stress testing script.
The particular workload trigger for the OOM killer invocation is
where there are more threads than CPUs all unlinking files in an
extremely memory constrained environment. Unlike other solutions,
this one does not have a performance impact on performance when
memory is not constrained or the number of concurrent threads
operating is <= to the number of CPUs.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Diffstat (limited to 'fs/xfs/xfs_mount.h')
-rw-r--r-- | fs/xfs/xfs_mount.h | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index a0ad90e95299..19af0ab0d0c6 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h | |||
@@ -204,6 +204,7 @@ typedef struct xfs_mount { | |||
204 | #endif | 204 | #endif |
205 | struct xfs_mru_cache *m_filestream; /* per-mount filestream data */ | 205 | struct xfs_mru_cache *m_filestream; /* per-mount filestream data */ |
206 | struct delayed_work m_sync_work; /* background sync work */ | 206 | struct delayed_work m_sync_work; /* background sync work */ |
207 | struct delayed_work m_reclaim_work; /* background inode reclaim */ | ||
207 | struct work_struct m_flush_work; /* background inode flush */ | 208 | struct work_struct m_flush_work; /* background inode flush */ |
208 | __int64_t m_update_flags; /* sb flags we need to update | 209 | __int64_t m_update_flags; /* sb flags we need to update |
209 | on the next remount,rw */ | 210 | on the next remount,rw */ |