aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/sysctl
diff options
context:
space:
mode:
authorMel Gorman <mgorman@suse.de>2014-06-04 19:07:14 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2014-06-04 19:53:59 -0400
commit4f9b16a64753d0bb607454347036dc997fd03b82 (patch)
tree26169e829082644a3e6ee6bd4a9991b061425c50 /Documentation/sysctl
parent944d9fec8d7aee3f2e16573e9b6a16634b33f403 (diff)
mm: disable zone_reclaim_mode by default
When it was introduced, zone_reclaim_mode made sense as NUMA distances punished and workloads were generally partitioned to fit into a NUMA node. NUMA machines are now common but few of the workloads are NUMA-aware and it's routine to see major performance degradation due to zone_reclaim_mode being enabled but relatively few can identify the problem. Those that require zone_reclaim_mode are likely to be able to detect when it needs to be enabled and tune appropriately so lets have a sensible default for the bulk of users. This patch (of 2): zone_reclaim_mode causes processes to prefer reclaiming memory from local node instead of spilling over to other nodes. This made sense initially when NUMA machines were almost exclusively HPC and the workload was partitioned into nodes. The NUMA penalties were sufficiently high to justify reclaiming the memory. On current machines and workloads it is often the case that zone_reclaim_mode destroys performance but not all users know how to detect this. Favour the common case and disable it by default. Users that are sophisticated enough to know they need zone_reclaim_mode will detect it. Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'Documentation/sysctl')
-rw-r--r--Documentation/sysctl/vm.txt17
1 files changed, 9 insertions, 8 deletions
diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index dd9d0e33b443..5b6da0fb5fbf 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -772,16 +772,17 @@ This is value ORed together of
7722 = Zone reclaim writes dirty pages out 7722 = Zone reclaim writes dirty pages out
7734 = Zone reclaim swaps pages 7734 = Zone reclaim swaps pages
774 774
775zone_reclaim_mode is set during bootup to 1 if it is determined that pages 775zone_reclaim_mode is disabled by default. For file servers or workloads
776from remote zones will cause a measurable performance reduction. The 776that benefit from having their data cached, zone_reclaim_mode should be
777page allocator will then reclaim easily reusable pages (those page 777left disabled as the caching effect is likely to be more important than
778cache pages that are currently not used) before allocating off node pages.
779
780It may be beneficial to switch off zone reclaim if the system is
781used for a file server and all of memory should be used for caching files
782from disk. In that case the caching effect is more important than
783data locality. 778data locality.
784 779
780zone_reclaim may be enabled if it's known that the workload is partitioned
781such that each partition fits within a NUMA node and that accessing remote
782memory would cause a measurable performance reduction. The page allocator
783will then reclaim easily reusable pages (those page cache pages that are
784currently not used) before allocating off node pages.
785
785Allowing zone reclaim to write out pages stops processes that are 786Allowing zone reclaim to write out pages stops processes that are
786writing large amounts of data from dirtying pages on other nodes. Zone 787writing large amounts of data from dirtying pages on other nodes. Zone
787reclaim will write out dirty pages if a zone fills up and so effectively 788reclaim will write out dirty pages if a zone fills up and so effectively