aboutsummaryrefslogtreecommitdiffstats
path: root/mm
diff options
context:
space:
mode:
authorMel Gorman <mgorman@suse.de>2014-06-04 19:10:49 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2014-06-04 19:54:12 -0400
commit1a501907bbea8e6ebb0b16cf6db9e9cbf1d2c813 (patch)
tree9412cf055064d717f84b79f71ca59ac28828d7ca /mm
parentd23da150a37c9fe3cc83dbaf71b3e37fd434ed52 (diff)
mm: vmscan: use proportional scanning during direct reclaim and full scan at DEF_PRIORITY
Commit "mm: vmscan: obey proportional scanning requirements for kswapd" ensured that file/anon lists were scanned proportionally for reclaim from kswapd but ignored it for direct reclaim. The intent was to minimse direct reclaim latency but Yuanhan Liu pointer out that it substitutes one long stall for many small stalls and distorts aging for normal workloads like streaming readers/writers. Hugh Dickins pointed out that a side-effect of the same commit was that when one LRU list dropped to zero that the entirety of the other list was shrunk leading to excessive reclaim in memcgs. This patch scans the file/anon lists proportionally for direct reclaim to similarly age page whether reclaimed by kswapd or direct reclaim but takes care to abort reclaim if one LRU drops to zero after reclaiming the requested number of pages. Based on ext4 and using the Intel VM scalability test 3.15.0-rc5 3.15.0-rc5 shrinker proportion Unit lru-file-readonce elapsed 5.3500 ( 0.00%) 5.4200 ( -1.31%) Unit lru-file-readonce time_range 0.2700 ( 0.00%) 0.1400 ( 48.15%) Unit lru-file-readonce time_stddv 0.1148 ( 0.00%) 0.0536 ( 53.33%) Unit lru-file-readtwice elapsed 8.1700 ( 0.00%) 8.1700 ( 0.00%) Unit lru-file-readtwice time_range 0.4300 ( 0.00%) 0.2300 ( 46.51%) Unit lru-file-readtwice time_stddv 0.1650 ( 0.00%) 0.0971 ( 41.16%) The test cases are running multiple dd instances reading sparse files. The results are within the noise for the small test machine. The impact of the patch is more noticable from the vmstats 3.15.0-rc5 3.15.0-rc5 shrinker proportion Minor Faults 35154 36784 Major Faults 611 1305 Swap Ins 394 1651 Swap Outs 4394 5891 Allocation stalls 118616 44781 Direct pages scanned 4935171 4602313 Kswapd pages scanned 15921292 16258483 Kswapd pages reclaimed 15913301 16248305 Direct pages reclaimed 4933368 4601133 Kswapd efficiency 99% 99% Kswapd velocity 670088.047 682555.961 Direct efficiency 99% 99% Direct velocity 207709.217 193212.133 Percentage direct scans 23% 22% Page writes by reclaim 4858.000 6232.000 Page writes file 464 341 Page writes anon 4394 5891 Note that there are fewer allocation stalls even though the amount of direct reclaim scanning is very approximately the same. Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: Jan Kara <jack@suse.cz> Cc: Rik van Riel <riel@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm')
-rw-r--r--mm/vmscan.c36
1 files changed, 25 insertions, 11 deletions
diff --git a/mm/vmscan.c b/mm/vmscan.c
index cc29fca8d989..9149444f947d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2064,13 +2064,27 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
2064 unsigned long nr_reclaimed = 0; 2064 unsigned long nr_reclaimed = 0;
2065 unsigned long nr_to_reclaim = sc->nr_to_reclaim; 2065 unsigned long nr_to_reclaim = sc->nr_to_reclaim;
2066 struct blk_plug plug; 2066 struct blk_plug plug;
2067 bool scan_adjusted = false; 2067 bool scan_adjusted;
2068 2068
2069 get_scan_count(lruvec, sc, nr); 2069 get_scan_count(lruvec, sc, nr);
2070 2070
2071 /* Record the original scan target for proportional adjustments later */ 2071 /* Record the original scan target for proportional adjustments later */
2072 memcpy(targets, nr, sizeof(nr)); 2072 memcpy(targets, nr, sizeof(nr));
2073 2073
2074 /*
2075 * Global reclaiming within direct reclaim at DEF_PRIORITY is a normal
2076 * event that can occur when there is little memory pressure e.g.
2077 * multiple streaming readers/writers. Hence, we do not abort scanning
2078 * when the requested number of pages are reclaimed when scanning at
2079 * DEF_PRIORITY on the assumption that the fact we are direct
2080 * reclaiming implies that kswapd is not keeping up and it is best to
2081 * do a batch of work at once. For memcg reclaim one check is made to
2082 * abort proportional reclaim if either the file or anon lru has already
2083 * dropped to zero at the first pass.
2084 */
2085 scan_adjusted = (global_reclaim(sc) && !current_is_kswapd() &&
2086 sc->priority == DEF_PRIORITY);
2087
2074 blk_start_plug(&plug); 2088 blk_start_plug(&plug);
2075 while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] || 2089 while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] ||
2076 nr[LRU_INACTIVE_FILE]) { 2090 nr[LRU_INACTIVE_FILE]) {
@@ -2091,17 +2105,8 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
2091 continue; 2105 continue;
2092 2106
2093 /* 2107 /*
2094 * For global direct reclaim, reclaim only the number of pages
2095 * requested. Less care is taken to scan proportionally as it
2096 * is more important to minimise direct reclaim stall latency
2097 * than it is to properly age the LRU lists.
2098 */
2099 if (global_reclaim(sc) && !current_is_kswapd())
2100 break;
2101
2102 /*
2103 * For kswapd and memcg, reclaim at least the number of pages 2108 * For kswapd and memcg, reclaim at least the number of pages
2104 * requested. Ensure that the anon and file LRUs shrink 2109 * requested. Ensure that the anon and file LRUs are scanned
2105 * proportionally what was requested by get_scan_count(). We 2110 * proportionally what was requested by get_scan_count(). We
2106 * stop reclaiming one LRU and reduce the amount scanning 2111 * stop reclaiming one LRU and reduce the amount scanning
2107 * proportional to the original scan target. 2112 * proportional to the original scan target.
@@ -2109,6 +2114,15 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
2109 nr_file = nr[LRU_INACTIVE_FILE] + nr[LRU_ACTIVE_FILE]; 2114 nr_file = nr[LRU_INACTIVE_FILE] + nr[LRU_ACTIVE_FILE];
2110 nr_anon = nr[LRU_INACTIVE_ANON] + nr[LRU_ACTIVE_ANON]; 2115 nr_anon = nr[LRU_INACTIVE_ANON] + nr[LRU_ACTIVE_ANON];
2111 2116
2117 /*
2118 * It's just vindictive to attack the larger once the smaller
2119 * has gone to zero. And given the way we stop scanning the
2120 * smaller below, this makes sure that we only make one nudge
2121 * towards proportionality once we've got nr_to_reclaim.
2122 */
2123 if (!nr_file || !nr_anon)
2124 break;
2125
2112 if (nr_file > nr_anon) { 2126 if (nr_file > nr_anon) {
2113 unsigned long scan_target = targets[LRU_INACTIVE_ANON] + 2127 unsigned long scan_target = targets[LRU_INACTIVE_ANON] +
2114 targets[LRU_ACTIVE_ANON] + 1; 2128 targets[LRU_ACTIVE_ANON] + 1;