aboutsummaryrefslogtreecommitdiffstats
path: root/mm/vmscan.c
diff options
context:
space:
mode:
authorJohannes Weiner <hannes@cmpxchg.org>2010-03-05 16:42:19 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2010-03-06 14:26:27 -0500
commitdfc8d636cdb95f7b792d5ba8c9f3b295809c125d (patch)
tree90070c49adb5a8833d8fc034bc94cc696797e22e /mm/vmscan.c
parente7c84ee22b8321fa0130a53d4c9806474d62eff0 (diff)
vmscan: factor out page reference checks
The used-once mapped file page detection patchset. It is meant to help workloads with large amounts of shortly used file mappings, like rtorrent hashing a file or git when dealing with loose objects (git gc on a bigger site?). Right now, the VM activates referenced mapped file pages on first encounter on the inactive list and it takes a full memory cycle to reclaim them again. When those pages dominate memory, the system no longer has a meaningful notion of 'working set' and is required to give up the active list to make reclaim progress. Obviously, this results in rather bad scanning latencies and the wrong pages being reclaimed. This patch makes the VM be more careful about activating mapped file pages in the first place. The minimum granted lifetime without another memory access becomes an inactive list cycle instead of the full memory cycle, which is more natural given the mentioned loads. This test resembles a hashing rtorrent process. Sequentially, 32MB chunks of a file are mapped into memory, hashed (sha1) and unmapped again. While this happens, every 5 seconds a process is launched and its execution time taken: python2.4 -c 'import pydoc' old: max=2.31s mean=1.26s (0.34) new: max=1.25s mean=0.32s (0.32) find /etc -type f old: max=2.52s mean=1.44s (0.43) new: max=1.92s mean=0.12s (0.17) vim -c ':quit' old: max=6.14s mean=4.03s (0.49) new: max=3.48s mean=2.41s (0.25) mplayer --help old: max=8.08s mean=5.74s (1.02) new: max=3.79s mean=1.32s (0.81) overall hash time (stdev): old: time=1192.30 (12.85) thruput=25.78mb/s (0.27) new: time=1060.27 (32.58) thruput=29.02mb/s (0.88) (-11%) I also tested kernbench with regular IO streaming in the background to see whether the delayed activation of frequently used mapped file pages had a negative impact on performance in the presence of pressure on the inactive list. The patch made no significant difference in timing, neither for kernbench nor for the streaming IO throughput. The first patch submission raised concerns about the cost of the extra faults for actually activated pages on machines that have no hardware support for young page table entries. I created an artificial worst case scenario on an ARM machine with around 300MHz and 64MB of memory to figure out the dimensions involved. The test would mmap a file of 20MB, then 1. touch all its pages to fault them in 2. force one full scan cycle on the inactive file LRU -- old: mapping pages activated -- new: mapping pages inactive 3. touch the mapping pages again -- old and new: fault exceptions to set the young bits 4. force another full scan cycle on the inactive file LRU 5. touch the mapping pages one last time -- new: fault exceptions to set the young bits The test showed an overall increase of 6% in time over 100 iterations of the above (old: ~212sec, new: ~225sec). 13 secs total overhead / (100 * 5k pages), ignoring the execution time of the test itself, makes for about 25us overhead for every page that gets actually activated. Note: 1. File mapping the size of one third of main memory, _completely_ in active use across memory pressure - i.e., most pages referenced within one LRU cycle. This should be rare to non-existant, especially on such embedded setups. 2. Many huge activation batches. Those batches only occur when the working set fluctuates. If it changes completely between every full LRU cycle, you have problematic reclaim overhead anyway. 3. Access of activated pages at maximum speed: sequential loads from every single page without doing anything in between. In reality, the extra faults will get distributed between actual operations on the data. So even if a workload manages to get the VM into the situation of activating a third of memory in one go on such a setup, it will take 2.2 seconds instead 2.1 without the patch. Comparing the numbers (and my user-experience over several months), I think this change is an overall improvement to the VM. Patch 1 is only refactoring to break up that ugly compound conditional in shrink_page_list() and make it easy to document and add new checks in a readable fashion. Patch 2 gets rid of the obsolete page_mapping_inuse(). It's not strictly related to #3, but it was in the original submission and is a net simplification, so I kept it. Patch 3 implements used-once detection of mapped file pages. This patch: Moving the big conditional into its own predicate function makes the code a bit easier to read and allows for better commenting on the checks one-by-one. This is just cleaning up, no semantics should have been changed. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: OSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm/vmscan.c')
-rw-r--r--mm/vmscan.c56
1 files changed, 43 insertions, 13 deletions
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 5cbf64dd79c1..ba4e87df3fc6 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -579,6 +579,40 @@ redo:
579 put_page(page); /* drop ref from isolate */ 579 put_page(page); /* drop ref from isolate */
580} 580}
581 581
582enum page_references {
583 PAGEREF_RECLAIM,
584 PAGEREF_RECLAIM_CLEAN,
585 PAGEREF_ACTIVATE,
586};
587
588static enum page_references page_check_references(struct page *page,
589 struct scan_control *sc)
590{
591 unsigned long vm_flags;
592 int referenced;
593
594 referenced = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
595 if (!referenced)
596 return PAGEREF_RECLAIM;
597
598 /* Lumpy reclaim - ignore references */
599 if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
600 return PAGEREF_RECLAIM;
601
602 /*
603 * Mlock lost the isolation race with us. Let try_to_unmap()
604 * move the page to the unevictable list.
605 */
606 if (vm_flags & VM_LOCKED)
607 return PAGEREF_RECLAIM;
608
609 if (page_mapping_inuse(page))
610 return PAGEREF_ACTIVATE;
611
612 /* Reclaim if clean, defer dirty pages to writeback */
613 return PAGEREF_RECLAIM_CLEAN;
614}
615
582/* 616/*
583 * shrink_page_list() returns the number of reclaimed pages 617 * shrink_page_list() returns the number of reclaimed pages
584 */ 618 */
@@ -590,16 +624,15 @@ static unsigned long shrink_page_list(struct list_head *page_list,
590 struct pagevec freed_pvec; 624 struct pagevec freed_pvec;
591 int pgactivate = 0; 625 int pgactivate = 0;
592 unsigned long nr_reclaimed = 0; 626 unsigned long nr_reclaimed = 0;
593 unsigned long vm_flags;
594 627
595 cond_resched(); 628 cond_resched();
596 629
597 pagevec_init(&freed_pvec, 1); 630 pagevec_init(&freed_pvec, 1);
598 while (!list_empty(page_list)) { 631 while (!list_empty(page_list)) {
632 enum page_references references;
599 struct address_space *mapping; 633 struct address_space *mapping;
600 struct page *page; 634 struct page *page;
601 int may_enter_fs; 635 int may_enter_fs;
602 int referenced;
603 636
604 cond_resched(); 637 cond_resched();
605 638
@@ -641,17 +674,14 @@ static unsigned long shrink_page_list(struct list_head *page_list,
641 goto keep_locked; 674 goto keep_locked;
642 } 675 }
643 676
644 referenced = page_referenced(page, 1, 677 references = page_check_references(page, sc);
645 sc->mem_cgroup, &vm_flags); 678 switch (references) {
646 /* 679 case PAGEREF_ACTIVATE:
647 * In active use or really unfreeable? Activate it.
648 * If page which have PG_mlocked lost isoltation race,
649 * try_to_unmap moves it to unevictable list
650 */
651 if (sc->order <= PAGE_ALLOC_COSTLY_ORDER &&
652 referenced && page_mapping_inuse(page)
653 && !(vm_flags & VM_LOCKED))
654 goto activate_locked; 680 goto activate_locked;
681 case PAGEREF_RECLAIM:
682 case PAGEREF_RECLAIM_CLEAN:
683 ; /* try to reclaim the page below */
684 }
655 685
656 /* 686 /*
657 * Anonymous process memory has backing store? 687 * Anonymous process memory has backing store?
@@ -685,7 +715,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
685 } 715 }
686 716
687 if (PageDirty(page)) { 717 if (PageDirty(page)) {
688 if (sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced) 718 if (references == PAGEREF_RECLAIM_CLEAN)
689 goto keep_locked; 719 goto keep_locked;
690 if (!may_enter_fs) 720 if (!may_enter_fs)
691 goto keep_locked; 721 goto keep_locked;