summaryrefslogtreecommitdiffstats
path: root/mm/page_alloc.c
diff options
context:
space:
mode:
authorMel Gorman <mgorman@techsingularity.net>2017-11-15 20:37:37 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2017-11-15 21:21:06 -0500
commit9cca35d42eb61b69e108a17215756c46173a5e6f (patch)
tree72f24467c3e3f1d3c9b687386419c7f0300bca9a /mm/page_alloc.c
parentaa65c29ce1b6e1990cd2c7d8004bbea7ff3aff38 (diff)
mm, page_alloc: enable/disable IRQs once when freeing a list of pages
Patch series "Follow-up for speed up page cache truncation", v2. This series is a follow-on for Jan Kara's series "Speed up page cache truncation" series. We both ended up looking at the same problem but saw different problems based on the same data. This series builds upon his work. A variety of workloads were compared on four separate machines but each machine showed gains albeit at different levels. Minimally, some of the differences are due to NUMA where truncating data from a remote node is slower than a local node. The workloads checked were o sparse truncate microbenchmark, tiny o sparse truncate microbenchmark, large o reaim-io disk workfile o dbench4 (modified by mmtests to produce more stable results) o filebench varmail configuration for small memory size o bonnie, directory operations, working set size 2*RAM reaim-io, dbench and filebench all showed minor gains. Truncation does not dominate those workloads but were tested to ensure no other regressions. They will not be reported further. The sparse truncate microbench was written by Jan. It creates a number of files and then times how long it takes to truncate each one. The "tiny" configuraiton creates a number of files that easily fits in memory and times how long it takes to truncate files with page cache. The large configuration uses enough files to have data that is twice the size of memory and so timings there include truncating page cache and working set shadow entries in the radix tree. Patches 1-4 are the most relevant parts of this series. Patches 5-8 are optional as they are deleting code that is essentially useless but has a negligible performance impact. The changelogs have more information on performance but just for bonnie delete options, the main comparison is bonnie 4.14.0-rc5 4.14.0-rc5 4.14.0-rc5 jan-v2 vanilla mel-v2 Hmean SeqCreate ops 76.20 ( 0.00%) 75.80 ( -0.53%) 76.80 ( 0.79%) Hmean SeqCreate read 85.00 ( 0.00%) 85.00 ( 0.00%) 85.00 ( 0.00%) Hmean SeqCreate del 13752.31 ( 0.00%) 12090.23 ( -12.09%) 15304.84 ( 11.29%) Hmean RandCreate ops 76.00 ( 0.00%) 75.60 ( -0.53%) 77.00 ( 1.32%) Hmean RandCreate read 96.80 ( 0.00%) 96.80 ( 0.00%) 97.00 ( 0.21%) Hmean RandCreate del 13233.75 ( 0.00%) 11525.35 ( -12.91%) 14446.61 ( 9.16%) Jan's series is the baseline and the vanilla kernel is 12% slower where as this series on top gains another 11%. This is from a different machine than the data in the changelogs but the detailed data was not collected as there was no substantial change in v2. This patch (of 8): Freeing a list of pages current enables/disables IRQs for each page freed. This patch splits freeing a list of pages into two operations -- preparing the pages for freeing and the actual freeing. This is a tradeoff - we're taking two passes of the list to free in exchange for avoiding multiple enable/disable of IRQs. sparsetruncate (tiny) 4.14.0-rc4 4.14.0-rc4 janbatch-v1r1 oneirq-v1r1 Min Time 149.00 ( 0.00%) 141.00 ( 5.37%) 1st-qrtle Time 150.00 ( 0.00%) 142.00 ( 5.33%) 2nd-qrtle Time 151.00 ( 0.00%) 142.00 ( 5.96%) 3rd-qrtle Time 151.00 ( 0.00%) 143.00 ( 5.30%) Max-90% Time 153.00 ( 0.00%) 144.00 ( 5.88%) Max-95% Time 155.00 ( 0.00%) 147.00 ( 5.16%) Max-99% Time 201.00 ( 0.00%) 195.00 ( 2.99%) Max Time 236.00 ( 0.00%) 230.00 ( 2.54%) Amean Time 152.65 ( 0.00%) 144.37 ( 5.43%) Stddev Time 9.78 ( 0.00%) 10.44 ( -6.72%) Coeff Time 6.41 ( 0.00%) 7.23 ( -12.84%) Best99%Amean Time 152.07 ( 0.00%) 143.72 ( 5.50%) Best95%Amean Time 150.75 ( 0.00%) 142.37 ( 5.56%) Best90%Amean Time 150.59 ( 0.00%) 142.19 ( 5.58%) Best75%Amean Time 150.36 ( 0.00%) 141.92 ( 5.61%) Best50%Amean Time 150.04 ( 0.00%) 141.69 ( 5.56%) Best25%Amean Time 149.85 ( 0.00%) 141.38 ( 5.65%) With a tiny number of files, each file truncated has resident page cache and it shows that time to truncate is roughtly 5-6% with some minor jitter. 4.14.0-rc4 4.14.0-rc4 janbatch-v1r1 oneirq-v1r1 Hmean SeqCreate ops 65.27 ( 0.00%) 81.86 ( 25.43%) Hmean SeqCreate read 39.48 ( 0.00%) 47.44 ( 20.16%) Hmean SeqCreate del 24963.95 ( 0.00%) 26319.99 ( 5.43%) Hmean RandCreate ops 65.47 ( 0.00%) 82.01 ( 25.26%) Hmean RandCreate read 42.04 ( 0.00%) 51.75 ( 23.09%) Hmean RandCreate del 23377.66 ( 0.00%) 23764.79 ( 1.66%) As expected, there is a small gain for the delete operation. [mgorman@techsingularity.net: use page_private and set_page_private helpers] Link: http://lkml.kernel.org/r/20171018101547.mjycw7zreb66jzpa@techsingularity.net Link: http://lkml.kernel.org/r/20171018075952.10627-2-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Cc: Jan Kara <jack@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm/page_alloc.c')
-rw-r--r--mm/page_alloc.c58
1 files changed, 44 insertions, 14 deletions
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ab648e359602..6a3c4a1d513f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2611,24 +2611,26 @@ void mark_free_pages(struct zone *zone)
2611} 2611}
2612#endif /* CONFIG_PM */ 2612#endif /* CONFIG_PM */
2613 2613
2614/* 2614static bool free_hot_cold_page_prepare(struct page *page, unsigned long pfn)
2615 * Free a 0-order page
2616 * cold == true ? free a cold page : free a hot page
2617 */
2618void free_hot_cold_page(struct page *page, bool cold)
2619{ 2615{
2620 struct zone *zone = page_zone(page);
2621 struct per_cpu_pages *pcp;
2622 unsigned long flags;
2623 unsigned long pfn = page_to_pfn(page);
2624 int migratetype; 2616 int migratetype;
2625 2617
2626 if (!free_pcp_prepare(page)) 2618 if (!free_pcp_prepare(page))
2627 return; 2619 return false;
2628 2620
2629 migratetype = get_pfnblock_migratetype(page, pfn); 2621 migratetype = get_pfnblock_migratetype(page, pfn);
2630 set_pcppage_migratetype(page, migratetype); 2622 set_pcppage_migratetype(page, migratetype);
2631 local_irq_save(flags); 2623 return true;
2624}
2625
2626static void free_hot_cold_page_commit(struct page *page, unsigned long pfn,
2627 bool cold)
2628{
2629 struct zone *zone = page_zone(page);
2630 struct per_cpu_pages *pcp;
2631 int migratetype;
2632
2633 migratetype = get_pcppage_migratetype(page);
2632 __count_vm_event(PGFREE); 2634 __count_vm_event(PGFREE);
2633 2635
2634 /* 2636 /*
@@ -2641,7 +2643,7 @@ void free_hot_cold_page(struct page *page, bool cold)
2641 if (migratetype >= MIGRATE_PCPTYPES) { 2643 if (migratetype >= MIGRATE_PCPTYPES) {
2642 if (unlikely(is_migrate_isolate(migratetype))) { 2644 if (unlikely(is_migrate_isolate(migratetype))) {
2643 free_one_page(zone, page, pfn, 0, migratetype); 2645 free_one_page(zone, page, pfn, 0, migratetype);
2644 goto out; 2646 return;
2645 } 2647 }
2646 migratetype = MIGRATE_MOVABLE; 2648 migratetype = MIGRATE_MOVABLE;
2647 } 2649 }
@@ -2657,8 +2659,22 @@ void free_hot_cold_page(struct page *page, bool cold)
2657 free_pcppages_bulk(zone, batch, pcp); 2659 free_pcppages_bulk(zone, batch, pcp);
2658 pcp->count -= batch; 2660 pcp->count -= batch;
2659 } 2661 }
2662}
2660 2663
2661out: 2664/*
2665 * Free a 0-order page
2666 * cold == true ? free a cold page : free a hot page
2667 */
2668void free_hot_cold_page(struct page *page, bool cold)
2669{
2670 unsigned long flags;
2671 unsigned long pfn = page_to_pfn(page);
2672
2673 if (!free_hot_cold_page_prepare(page, pfn))
2674 return;
2675
2676 local_irq_save(flags);
2677 free_hot_cold_page_commit(page, pfn, cold);
2662 local_irq_restore(flags); 2678 local_irq_restore(flags);
2663} 2679}
2664 2680
@@ -2668,11 +2684,25 @@ out:
2668void free_hot_cold_page_list(struct list_head *list, bool cold) 2684void free_hot_cold_page_list(struct list_head *list, bool cold)
2669{ 2685{
2670 struct page *page, *next; 2686 struct page *page, *next;
2687 unsigned long flags, pfn;
2688
2689 /* Prepare pages for freeing */
2690 list_for_each_entry_safe(page, next, list, lru) {
2691 pfn = page_to_pfn(page);
2692 if (!free_hot_cold_page_prepare(page, pfn))
2693 list_del(&page->lru);
2694 set_page_private(page, pfn);
2695 }
2671 2696
2697 local_irq_save(flags);
2672 list_for_each_entry_safe(page, next, list, lru) { 2698 list_for_each_entry_safe(page, next, list, lru) {
2699 unsigned long pfn = page_private(page);
2700
2701 set_page_private(page, 0);
2673 trace_mm_page_free_batched(page, cold); 2702 trace_mm_page_free_batched(page, cold);
2674 free_hot_cold_page(page, cold); 2703 free_hot_cold_page_commit(page, pfn, cold);
2675 } 2704 }
2705 local_irq_restore(flags);
2676} 2706}
2677 2707
2678/* 2708/*