aboutsummaryrefslogtreecommitdiffstats
path: root/mm/page_alloc.c
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2012-12-16 17:33:25 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2012-12-16 18:18:08 -0500
commit3d59eebc5e137bd89c6351e4c70e90ba1d0dc234 (patch)
treeb4ddfd0b057454a7437a3b4e3074a3b8b4b03817 /mm/page_alloc.c
parent11520e5e7c1855fc3bf202bb3be35a39d9efa034 (diff)
parent4fc3f1d66b1ef0d7b8dc11f4ff1cc510f78b37d6 (diff)
Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma
Pull Automatic NUMA Balancing bare-bones from Mel Gorman: "There are three implementations for NUMA balancing, this tree (balancenuma), numacore which has been developed in tip/master and autonuma which is in aa.git. In almost all respects balancenuma is the dumbest of the three because its main impact is on the VM side with no attempt to be smart about scheduling. In the interest of getting the ball rolling, it would be desirable to see this much merged for 3.8 with the view to building scheduler smarts on top and adapting the VM where required for 3.9. The most recent set of comparisons available from different people are mel: https://lkml.org/lkml/2012/12/9/108 mingo: https://lkml.org/lkml/2012/12/7/331 tglx: https://lkml.org/lkml/2012/12/10/437 srikar: https://lkml.org/lkml/2012/12/10/397 The results are a mixed bag. In my own tests, balancenuma does reasonably well. It's dumb as rocks and does not regress against mainline. On the other hand, Ingo's tests shows that balancenuma is incapable of converging for this workloads driven by perf which is bad but is potentially explained by the lack of scheduler smarts. Thomas' results show balancenuma improves on mainline but falls far short of numacore or autonuma. Srikar's results indicate we all suffer on a large machine with imbalanced node sizes. My own testing showed that recent numacore results have improved dramatically, particularly in the last week but not universally. We've butted heads heavily on system CPU usage and high levels of migration even when it shows that overall performance is better. There are also cases where it regresses. Of interest is that for specjbb in some configurations it will regress for lower numbers of warehouses and show gains for higher numbers which is not reported by the tool by default and sometimes missed in treports. Recently I reported for numacore that the JVM was crashing with NullPointerExceptions but currently it's unclear what the source of this problem is. Initially I thought it was in how numacore batch handles PTEs but I'm no longer think this is the case. It's possible numacore is just able to trigger it due to higher rates of migration. These reports were quite late in the cycle so I/we would like to start with this tree as it contains much of the code we can agree on and has not changed significantly over the last 2-3 weeks." * tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits) mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable mm/rmap: Convert the struct anon_vma::mutex to an rwsem mm: migrate: Account a transhuge page properly when rate limiting mm: numa: Account for failed allocations and isolations as migration failures mm: numa: Add THP migration for the NUMA working set scanning fault case build fix mm: numa: Add THP migration for the NUMA working set scanning fault case. mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG mm: sched: numa: Control enabling and disabling of NUMA balancing mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships mm: numa: migrate: Set last_nid on newly allocated page mm: numa: split_huge_page: Transfer last_nid on tail page mm: numa: Introduce last_nid to the page frame sched: numa: Slowly increase the scanning period as NUMA faults are handled mm: numa: Rate limit setting of pte_numa if node is saturated mm: numa: Rate limit the amount of memory that is migrated between nodes mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting mm: numa: Migrate pages handled during a pmd_numa hinting fault mm: numa: Migrate on reference policy ...
Diffstat (limited to 'mm/page_alloc.c')
-rw-r--r--mm/page_alloc.c10
1 files changed, 9 insertions, 1 deletions
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 83637dfba110..d037c8bc1512 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -611,6 +611,7 @@ static inline int free_pages_check(struct page *page)
611 bad_page(page); 611 bad_page(page);
612 return 1; 612 return 1;
613 } 613 }
614 reset_page_last_nid(page);
614 if (page->flags & PAGE_FLAGS_CHECK_AT_PREP) 615 if (page->flags & PAGE_FLAGS_CHECK_AT_PREP)
615 page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; 616 page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
616 return 0; 617 return 0;
@@ -3883,6 +3884,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
3883 mminit_verify_page_links(page, zone, nid, pfn); 3884 mminit_verify_page_links(page, zone, nid, pfn);
3884 init_page_count(page); 3885 init_page_count(page);
3885 reset_page_mapcount(page); 3886 reset_page_mapcount(page);
3887 reset_page_last_nid(page);
3886 SetPageReserved(page); 3888 SetPageReserved(page);
3887 /* 3889 /*
3888 * Mark the block movable so that blocks are reserved for 3890 * Mark the block movable so that blocks are reserved for
@@ -4526,6 +4528,11 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat,
4526 int ret; 4528 int ret;
4527 4529
4528 pgdat_resize_init(pgdat); 4530 pgdat_resize_init(pgdat);
4531#ifdef CONFIG_NUMA_BALANCING
4532 spin_lock_init(&pgdat->numabalancing_migrate_lock);
4533 pgdat->numabalancing_migrate_nr_pages = 0;
4534 pgdat->numabalancing_migrate_next_window = jiffies;
4535#endif
4529 init_waitqueue_head(&pgdat->kswapd_wait); 4536 init_waitqueue_head(&pgdat->kswapd_wait);
4530 init_waitqueue_head(&pgdat->pfmemalloc_wait); 4537 init_waitqueue_head(&pgdat->pfmemalloc_wait);
4531 pgdat_page_cgroup_init(pgdat); 4538 pgdat_page_cgroup_init(pgdat);
@@ -5800,7 +5807,8 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
5800 5807
5801 ret = migrate_pages(&cc->migratepages, 5808 ret = migrate_pages(&cc->migratepages,
5802 alloc_migrate_target, 5809 alloc_migrate_target,
5803 0, false, MIGRATE_SYNC); 5810 0, false, MIGRATE_SYNC,
5811 MR_CMA);
5804 } 5812 }
5805 5813
5806 putback_movable_pages(&cc->migratepages); 5814 putback_movable_pages(&cc->migratepages);