aboutsummaryrefslogtreecommitdiffstats
path: root/mm/huge_memory.c
diff options
context:
space:
mode:
authorMel Gorman <mgorman@suse.de>2012-11-02 10:52:48 -0400
committerMel Gorman <mgorman@suse.de>2012-12-11 09:42:48 -0500
commit03c5a6e16322c997bf8f264851bfa3f532ad515f (patch)
treedf5b09acdcd6d171286afa3f77a7ff56336c8ca6 /mm/huge_memory.c
parent4b96a29ba891dd59734cb7be80a900fe93aa2d9f (diff)
mm: numa: Add pte updates, hinting and migration stats
It is tricky to quantify the basic cost of automatic NUMA placement in a meaningful manner. This patch adds some vmstats that can be used as part of a basic costing model. u = basic unit = sizeof(void *) Ca = cost of struct page access = sizeof(struct page) / u Cpte = Cost PTE access = Ca Cupdate = Cost PTE update = (2 * Cpte) + (2 * Wlock) where Cpte is incurred twice for a read and a write and Wlock is a constant representing the cost of taking or releasing a lock Cnumahint = Cost of a minor page fault = some high constant e.g. 1000 Cpagerw = Cost to read or write a full page = Ca + PAGE_SIZE/u Ci = Cost of page isolation = Ca + Wi where Wi is a constant that should reflect the approximate cost of the locking operation Cpagecopy = Cpagerw + (Cpagerw * Wnuma) + Ci + (Ci * Wnuma) where Wnuma is the approximate NUMA factor. 1 is local. 1.2 would imply that remote accesses are 20% more expensive Balancing cost = Cpte * numa_pte_updates + Cnumahint * numa_hint_faults + Ci * numa_pages_migrated + Cpagecopy * numa_pages_migrated Note that numa_pages_migrated is used as a measure of how many pages were isolated even though it would miss pages that failed to migrate. A vmstat counter could have been added for it but the isolation cost is pretty marginal in comparison to the overall cost so it seemed overkill. The ideal way to measure automatic placement benefit would be to count the number of remote accesses versus local accesses and do something like benefit = (remote_accesses_before - remove_access_after) * Wnuma but the information is not readily available. As a workload converges, the expection would be that the number of remote numa hints would reduce to 0. convergence = numa_hint_faults_local / numa_hint_faults where this is measured for the last N number of numa hints recorded. When the workload is fully converged the value is 1. This can measure if the placement policy is converging and how fast it is doing it. Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Rik van Riel <riel@redhat.com>
Diffstat (limited to 'mm/huge_memory.c')
-rw-r--r--mm/huge_memory.c5
1 files changed, 5 insertions, 0 deletions
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index ee8133794a56..f3a477fffd09 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1026,6 +1026,7 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
1026 struct page *page = NULL; 1026 struct page *page = NULL;
1027 unsigned long haddr = addr & HPAGE_PMD_MASK; 1027 unsigned long haddr = addr & HPAGE_PMD_MASK;
1028 int target_nid; 1028 int target_nid;
1029 int current_nid = -1;
1029 1030
1030 spin_lock(&mm->page_table_lock); 1031 spin_lock(&mm->page_table_lock);
1031 if (unlikely(!pmd_same(pmd, *pmdp))) 1032 if (unlikely(!pmd_same(pmd, *pmdp)))
@@ -1034,6 +1035,10 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
1034 page = pmd_page(pmd); 1035 page = pmd_page(pmd);
1035 get_page(page); 1036 get_page(page);
1036 spin_unlock(&mm->page_table_lock); 1037 spin_unlock(&mm->page_table_lock);
1038 current_nid = page_to_nid(page);
1039 count_vm_numa_event(NUMA_HINT_FAULTS);
1040 if (current_nid == numa_node_id())
1041 count_vm_numa_event(NUMA_HINT_FAULTS_LOCAL);
1037 1042
1038 target_nid = mpol_misplaced(page, vma, haddr); 1043 target_nid = mpol_misplaced(page, vma, haddr);
1039 if (target_nid == -1) 1044 if (target_nid == -1)