Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma

Pull Automatic NUMA Balancing bare-bones from Mel Gorman: "There are three implementations for NUMA balancing, this tree (balancenuma), numacore which has been developed in tip/master and autonuma which is in aa.git. In almost all respects balancenuma is the dumbest of the three because its main impact is on the VM side with no attempt to be smart about scheduling. In the interest of getting the ball rolling, it would be desirable to see this much merged for 3.8 with the view to building scheduler smarts on top and adapting the VM where required for 3.9. The most recent set of comparisons available from different people are mel: https://lkml.org/lkml/2012/12/9/108 mingo: https://lkml.org/lkml/2012/12/7/331 tglx: https://lkml.org/lkml/2012/12/10/437 srikar: https://lkml.org/lkml/2012/12/10/397 The results are a mixed bag. In my own tests, balancenuma does reasonably well. It's dumb as rocks and does not regress against mainline. On the other hand, Ingo's tests shows that balancenuma is incapable of converging for this workloads driven by perf which is bad but is potentially explained by the lack of scheduler smarts. Thomas' results show balancenuma improves on mainline but falls far short of numacore or autonuma. Srikar's results indicate we all suffer on a large machine with imbalanced node sizes. My own testing showed that recent numacore results have improved dramatically, particularly in the last week but not universally. We've butted heads heavily on system CPU usage and high levels of migration even when it shows that overall performance is better. There are also cases where it regresses. Of interest is that for specjbb in some configurations it will regress for lower numbers of warehouses and show gains for higher numbers which is not reported by the tool by default and sometimes missed in treports. Recently I reported for numacore that the JVM was crashing with NullPointerExceptions but currently it's unclear what the source of this problem is. Initially I thought it was in how numacore batch handles PTEs but I'm no longer think this is the case. It's possible numacore is just able to trigger it due to higher rates of migration. These reports were quite late in the cycle so I/we would like to start with this tree as it contains much of the code we can agree on and has not changed significantly over the last 2-3 weeks." * tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits) mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable mm/rmap: Convert the struct anon_vma::mutex to an rwsem mm: migrate: Account a transhuge page properly when rate limiting mm: numa: Account for failed allocations and isolations as migration failures mm: numa: Add THP migration for the NUMA working set scanning fault case build fix mm: numa: Add THP migration for the NUMA working set scanning fault case. mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG mm: sched: numa: Control enabling and disabling of NUMA balancing mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships mm: numa: migrate: Set last_nid on newly allocated page mm: numa: split_huge_page: Transfer last_nid on tail page mm: numa: Introduce last_nid to the page frame sched: numa: Slowly increase the scanning period as NUMA faults are handled mm: numa: Rate limit setting of pte_numa if node is saturated mm: numa: Rate limit the amount of memory that is migrated between nodes mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting mm: numa: Migrate pages handled during a pmd_numa hinting fault mm: numa: Migrate on reference policy ...
author: Linus Torvalds <torvalds@linux-foundation.org> 2012-12-16 17:33:25 -0500
committer: Linus Torvalds <torvalds@linux-foundation.org> 2012-12-16 18:18:08 -0500
commit: 3d59eebc5e137bd89c6351e4c70e90ba1d0dc234 (patch)
tree: b4ddfd0b057454a7437a3b4e3074a3b8b4b03817 /init
parent: 11520e5e7c1855fc3bf202bb3be35a39d9efa034 (diff)
parent: 4fc3f1d66b1ef0d7b8dc11f4ff1cc510f78b37d6 (diff)
1 files changed, 44 insertions, 0 deletions
diff --git a/init/Kconfig b/init/Kconfig
index 2054e048bb98..1a207efca591 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -717,6 +717,50 @@ config LOG_BUF_SHIFT
 config HAVE_UNSTABLE_SCHED_CLOCK
        bool
+#
+# For architectures that want to enable the support for NUMA-affine scheduler
+# balancing logic:
+#
+config ARCH_SUPPORTS_NUMA_BALANCING
+        bool
+# For architectures that (ab)use NUMA to represent different memory regions
+# all cpu-local but of different latencies, such as SuperH.
+#
+config ARCH_WANT_NUMA_VARIABLE_LOCALITY
+        bool
+#
+# For architectures that are willing to define _PAGE_NUMA as _PAGE_PROTNONE
+config ARCH_WANTS_PROT_NUMA_PROT_NONE
+        bool
+config ARCH_USES_NUMA_PROT_NONE
+        bool
+        default y
+        depends on ARCH_WANTS_PROT_NUMA_PROT_NONE
+        depends on NUMA_BALANCING
+config NUMA_BALANCING_DEFAULT_ENABLED
+        bool "Automatically enable NUMA aware memory/task placement"
+        default y
+        depends on NUMA_BALANCING
+        help
+          If set, autonumic NUMA balancing will be enabled if running on a NUMA
+          machine.
+config NUMA_BALANCING
+        bool "Memory placement aware NUMA scheduler"
+        depends on ARCH_SUPPORTS_NUMA_BALANCING
+        depends on !ARCH_WANT_NUMA_VARIABLE_LOCALITY
+        depends on SMP && NUMA && MIGRATION
+        help
+          This option adds support for automatic NUMA aware memory/task placement.
+          The mechanism is quite primitive and is based on migrating memory when
+          it is references to the node the task is running on.
+          This system will be inactive on UMA systems.
 menuconfig CGROUPS
        boolean "Control Group support"
        depends on EVENTFD
author	Linus Torvalds <torvalds@linux-foundation.org>	2012-12-16 17:33:25 -0500
committer	Linus Torvalds <torvalds@linux-foundation.org>	2012-12-16 18:18:08 -0500
commit	3d59eebc5e137bd89c6351e4c70e90ba1d0dc234 (patch)
tree	b4ddfd0b057454a7437a3b4e3074a3b8b4b03817 /init
parent	11520e5e7c1855fc3bf202bb3be35a39d9efa034 (diff)
parent	4fc3f1d66b1ef0d7b8dc11f4ff1cc510f78b37d6 (diff)

diff --git a/init/Kconfig b/init/Kconfig index 2054e048bb98..1a207efca591 100644 --- a/init/Kconfig +++ b/init/Kconfig
@@ -717,6 +717,50 @@ config LOG_BUF_SHIFT
717	config HAVE_UNSTABLE_SCHED_CLOCK	717	config HAVE_UNSTABLE_SCHED_CLOCK
718	bool	718	bool
719		719
		720	#
		721	# For architectures that want to enable the support for NUMA-affine scheduler
		722	# balancing logic:
		723	#
		724	config ARCH_SUPPORTS_NUMA_BALANCING
		725	bool
		726
		727	# For architectures that (ab)use NUMA to represent different memory regions
		728	# all cpu-local but of different latencies, such as SuperH.
		729	#
		730	config ARCH_WANT_NUMA_VARIABLE_LOCALITY
		731	bool
		732
		733	#
		734	# For architectures that are willing to define _PAGE_NUMA as _PAGE_PROTNONE
		735	config ARCH_WANTS_PROT_NUMA_PROT_NONE
		736	bool
		737
		738	config ARCH_USES_NUMA_PROT_NONE
		739	bool
		740	default y
		741	depends on ARCH_WANTS_PROT_NUMA_PROT_NONE
		742	depends on NUMA_BALANCING
		743
		744	config NUMA_BALANCING_DEFAULT_ENABLED
		745	bool "Automatically enable NUMA aware memory/task placement"
		746	default y
		747	depends on NUMA_BALANCING
		748	help
		749	If set, autonumic NUMA balancing will be enabled if running on a NUMA
		750	machine.
		751
		752	config NUMA_BALANCING
		753	bool "Memory placement aware NUMA scheduler"
		754	depends on ARCH_SUPPORTS_NUMA_BALANCING
		755	depends on !ARCH_WANT_NUMA_VARIABLE_LOCALITY
		756	depends on SMP && NUMA && MIGRATION
		757	help
		758	This option adds support for automatic NUMA aware memory/task placement.
		759	The mechanism is quite primitive and is based on migrating memory when
		760	it is references to the node the task is running on.
		761
		762	This system will be inactive on UMA systems.
		763
720	menuconfig CGROUPS	764	menuconfig CGROUPS
721	boolean "Control Group support"	765	boolean "Control Group support"
722	depends on EVENTFD	766	depends on EVENTFD