aboutsummaryrefslogtreecommitdiffstats
path: root/kernel/sched.c
diff options
context:
space:
mode:
authorSuresh Siddha <suresh.b.siddha@intel.com>2010-09-17 18:02:32 -0400
committerIngo Molnar <mingo@elte.hu>2010-11-10 17:13:56 -0500
commitaae6d3ddd8b90f5b2c8d79a2b914d1706d124193 (patch)
treeb993f929f4b1cc38ef01094ff4504eaf358adb31 /kernel/sched.c
parentf6614b7bb405a9b35dd28baea989a749492c46b2 (diff)
sched: Use group weight, idle cpu metrics to fix imbalances during idle
Currently we consider a sched domain to be well balanced when the imbalance is less than the domain's imablance_pct. As the number of cores and threads are increasing, current values of imbalance_pct (for example 25% for a NUMA domain) are not enough to detect imbalances like: a) On a WSM-EP system (two sockets, each having 6 cores and 12 logical threads), 24 cpu-hogging tasks get scheduled as 13 on one socket and 11 on another socket. Leading to an idle HT cpu. b) On a hypothetial 2 socket NHM-EX system (each socket having 8 cores and 16 logical threads), 16 cpu-hogging tasks can get scheduled as 9 on one socket and 7 on another socket. Leaving one core in a socket idle whereas in another socket we have a core having both its HT siblings busy. While this issue can be fixed by decreasing the domain's imbalance_pct (by making it a function of number of logical cpus in the domain), it can potentially cause more task migrations across sched groups in an overloaded case. Fix this by using imbalance_pct only during newly_idle and busy load balancing. And during idle load balancing, check if there is an imbalance in number of idle cpu's across the busiest and this sched_group or if the busiest group has more tasks than its weight that the idle cpu in this_group can pull. Reported-by: Nikhil Rao <ncrao@google.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1284760952.2676.11.camel@sbsiddha-MOBL3.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Diffstat (limited to 'kernel/sched.c')
-rw-r--r--kernel/sched.c2
1 files changed, 2 insertions, 0 deletions
diff --git a/kernel/sched.c b/kernel/sched.c
index aa14a56f9d03..36a088018fe0 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6960,6 +6960,8 @@ static void init_sched_groups_power(int cpu, struct sched_domain *sd)
6960 if (cpu != group_first_cpu(sd->groups)) 6960 if (cpu != group_first_cpu(sd->groups))
6961 return; 6961 return;
6962 6962
6963 sd->groups->group_weight = cpumask_weight(sched_group_cpus(sd->groups));
6964
6963 child = sd->child; 6965 child = sd->child;
6964 6966
6965 sd->groups->cpu_power = 0; 6967 sd->groups->cpu_power = 0;