diff options
author | Suresh Siddha <suresh.b.siddha@intel.com> | 2010-09-17 18:02:32 -0400 |
---|---|---|
committer | Ingo Molnar <mingo@elte.hu> | 2010-11-10 17:13:56 -0500 |
commit | aae6d3ddd8b90f5b2c8d79a2b914d1706d124193 (patch) | |
tree | b993f929f4b1cc38ef01094ff4504eaf358adb31 /kernel/sched.c | |
parent | f6614b7bb405a9b35dd28baea989a749492c46b2 (diff) |
sched: Use group weight, idle cpu metrics to fix imbalances during idle
Currently we consider a sched domain to be well balanced when the imbalance
is less than the domain's imablance_pct. As the number of cores and threads
are increasing, current values of imbalance_pct (for example 25% for a
NUMA domain) are not enough to detect imbalances like:
a) On a WSM-EP system (two sockets, each having 6 cores and 12 logical threads),
24 cpu-hogging tasks get scheduled as 13 on one socket and 11 on another
socket. Leading to an idle HT cpu.
b) On a hypothetial 2 socket NHM-EX system (each socket having 8 cores and
16 logical threads), 16 cpu-hogging tasks can get scheduled as 9 on one
socket and 7 on another socket. Leaving one core in a socket idle
whereas in another socket we have a core having both its HT siblings busy.
While this issue can be fixed by decreasing the domain's imbalance_pct
(by making it a function of number of logical cpus in the domain), it
can potentially cause more task migrations across sched groups in an
overloaded case.
Fix this by using imbalance_pct only during newly_idle and busy
load balancing. And during idle load balancing, check if there
is an imbalance in number of idle cpu's across the busiest and this
sched_group or if the busiest group has more tasks than its weight that
the idle cpu in this_group can pull.
Reported-by: Nikhil Rao <ncrao@google.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1284760952.2676.11.camel@sbsiddha-MOBL3.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Diffstat (limited to 'kernel/sched.c')
-rw-r--r-- | kernel/sched.c | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/kernel/sched.c b/kernel/sched.c index aa14a56f9d03..36a088018fe0 100644 --- a/kernel/sched.c +++ b/kernel/sched.c | |||
@@ -6960,6 +6960,8 @@ static void init_sched_groups_power(int cpu, struct sched_domain *sd) | |||
6960 | if (cpu != group_first_cpu(sd->groups)) | 6960 | if (cpu != group_first_cpu(sd->groups)) |
6961 | return; | 6961 | return; |
6962 | 6962 | ||
6963 | sd->groups->group_weight = cpumask_weight(sched_group_cpus(sd->groups)); | ||
6964 | |||
6963 | child = sd->child; | 6965 | child = sd->child; |
6964 | 6966 | ||
6965 | sd->groups->cpu_power = 0; | 6967 | sd->groups->cpu_power = 0; |