diff options
author | Eric Dumazet <dada1@cosmosbay.com> | 2007-05-08 03:32:57 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@woody.linux-foundation.org> | 2007-05-08 14:15:17 -0400 |
commit | 5517d86bea237c1d7078840182d9ebc0fe4c1afc (patch) | |
tree | 67f1999895313878bfa904c66dffb7066f3c8d91 /include/linux/sched.h | |
parent | 46cb4b7c88fa5517f64b5bee42939ea3614cddcb (diff) |
Speed up divides by cpu_power in scheduler
I noticed expensive divides done in try_to_wakeup() and
find_busiest_group() on a bi dual core Opteron machine (total of 4 cores),
moderatly loaded (15.000 context switch per second)
oprofile numbers :
CPU: AMD64 processors, speed 2600.05 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit
mask of 0x00 (No unit mask) count 50000
samples % symbol name
...
613914 1.0498 try_to_wake_up
834 0.0013 :ffffffff80227ae1: div %rcx
77513 0.1191 :ffffffff80227ae4: mov %rax,%r11
608893 1.0413 find_busiest_group
1841 0.0031 :ffffffff802260bf: div %rdi
140109 0.2394 :ffffffff802260c2: test %sil,%sil
Some of these divides can use the reciprocal divides we introduced some
time ago (currently used in slab AFAIK)
We can assume a load will fit in a 32bits number, because with a
SCHED_LOAD_SCALE=128 value, its still a theorical limit of 33554432
When/if we reach this limit one day, probably cpus will have a fast
hardware divide and we can zap the reciprocal divide trick.
Ingo suggested to rename cpu_power to __cpu_power to make clear it should
not be modified without changing its reciprocal value too.
I did not convert the divide in cpu_avg_load_per_task(), because tracking
nr_running changes may be not worth it ? We could use a static table of 32
reciprocal values but it would add a conditional branch and table lookup.
[akpm@linux-foundation.org: !SMP build fix]
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'include/linux/sched.h')
-rw-r--r-- | include/linux/sched.h | 8 |
1 files changed, 7 insertions, 1 deletions
diff --git a/include/linux/sched.h b/include/linux/sched.h index 15ab3e039535..3d95c480f58d 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h | |||
@@ -680,8 +680,14 @@ struct sched_group { | |||
680 | /* | 680 | /* |
681 | * CPU power of this group, SCHED_LOAD_SCALE being max power for a | 681 | * CPU power of this group, SCHED_LOAD_SCALE being max power for a |
682 | * single CPU. This is read only (except for setup, hotplug CPU). | 682 | * single CPU. This is read only (except for setup, hotplug CPU). |
683 | * Note : Never change cpu_power without recompute its reciprocal | ||
683 | */ | 684 | */ |
684 | unsigned long cpu_power; | 685 | unsigned int __cpu_power; |
686 | /* | ||
687 | * reciprocal value of cpu_power to avoid expensive divides | ||
688 | * (see include/linux/reciprocal_div.h) | ||
689 | */ | ||
690 | u32 reciprocal_cpu_power; | ||
685 | }; | 691 | }; |
686 | 692 | ||
687 | struct sched_domain { | 693 | struct sched_domain { |