aboutsummaryrefslogtreecommitdiffstats
path: root/arch/sparc64/kernel
diff options
context:
space:
mode:
authorMark Nelson <markn@au1.ibm.com>2008-08-22 00:39:00 -0400
committerPaul Mackerras <paulus@samba.org>2008-09-15 14:07:42 -0400
commit57dda6ef5bd5b9e60410477ad29e654097e2cca1 (patch)
tree5d9ea334a65752d440e2cef79cb33156fe97d585 /arch/sparc64/kernel
parent2a9294369bd020db89bfdf78b84c3615b39a5c84 (diff)
powerpc: New copy_4K_page()
This new copy_4K_page() function was originally tuned for the best performance on the Cell processor, but after testing on more 64bit powerpc chips it was found that with a small modification it either matched the performance offered by the current mainline version or bettered it by a small amount. It was found that on a Cell-based QS22 blade the amount of system time measured when compiling a 2.6.26 pseries_defconfig decreased by 4%. Using the same test, a 4-way 970MP machine saw a decrease of 2% in system time. No noticeable change was seen on Power4, Power5 or Power6. The 4096 byte page is copied in thirty-two 128 byte strides. An initial setup loop executes dcbt instructions for the whole source page and dcbz instructions for the whole destination page. To do this, the cache line size is retrieved from ppc64_caches. A new CPU feature bit, CPU_FTR_CP_USE_DCBTZ, (introduced in the previous patch) is used to make the modification to this new copy routine - on Power4, 970 and Cell the feature bit is set so the setup loop is executed, but on all other 64bit chips the setup loop is nop'ed out. Signed-off-by: Mark Nelson <markn@au1.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Diffstat (limited to 'arch/sparc64/kernel')
0 files changed, 0 insertions, 0 deletions