aboutsummaryrefslogtreecommitdiffstats
path: root/include
diff options
context:
space:
mode:
authorNicolas Pitre <nico@cam.org>2006-12-13 12:39:26 -0500
committerRussell King <rmk+kernel@arm.linux.org.uk>2006-12-13 13:30:20 -0500
commit02828845dda5ccf921ab2557c6ca17b6e7fc70e2 (patch)
treeacdcb4a0c25d8bf65688b122cdd71395dcde9ccf /include
parent386b0ce25ae16eb1d25db6a004c959e3a9003ce3 (diff)
[ARM] 4016/1: prefetch macro is wrong wrt gcc's "delete-null-pointer-checks"
optimization The gcc manual says: |`-fdelete-null-pointer-checks' | Use global dataflow analysis to identify and eliminate useless | checks for null pointers. The compiler assumes that dereferencing | a null pointer would have halted the program. If a pointer is | checked after it has already been dereferenced, it cannot be null. | Enabled at levels `-O2', `-O3', `-Os'. Now the problem can be seen with this test case: #include <linux/prefetch.h> extern void bar(char *x); void foo(char *x) { prefetch(x); if (x) bar(x); } Because the constraint to the inline asm used in the prefetch() macro is a memory operand, gcc assumes that the asm code does dereference the pointer and the delete-null-pointer-checks optimization kicks in. Inspection of generated assembly for the above example shows that bar() is indeed called unconditionally without any test on the value of x. Of course in the prefetch case there is no real dereference and it cannot be assumed that a null pointer would have been caught at that point. This causes kernel oopses with constructs like hlist_for_each_entry() where the list's 'next' content is prefetched before the pointer is tested against NULL, and only when gcc feels like applying this optimization which doesn't happen all the time with more complex code. It appears that the way to prevent delete-null-pointer-checks optimization to occur in this case is to make prefetch() into a static inline function instead of a macro. At least this is what is done on x86_64 where a similar inline asm memory operand is used (I presume they would have seen the same problem if it didn't work) and resulting code for the above example confirms that. An alternative would consist of replacing the memory operand by a register operand containing the pointer, and use the addressing mode explicitly in the asm template. But that would be less optimal than an offsettable memory reference. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Diffstat (limited to 'include')
-rw-r--r--include/asm-arm/processor.h16
1 files changed, 8 insertions, 8 deletions
diff --git a/include/asm-arm/processor.h b/include/asm-arm/processor.h
index b442e8e2a809..1bbf16182d62 100644
--- a/include/asm-arm/processor.h
+++ b/include/asm-arm/processor.h
@@ -103,14 +103,14 @@ extern int kernel_thread(int (*fn)(void *), void *arg, unsigned long flags);
103#if __LINUX_ARM_ARCH__ >= 5 103#if __LINUX_ARM_ARCH__ >= 5
104 104
105#define ARCH_HAS_PREFETCH 105#define ARCH_HAS_PREFETCH
106#define prefetch(ptr) \ 106static inline void prefetch(const void *ptr)
107 ({ \ 107{
108 __asm__ __volatile__( \ 108 __asm__ __volatile__(
109 "pld\t%0" \ 109 "pld\t%0"
110 : \ 110 :
111 : "o" (*(char *)(ptr)) \ 111 : "o" (*(char *)ptr)
112 : "cc"); \ 112 : "cc");
113 }) 113}
114 114
115#define ARCH_HAS_PREFETCHW 115#define ARCH_HAS_PREFETCHW
116#define prefetchw(ptr) prefetch(ptr) 116#define prefetchw(ptr) prefetch(ptr)