diff options
author | Jack F Vogel <jfv@bluesong.net> | 2005-05-01 11:58:48 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-05-01 11:58:48 -0400 |
commit | 67701ae9767534534d3710664037dfde2cc04935 (patch) | |
tree | 6adb8d33585f8eee20794827c79e40991aeeaee5 /arch/x86_64/kernel/nmi.c | |
parent | fd51f666fa591294bd7462447512666e61c56ea0 (diff) |
[PATCH] check nmi watchdog is broken
A bug against an xSeries system showed up recently noting that the
check_nmi_watchdog() test was failing.
I have been investigating it and discovered in both i386 and x86_64 the
recent change to the routine to use the cpu_callin_map has uncovered a
problem. Prior to that change, on an SMP box, the test was trivally
passing because all cpu's were found to not yet be online, but now with the
callin_map they are discovered, it goes on to test the counter and they
have not yet begun to increment, so it announces a CPU is stuck and bails
out.
On all the systems I have access to test, the announcement of failure is
also bougs... by the time you can login and check /proc/interrupts, the
NMI count is happily incrementing on all CPUs. Its just that the test is
being done too early.
I have tried moving the call to the test around a bit, and it was always
too early. I finally hit on this proposed solution, it delays the routine
via a late_initcall(), seems like the right solution to me.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Diffstat (limited to 'arch/x86_64/kernel/nmi.c')
-rw-r--r-- | arch/x86_64/kernel/nmi.c | 9 |
1 files changed, 7 insertions, 2 deletions
diff --git a/arch/x86_64/kernel/nmi.c b/arch/x86_64/kernel/nmi.c index e00d4adec36b..61de0b34a01e 100644 --- a/arch/x86_64/kernel/nmi.c +++ b/arch/x86_64/kernel/nmi.c | |||
@@ -112,17 +112,20 @@ static __init int cpu_has_lapic(void) | |||
112 | } | 112 | } |
113 | } | 113 | } |
114 | 114 | ||
115 | int __init check_nmi_watchdog (void) | 115 | static int __init check_nmi_watchdog (void) |
116 | { | 116 | { |
117 | int counts[NR_CPUS]; | 117 | int counts[NR_CPUS]; |
118 | int cpu; | 118 | int cpu; |
119 | 119 | ||
120 | if (nmi_watchdog == NMI_NONE) | ||
121 | return 0; | ||
122 | |||
120 | if (nmi_watchdog == NMI_LOCAL_APIC && !cpu_has_lapic()) { | 123 | if (nmi_watchdog == NMI_LOCAL_APIC && !cpu_has_lapic()) { |
121 | nmi_watchdog = NMI_NONE; | 124 | nmi_watchdog = NMI_NONE; |
122 | return -1; | 125 | return -1; |
123 | } | 126 | } |
124 | 127 | ||
125 | printk(KERN_INFO "testing NMI watchdog ... "); | 128 | printk(KERN_INFO "Testing NMI watchdog ... "); |
126 | 129 | ||
127 | for (cpu = 0; cpu < NR_CPUS; cpu++) | 130 | for (cpu = 0; cpu < NR_CPUS; cpu++) |
128 | counts[cpu] = cpu_pda[cpu].__nmi_count; | 131 | counts[cpu] = cpu_pda[cpu].__nmi_count; |
@@ -148,6 +151,8 @@ int __init check_nmi_watchdog (void) | |||
148 | 151 | ||
149 | return 0; | 152 | return 0; |
150 | } | 153 | } |
154 | /* Have this called later during boot so counters are updating */ | ||
155 | late_initcall(check_nmi_watchdog); | ||
151 | 156 | ||
152 | int __init setup_nmi_watchdog(char *str) | 157 | int __init setup_nmi_watchdog(char *str) |
153 | { | 158 | { |