From 5f2b0ba4d94b3ac23cbc4b7f675d98eb677a760a Mon Sep 17 00:00:00 2001 From: Don Zickus Date: Fri, 12 Nov 2010 11:22:23 -0500 Subject: x86, nmi_watchdog: Remove the old nmi_watchdog Now that we have a new nmi_watchdog that is more generic and sits on top of the perf subsystem, we really do not need the old nmi_watchdog any more. In addition, the old nmi_watchdog doesn't really work if you are using the default clocksource, hpet. The old nmi_watchdog code relied on local apic interrupts to determine if the cpu is still alive. With hpet as the clocksource, these interrupts don't increment any more and the old nmi_watchdog triggers false postives. This piece removes the old nmi_watchdog code and stubs out any variables and functions calls. The stubs are the same ones used by the new nmi_watchdog code, so it should be well tested. Signed-off-by: Don Zickus Cc: fweisbec@gmail.com Cc: gorcunov@openvz.org LKML-Reference: <1289578944-28564-2-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar --- include/linux/nmi.h | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) (limited to 'include/linux/nmi.h') diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 06aab5eee134..0cb3e5c246d0 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -16,10 +16,7 @@ */ #ifdef ARCH_HAS_NMI_WATCHDOG #include -extern void touch_nmi_watchdog(void); -extern void acpi_nmi_disable(void); -extern void acpi_nmi_enable(void); -#else +#endif #ifndef CONFIG_HARDLOCKUP_DETECTOR static inline void touch_nmi_watchdog(void) { @@ -30,7 +27,6 @@ extern void touch_nmi_watchdog(void); #endif static inline void acpi_nmi_disable(void) { } static inline void acpi_nmi_enable(void) { } -#endif /* * Create trigger_all_cpu_backtrace() out of the arch-provided -- cgit v1.2.2 From 072b198a4ad48bd722ec6d203d65422a4698eae7 Mon Sep 17 00:00:00 2001 From: Don Zickus Date: Fri, 12 Nov 2010 11:22:24 -0500 Subject: x86, nmi_watchdog: Remove all stub function calls from old nmi_watchdog Now that the bulk of the old nmi_watchdog is gone, remove all the stub variables and hooks associated with it. This touches lots of files mainly because of how the io_apic nmi_watchdog was implemented. Now that the io_apic nmi_watchdog is forever gone, remove all its fingers. Most of this code was not being exercised by virtue of nmi_watchdog != NMI_IO_APIC, so there shouldn't be anything to risky here. Signed-off-by: Don Zickus Cc: fweisbec@gmail.com Cc: gorcunov@openvz.org LKML-Reference: <1289578944-28564-3-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar --- include/linux/nmi.h | 2 -- 1 file changed, 2 deletions(-) (limited to 'include/linux/nmi.h') diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 0cb3e5c246d0..1c451e6ecc17 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -25,8 +25,6 @@ static inline void touch_nmi_watchdog(void) #else extern void touch_nmi_watchdog(void); #endif -static inline void acpi_nmi_disable(void) { } -static inline void acpi_nmi_enable(void) { } /* * Create trigger_all_cpu_backtrace() out of the arch-provided -- cgit v1.2.2 From 96a84c20d635fb1e98ab92f9fc517c4441f5c424 Mon Sep 17 00:00:00 2001 From: Don Zickus Date: Mon, 29 Nov 2010 17:07:16 -0500 Subject: lockup detector: Compile fixes from removing the old x86 nmi watchdog My patch that removed the old x86 nmi watchdog broke other arches. This change reverts a piece of that patch and puts the change in the correct spot. Signed-off-by: Don Zickus Reviewed-by: Cyrill Gorcunov Cc: fweisbec@gmail.com Cc: yinghai@kernel.org LKML-Reference: <1291068437-5331-2-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar --- include/linux/nmi.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'include/linux/nmi.h') diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 1c451e6ecc17..17ccf44e7dcb 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -16,7 +16,8 @@ */ #ifdef ARCH_HAS_NMI_WATCHDOG #include -#endif +extern void touch_nmi_watchdog(void); +#else #ifndef CONFIG_HARDLOCKUP_DETECTOR static inline void touch_nmi_watchdog(void) { @@ -25,6 +26,7 @@ static inline void touch_nmi_watchdog(void) #else extern void touch_nmi_watchdog(void); #endif +#endif /* * Create trigger_all_cpu_backtrace() out of the arch-provided -- cgit v1.2.2 From 4a7863cc2eb5f9804f1c4e9156619a801cd7f14f Mon Sep 17 00:00:00 2001 From: Don Zickus Date: Wed, 22 Dec 2010 14:00:03 -0500 Subject: x86, nmi_watchdog: Remove ARCH_HAS_NMI_WATCHDOG and rely on CONFIG_HARDLOCKUP_DETECTOR The x86 arch has shifted its use of the nmi_watchdog from a local implementation to the global one provide by kernel/watchdog.c. This shift has caused a whole bunch of compile problems under different config options. I attempt to simplify things with the patch below. In order to simplify things, I had to come to terms with the meaning of two terms ARCH_HAS_NMI_WATCHDOG and CONFIG_HARDLOCKUP_DETECTOR. Basically they mean the same thing, the former on a local level and the latter on a global level. With the old x86 nmi watchdog gone, there is no need to rely on defining the ARCH_HAS_NMI_WATCHDOG variable because it doesn't make sense any more. x86 will now use the global implementation. The changes below do a few things. First it changes the few places that relied on ARCH_HAS_NMI_WATCHDOG to use CONFIG_X86_LOCAL_APIC (the former was an alias for the latter anyway, so nothing unusual here). Those pieces of code were relying more on local apic functionality the nmi watchdog functionality, so the change should make sense. Second, I removed the x86 implementation of touch_nmi_watchdog(). It isn't need now, instead x86 will rely on kernel/watchdog.c's implementation. Third, I removed the #define ARCH_HAS_NMI_WATCHDOG itself from x86. And tweaked the include/linux/nmi.h file to tell users to look for an externally defined touch_nmi_watchdog in the case of ARCH_HAS_NMI_WATCHDOG _or_ CONFIG_HARDLOCKUP_DETECTOR. This changes removes some of the ugliness in that file. Finally, I added a Kconfig dependency for CONFIG_HARDLOCKUP_DETECTOR that said you can't have ARCH_HAS_NMI_WATCHDOG _and_ CONFIG_HARDLOCKUP_DETECTOR. You can only have one nmi_watchdog. Tested with ARCH=i386: allnoconfig, defconfig, allyesconfig, (various broken configs) ARCH=x86_64: allnoconfig, defconfig, allyesconfig, (various broken configs) Hopefully, after this patch I won't get any more compile broken emails. :-) v3: changed a couple of 'linux/nmi.h' -> 'asm/nmi.h' to pick-up correct function prototypes when CONFIG_HARDLOCKUP_DETECTOR is not set. Signed-off-by: Don Zickus Cc: Peter Zijlstra Cc: fweisbec@gmail.com LKML-Reference: <1293044403-14117-1-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar --- include/linux/nmi.h | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) (limited to 'include/linux/nmi.h') diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 17ccf44e7dcb..c536f8545f74 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -14,18 +14,14 @@ * may be used to reset the timeout - for code which intentionally * disables interrupts for a long time. This call is stateless. */ -#ifdef ARCH_HAS_NMI_WATCHDOG +#if defined(ARCH_HAS_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR) #include extern void touch_nmi_watchdog(void); #else -#ifndef CONFIG_HARDLOCKUP_DETECTOR static inline void touch_nmi_watchdog(void) { touch_softlockup_watchdog(); } -#else -extern void touch_nmi_watchdog(void); -#endif #endif /* -- cgit v1.2.2 From 586692a5a5fc5740c8a46abc0f2365495c2d7c5f Mon Sep 17 00:00:00 2001 From: Mandeep Singh Baines Date: Sun, 22 May 2011 22:10:22 -0700 Subject: watchdog: Disable watchdog when thresh is zero This restores the previous behavior of softlock_thresh. Currently, setting watchdog_thresh to zero causes the watchdog kthreads to consume a lot of CPU. In addition, the logic of proc_dowatchdog_thresh and proc_dowatchdog_enabled has been factored into proc_dowatchdog. Signed-off-by: Mandeep Singh Baines Cc: Marcin Slusarz Cc: Don Zickus Cc: Peter Zijlstra Cc: Frederic Weisbecker Link: http://lkml.kernel.org/r/1306127423-3347-3-git-send-email-msb@chromium.org Signed-off-by: Ingo Molnar LKML-Reference: <20110517071018.GE22305@elte.hu> --- include/linux/nmi.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'include/linux/nmi.h') diff --git a/include/linux/nmi.h b/include/linux/nmi.h index c536f8545f74..5317b8b2198f 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -47,9 +47,10 @@ static inline bool trigger_all_cpu_backtrace(void) int hw_nmi_is_cpu_stuck(struct pt_regs *); u64 hw_nmi_get_sample_period(void); extern int watchdog_enabled; +extern int watchdog_thresh; struct ctl_table; -extern int proc_dowatchdog_enabled(struct ctl_table *, int , - void __user *, size_t *, loff_t *); +extern int proc_dowatchdog(struct ctl_table *, int , + void __user *, size_t *, loff_t *); #endif #endif -- cgit v1.2.2 From 4eec42f392043063d0f019640b4ccc2a45570002 Mon Sep 17 00:00:00 2001 From: Mandeep Singh Baines Date: Sun, 22 May 2011 22:10:23 -0700 Subject: watchdog: Change the default timeout and configure nmi watchdog period based on watchdog_thresh Before the conversion of the NMI watchdog to perf event, the watchdog timeout was 5 seconds. Now it is 60 seconds. For my particular application, netbooks, 5 seconds was a better timeout. With a short timeout, we catch faults earlier and are able to send back a panic. With a 60 second timeout, the user is unlikely to wait and will instead hit the power button, causing us to lose the panic info. This change configures the NMI period to watchdog_thresh and sets the softlockup_thresh to watchdog_thresh * 2. In addition, watchdog_thresh was reduced to 10 seconds as suggested by Ingo Molnar. Signed-off-by: Mandeep Singh Baines Cc: Marcin Slusarz Cc: Don Zickus Cc: Peter Zijlstra Cc: Frederic Weisbecker Link: http://lkml.kernel.org/r/1306127423-3347-4-git-send-email-msb@chromium.org Signed-off-by: Ingo Molnar LKML-Reference: <20110517071642.GF22305@elte.hu> --- include/linux/nmi.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'include/linux/nmi.h') diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 5317b8b2198f..2d304efc89df 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -45,7 +45,7 @@ static inline bool trigger_all_cpu_backtrace(void) #ifdef CONFIG_LOCKUP_DETECTOR int hw_nmi_is_cpu_stuck(struct pt_regs *); -u64 hw_nmi_get_sample_period(void); +u64 hw_nmi_get_sample_period(int watchdog_thresh); extern int watchdog_enabled; extern int watchdog_thresh; struct ctl_table; -- cgit v1.2.2