watchdog: Update documentation

The soft and hard lockup detectors are now built on top of the hrtimer and perf subsystems. Update the documentation accordingly. Signed-off-by: Fernando Luis Vazquez Cao<fernando@oss.ntt.co.jp> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1328827342-6253-1-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
author: Fernando Luis Vázquez Cao <fernando@oss.ntt.co.jp> 2012-02-09 17:42:20 -0500
committer: Ingo Molnar <mingo@elte.hu> 2012-02-11 09:11:28 -0500
commit: 9919cba7ff71147803c988521cc1ceb80e7f0f6d (patch)
tree: 2e790fe9373225bb72fc74b3f14702bc04252508
parent: c98fdeaa92731308ed80386261fa2589addefa47 (diff)
2 files changed, 63 insertions, 83 deletions
diff --git a/Documentation/lockup-watchdogs.txt b/Documentation/lockup-watchdogs.txt
new file mode 100644
index 000000000000..d2a36602ca8d
--- /dev/null
+++ b/Documentation/lockup-watchdogs.txt
@@ -0,0 +1,63 @@
+===============================================================
+Softlockup detector and hardlockup detector (aka nmi_watchdog)
+===============================================================
+The Linux kernel can act as a watchdog to detect both soft and hard
+lockups.
+A 'softlockup' is defined as a bug that causes the kernel to loop in
+kernel mode for more than 20 seconds (see "Implementation" below for
+details), without giving other tasks a chance to run. The current
+stack trace is displayed upon detection and, by default, the system
+will stay locked up. Alternatively, the kernel can be configured to
+panic; a sysctl, "kernel.softlockup_panic", a kernel parameter,
+"softlockup_panic" (see "Documentation/kernel-parameters.txt" for
+details), and a compile option, "BOOTPARAM_HARDLOCKUP_PANIC", are
+provided for this.
+A 'hardlockup' is defined as a bug that causes the CPU to loop in
+kernel mode for more than 10 seconds (see "Implementation" below for
+details), without letting other interrupts have a chance to run.
+Similarly to the softlockup case, the current stack trace is displayed
+upon detection and the system will stay locked up unless the default
+behavior is changed, which can be done through a compile time knob,
+"BOOTPARAM_HARDLOCKUP_PANIC", and a kernel parameter, "nmi_watchdog"
+(see "Documentation/kernel-parameters.txt" for details).
+The panic option can be used in combination with panic_timeout (this
+timeout is set through the confusingly named "kernel.panic" sysctl),
+to cause the system to reboot automatically after a specified amount
+of time.
+=== Implementation ===
+The soft and hard lockup detectors are built on top of the hrtimer and
+perf subsystems, respectively. A direct consequence of this is that,
+in principle, they should work in any architecture where these
+subsystems are present.
+A periodic hrtimer runs to generate interrupts and kick the watchdog
+task. An NMI perf event is generated every "watchdog_thresh"
+(compile-time initialized to 10 and configurable through sysctl of the
+same name) seconds to check for hardlockups. If any CPU in the system
+does not receive any hrtimer interrupt during that time the
+'hardlockup detector' (the handler for the NMI perf event) will
+generate a kernel warning or call panic, depending on the
+configuration.
+The watchdog task is a high priority kernel thread that updates a
+timestamp every time it is scheduled. If that timestamp is not updated
+for 2*watchdog_thresh seconds (the softlockup threshold) the
+'softlockup detector' (coded inside the hrtimer callback function)
+will dump useful debug information to the system log, after which it
+will call panic if it was instructed to do so or resume execution of
+other kernel code.
+The period of the hrtimer is 2*watchdog_thresh/5, which means it has
+two or three chances to generate an interrupt before the hardlockup
+detector kicks in.
+As explained above, a kernel knob is provided that allows
+administrators to configure the period of the hrtimer and the perf
+event. The right value for a particular environment is a trade-off
+between fast response to lockups and detection overhead.
diff --git a/Documentation/nmi_watchdog.txt b/Documentation/nmi_watchdog.txt
deleted file mode 100644
index bf9f80a98282..000000000000
--- a/Documentation/nmi_watchdog.txt
+++ /dev/null
@@ -1,83 +0,0 @@
-[NMI watchdog is available for x86 and x86-64 architectures]
-Is your system locking up unpredictably? No keyboard activity, just
-a frustrating complete hard lockup? Do you want to help us debugging
-such lockups? If all yes then this document is definitely for you.
-On many x86/x86-64 type hardware there is a feature that enables
-us to generate 'watchdog NMI interrupts'.  (NMI: Non Maskable Interrupt
-which get executed even if the system is otherwise locked up hard).
-This can be used to debug hard kernel lockups.  By executing periodic
-NMI interrupts, the kernel can monitor whether any CPU has locked up,
-and print out debugging messages if so.
-In order to use the NMI watchdog, you need to have APIC support in your
-kernel. For SMP kernels, APIC support gets compiled in automatically. For
-UP, enable either CONFIG_X86_UP_APIC (Processor type and features -> Local
-APIC support on uniprocessors) or CONFIG_X86_UP_IOAPIC (Processor type and
-features -> IO-APIC support on uniprocessors) in your kernel config.
-CONFIG_X86_UP_APIC is for uniprocessor machines without an IO-APIC.
-CONFIG_X86_UP_IOAPIC is for uniprocessor with an IO-APIC. [Note: certain
-kernel debugging options, such as Kernel Stack Meter or Kernel Tracer,
-may implicitly disable the NMI watchdog.]
-For x86-64, the needed APIC is always compiled in.
-Using local APIC (nmi_watchdog=2) needs the first performance register, so
-you can't use it for other purposes (such as high precision performance
-profiling.) However, at least oprofile and the perfctr driver disable the
-local APIC NMI watchdog automatically.
-To actually enable the NMI watchdog, use the 'nmi_watchdog=N' boot
-parameter.  Eg. the relevant lilo.conf entry:
-        append="nmi_watchdog=1"
-For SMP machines and UP machines with an IO-APIC use nmi_watchdog=1.
-For UP machines without an IO-APIC use nmi_watchdog=2, this only works
-for some processor types.  If in doubt, boot with nmi_watchdog=1 and
-check the NMI count in /proc/interrupts; if the count is zero then
-reboot with nmi_watchdog=2 and check the NMI count.  If it is still
-zero then log a problem, you probably have a processor that needs to be
-added to the nmi code.
-A 'lockup' is the following scenario: if any CPU in the system does not
-execute the period local timer interrupt for more than 5 seconds, then
-the NMI handler generates an oops and kills the process. This
-'controlled crash' (and the resulting kernel messages) can be used to
-debug the lockup. Thus whenever the lockup happens, wait 5 seconds and
-the oops will show up automatically. If the kernel produces no messages
-then the system has crashed so hard (eg. hardware-wise) that either it
-cannot even accept NMI interrupts, or the crash has made the kernel
-unable to print messages.
-Be aware that when using local APIC, the frequency of NMI interrupts
-it generates, depends on the system load. The local APIC NMI watchdog,
-lacking a better source, uses the "cycles unhalted" event. As you may
-guess it doesn't tick when the CPU is in the halted state (which happens
-when the system is idle), but if your system locks up on anything but the
-"hlt" processor instruction, the watchdog will trigger very soon as the
-"cycles unhalted" event will happen every clock tick. If it locks up on
-"hlt", then you are out of luck -- the event will not happen at all and the
-watchdog won't trigger. This is a shortcoming of the local APIC watchdog
-- unfortunately there is no "clock ticks" event that would work all the
-time. The I/O APIC watchdog is driven externally and has no such shortcoming.
-But its NMI frequency is much higher, resulting in a more significant hit
-to the overall system performance.
-On x86 nmi_watchdog is disabled by default so you have to enable it with
-a boot time parameter.
-It's possible to disable the NMI watchdog in run-time by writing "0" to
-/proc/sys/kernel/nmi_watchdog. Writing "1" to the same file will re-enable
-the NMI watchdog. Notice that you still need to use "nmi_watchdog=" parameter
-at boot time.
-NOTE: In kernels prior to 2.4.2-ac18 the NMI-oopser is enabled unconditionally
-on x86 SMP boxes.
-[ feel free to send bug reports, suggestions and patches to
-  Ingo Molnar <mingo@redhat.com> or the Linux SMP mailing
-  list at <linux-smp@vger.kernel.org> ]
author	Fernando Luis Vázquez Cao <fernando@oss.ntt.co.jp>	2012-02-09 17:42:20 -0500
committer	Ingo Molnar <mingo@elte.hu>	2012-02-11 09:11:28 -0500
commit	9919cba7ff71147803c988521cc1ceb80e7f0f6d (patch)
tree	2e790fe9373225bb72fc74b3f14702bc04252508
parent	c98fdeaa92731308ed80386261fa2589addefa47 (diff)

diff --git a/Documentation/lockup-watchdogs.txt b/Documentation/lockup-watchdogs.txt new file mode 100644 index 000000000000..d2a36602ca8d --- /dev/null +++ b/Documentation/lockup-watchdogs.txt
@@ -0,0 +1,63 @@
		1	===============================================================
		2	Softlockup detector and hardlockup detector (aka nmi_watchdog)
		3	===============================================================
		4
		5	The Linux kernel can act as a watchdog to detect both soft and hard
		6	lockups.
		7
		8	A 'softlockup' is defined as a bug that causes the kernel to loop in
		9	kernel mode for more than 20 seconds (see "Implementation" below for
		10	details), without giving other tasks a chance to run. The current
		11	stack trace is displayed upon detection and, by default, the system
		12	will stay locked up. Alternatively, the kernel can be configured to
		13	panic; a sysctl, "kernel.softlockup_panic", a kernel parameter,
		14	"softlockup_panic" (see "Documentation/kernel-parameters.txt" for
		15	details), and a compile option, "BOOTPARAM_HARDLOCKUP_PANIC", are
		16	provided for this.
		17
		18	A 'hardlockup' is defined as a bug that causes the CPU to loop in
		19	kernel mode for more than 10 seconds (see "Implementation" below for
		20	details), without letting other interrupts have a chance to run.
		21	Similarly to the softlockup case, the current stack trace is displayed
		22	upon detection and the system will stay locked up unless the default
		23	behavior is changed, which can be done through a compile time knob,
		24	"BOOTPARAM_HARDLOCKUP_PANIC", and a kernel parameter, "nmi_watchdog"
		25	(see "Documentation/kernel-parameters.txt" for details).
		26
		27	The panic option can be used in combination with panic_timeout (this
		28	timeout is set through the confusingly named "kernel.panic" sysctl),
		29	to cause the system to reboot automatically after a specified amount
		30	of time.
		31
		32	=== Implementation ===
		33
		34	The soft and hard lockup detectors are built on top of the hrtimer and
		35	perf subsystems, respectively. A direct consequence of this is that,
		36	in principle, they should work in any architecture where these
		37	subsystems are present.
		38
		39	A periodic hrtimer runs to generate interrupts and kick the watchdog
		40	task. An NMI perf event is generated every "watchdog_thresh"
		41	(compile-time initialized to 10 and configurable through sysctl of the
		42	same name) seconds to check for hardlockups. If any CPU in the system
		43	does not receive any hrtimer interrupt during that time the
		44	'hardlockup detector' (the handler for the NMI perf event) will
		45	generate a kernel warning or call panic, depending on the
		46	configuration.
		47
		48	The watchdog task is a high priority kernel thread that updates a
		49	timestamp every time it is scheduled. If that timestamp is not updated
		50	for 2*watchdog_thresh seconds (the softlockup threshold) the
		51	'softlockup detector' (coded inside the hrtimer callback function)
		52	will dump useful debug information to the system log, after which it
		53	will call panic if it was instructed to do so or resume execution of
		54	other kernel code.
		55
		56	The period of the hrtimer is 2*watchdog_thresh/5, which means it has
		57	two or three chances to generate an interrupt before the hardlockup
		58	detector kicks in.
		59
		60	As explained above, a kernel knob is provided that allows
		61	administrators to configure the period of the hrtimer and the perf
		62	event. The right value for a particular environment is a trade-off
		63	between fast response to lockups and detection overhead.


diff --git a/Documentation/nmi_watchdog.txt b/Documentation/nmi_watchdog.txt deleted file mode 100644 index bf9f80a98282..000000000000 --- a/Documentation/nmi_watchdog.txt +++ /dev/null
@@ -1,83 +0,0 @@
1
2	[NMI watchdog is available for x86 and x86-64 architectures]
3
4	Is your system locking up unpredictably? No keyboard activity, just
5	a frustrating complete hard lockup? Do you want to help us debugging
6	such lockups? If all yes then this document is definitely for you.
7
8	On many x86/x86-64 type hardware there is a feature that enables
9	us to generate 'watchdog NMI interrupts'. (NMI: Non Maskable Interrupt
10	which get executed even if the system is otherwise locked up hard).
11	This can be used to debug hard kernel lockups. By executing periodic
12	NMI interrupts, the kernel can monitor whether any CPU has locked up,
13	and print out debugging messages if so.
14
15	In order to use the NMI watchdog, you need to have APIC support in your
16	kernel. For SMP kernels, APIC support gets compiled in automatically. For
17	UP, enable either CONFIG_X86_UP_APIC (Processor type and features -> Local
18	APIC support on uniprocessors) or CONFIG_X86_UP_IOAPIC (Processor type and
19	features -> IO-APIC support on uniprocessors) in your kernel config.
20	CONFIG_X86_UP_APIC is for uniprocessor machines without an IO-APIC.
21	CONFIG_X86_UP_IOAPIC is for uniprocessor with an IO-APIC. [Note: certain
22	kernel debugging options, such as Kernel Stack Meter or Kernel Tracer,
23	may implicitly disable the NMI watchdog.]
24
25	For x86-64, the needed APIC is always compiled in.
26
27	Using local APIC (nmi_watchdog=2) needs the first performance register, so
28	you can't use it for other purposes (such as high precision performance
29	profiling.) However, at least oprofile and the perfctr driver disable the
30	local APIC NMI watchdog automatically.
31
32	To actually enable the NMI watchdog, use the 'nmi_watchdog=N' boot
33	parameter. Eg. the relevant lilo.conf entry:
34
35	append="nmi_watchdog=1"
36
37	For SMP machines and UP machines with an IO-APIC use nmi_watchdog=1.
38	For UP machines without an IO-APIC use nmi_watchdog=2, this only works
39	for some processor types. If in doubt, boot with nmi_watchdog=1 and
40	check the NMI count in /proc/interrupts; if the count is zero then
41	reboot with nmi_watchdog=2 and check the NMI count. If it is still
42	zero then log a problem, you probably have a processor that needs to be
43	added to the nmi code.
44
45	A 'lockup' is the following scenario: if any CPU in the system does not
46	execute the period local timer interrupt for more than 5 seconds, then
47	the NMI handler generates an oops and kills the process. This
48	'controlled crash' (and the resulting kernel messages) can be used to
49	debug the lockup. Thus whenever the lockup happens, wait 5 seconds and
50	the oops will show up automatically. If the kernel produces no messages
51	then the system has crashed so hard (eg. hardware-wise) that either it
52	cannot even accept NMI interrupts, or the crash has made the kernel
53	unable to print messages.
54
55	Be aware that when using local APIC, the frequency of NMI interrupts
56	it generates, depends on the system load. The local APIC NMI watchdog,
57	lacking a better source, uses the "cycles unhalted" event. As you may
58	guess it doesn't tick when the CPU is in the halted state (which happens
59	when the system is idle), but if your system locks up on anything but the
60	"hlt" processor instruction, the watchdog will trigger very soon as the
61	"cycles unhalted" event will happen every clock tick. If it locks up on
62	"hlt", then you are out of luck -- the event will not happen at all and the
63	watchdog won't trigger. This is a shortcoming of the local APIC watchdog
64	-- unfortunately there is no "clock ticks" event that would work all the
65	time. The I/O APIC watchdog is driven externally and has no such shortcoming.
66	But its NMI frequency is much higher, resulting in a more significant hit
67	to the overall system performance.
68
69	On x86 nmi_watchdog is disabled by default so you have to enable it with
70	a boot time parameter.
71
72	It's possible to disable the NMI watchdog in run-time by writing "0" to
73	/proc/sys/kernel/nmi_watchdog. Writing "1" to the same file will re-enable
74	the NMI watchdog. Notice that you still need to use "nmi_watchdog=" parameter
75	at boot time.
76
77	NOTE: In kernels prior to 2.4.2-ac18 the NMI-oopser is enabled unconditionally
78	on x86 SMP boxes.
79
80	[ feel free to send bug reports, suggestions and patches to
81	Ingo Molnar <mingo@redhat.com> or the Linux SMP mailing
82	list at <linux-smp@vger.kernel.org> ]
83