aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorFernando Luis Vázquez Cao <fernando@oss.ntt.co.jp>2012-02-09 17:42:20 -0500
committerIngo Molnar <mingo@elte.hu>2012-02-11 09:11:28 -0500
commit9919cba7ff71147803c988521cc1ceb80e7f0f6d (patch)
tree2e790fe9373225bb72fc74b3f14702bc04252508
parentc98fdeaa92731308ed80386261fa2589addefa47 (diff)
watchdog: Update documentation
The soft and hard lockup detectors are now built on top of the hrtimer and perf subsystems. Update the documentation accordingly. Signed-off-by: Fernando Luis Vazquez Cao<fernando@oss.ntt.co.jp> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1328827342-6253-1-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
-rw-r--r--Documentation/lockup-watchdogs.txt63
-rw-r--r--Documentation/nmi_watchdog.txt83
2 files changed, 63 insertions, 83 deletions
diff --git a/Documentation/lockup-watchdogs.txt b/Documentation/lockup-watchdogs.txt
new file mode 100644
index 000000000000..d2a36602ca8d
--- /dev/null
+++ b/Documentation/lockup-watchdogs.txt
@@ -0,0 +1,63 @@
1===============================================================
2Softlockup detector and hardlockup detector (aka nmi_watchdog)
3===============================================================
4
5The Linux kernel can act as a watchdog to detect both soft and hard
6lockups.
7
8A 'softlockup' is defined as a bug that causes the kernel to loop in
9kernel mode for more than 20 seconds (see "Implementation" below for
10details), without giving other tasks a chance to run. The current
11stack trace is displayed upon detection and, by default, the system
12will stay locked up. Alternatively, the kernel can be configured to
13panic; a sysctl, "kernel.softlockup_panic", a kernel parameter,
14"softlockup_panic" (see "Documentation/kernel-parameters.txt" for
15details), and a compile option, "BOOTPARAM_HARDLOCKUP_PANIC", are
16provided for this.
17
18A 'hardlockup' is defined as a bug that causes the CPU to loop in
19kernel mode for more than 10 seconds (see "Implementation" below for
20details), without letting other interrupts have a chance to run.
21Similarly to the softlockup case, the current stack trace is displayed
22upon detection and the system will stay locked up unless the default
23behavior is changed, which can be done through a compile time knob,
24"BOOTPARAM_HARDLOCKUP_PANIC", and a kernel parameter, "nmi_watchdog"
25(see "Documentation/kernel-parameters.txt" for details).
26
27The panic option can be used in combination with panic_timeout (this
28timeout is set through the confusingly named "kernel.panic" sysctl),
29to cause the system to reboot automatically after a specified amount
30of time.
31
32=== Implementation ===
33
34The soft and hard lockup detectors are built on top of the hrtimer and
35perf subsystems, respectively. A direct consequence of this is that,
36in principle, they should work in any architecture where these
37subsystems are present.
38
39A periodic hrtimer runs to generate interrupts and kick the watchdog
40task. An NMI perf event is generated every "watchdog_thresh"
41(compile-time initialized to 10 and configurable through sysctl of the
42same name) seconds to check for hardlockups. If any CPU in the system
43does not receive any hrtimer interrupt during that time the
44'hardlockup detector' (the handler for the NMI perf event) will
45generate a kernel warning or call panic, depending on the
46configuration.
47
48The watchdog task is a high priority kernel thread that updates a
49timestamp every time it is scheduled. If that timestamp is not updated
50for 2*watchdog_thresh seconds (the softlockup threshold) the
51'softlockup detector' (coded inside the hrtimer callback function)
52will dump useful debug information to the system log, after which it
53will call panic if it was instructed to do so or resume execution of
54other kernel code.
55
56The period of the hrtimer is 2*watchdog_thresh/5, which means it has
57two or three chances to generate an interrupt before the hardlockup
58detector kicks in.
59
60As explained above, a kernel knob is provided that allows
61administrators to configure the period of the hrtimer and the perf
62event. The right value for a particular environment is a trade-off
63between fast response to lockups and detection overhead.
diff --git a/Documentation/nmi_watchdog.txt b/Documentation/nmi_watchdog.txt
deleted file mode 100644
index bf9f80a98282..000000000000
--- a/Documentation/nmi_watchdog.txt
+++ /dev/null
@@ -1,83 +0,0 @@
1
2[NMI watchdog is available for x86 and x86-64 architectures]
3
4Is your system locking up unpredictably? No keyboard activity, just
5a frustrating complete hard lockup? Do you want to help us debugging
6such lockups? If all yes then this document is definitely for you.
7
8On many x86/x86-64 type hardware there is a feature that enables
9us to generate 'watchdog NMI interrupts'. (NMI: Non Maskable Interrupt
10which get executed even if the system is otherwise locked up hard).
11This can be used to debug hard kernel lockups. By executing periodic
12NMI interrupts, the kernel can monitor whether any CPU has locked up,
13and print out debugging messages if so.
14
15In order to use the NMI watchdog, you need to have APIC support in your
16kernel. For SMP kernels, APIC support gets compiled in automatically. For
17UP, enable either CONFIG_X86_UP_APIC (Processor type and features -> Local
18APIC support on uniprocessors) or CONFIG_X86_UP_IOAPIC (Processor type and
19features -> IO-APIC support on uniprocessors) in your kernel config.
20CONFIG_X86_UP_APIC is for uniprocessor machines without an IO-APIC.
21CONFIG_X86_UP_IOAPIC is for uniprocessor with an IO-APIC. [Note: certain
22kernel debugging options, such as Kernel Stack Meter or Kernel Tracer,
23may implicitly disable the NMI watchdog.]
24
25For x86-64, the needed APIC is always compiled in.
26
27Using local APIC (nmi_watchdog=2) needs the first performance register, so
28you can't use it for other purposes (such as high precision performance
29profiling.) However, at least oprofile and the perfctr driver disable the
30local APIC NMI watchdog automatically.
31
32To actually enable the NMI watchdog, use the 'nmi_watchdog=N' boot
33parameter. Eg. the relevant lilo.conf entry:
34
35 append="nmi_watchdog=1"
36
37For SMP machines and UP machines with an IO-APIC use nmi_watchdog=1.
38For UP machines without an IO-APIC use nmi_watchdog=2, this only works
39for some processor types. If in doubt, boot with nmi_watchdog=1 and
40check the NMI count in /proc/interrupts; if the count is zero then
41reboot with nmi_watchdog=2 and check the NMI count. If it is still
42zero then log a problem, you probably have a processor that needs to be
43added to the nmi code.
44
45A 'lockup' is the following scenario: if any CPU in the system does not
46execute the period local timer interrupt for more than 5 seconds, then
47the NMI handler generates an oops and kills the process. This
48'controlled crash' (and the resulting kernel messages) can be used to
49debug the lockup. Thus whenever the lockup happens, wait 5 seconds and
50the oops will show up automatically. If the kernel produces no messages
51then the system has crashed so hard (eg. hardware-wise) that either it
52cannot even accept NMI interrupts, or the crash has made the kernel
53unable to print messages.
54
55Be aware that when using local APIC, the frequency of NMI interrupts
56it generates, depends on the system load. The local APIC NMI watchdog,
57lacking a better source, uses the "cycles unhalted" event. As you may
58guess it doesn't tick when the CPU is in the halted state (which happens
59when the system is idle), but if your system locks up on anything but the
60"hlt" processor instruction, the watchdog will trigger very soon as the
61"cycles unhalted" event will happen every clock tick. If it locks up on
62"hlt", then you are out of luck -- the event will not happen at all and the
63watchdog won't trigger. This is a shortcoming of the local APIC watchdog
64-- unfortunately there is no "clock ticks" event that would work all the
65time. The I/O APIC watchdog is driven externally and has no such shortcoming.
66But its NMI frequency is much higher, resulting in a more significant hit
67to the overall system performance.
68
69On x86 nmi_watchdog is disabled by default so you have to enable it with
70a boot time parameter.
71
72It's possible to disable the NMI watchdog in run-time by writing "0" to
73/proc/sys/kernel/nmi_watchdog. Writing "1" to the same file will re-enable
74the NMI watchdog. Notice that you still need to use "nmi_watchdog=" parameter
75at boot time.
76
77NOTE: In kernels prior to 2.4.2-ac18 the NMI-oopser is enabled unconditionally
78on x86 SMP boxes.
79
80[ feel free to send bug reports, suggestions and patches to
81 Ingo Molnar <mingo@redhat.com> or the Linux SMP mailing
82 list at <linux-smp@vger.kernel.org> ]
83