diff options
author | Fernando Luis Vázquez Cao <fernando@oss.ntt.co.jp> | 2012-02-09 17:42:20 -0500 |
---|---|---|
committer | Ingo Molnar <mingo@elte.hu> | 2012-02-11 09:11:28 -0500 |
commit | 9919cba7ff71147803c988521cc1ceb80e7f0f6d (patch) | |
tree | 2e790fe9373225bb72fc74b3f14702bc04252508 | |
parent | c98fdeaa92731308ed80386261fa2589addefa47 (diff) |
watchdog: Update documentation
The soft and hard lockup detectors are now built on top of the
hrtimer and perf subsystems. Update the documentation
accordingly.
Signed-off-by: Fernando Luis Vazquez Cao<fernando@oss.ntt.co.jp>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/1328827342-6253-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
-rw-r--r-- | Documentation/lockup-watchdogs.txt | 63 | ||||
-rw-r--r-- | Documentation/nmi_watchdog.txt | 83 |
2 files changed, 63 insertions, 83 deletions
diff --git a/Documentation/lockup-watchdogs.txt b/Documentation/lockup-watchdogs.txt new file mode 100644 index 000000000000..d2a36602ca8d --- /dev/null +++ b/Documentation/lockup-watchdogs.txt | |||
@@ -0,0 +1,63 @@ | |||
1 | =============================================================== | ||
2 | Softlockup detector and hardlockup detector (aka nmi_watchdog) | ||
3 | =============================================================== | ||
4 | |||
5 | The Linux kernel can act as a watchdog to detect both soft and hard | ||
6 | lockups. | ||
7 | |||
8 | A 'softlockup' is defined as a bug that causes the kernel to loop in | ||
9 | kernel mode for more than 20 seconds (see "Implementation" below for | ||
10 | details), without giving other tasks a chance to run. The current | ||
11 | stack trace is displayed upon detection and, by default, the system | ||
12 | will stay locked up. Alternatively, the kernel can be configured to | ||
13 | panic; a sysctl, "kernel.softlockup_panic", a kernel parameter, | ||
14 | "softlockup_panic" (see "Documentation/kernel-parameters.txt" for | ||
15 | details), and a compile option, "BOOTPARAM_HARDLOCKUP_PANIC", are | ||
16 | provided for this. | ||
17 | |||
18 | A 'hardlockup' is defined as a bug that causes the CPU to loop in | ||
19 | kernel mode for more than 10 seconds (see "Implementation" below for | ||
20 | details), without letting other interrupts have a chance to run. | ||
21 | Similarly to the softlockup case, the current stack trace is displayed | ||
22 | upon detection and the system will stay locked up unless the default | ||
23 | behavior is changed, which can be done through a compile time knob, | ||
24 | "BOOTPARAM_HARDLOCKUP_PANIC", and a kernel parameter, "nmi_watchdog" | ||
25 | (see "Documentation/kernel-parameters.txt" for details). | ||
26 | |||
27 | The panic option can be used in combination with panic_timeout (this | ||
28 | timeout is set through the confusingly named "kernel.panic" sysctl), | ||
29 | to cause the system to reboot automatically after a specified amount | ||
30 | of time. | ||
31 | |||
32 | === Implementation === | ||
33 | |||
34 | The soft and hard lockup detectors are built on top of the hrtimer and | ||
35 | perf subsystems, respectively. A direct consequence of this is that, | ||
36 | in principle, they should work in any architecture where these | ||
37 | subsystems are present. | ||
38 | |||
39 | A periodic hrtimer runs to generate interrupts and kick the watchdog | ||
40 | task. An NMI perf event is generated every "watchdog_thresh" | ||
41 | (compile-time initialized to 10 and configurable through sysctl of the | ||
42 | same name) seconds to check for hardlockups. If any CPU in the system | ||
43 | does not receive any hrtimer interrupt during that time the | ||
44 | 'hardlockup detector' (the handler for the NMI perf event) will | ||
45 | generate a kernel warning or call panic, depending on the | ||
46 | configuration. | ||
47 | |||
48 | The watchdog task is a high priority kernel thread that updates a | ||
49 | timestamp every time it is scheduled. If that timestamp is not updated | ||
50 | for 2*watchdog_thresh seconds (the softlockup threshold) the | ||
51 | 'softlockup detector' (coded inside the hrtimer callback function) | ||
52 | will dump useful debug information to the system log, after which it | ||
53 | will call panic if it was instructed to do so or resume execution of | ||
54 | other kernel code. | ||
55 | |||
56 | The period of the hrtimer is 2*watchdog_thresh/5, which means it has | ||
57 | two or three chances to generate an interrupt before the hardlockup | ||
58 | detector kicks in. | ||
59 | |||
60 | As explained above, a kernel knob is provided that allows | ||
61 | administrators to configure the period of the hrtimer and the perf | ||
62 | event. The right value for a particular environment is a trade-off | ||
63 | between fast response to lockups and detection overhead. | ||
diff --git a/Documentation/nmi_watchdog.txt b/Documentation/nmi_watchdog.txt deleted file mode 100644 index bf9f80a98282..000000000000 --- a/Documentation/nmi_watchdog.txt +++ /dev/null | |||
@@ -1,83 +0,0 @@ | |||
1 | |||
2 | [NMI watchdog is available for x86 and x86-64 architectures] | ||
3 | |||
4 | Is your system locking up unpredictably? No keyboard activity, just | ||
5 | a frustrating complete hard lockup? Do you want to help us debugging | ||
6 | such lockups? If all yes then this document is definitely for you. | ||
7 | |||
8 | On many x86/x86-64 type hardware there is a feature that enables | ||
9 | us to generate 'watchdog NMI interrupts'. (NMI: Non Maskable Interrupt | ||
10 | which get executed even if the system is otherwise locked up hard). | ||
11 | This can be used to debug hard kernel lockups. By executing periodic | ||
12 | NMI interrupts, the kernel can monitor whether any CPU has locked up, | ||
13 | and print out debugging messages if so. | ||
14 | |||
15 | In order to use the NMI watchdog, you need to have APIC support in your | ||
16 | kernel. For SMP kernels, APIC support gets compiled in automatically. For | ||
17 | UP, enable either CONFIG_X86_UP_APIC (Processor type and features -> Local | ||
18 | APIC support on uniprocessors) or CONFIG_X86_UP_IOAPIC (Processor type and | ||
19 | features -> IO-APIC support on uniprocessors) in your kernel config. | ||
20 | CONFIG_X86_UP_APIC is for uniprocessor machines without an IO-APIC. | ||
21 | CONFIG_X86_UP_IOAPIC is for uniprocessor with an IO-APIC. [Note: certain | ||
22 | kernel debugging options, such as Kernel Stack Meter or Kernel Tracer, | ||
23 | may implicitly disable the NMI watchdog.] | ||
24 | |||
25 | For x86-64, the needed APIC is always compiled in. | ||
26 | |||
27 | Using local APIC (nmi_watchdog=2) needs the first performance register, so | ||
28 | you can't use it for other purposes (such as high precision performance | ||
29 | profiling.) However, at least oprofile and the perfctr driver disable the | ||
30 | local APIC NMI watchdog automatically. | ||
31 | |||
32 | To actually enable the NMI watchdog, use the 'nmi_watchdog=N' boot | ||
33 | parameter. Eg. the relevant lilo.conf entry: | ||
34 | |||
35 | append="nmi_watchdog=1" | ||
36 | |||
37 | For SMP machines and UP machines with an IO-APIC use nmi_watchdog=1. | ||
38 | For UP machines without an IO-APIC use nmi_watchdog=2, this only works | ||
39 | for some processor types. If in doubt, boot with nmi_watchdog=1 and | ||
40 | check the NMI count in /proc/interrupts; if the count is zero then | ||
41 | reboot with nmi_watchdog=2 and check the NMI count. If it is still | ||
42 | zero then log a problem, you probably have a processor that needs to be | ||
43 | added to the nmi code. | ||
44 | |||
45 | A 'lockup' is the following scenario: if any CPU in the system does not | ||
46 | execute the period local timer interrupt for more than 5 seconds, then | ||
47 | the NMI handler generates an oops and kills the process. This | ||
48 | 'controlled crash' (and the resulting kernel messages) can be used to | ||
49 | debug the lockup. Thus whenever the lockup happens, wait 5 seconds and | ||
50 | the oops will show up automatically. If the kernel produces no messages | ||
51 | then the system has crashed so hard (eg. hardware-wise) that either it | ||
52 | cannot even accept NMI interrupts, or the crash has made the kernel | ||
53 | unable to print messages. | ||
54 | |||
55 | Be aware that when using local APIC, the frequency of NMI interrupts | ||
56 | it generates, depends on the system load. The local APIC NMI watchdog, | ||
57 | lacking a better source, uses the "cycles unhalted" event. As you may | ||
58 | guess it doesn't tick when the CPU is in the halted state (which happens | ||
59 | when the system is idle), but if your system locks up on anything but the | ||
60 | "hlt" processor instruction, the watchdog will trigger very soon as the | ||
61 | "cycles unhalted" event will happen every clock tick. If it locks up on | ||
62 | "hlt", then you are out of luck -- the event will not happen at all and the | ||
63 | watchdog won't trigger. This is a shortcoming of the local APIC watchdog | ||
64 | -- unfortunately there is no "clock ticks" event that would work all the | ||
65 | time. The I/O APIC watchdog is driven externally and has no such shortcoming. | ||
66 | But its NMI frequency is much higher, resulting in a more significant hit | ||
67 | to the overall system performance. | ||
68 | |||
69 | On x86 nmi_watchdog is disabled by default so you have to enable it with | ||
70 | a boot time parameter. | ||
71 | |||
72 | It's possible to disable the NMI watchdog in run-time by writing "0" to | ||
73 | /proc/sys/kernel/nmi_watchdog. Writing "1" to the same file will re-enable | ||
74 | the NMI watchdog. Notice that you still need to use "nmi_watchdog=" parameter | ||
75 | at boot time. | ||
76 | |||
77 | NOTE: In kernels prior to 2.4.2-ac18 the NMI-oopser is enabled unconditionally | ||
78 | on x86 SMP boxes. | ||
79 | |||
80 | [ feel free to send bug reports, suggestions and patches to | ||
81 | Ingo Molnar <mingo@redhat.com> or the Linux SMP mailing | ||
82 | list at <linux-smp@vger.kernel.org> ] | ||
83 | |||