diff options
| author | Prarit Bhargava <prarit@redhat.com> | 2014-12-10 18:45:50 -0500 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2014-12-10 20:41:10 -0500 |
| commit | 9e3961a0979817c612b10b2da4f3045ec9faa779 (patch) | |
| tree | 08ddeb0aed7fe4a0dd0e00838b373be786c95ada /Documentation/sysctl | |
| parent | f938612dd97d481b8b5bf960c992ae577f081c17 (diff) | |
kernel: add panic_on_warn
There have been several times where I have had to rebuild a kernel to
cause a panic when hitting a WARN() in the code in order to get a crash
dump from a system. Sometimes this is easy to do, other times (such as
in the case of a remote admin) it is not trivial to send new images to
the user.
A much easier method would be a switch to change the WARN() over to a
panic. This makes debugging easier in that I can now test the actual
image the WARN() was seen on and I do not have to engage in remote
debugging.
This patch adds a panic_on_warn kernel parameter and
/proc/sys/kernel/panic_on_warn calls panic() in the
warn_slowpath_common() path. The function will still print out the
location of the warning.
An example of the panic_on_warn output:
The first line below is from the WARN_ON() to output the WARN_ON()'s
location. After that the panic() output is displayed.
WARNING: CPU: 30 PID: 11698 at /home/prarit/dummy_module/dummy-module.c:25 init_dummy+0x1f/0x30 [dummy_module]()
Kernel panic - not syncing: panic_on_warn set ...
CPU: 30 PID: 11698 Comm: insmod Tainted: G W OE 3.17.0+ #57
Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
0000000000000000 000000008e3f87df ffff88080f093c38 ffffffff81665190
0000000000000000 ffffffff818aea3d ffff88080f093cb8 ffffffff8165e2ec
ffffffff00000008 ffff88080f093cc8 ffff88080f093c68 000000008e3f87df
Call Trace:
[<ffffffff81665190>] dump_stack+0x46/0x58
[<ffffffff8165e2ec>] panic+0xd0/0x204
[<ffffffffa038e05f>] ? init_dummy+0x1f/0x30 [dummy_module]
[<ffffffff81076b90>] warn_slowpath_common+0xd0/0xd0
[<ffffffffa038e040>] ? dummy_greetings+0x40/0x40 [dummy_module]
[<ffffffff81076c8a>] warn_slowpath_null+0x1a/0x20
[<ffffffffa038e05f>] init_dummy+0x1f/0x30 [dummy_module]
[<ffffffff81002144>] do_one_initcall+0xd4/0x210
[<ffffffff811b52c2>] ? __vunmap+0xc2/0x110
[<ffffffff810f8889>] load_module+0x16a9/0x1b30
[<ffffffff810f3d30>] ? store_uevent+0x70/0x70
[<ffffffff810f49b9>] ? copy_module_from_fd.isra.44+0x129/0x180
[<ffffffff810f8ec6>] SyS_finit_module+0xa6/0xd0
[<ffffffff8166cf29>] system_call_fastpath+0x12/0x17
Successfully tested by me.
hpa said: There is another very valid use for this: many operators would
rather a machine shuts down than being potentially compromised either
functionally or security-wise.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'Documentation/sysctl')
| -rw-r--r-- | Documentation/sysctl/kernel.txt | 40 |
1 files changed, 26 insertions, 14 deletions
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 57baff5bdb80..b5d0c8501a18 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt | |||
| @@ -54,8 +54,9 @@ show up in /proc/sys/kernel: | |||
| 54 | - overflowuid | 54 | - overflowuid |
| 55 | - panic | 55 | - panic |
| 56 | - panic_on_oops | 56 | - panic_on_oops |
| 57 | - panic_on_unrecovered_nmi | ||
| 58 | - panic_on_stackoverflow | 57 | - panic_on_stackoverflow |
| 58 | - panic_on_unrecovered_nmi | ||
| 59 | - panic_on_warn | ||
| 59 | - pid_max | 60 | - pid_max |
| 60 | - powersave-nap [ PPC only ] | 61 | - powersave-nap [ PPC only ] |
| 61 | - printk | 62 | - printk |
| @@ -527,19 +528,6 @@ the recommended setting is 60. | |||
| 527 | 528 | ||
| 528 | ============================================================== | 529 | ============================================================== |
| 529 | 530 | ||
| 530 | panic_on_unrecovered_nmi: | ||
| 531 | |||
| 532 | The default Linux behaviour on an NMI of either memory or unknown is | ||
| 533 | to continue operation. For many environments such as scientific | ||
| 534 | computing it is preferable that the box is taken out and the error | ||
| 535 | dealt with than an uncorrected parity/ECC error get propagated. | ||
| 536 | |||
| 537 | A small number of systems do generate NMI's for bizarre random reasons | ||
| 538 | such as power management so the default is off. That sysctl works like | ||
| 539 | the existing panic controls already in that directory. | ||
| 540 | |||
| 541 | ============================================================== | ||
| 542 | |||
| 543 | panic_on_oops: | 531 | panic_on_oops: |
| 544 | 532 | ||
| 545 | Controls the kernel's behaviour when an oops or BUG is encountered. | 533 | Controls the kernel's behaviour when an oops or BUG is encountered. |
| @@ -563,6 +551,30 @@ This file shows up if CONFIG_DEBUG_STACKOVERFLOW is enabled. | |||
| 563 | 551 | ||
| 564 | ============================================================== | 552 | ============================================================== |
| 565 | 553 | ||
| 554 | panic_on_unrecovered_nmi: | ||
| 555 | |||
| 556 | The default Linux behaviour on an NMI of either memory or unknown is | ||
| 557 | to continue operation. For many environments such as scientific | ||
| 558 | computing it is preferable that the box is taken out and the error | ||
| 559 | dealt with than an uncorrected parity/ECC error get propagated. | ||
| 560 | |||
| 561 | A small number of systems do generate NMI's for bizarre random reasons | ||
| 562 | such as power management so the default is off. That sysctl works like | ||
| 563 | the existing panic controls already in that directory. | ||
| 564 | |||
| 565 | ============================================================== | ||
| 566 | |||
| 567 | panic_on_warn: | ||
| 568 | |||
| 569 | Calls panic() in the WARN() path when set to 1. This is useful to avoid | ||
| 570 | a kernel rebuild when attempting to kdump at the location of a WARN(). | ||
| 571 | |||
| 572 | 0: only WARN(), default behaviour. | ||
| 573 | |||
| 574 | 1: call panic() after printing out WARN() location. | ||
| 575 | |||
| 576 | ============================================================== | ||
| 577 | |||
| 566 | perf_cpu_time_max_percent: | 578 | perf_cpu_time_max_percent: |
| 567 | 579 | ||
| 568 | Hints to the kernel how much CPU time it should be allowed to | 580 | Hints to the kernel how much CPU time it should be allowed to |
