diff options
author | KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> | 2010-03-10 18:22:32 -0500 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2010-03-12 18:52:37 -0500 |
commit | daaf1e68874c078a15ae6ae827751839c4d81739 (patch) | |
tree | 22ed2e28b1c4f0b714df680ffff6407e519c5c60 | |
parent | 1080d7a30304d03b1d9fd530aacd8aece2d702a2 (diff) |
memcg: handle panic_on_oom=always case
Presently, if panic_on_oom=2, the whole system panics even if the oom
happend in some special situation (as cpuset, mempolicy....). Then,
panic_on_oom=2 means painc_on_oom_always.
Now, memcg doesn't check panic_on_oom flag. This patch adds a check.
BTW, how it's useful ?
kdump+panic_on_oom=2 is the last tool to investigate what happens in
oom-ed system. When a task is killed, the sysytem recovers and there will
be few hint to know what happnes. In mission critical system, oom should
never happen. Then, panic_on_oom=2+kdump is useful to avoid next OOM by
knowing precise information via snapshot.
TODO:
- For memcg, it's for isolate system's memory usage, oom-notiifer and
freeze_at_oom (or rest_at_oom) should be implemented. Then, management
daemon can do similar jobs (as kdump) or taking snapshot per cgroup.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Nick Piggin <npiggin@suse.de>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r-- | Documentation/cgroups/memory.txt | 5 | ||||
-rw-r--r-- | Documentation/sysctl/vm.txt | 5 | ||||
-rw-r--r-- | mm/oom_kill.c | 2 |
3 files changed, 10 insertions, 2 deletions
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index 268ab08222dd..f8bc802d70b9 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt | |||
@@ -182,6 +182,8 @@ list. | |||
182 | NOTE: Reclaim does not work for the root cgroup, since we cannot set any | 182 | NOTE: Reclaim does not work for the root cgroup, since we cannot set any |
183 | limits on the root cgroup. | 183 | limits on the root cgroup. |
184 | 184 | ||
185 | Note2: When panic_on_oom is set to "2", the whole system will panic. | ||
186 | |||
185 | 2. Locking | 187 | 2. Locking |
186 | 188 | ||
187 | The memory controller uses the following hierarchy | 189 | The memory controller uses the following hierarchy |
@@ -379,7 +381,8 @@ The feature can be disabled by | |||
379 | NOTE1: Enabling/disabling will fail if the cgroup already has other | 381 | NOTE1: Enabling/disabling will fail if the cgroup already has other |
380 | cgroups created below it. | 382 | cgroups created below it. |
381 | 383 | ||
382 | NOTE2: This feature can be enabled/disabled per subtree. | 384 | NOTE2: When panic_on_oom is set to "2", the whole system will panic in |
385 | case of an oom event in any cgroup. | ||
383 | 386 | ||
384 | 7. Soft limits | 387 | 7. Soft limits |
385 | 388 | ||
diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index fc5790d36cd9..6c7d18c53f84 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt | |||
@@ -573,11 +573,14 @@ Because other nodes' memory may be free. This means system total status | |||
573 | may be not fatal yet. | 573 | may be not fatal yet. |
574 | 574 | ||
575 | If this is set to 2, the kernel panics compulsorily even on the | 575 | If this is set to 2, the kernel panics compulsorily even on the |
576 | above-mentioned. | 576 | above-mentioned. Even oom happens under memory cgroup, the whole |
577 | system panics. | ||
577 | 578 | ||
578 | The default value is 0. | 579 | The default value is 0. |
579 | 1 and 2 are for failover of clustering. Please select either | 580 | 1 and 2 are for failover of clustering. Please select either |
580 | according to your policy of failover. | 581 | according to your policy of failover. |
582 | panic_on_oom=2+kdump gives you very strong tool to investigate | ||
583 | why oom happens. You can get snapshot. | ||
581 | 584 | ||
582 | ============================================================= | 585 | ============================================================= |
583 | 586 | ||
diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 35755a4156d6..71d10bf52dc8 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c | |||
@@ -473,6 +473,8 @@ void mem_cgroup_out_of_memory(struct mem_cgroup *mem, gfp_t gfp_mask) | |||
473 | unsigned long points = 0; | 473 | unsigned long points = 0; |
474 | struct task_struct *p; | 474 | struct task_struct *p; |
475 | 475 | ||
476 | if (sysctl_panic_on_oom == 2) | ||
477 | panic("out of memory(memcg). panic_on_oom is selected.\n"); | ||
476 | read_lock(&tasklist_lock); | 478 | read_lock(&tasklist_lock); |
477 | retry: | 479 | retry: |
478 | p = select_bad_process(&points, mem); | 480 | p = select_bad_process(&points, mem); |