diff options
author | KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> | 2009-09-21 20:03:13 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2009-09-22 10:17:39 -0400 |
commit | 28b83c5193e7ab951e402252278f2cc79dc4d298 (patch) | |
tree | 10080e8d3957c2a03f8419ab44c9ecb0ffcdaee0 /include/linux | |
parent | f168e1b6390e2d79cf57e48e6ae6d9b0a9e2851a (diff) |
oom: move oom_adj value from task_struct to signal_struct
Currently, OOM logic callflow is here.
__out_of_memory()
select_bad_process() for each task
badness() calculate badness of one task
oom_kill_process() search child
oom_kill_task() kill target task and mm shared tasks with it
example, process-A have two thread, thread-A and thread-B and it have very
fat memory and each thread have following oom_adj and oom_score.
thread-A: oom_adj = OOM_DISABLE, oom_score = 0
thread-B: oom_adj = 0, oom_score = very-high
Then, select_bad_process() select thread-B, but oom_kill_task() refuse
kill the task because thread-A have OOM_DISABLE. Thus __out_of_memory()
call select_bad_process() again. but select_bad_process() select the same
task. It mean kernel fall in livelock.
The fact is, select_bad_process() must select killable task. otherwise
OOM logic go into livelock.
And root cause is, oom_adj shouldn't be per-thread value. it should be
per-process value because OOM-killer kill a process, not thread. Thus
This patch moves oomkilladj (now more appropriately named oom_adj) from
struct task_struct to struct signal_struct. it naturally prevent
select_bad_process() choose wrong task.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'include/linux')
-rw-r--r-- | include/linux/sched.h | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/include/linux/sched.h b/include/linux/sched.h index 899d7304d594..17e9a8e9a51d 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h | |||
@@ -639,6 +639,8 @@ struct signal_struct { | |||
639 | unsigned audit_tty; | 639 | unsigned audit_tty; |
640 | struct tty_audit_buf *tty_audit_buf; | 640 | struct tty_audit_buf *tty_audit_buf; |
641 | #endif | 641 | #endif |
642 | |||
643 | int oom_adj; /* OOM kill score adjustment (bit shift) */ | ||
642 | }; | 644 | }; |
643 | 645 | ||
644 | /* Context switch must be unlocked if interrupts are to be enabled */ | 646 | /* Context switch must be unlocked if interrupts are to be enabled */ |
@@ -1221,7 +1223,6 @@ struct task_struct { | |||
1221 | * a short time | 1223 | * a short time |
1222 | */ | 1224 | */ |
1223 | unsigned char fpu_counter; | 1225 | unsigned char fpu_counter; |
1224 | s8 oomkilladj; /* OOM kill score adjustment (bit shift). */ | ||
1225 | #ifdef CONFIG_BLK_DEV_IO_TRACE | 1226 | #ifdef CONFIG_BLK_DEV_IO_TRACE |
1226 | unsigned int btrace_seq; | 1227 | unsigned int btrace_seq; |
1227 | #endif | 1228 | #endif |