diff options
author | KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> | 2009-08-18 17:11:10 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2009-08-18 19:31:13 -0400 |
commit | 0753ba01e126020bf0f8150934903b48935b697d (patch) | |
tree | fbfd7e2d0abbe724a8c5e0e17fb9af522ed2e097 /Documentation/filesystems | |
parent | 89a4eb4b66e8f4d395e14a14d262dac4d6ca52f0 (diff) |
mm: revert "oom: move oom_adj value"
The commit 2ff05b2b (oom: move oom_adj value) moveed the oom_adj value to
the mm_struct. It was a very good first step for sanitize OOM.
However Paul Menage reported the commit makes regression to his job
scheduler. Current OOM logic can kill OOM_DISABLED process.
Why? His program has the code of similar to the following.
...
set_oom_adj(OOM_DISABLE); /* The job scheduler never killed by oom */
...
if (vfork() == 0) {
set_oom_adj(0); /* Invoked child can be killed */
execve("foo-bar-cmd");
}
....
vfork() parent and child are shared the same mm_struct. then above
set_oom_adj(0) doesn't only change oom_adj for vfork() child, it's also
change oom_adj for vfork() parent. Then, vfork() parent (job scheduler)
lost OOM immune and it was killed.
Actually, fork-setting-exec idiom is very frequently used in userland program.
We must not break this assumption.
Then, this patch revert commit 2ff05b2b and related commit.
Reverted commit list
---------------------
- commit 2ff05b2b4e (oom: move oom_adj value from task_struct to mm_struct)
- commit 4d8b9135c3 (oom: avoid unnecessary mm locking and scanning for OOM_DISABLE)
- commit 8123681022 (oom: only oom kill exiting tasks with attached memory)
- commit 933b787b57 (mm: copy over oom_adj value at fork time)
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r-- | Documentation/filesystems/proc.txt | 15 |
1 files changed, 5 insertions, 10 deletions
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index fad18f9456e4..ffead13f9443 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt | |||
@@ -1167,13 +1167,11 @@ CHAPTER 3: PER-PROCESS PARAMETERS | |||
1167 | 3.1 /proc/<pid>/oom_adj - Adjust the oom-killer score | 1167 | 3.1 /proc/<pid>/oom_adj - Adjust the oom-killer score |
1168 | ------------------------------------------------------ | 1168 | ------------------------------------------------------ |
1169 | 1169 | ||
1170 | This file can be used to adjust the score used to select which processes should | 1170 | This file can be used to adjust the score used to select which processes |
1171 | be killed in an out-of-memory situation. The oom_adj value is a characteristic | 1171 | should be killed in an out-of-memory situation. Giving it a high score will |
1172 | of the task's mm, so all threads that share an mm with pid will have the same | 1172 | increase the likelihood of this process being killed by the oom-killer. Valid |
1173 | oom_adj value. A high value will increase the likelihood of this process being | 1173 | values are in the range -16 to +15, plus the special value -17, which disables |
1174 | killed by the oom-killer. Valid values are in the range -16 to +15 as | 1174 | oom-killing altogether for this process. |
1175 | explained below and a special value of -17, which disables oom-killing | ||
1176 | altogether for threads sharing pid's mm. | ||
1177 | 1175 | ||
1178 | The process to be killed in an out-of-memory situation is selected among all others | 1176 | The process to be killed in an out-of-memory situation is selected among all others |
1179 | based on its badness score. This value equals the original memory size of the process | 1177 | based on its badness score. This value equals the original memory size of the process |
@@ -1187,9 +1185,6 @@ the parent's score if they do not share the same memory. Thus forking servers | |||
1187 | are the prime candidates to be killed. Having only one 'hungry' child will make | 1185 | are the prime candidates to be killed. Having only one 'hungry' child will make |
1188 | parent less preferable than the child. | 1186 | parent less preferable than the child. |
1189 | 1187 | ||
1190 | /proc/<pid>/oom_adj cannot be changed for kthreads since they are immune from | ||
1191 | oom-killing already. | ||
1192 | |||
1193 | /proc/<pid>/oom_score shows process' current badness score. | 1188 | /proc/<pid>/oom_score shows process' current badness score. |
1194 | 1189 | ||
1195 | The following heuristics are then applied: | 1190 | The following heuristics are then applied: |