diff options
author | David Rientjes <rientjes@google.com> | 2009-06-16 18:32:56 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2009-06-16 22:47:43 -0400 |
commit | 2ff05b2b4eac2e63d345fc731ea151a060247f53 (patch) | |
tree | 1840bc2d3b381eca5d39869499339b0fcc6eabbf /Documentation | |
parent | c9e444103b5e7a5a3519f9913f59767f92e33baf (diff) |
oom: move oom_adj value from task_struct to mm_struct
The per-task oom_adj value is a characteristic of its mm more than the
task itself since it's not possible to oom kill any thread that shares the
mm. If a task were to be killed while attached to an mm that could not be
freed because another thread were set to OOM_DISABLE, it would have
needlessly been terminated since there is no potential for future memory
freeing.
This patch moves oomkilladj (now more appropriately named oom_adj) from
struct task_struct to struct mm_struct. This requires task_lock() on a
task to check its oom_adj value to protect against exec, but it's already
necessary to take the lock when dereferencing the mm to find the total VM
size for the badness heuristic.
This fixes a livelock if the oom killer chooses a task and another thread
sharing the same memory has an oom_adj value of OOM_DISABLE. This occurs
because oom_kill_task() repeatedly returns 1 and refuses to kill the
chosen task while select_bad_process() will repeatedly choose the same
task during the next retry.
Taking task_lock() in select_bad_process() to check for OOM_DISABLE and in
oom_kill_task() to check for threads sharing the same memory will be
removed in the next patch in this series where it will no longer be
necessary.
Writing to /proc/pid/oom_adj for a kthread will now return -EINVAL since
these threads are immune from oom killing already. They simply report an
oom_adj value of OOM_DISABLE.
Cc: Nick Piggin <npiggin@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/filesystems/proc.txt | 15 |
1 files changed, 10 insertions, 5 deletions
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index cd8717a36271..ebff3c10a07f 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt | |||
@@ -1003,11 +1003,13 @@ CHAPTER 3: PER-PROCESS PARAMETERS | |||
1003 | 3.1 /proc/<pid>/oom_adj - Adjust the oom-killer score | 1003 | 3.1 /proc/<pid>/oom_adj - Adjust the oom-killer score |
1004 | ------------------------------------------------------ | 1004 | ------------------------------------------------------ |
1005 | 1005 | ||
1006 | This file can be used to adjust the score used to select which processes | 1006 | This file can be used to adjust the score used to select which processes should |
1007 | should be killed in an out-of-memory situation. Giving it a high score will | 1007 | be killed in an out-of-memory situation. The oom_adj value is a characteristic |
1008 | increase the likelihood of this process being killed by the oom-killer. Valid | 1008 | of the task's mm, so all threads that share an mm with pid will have the same |
1009 | values are in the range -16 to +15, plus the special value -17, which disables | 1009 | oom_adj value. A high value will increase the likelihood of this process being |
1010 | oom-killing altogether for this process. | 1010 | killed by the oom-killer. Valid values are in the range -16 to +15 as |
1011 | explained below and a special value of -17, which disables oom-killing | ||
1012 | altogether for threads sharing pid's mm. | ||
1011 | 1013 | ||
1012 | The process to be killed in an out-of-memory situation is selected among all others | 1014 | The process to be killed in an out-of-memory situation is selected among all others |
1013 | based on its badness score. This value equals the original memory size of the process | 1015 | based on its badness score. This value equals the original memory size of the process |
@@ -1021,6 +1023,9 @@ the parent's score if they do not share the same memory. Thus forking servers | |||
1021 | are the prime candidates to be killed. Having only one 'hungry' child will make | 1023 | are the prime candidates to be killed. Having only one 'hungry' child will make |
1022 | parent less preferable than the child. | 1024 | parent less preferable than the child. |
1023 | 1025 | ||
1026 | /proc/<pid>/oom_adj cannot be changed for kthreads since they are immune from | ||
1027 | oom-killing already. | ||
1028 | |||
1024 | /proc/<pid>/oom_score shows process' current badness score. | 1029 | /proc/<pid>/oom_score shows process' current badness score. |
1025 | 1030 | ||
1026 | The following heuristics are then applied: | 1031 | The following heuristics are then applied: |