diff options
Diffstat (limited to 'Documentation/filesystems/proc.txt')
-rw-r--r-- | Documentation/filesystems/proc.txt | 94 |
1 files changed, 57 insertions, 37 deletions
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index 8fe8895894d8..cf1295c2bb66 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt | |||
@@ -33,7 +33,8 @@ Table of Contents | |||
33 | 2 Modifying System Parameters | 33 | 2 Modifying System Parameters |
34 | 34 | ||
35 | 3 Per-Process Parameters | 35 | 3 Per-Process Parameters |
36 | 3.1 /proc/<pid>/oom_adj - Adjust the oom-killer score | 36 | 3.1 /proc/<pid>/oom_adj & /proc/<pid>/oom_score_adj - Adjust the oom-killer |
37 | score | ||
37 | 3.2 /proc/<pid>/oom_score - Display current oom-killer score | 38 | 3.2 /proc/<pid>/oom_score - Display current oom-killer score |
38 | 3.3 /proc/<pid>/io - Display the IO accounting fields | 39 | 3.3 /proc/<pid>/io - Display the IO accounting fields |
39 | 3.4 /proc/<pid>/coredump_filter - Core dump filtering settings | 40 | 3.4 /proc/<pid>/coredump_filter - Core dump filtering settings |
@@ -1234,42 +1235,61 @@ of the kernel. | |||
1234 | CHAPTER 3: PER-PROCESS PARAMETERS | 1235 | CHAPTER 3: PER-PROCESS PARAMETERS |
1235 | ------------------------------------------------------------------------------ | 1236 | ------------------------------------------------------------------------------ |
1236 | 1237 | ||
1237 | 3.1 /proc/<pid>/oom_adj - Adjust the oom-killer score | 1238 | 3.1 /proc/<pid>/oom_adj & /proc/<pid>/oom_score_adj- Adjust the oom-killer score |
1238 | ------------------------------------------------------ | 1239 | -------------------------------------------------------------------------------- |
1239 | 1240 | ||
1240 | This file can be used to adjust the score used to select which processes | 1241 | These file can be used to adjust the badness heuristic used to select which |
1241 | should be killed in an out-of-memory situation. Giving it a high score will | 1242 | process gets killed in out of memory conditions. |
1242 | increase the likelihood of this process being killed by the oom-killer. Valid | 1243 | |
1243 | values are in the range -16 to +15, plus the special value -17, which disables | 1244 | The badness heuristic assigns a value to each candidate task ranging from 0 |
1244 | oom-killing altogether for this process. | 1245 | (never kill) to 1000 (always kill) to determine which process is targeted. The |
1245 | 1246 | units are roughly a proportion along that range of allowed memory the process | |
1246 | The process to be killed in an out-of-memory situation is selected among all others | 1247 | may allocate from based on an estimation of its current memory and swap use. |
1247 | based on its badness score. This value equals the original memory size of the process | 1248 | For example, if a task is using all allowed memory, its badness score will be |
1248 | and is then updated according to its CPU time (utime + stime) and the | 1249 | 1000. If it is using half of its allowed memory, its score will be 500. |
1249 | run time (uptime - start time). The longer it runs the smaller is the score. | 1250 | |
1250 | Badness score is divided by the square root of the CPU time and then by | 1251 | There is an additional factor included in the badness score: root |
1251 | the double square root of the run time. | 1252 | processes are given 3% extra memory over other tasks. |
1252 | 1253 | ||
1253 | Swapped out tasks are killed first. Half of each child's memory size is added to | 1254 | The amount of "allowed" memory depends on the context in which the oom killer |
1254 | the parent's score if they do not share the same memory. Thus forking servers | 1255 | was called. If it is due to the memory assigned to the allocating task's cpuset |
1255 | are the prime candidates to be killed. Having only one 'hungry' child will make | 1256 | being exhausted, the allowed memory represents the set of mems assigned to that |
1256 | parent less preferable than the child. | 1257 | cpuset. If it is due to a mempolicy's node(s) being exhausted, the allowed |
1257 | 1258 | memory represents the set of mempolicy nodes. If it is due to a memory | |
1258 | /proc/<pid>/oom_score shows process' current badness score. | 1259 | limit (or swap limit) being reached, the allowed memory is that configured |
1259 | 1260 | limit. Finally, if it is due to the entire system being out of memory, the | |
1260 | The following heuristics are then applied: | 1261 | allowed memory represents all allocatable resources. |
1261 | * if the task was reniced, its score doubles | 1262 | |
1262 | * superuser or direct hardware access tasks (CAP_SYS_ADMIN, CAP_SYS_RESOURCE | 1263 | The value of /proc/<pid>/oom_score_adj is added to the badness score before it |
1263 | or CAP_SYS_RAWIO) have their score divided by 4 | 1264 | is used to determine which task to kill. Acceptable values range from -1000 |
1264 | * if oom condition happened in one cpuset and checked process does not belong | 1265 | (OOM_SCORE_ADJ_MIN) to +1000 (OOM_SCORE_ADJ_MAX). This allows userspace to |
1265 | to it, its score is divided by 8 | 1266 | polarize the preference for oom killing either by always preferring a certain |
1266 | * the resulting score is multiplied by two to the power of oom_adj, i.e. | 1267 | task or completely disabling it. The lowest possible value, -1000, is |
1267 | points <<= oom_adj when it is positive and | 1268 | equivalent to disabling oom killing entirely for that task since it will always |
1268 | points >>= -(oom_adj) otherwise | 1269 | report a badness score of 0. |
1269 | 1270 | ||
1270 | The task with the highest badness score is then selected and its children | 1271 | Consequently, it is very simple for userspace to define the amount of memory to |
1271 | are killed, process itself will be killed in an OOM situation when it does | 1272 | consider for each task. Setting a /proc/<pid>/oom_score_adj value of +500, for |
1272 | not have children or some of them disabled oom like described above. | 1273 | example, is roughly equivalent to allowing the remainder of tasks sharing the |
1274 | same system, cpuset, mempolicy, or memory controller resources to use at least | ||
1275 | 50% more memory. A value of -500, on the other hand, would be roughly | ||
1276 | equivalent to discounting 50% of the task's allowed memory from being considered | ||
1277 | as scoring against the task. | ||
1278 | |||
1279 | For backwards compatibility with previous kernels, /proc/<pid>/oom_adj may also | ||
1280 | be used to tune the badness score. Its acceptable values range from -16 | ||
1281 | (OOM_ADJUST_MIN) to +15 (OOM_ADJUST_MAX) and a special value of -17 | ||
1282 | (OOM_DISABLE) to disable oom killing entirely for that task. Its value is | ||
1283 | scaled linearly with /proc/<pid>/oom_score_adj. | ||
1284 | |||
1285 | Writing to /proc/<pid>/oom_score_adj or /proc/<pid>/oom_adj will change the | ||
1286 | other with its scaled value. | ||
1287 | |||
1288 | Caveat: when a parent task is selected, the oom killer will sacrifice any first | ||
1289 | generation children with seperate address spaces instead, if possible. This | ||
1290 | avoids servers and important system daemons from being killed and loses the | ||
1291 | minimal amount of work. | ||
1292 | |||
1273 | 1293 | ||
1274 | 3.2 /proc/<pid>/oom_score - Display current oom-killer score | 1294 | 3.2 /proc/<pid>/oom_score - Display current oom-killer score |
1275 | ------------------------------------------------------------- | 1295 | ------------------------------------------------------------- |