diff options
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/cgroups/memcg_test.txt | 24 | ||||
-rw-r--r-- | Documentation/filesystems/proc.txt | 28 |
2 files changed, 50 insertions, 2 deletions
diff --git a/Documentation/cgroups/memcg_test.txt b/Documentation/cgroups/memcg_test.txt index 19533f93b7a2..523a9c16c400 100644 --- a/Documentation/cgroups/memcg_test.txt +++ b/Documentation/cgroups/memcg_test.txt | |||
@@ -1,6 +1,6 @@ | |||
1 | Memory Resource Controller(Memcg) Implementation Memo. | 1 | Memory Resource Controller(Memcg) Implementation Memo. |
2 | Last Updated: 2008/12/15 | 2 | Last Updated: 2009/1/19 |
3 | Base Kernel Version: based on 2.6.28-rc8-mm. | 3 | Base Kernel Version: based on 2.6.29-rc2. |
4 | 4 | ||
5 | Because VM is getting complex (one of reasons is memcg...), memcg's behavior | 5 | Because VM is getting complex (one of reasons is memcg...), memcg's behavior |
6 | is complex. This is a document for memcg's internal behavior. | 6 | is complex. This is a document for memcg's internal behavior. |
@@ -340,3 +340,23 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y. | |||
340 | # mount -t cgroup none /cgroup -t cpuset,memory,cpu,devices | 340 | # mount -t cgroup none /cgroup -t cpuset,memory,cpu,devices |
341 | 341 | ||
342 | and do task move, mkdir, rmdir etc...under this. | 342 | and do task move, mkdir, rmdir etc...under this. |
343 | |||
344 | 9.7 swapoff. | ||
345 | Besides management of swap is one of complicated parts of memcg, | ||
346 | call path of swap-in at swapoff is not same as usual swap-in path.. | ||
347 | It's worth to be tested explicitly. | ||
348 | |||
349 | For example, test like following is good. | ||
350 | (Shell-A) | ||
351 | # mount -t cgroup none /cgroup -t memory | ||
352 | # mkdir /cgroup/test | ||
353 | # echo 40M > /cgroup/test/memory.limit_in_bytes | ||
354 | # echo 0 > /cgroup/test/tasks | ||
355 | Run malloc(100M) program under this. You'll see 60M of swaps. | ||
356 | (Shell-B) | ||
357 | # move all tasks in /cgroup/test to /cgroup | ||
358 | # /sbin/swapoff -a | ||
359 | # rmdir /test/cgroup | ||
360 | # kill malloc task. | ||
361 | |||
362 | Of course, tmpfs v.s. swapoff test should be tested, too. | ||
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index bbebc3a43ac0..a87be42f8211 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt | |||
@@ -2027,6 +2027,34 @@ increase the likelihood of this process being killed by the oom-killer. Valid | |||
2027 | values are in the range -16 to +15, plus the special value -17, which disables | 2027 | values are in the range -16 to +15, plus the special value -17, which disables |
2028 | oom-killing altogether for this process. | 2028 | oom-killing altogether for this process. |
2029 | 2029 | ||
2030 | The process to be killed in an out-of-memory situation is selected among all others | ||
2031 | based on its badness score. This value equals the original memory size of the process | ||
2032 | and is then updated according to its CPU time (utime + stime) and the | ||
2033 | run time (uptime - start time). The longer it runs the smaller is the score. | ||
2034 | Badness score is divided by the square root of the CPU time and then by | ||
2035 | the double square root of the run time. | ||
2036 | |||
2037 | Swapped out tasks are killed first. Half of each child's memory size is added to | ||
2038 | the parent's score if they do not share the same memory. Thus forking servers | ||
2039 | are the prime candidates to be killed. Having only one 'hungry' child will make | ||
2040 | parent less preferable than the child. | ||
2041 | |||
2042 | /proc/<pid>/oom_score shows process' current badness score. | ||
2043 | |||
2044 | The following heuristics are then applied: | ||
2045 | * if the task was reniced, its score doubles | ||
2046 | * superuser or direct hardware access tasks (CAP_SYS_ADMIN, CAP_SYS_RESOURCE | ||
2047 | or CAP_SYS_RAWIO) have their score divided by 4 | ||
2048 | * if oom condition happened in one cpuset and checked task does not belong | ||
2049 | to it, its score is divided by 8 | ||
2050 | * the resulting score is multiplied by two to the power of oom_adj, i.e. | ||
2051 | points <<= oom_adj when it is positive and | ||
2052 | points >>= -(oom_adj) otherwise | ||
2053 | |||
2054 | The task with the highest badness score is then selected and its children | ||
2055 | are killed, process itself will be killed in an OOM situation when it does | ||
2056 | not have children or some of them disabled oom like described above. | ||
2057 | |||
2030 | 2.13 /proc/<pid>/oom_score - Display current oom-killer score | 2058 | 2.13 /proc/<pid>/oom_score - Display current oom-killer score |
2031 | ------------------------------------------------------------- | 2059 | ------------------------------------------------------------- |
2032 | 2060 | ||