aboutsummaryrefslogtreecommitdiffstats
path: root/mm
diff options
context:
space:
mode:
authorDavid Rientjes <rientjes@google.com>2012-12-11 19:01:30 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2012-12-11 20:22:24 -0500
commit9ff4868e3051d9128a24dd330bed32011a11421d (patch)
treedff6fa6413939b1d5ce8704ee8391e543a8a8b4f /mm
parent348b465530ad222ce80e516524dd01009a4f9205 (diff)
mm, oom: allow exiting threads to have access to memory reserves
Exiting threads, those with PF_EXITING set, can pagefault and require memory before they can make forward progress. This happens, for instance, when a process must fault task->robust_list, a userspace structure, before detaching its memory. These threads also aren't guaranteed to get access to memory reserves unless oom killed or killed from userspace. The oom killer won't grant memory reserves if other threads are also exiting other than current and stalling at the same point. This prevents needlessly killing processes when others are already exiting. Instead of special casing all the possible situations between PF_EXITING getting set and a thread detaching its mm where it may allocate memory, which probably wouldn't get updated when a change is made to the exit path, the solution is to give all exiting threads access to memory reserves if they call the oom killer. This allows them to quickly allocate, detach its mm, and free the memory it represents. Summary of Luigi's bug report: : He had an oom condition where threads were faulting on task->robust_list : and repeatedly called the oom killer but it would defer killing a thread : because it saw other PF_EXITING threads. This can happen anytime we need : to allocate memory after setting PF_EXITING and before detaching our mm; : if there are other threads in the same state then the oom killer won't do : anything unless one of them happens to be killed from userspace. : : So instead of only deferring for PF_EXITING and !task->robust_list, it's : better to just give them access to memory reserves to prevent a potential : livelock so that any other faults that may be introduced in the future in : the exit path don't cause the same problem (and hopefully we don't allow : too many of those!). Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Minchan Kim <minchan@kernel.org> Tested-by: Luigi Semenzato <semenzato@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm')
-rw-r--r--mm/oom_kill.c31
1 files changed, 9 insertions, 22 deletions
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 79e0f3e24831..7e9e9113bd05 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -310,26 +310,13 @@ enum oom_scan_t oom_scan_process_thread(struct task_struct *task,
310 if (!task->mm) 310 if (!task->mm)
311 return OOM_SCAN_CONTINUE; 311 return OOM_SCAN_CONTINUE;
312 312
313 if (task->flags & PF_EXITING) { 313 if (task->flags & PF_EXITING && !force_kill) {
314 /* 314 /*
315 * If task is current and is in the process of releasing memory, 315 * If this task is not being ptraced on exit, then wait for it
316 * allow the "kill" to set TIF_MEMDIE, which will allow it to 316 * to finish before killing some other task unnecessarily.
317 * access memory reserves. Otherwise, it may stall forever.
318 *
319 * The iteration isn't broken here, however, in case other
320 * threads are found to have already been oom killed.
321 */ 317 */
322 if (task == current) 318 if (!(task->group_leader->ptrace & PT_TRACE_EXIT))
323 return OOM_SCAN_SELECT; 319 return OOM_SCAN_ABORT;
324 else if (!force_kill) {
325 /*
326 * If this task is not being ptraced on exit, then wait
327 * for it to finish before killing some other task
328 * unnecessarily.
329 */
330 if (!(task->group_leader->ptrace & PT_TRACE_EXIT))
331 return OOM_SCAN_ABORT;
332 }
333 } 320 }
334 return OOM_SCAN_OK; 321 return OOM_SCAN_OK;
335} 322}
@@ -706,11 +693,11 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
706 return; 693 return;
707 694
708 /* 695 /*
709 * If current has a pending SIGKILL, then automatically select it. The 696 * If current has a pending SIGKILL or is exiting, then automatically
710 * goal is to allow it to allocate so that it may quickly exit and free 697 * select it. The goal is to allow it to allocate so that it may
711 * its memory. 698 * quickly exit and free its memory.
712 */ 699 */
713 if (fatal_signal_pending(current)) { 700 if (fatal_signal_pending(current) || current->flags & PF_EXITING) {
714 set_thread_flag(TIF_MEMDIE); 701 set_thread_flag(TIF_MEMDIE);
715 return; 702 return;
716 } 703 }