mm, oom: protect !costly allocations some more

should_reclaim_retry will give up retries for higher order allocations if none of the eligible zones has any requested or higher order pages available even if we pass the watermak check for order-0. This is done because there is no guarantee that the reclaimable and currently free pages will form the required order. This can, however, lead to situations where the high-order request (e.g. order-2 required for the stack allocation during fork) will trigger OOM too early - e.g. after the first reclaim/compaction round. Such a system would have to be highly fragmented and there is no guarantee further reclaim/compaction attempts would help but at least make sure that the compaction was active before we go OOM and keep retrying even if should_reclaim_retry tells us to oom if - the last compaction round backed off or - we haven't completed at least MAX_COMPACT_RETRIES active compaction rounds. The first rule ensures that the very last attempt for compaction was not ignored while the second guarantees that the compaction has done some work. Multiple retries might be needed to prevent occasional pigggy backing of other contexts to steal the compacted pages before the current context manages to retry to allocate them. compaction_failed() is taken as a final word from the compaction that the retry doesn't make much sense. We have to be careful though because the first compaction round is MIGRATE_ASYNC which is rather weak as it ignores pages under writeback and gives up too easily in other situations. We therefore have to make sure that MIGRATE_SYNC_LIGHT mode has been used before we give up. With this logic in place we do not have to increase the migration mode unconditionally and rather do it only if the compaction failed for the weaker mode. A nice side effect is that the stronger migration mode is used only when really needed so this has a potential of smaller latencies in some cases. Please note that the compaction doesn't tell us much about how successful it was when returning compaction_made_progress so we just have to blindly trust that another retry is worthwhile and cap the number to something reasonable to guarantee a convergence. If the given number of successful retries is not sufficient for a reasonable workloads we should focus on the collected compaction tracepoints data and try to address the issue in the compaction code. If this is not feasible we can increase the retries limit. [mhocko@suse.com: fix warning] Link: http://lkml.kernel.org/r/20160512061636.GA4200@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
author: Michal Hocko <mhocko@suse.com> 2016-05-20 19:57:06 -0400
committer: Linus Torvalds <torvalds@linux-foundation.org> 2016-05-20 20:58:30 -0400
commit: 33c2d21438daea807947923377995c73ee8ed3fc (patch)
tree: 7f6857b3a6e5443f41b50926ec0a52a0988dd9c3 /mm
parent: ede37713737834d98ec72ed299a305d53e909f73 (diff)
1 files changed, 78 insertions, 10 deletions
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f51c302126a1..38ad6dd7cba0 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3180,6 +3180,13 @@ out:
        return page;
 }
+/*
+ * Maximum number of compaction retries wit a progress before OOM
+ * killer is consider as the only way to move forward.
+ */
+#define MAX_COMPACT_RETRIES 16
 #ifdef CONFIG_COMPACTION
 /* Try memory compaction for high-order allocations before reclaim */
 static struct page *
@@ -3247,14 +3254,60 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
        return NULL;
 }
+static inline bool
+should_compact_retry(unsigned int order, enum compact_result compact_result,
+                     enum migrate_mode *migrate_mode,
+                     int compaction_retries)
+{
+        if (!order)
+                return false;
+        /*
+         * compaction considers all the zone as desperately out of memory
+         * so it doesn't really make much sense to retry except when the
+         * failure could be caused by weak migration mode.
+         */
+        if (compaction_failed(compact_result)) {
+                if (*migrate_mode == MIGRATE_ASYNC) {
+                        *migrate_mode = MIGRATE_SYNC_LIGHT;
+                        return true;
+                }
+                return false;
+        }
+        /*
+         * !costly allocations are really important and we have to make sure
+         * the compaction wasn't deferred or didn't bail out early due to locks
+         * contention before we go OOM. Still cap the reclaim retry loops with
+         * progress to prevent from looping forever and potential trashing.
+         */
+        if (order <= PAGE_ALLOC_COSTLY_ORDER) {
+                if (compaction_withdrawn(compact_result))
+                        return true;
+                if (compaction_retries <= MAX_COMPACT_RETRIES)
+                        return true;
+        }
+        return false;
+}
 #else
 static inline struct page *
 __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
                unsigned int alloc_flags, const struct alloc_context *ac,
                enum migrate_mode mode, enum compact_result *compact_result)
 {
+        *compact_result = COMPACT_SKIPPED;
        return NULL;
 }
+static inline bool
+should_compact_retry(unsigned int order, enum compact_result compact_result,
+                     enum migrate_mode *migrate_mode,
+                     int compaction_retries)
+{
+        return false;
+}
 #endif /* CONFIG_COMPACTION */
 /* Perform direct synchronous page reclaim */
@@ -3501,6 +3554,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
        unsigned long did_some_progress;
        enum migrate_mode migration_mode = MIGRATE_ASYNC;
        enum compact_result compact_result;
+        int compaction_retries = 0;
        int no_progress_loops = 0;
        /*
@@ -3612,13 +3666,8 @@ retry:
                        goto nopage;
        }
-        /*
+        if (order && compaction_made_progress(compact_result))
-         * It can become very expensive to allocate transparent hugepages at
+                compaction_retries++;
-         * fault, so use asynchronous memory compaction for THP unless it is
-         * khugepaged trying to collapse.
-         */
-        if (!is_thp_gfp_mask(gfp_mask) || (current->flags & PF_KTHREAD))
-                migration_mode = MIGRATE_SYNC_LIGHT;
        /* Try direct reclaim and then allocating */
        page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
@@ -3649,6 +3698,17 @@ retry:
                                 no_progress_loops))
                goto retry;
+        /*
+         * It doesn't make any sense to retry for the compaction if the order-0
+         * reclaim is not able to make any progress because the current
+         * implementation of the compaction depends on the sufficient amount
+         * of free memory (see __compaction_suitable)
+         */
+        if (did_some_progress > 0 &&
+                        should_compact_retry(order, compact_result,
+                                &migration_mode, compaction_retries))
+                goto retry;
        /* Reclaim has failed us, start killing things */
        page = __alloc_pages_may_oom(gfp_mask, order, ac, &did_some_progress);
        if (page)
@@ -3662,10 +3722,18 @@ retry:
 noretry:
        /*
-         * High-order allocations do not necessarily loop after
+         * High-order allocations do not necessarily loop after direct reclaim
-         * direct reclaim and reclaim/compaction depends on compaction
+         * and reclaim/compaction depends on compaction being called after
-         * being called after reclaim so call directly if necessary
+         * reclaim so call directly if necessary.
+         * It can become very expensive to allocate transparent hugepages at
+         * fault, so use asynchronous memory compaction for THP unless it is
+         * khugepaged trying to collapse. All other requests should tolerate
+         * at least light sync migration.
         */
+        if (is_thp_gfp_mask(gfp_mask) && !(current->flags & PF_KTHREAD))
+                migration_mode = MIGRATE_ASYNC;
+        else
+                migration_mode = MIGRATE_SYNC_LIGHT;
        page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags,
                                            ac, migration_mode,
                                            &compact_result);
author	Michal Hocko <mhocko@suse.com>	2016-05-20 19:57:06 -0400
committer	Linus Torvalds <torvalds@linux-foundation.org>	2016-05-20 20:58:30 -0400
commit	33c2d21438daea807947923377995c73ee8ed3fc (patch)
tree	7f6857b3a6e5443f41b50926ec0a52a0988dd9c3 /mm
parent	ede37713737834d98ec72ed299a305d53e909f73 (diff)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f51c302126a1..38ad6dd7cba0 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c
@@ -3180,6 +3180,13 @@ out:
3180	return page;	3180	return page;
3181	}	3181	}
3182		3182
		3183
		3184	/*
		3185	* Maximum number of compaction retries wit a progress before OOM
		3186	* killer is consider as the only way to move forward.
		3187	*/
		3188	#define MAX_COMPACT_RETRIES 16
		3189
3183	#ifdef CONFIG_COMPACTION	3190	#ifdef CONFIG_COMPACTION
3184	/* Try memory compaction for high-order allocations before reclaim */	3191	/* Try memory compaction for high-order allocations before reclaim */
3185	static struct page *	3192	static struct page *
@@ -3247,14 +3254,60 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
3247		3254
3248	return NULL;	3255	return NULL;
3249	}	3256	}
		3257
		3258	static inline bool
		3259	should_compact_retry(unsigned int order, enum compact_result compact_result,
		3260	enum migrate_mode *migrate_mode,
		3261	int compaction_retries)
		3262	{
		3263	if (!order)
		3264	return false;
		3265
		3266	/*
		3267	* compaction considers all the zone as desperately out of memory
		3268	* so it doesn't really make much sense to retry except when the
		3269	* failure could be caused by weak migration mode.
		3270	*/
		3271	if (compaction_failed(compact_result)) {
		3272	if (*migrate_mode == MIGRATE_ASYNC) {
		3273	*migrate_mode = MIGRATE_SYNC_LIGHT;
		3274	return true;
		3275	}
		3276	return false;
		3277	}
		3278
		3279	/*
		3280	* !costly allocations are really important and we have to make sure
		3281	* the compaction wasn't deferred or didn't bail out early due to locks
		3282	* contention before we go OOM. Still cap the reclaim retry loops with
		3283	* progress to prevent from looping forever and potential trashing.
		3284	*/
		3285	if (order <= PAGE_ALLOC_COSTLY_ORDER) {
		3286	if (compaction_withdrawn(compact_result))
		3287	return true;
		3288	if (compaction_retries <= MAX_COMPACT_RETRIES)
		3289	return true;
		3290	}
		3291
		3292	return false;
		3293	}
3250	#else	3294	#else
3251	static inline struct page *	3295	static inline struct page *
3252	__alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,	3296	__alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
3253	unsigned int alloc_flags, const struct alloc_context *ac,	3297	unsigned int alloc_flags, const struct alloc_context *ac,
3254	enum migrate_mode mode, enum compact_result *compact_result)	3298	enum migrate_mode mode, enum compact_result *compact_result)
3255	{	3299	{
		3300	*compact_result = COMPACT_SKIPPED;
3256	return NULL;	3301	return NULL;
3257	}	3302	}
		3303
		3304	static inline bool
		3305	should_compact_retry(unsigned int order, enum compact_result compact_result,
		3306	enum migrate_mode *migrate_mode,
		3307	int compaction_retries)
		3308	{
		3309	return false;
		3310	}
3258	#endif /* CONFIG_COMPACTION */	3311	#endif /* CONFIG_COMPACTION */
3259		3312
3260	/* Perform direct synchronous page reclaim */	3313	/* Perform direct synchronous page reclaim */
@@ -3501,6 +3554,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
3501	unsigned long did_some_progress;	3554	unsigned long did_some_progress;
3502	enum migrate_mode migration_mode = MIGRATE_ASYNC;	3555	enum migrate_mode migration_mode = MIGRATE_ASYNC;
3503	enum compact_result compact_result;	3556	enum compact_result compact_result;
		3557	int compaction_retries = 0;
3504	int no_progress_loops = 0;	3558	int no_progress_loops = 0;
3505		3559
3506	/*	3560	/*
@@ -3612,13 +3666,8 @@ retry:
3612	goto nopage;	3666	goto nopage;
3613	}	3667	}
3614		3668
3615	/*	3669	if (order && compaction_made_progress(compact_result))
3616	* It can become very expensive to allocate transparent hugepages at	3670	compaction_retries++;
3617	* fault, so use asynchronous memory compaction for THP unless it is
3618	* khugepaged trying to collapse.
3619	*/
3620	if (!is_thp_gfp_mask(gfp_mask) \|\| (current->flags & PF_KTHREAD))
3621	migration_mode = MIGRATE_SYNC_LIGHT;
3622		3671
3623	/* Try direct reclaim and then allocating */	3672	/* Try direct reclaim and then allocating */
3624	page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,	3673	page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
@@ -3649,6 +3698,17 @@ retry:
3649	no_progress_loops))	3698	no_progress_loops))
3650	goto retry;	3699	goto retry;
3651		3700
		3701	/*
		3702	* It doesn't make any sense to retry for the compaction if the order-0
		3703	* reclaim is not able to make any progress because the current
		3704	* implementation of the compaction depends on the sufficient amount
		3705	* of free memory (see __compaction_suitable)
		3706	*/
		3707	if (did_some_progress > 0 &&
		3708	should_compact_retry(order, compact_result,
		3709	&migration_mode, compaction_retries))
		3710	goto retry;
		3711
3652	/* Reclaim has failed us, start killing things */	3712	/* Reclaim has failed us, start killing things */
3653	page = __alloc_pages_may_oom(gfp_mask, order, ac, &did_some_progress);	3713	page = __alloc_pages_may_oom(gfp_mask, order, ac, &did_some_progress);
3654	if (page)	3714	if (page)
@@ -3662,10 +3722,18 @@ retry:
3662		3722
3663	noretry:	3723	noretry:
3664	/*	3724	/*
3665	* High-order allocations do not necessarily loop after	3725	* High-order allocations do not necessarily loop after direct reclaim
3666	* direct reclaim and reclaim/compaction depends on compaction	3726	* and reclaim/compaction depends on compaction being called after
3667	* being called after reclaim so call directly if necessary	3727	* reclaim so call directly if necessary.
		3728	* It can become very expensive to allocate transparent hugepages at
		3729	* fault, so use asynchronous memory compaction for THP unless it is
		3730	* khugepaged trying to collapse. All other requests should tolerate
		3731	* at least light sync migration.
3668	*/	3732	*/
		3733	if (is_thp_gfp_mask(gfp_mask) && !(current->flags & PF_KTHREAD))
		3734	migration_mode = MIGRATE_ASYNC;
		3735	else
		3736	migration_mode = MIGRATE_SYNC_LIGHT;
3669	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags,	3737	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags,
3670	ac, migration_mode,	3738	ac, migration_mode,
3671	&compact_result);	3739	&compact_result);