mm, page_alloc: restructure direct compaction handling in slowpath

The retry loop in __alloc_pages_slowpath is supposed to keep trying reclaim and compaction (and OOM), until either the allocation succeeds, or returns with failure. Success here is more probable when reclaim precedes compaction, as certain watermarks have to be met for compaction to even try, and more free pages increase the probability of compaction success. On the other hand, starting with light async compaction (if the watermarks allow it), can be more efficient, especially for smaller orders, if there's enough free memory which is just fragmented. Thus, the current code starts with compaction before reclaim, and to make sure that the last reclaim is always followed by a final compaction, there's another direct compaction call at the end of the loop. This makes the code hard to follow and adds some duplicated handling of migration_mode decisions. It's also somewhat inefficient that even if reclaim or compaction decides not to retry, the final compaction is still attempted. Some gfp flags combination also shortcut these retry decisions by "goto noretry;", making it even harder to follow. This patch attempts to restructure the code with only minimal functional changes. The call to the first compaction and THP-specific checks are now placed above the retry loop, and the "noretry" direct compaction is removed. The initial compaction is additionally restricted only to costly orders, as we can expect smaller orders to be held back by watermarks, and only larger orders to suffer primarily from fragmentation. This better matches the checks in reclaim's shrink_zones(). There are two other smaller functional changes. One is that the upgrade from async migration to light sync migration will always occur after the initial compaction. This is how it has been until recent patch "mm, oom: protect !costly allocations some more", which introduced upgrading the mode based on COMPACT_COMPLETE result, but kept the final compaction always upgraded, which made it even more special. It's better to return to the simpler handling for now, as migration modes will be further modified later in the series. The second change is that once both reclaim and compaction declare it's not worth to retry the reclaim/compact loop, there is no final compaction attempt. As argued above, this is intentional. If that final compaction were to succeed, it would be due to a wrong retry decision, or simply a race with somebody else freeing memory for us. The main outcome of this patch should be simpler code. Logically, the initial compaction without reclaim is the exceptional case to the reclaim/compaction scheme, but prior to the patch, it was the last loop iteration that was exceptional. Now the code matches the logic better. The change also enable the following patches. Link: http://lkml.kernel.org/r/20160721073614.24395-5-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
author: Vlastimil Babka <vbabka@suse.cz> 2016-07-28 18:49:19 -0400
committer: Linus Torvalds <torvalds@linux-foundation.org> 2016-07-28 19:07:41 -0400
commit: a8161d1ed6098506303c65b3701dedba876df42a (patch)
tree: b567a7d7aad3d85a0daf050385e19e63a24a7cdd /mm/page_alloc.c
parent: 23771235bb569c4999ff077d2c38eaee5763193a (diff)
1 files changed, 57 insertions, 52 deletions
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a42fa09ee91f..ae721a713bda 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3479,7 +3479,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
        struct page *page = NULL;
        unsigned int alloc_flags;
        unsigned long did_some_progress;
-        enum migrate_mode migration_mode = MIGRATE_ASYNC;
+        enum migrate_mode migration_mode = MIGRATE_SYNC_LIGHT;
        enum compact_result compact_result;
        int compaction_retries = 0;
        int no_progress_loops = 0;
@@ -3521,6 +3521,52 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
        if (page)
                goto got_pg;
+        /*
+         * For costly allocations, try direct compaction first, as it's likely
+         * that we have enough base pages and don't need to reclaim. Don't try
+         * that for allocations that are allowed to ignore watermarks, as the
+         * ALLOC_NO_WATERMARKS attempt didn't yet happen.
+         */
+        if (can_direct_reclaim && order > PAGE_ALLOC_COSTLY_ORDER &&
+                !gfp_pfmemalloc_allowed(gfp_mask)) {
+                page = __alloc_pages_direct_compact(gfp_mask, order,
+                                                alloc_flags, ac,
+                                                MIGRATE_ASYNC,
+                                                &compact_result);
+                if (page)
+                        goto got_pg;
+                /* Checks for THP-specific high-order allocations */
+                if (is_thp_gfp_mask(gfp_mask)) {
+                        /*
+                         * If compaction is deferred for high-order allocations,
+                         * it is because sync compaction recently failed. If
+                         * this is the case and the caller requested a THP
+                         * allocation, we do not want to heavily disrupt the
+                         * system, so we fail the allocation instead of entering
+                         * direct reclaim.
+                         */
+                        if (compact_result == COMPACT_DEFERRED)
+                                goto nopage;
+                        /*
+                         * Compaction is contended so rather back off than cause
+                         * excessive stalls.
+                         */
+                        if (compact_result == COMPACT_CONTENDED)
+                                goto nopage;
+                        /*
+                         * It can become very expensive to allocate transparent
+                         * hugepages at fault, so use asynchronous memory
+                         * compaction for THP unless it is khugepaged trying to
+                         * collapse. All other requests should tolerate at
+                         * least light sync migration.
+                         */
+                        if (!(current->flags & PF_KTHREAD))
+                                migration_mode = MIGRATE_ASYNC;
+                }
+        }
 retry:
        /* Ensure kswapd doesn't accidentally go to sleep as long as we loop */
@@ -3575,55 +3621,33 @@ retry:
        if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL))
                goto nopage;
-        /*
-         * Try direct compaction. The first pass is asynchronous. Subsequent
+        /* Try direct reclaim and then allocating */
-         * attempts after direct reclaim are synchronous
+        page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
-         */
+                                                        &did_some_progress);
+        if (page)
+                goto got_pg;
+        /* Try direct compaction and then allocating */
        page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac,
                                        migration_mode,
                                        &compact_result);
        if (page)
                goto got_pg;
-        /* Checks for THP-specific high-order allocations */
-        if (is_thp_gfp_mask(gfp_mask)) {
-                /*
-                 * If compaction is deferred for high-order allocations, it is
-                 * because sync compaction recently failed. If this is the case
-                 * and the caller requested a THP allocation, we do not want
-                 * to heavily disrupt the system, so we fail the allocation
-                 * instead of entering direct reclaim.
-                 */
-                if (compact_result == COMPACT_DEFERRED)
-                        goto nopage;
-                /*
-                 * Compaction is contended so rather back off than cause
-                 * excessive stalls.
-                 */
-                if(compact_result == COMPACT_CONTENDED)
-                        goto nopage;
-        }
        if (order && compaction_made_progress(compact_result))
                compaction_retries++;
-        /* Try direct reclaim and then allocating */
-        page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
-                                                        &did_some_progress);
-        if (page)
-                goto got_pg;
        /* Do not loop if specifically requested */
        if (gfp_mask & __GFP_NORETRY)
-                goto noretry;
+                goto nopage;
        /*
         * Do not retry costly high order allocations unless they are
         * __GFP_REPEAT
         */
        if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_REPEAT))
-                goto noretry;
+                goto nopage;
        /*
         * Costly allocations might have made a progress but this doesn't mean
@@ -3662,25 +3686,6 @@ retry:
                goto retry;
        }
-noretry:
-        /*
-         * High-order allocations do not necessarily loop after direct reclaim
-         * and reclaim/compaction depends on compaction being called after
-         * reclaim so call directly if necessary.
-         * It can become very expensive to allocate transparent hugepages at
-         * fault, so use asynchronous memory compaction for THP unless it is
-         * khugepaged trying to collapse. All other requests should tolerate
-         * at least light sync migration.
-         */
-        if (is_thp_gfp_mask(gfp_mask) && !(current->flags & PF_KTHREAD))
-                migration_mode = MIGRATE_ASYNC;
-        else
-                migration_mode = MIGRATE_SYNC_LIGHT;
-        page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags,
-                                            ac, migration_mode,
-                                            &compact_result);
-        if (page)
-                goto got_pg;
 nopage:
        warn_alloc_failed(gfp_mask, order, NULL);
 got_pg:
author	Vlastimil Babka <vbabka@suse.cz>	2016-07-28 18:49:19 -0400
committer	Linus Torvalds <torvalds@linux-foundation.org>	2016-07-28 19:07:41 -0400
commit	a8161d1ed6098506303c65b3701dedba876df42a (patch)
tree	b567a7d7aad3d85a0daf050385e19e63a24a7cdd /mm/page_alloc.c
parent	23771235bb569c4999ff077d2c38eaee5763193a (diff)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a42fa09ee91f..ae721a713bda 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c
@@ -3479,7 +3479,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
3479	struct page *page = NULL;	3479	struct page *page = NULL;
3480	unsigned int alloc_flags;	3480	unsigned int alloc_flags;
3481	unsigned long did_some_progress;	3481	unsigned long did_some_progress;
3482	enum migrate_mode migration_mode = MIGRATE_ASYNC;	3482	enum migrate_mode migration_mode = MIGRATE_SYNC_LIGHT;
3483	enum compact_result compact_result;	3483	enum compact_result compact_result;
3484	int compaction_retries = 0;	3484	int compaction_retries = 0;
3485	int no_progress_loops = 0;	3485	int no_progress_loops = 0;
@@ -3521,6 +3521,52 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
3521	if (page)	3521	if (page)
3522	goto got_pg;	3522	goto got_pg;
3523		3523
		3524	/*
		3525	* For costly allocations, try direct compaction first, as it's likely
		3526	* that we have enough base pages and don't need to reclaim. Don't try
		3527	* that for allocations that are allowed to ignore watermarks, as the
		3528	* ALLOC_NO_WATERMARKS attempt didn't yet happen.
		3529	*/
		3530	if (can_direct_reclaim && order > PAGE_ALLOC_COSTLY_ORDER &&
		3531	!gfp_pfmemalloc_allowed(gfp_mask)) {
		3532	page = __alloc_pages_direct_compact(gfp_mask, order,
		3533	alloc_flags, ac,
		3534	MIGRATE_ASYNC,
		3535	&compact_result);
		3536	if (page)
		3537	goto got_pg;
		3538
		3539	/* Checks for THP-specific high-order allocations */
		3540	if (is_thp_gfp_mask(gfp_mask)) {
		3541	/*
		3542	* If compaction is deferred for high-order allocations,
		3543	* it is because sync compaction recently failed. If
		3544	* this is the case and the caller requested a THP
		3545	* allocation, we do not want to heavily disrupt the
		3546	* system, so we fail the allocation instead of entering
		3547	* direct reclaim.
		3548	*/
		3549	if (compact_result == COMPACT_DEFERRED)
		3550	goto nopage;
		3551
		3552	/*
		3553	* Compaction is contended so rather back off than cause
		3554	* excessive stalls.
		3555	*/
		3556	if (compact_result == COMPACT_CONTENDED)
		3557	goto nopage;
		3558
		3559	/*
		3560	* It can become very expensive to allocate transparent
		3561	* hugepages at fault, so use asynchronous memory
		3562	* compaction for THP unless it is khugepaged trying to
		3563	* collapse. All other requests should tolerate at
		3564	* least light sync migration.
		3565	*/
		3566	if (!(current->flags & PF_KTHREAD))
		3567	migration_mode = MIGRATE_ASYNC;
		3568	}
		3569	}
3524		3570
3525	retry:	3571	retry:
3526	/* Ensure kswapd doesn't accidentally go to sleep as long as we loop */	3572	/* Ensure kswapd doesn't accidentally go to sleep as long as we loop */
@@ -3575,55 +3621,33 @@ retry:
3575	if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL))	3621	if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL))
3576	goto nopage;	3622	goto nopage;
3577		3623
3578	/*	3624
3579	* Try direct compaction. The first pass is asynchronous. Subsequent	3625	/* Try direct reclaim and then allocating */
3580	* attempts after direct reclaim are synchronous	3626	page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
3581	*/	3627	&did_some_progress);
		3628	if (page)
		3629	goto got_pg;
		3630
		3631	/* Try direct compaction and then allocating */
3582	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac,	3632	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac,
3583	migration_mode,	3633	migration_mode,
3584	&compact_result);	3634	&compact_result);
3585	if (page)	3635	if (page)
3586	goto got_pg;	3636	goto got_pg;
3587		3637
3588	/* Checks for THP-specific high-order allocations */
3589	if (is_thp_gfp_mask(gfp_mask)) {
3590	/*
3591	* If compaction is deferred for high-order allocations, it is
3592	* because sync compaction recently failed. If this is the case
3593	* and the caller requested a THP allocation, we do not want
3594	* to heavily disrupt the system, so we fail the allocation
3595	* instead of entering direct reclaim.
3596	*/
3597	if (compact_result == COMPACT_DEFERRED)
3598	goto nopage;
3599
3600	/*
3601	* Compaction is contended so rather back off than cause
3602	* excessive stalls.
3603	*/
3604	if(compact_result == COMPACT_CONTENDED)
3605	goto nopage;
3606	}
3607
3608	if (order && compaction_made_progress(compact_result))	3638	if (order && compaction_made_progress(compact_result))
3609	compaction_retries++;	3639	compaction_retries++;
3610		3640
3611	/* Try direct reclaim and then allocating */
3612	page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
3613	&did_some_progress);
3614	if (page)
3615	goto got_pg;
3616
3617	/* Do not loop if specifically requested */	3641	/* Do not loop if specifically requested */
3618	if (gfp_mask & __GFP_NORETRY)	3642	if (gfp_mask & __GFP_NORETRY)
3619	goto noretry;	3643	goto nopage;
3620		3644
3621	/*	3645	/*
3622	* Do not retry costly high order allocations unless they are	3646	* Do not retry costly high order allocations unless they are
3623	* __GFP_REPEAT	3647	* __GFP_REPEAT
3624	*/	3648	*/
3625	if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_REPEAT))	3649	if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_REPEAT))
3626	goto noretry;	3650	goto nopage;
3627		3651
3628	/*	3652	/*
3629	* Costly allocations might have made a progress but this doesn't mean	3653	* Costly allocations might have made a progress but this doesn't mean
@@ -3662,25 +3686,6 @@ retry:
3662	goto retry;	3686	goto retry;
3663	}	3687	}
3664		3688
3665	noretry:
3666	/*
3667	* High-order allocations do not necessarily loop after direct reclaim
3668	* and reclaim/compaction depends on compaction being called after
3669	* reclaim so call directly if necessary.
3670	* It can become very expensive to allocate transparent hugepages at
3671	* fault, so use asynchronous memory compaction for THP unless it is
3672	* khugepaged trying to collapse. All other requests should tolerate
3673	* at least light sync migration.
3674	*/
3675	if (is_thp_gfp_mask(gfp_mask) && !(current->flags & PF_KTHREAD))
3676	migration_mode = MIGRATE_ASYNC;
3677	else
3678	migration_mode = MIGRATE_SYNC_LIGHT;
3679	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags,
3680	ac, migration_mode,
3681	&compact_result);
3682	if (page)
3683	goto got_pg;
3684	nopage:	3689	nopage:
3685	warn_alloc_failed(gfp_mask, order, NULL);	3690	warn_alloc_failed(gfp_mask, order, NULL);
3686	got_pg:	3691	got_pg: