mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE

Patch series "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE", v2. 0. History This patchset is the follow-up of the discussion about the "Introduce ZONE_CMA (v7)" [1]. Please reference it if more information is needed. 1. What does this patch do? This patch changes the management way for the memory of the CMA area in the MM subsystem. Currently the memory of the CMA area is managed by the zone where their pfn is belong to. However, this approach has some problems since MM subsystem doesn't have enough logic to handle the situation that different characteristic memories are in a single zone. To solve this issue, this patch try to manage all the memory of the CMA area by using the MOVABLE zone. In MM subsystem's point of view, characteristic of the memory on the MOVABLE zone and the memory of the CMA area are the same. So, managing the memory of the CMA area by using the MOVABLE zone will not have any problem. 2. Motivation There are some problems with current approach. See following. Although these problem would not be inherent and it could be fixed without this conception change, it requires many hooks addition in various code path and it would be intrusive to core MM and would be really error-prone. Therefore, I try to solve them with this new approach. Anyway, following is the problems of the current implementation. o CMA memory utilization First, following is the freepage calculation logic in MM. - For movable allocation: freepage = total freepage - For unmovable allocation: freepage = total freepage - CMA freepage Freepages on the CMA area is used after the normal freepages in the zone where the memory of the CMA area is belong to are exhausted. At that moment that the number of the normal freepages is zero, so - For movable allocation: freepage = total freepage = CMA freepage - For unmovable allocation: freepage = 0 If unmovable allocation comes at this moment, allocation request would fail to pass the watermark check and reclaim is started. After reclaim, there would exist the normal freepages so freepages on the CMA areas would not be used. FYI, there is another attempt [2] trying to solve this problem in lkml. And, as far as I know, Qualcomm also has out-of-tree solution for this problem. Useless reclaim: There is no logic to distinguish CMA pages in the reclaim path. Hence, CMA page is reclaimed even if the system just needs the page that can be usable for the kernel allocation. Atomic allocation failure: This is also related to the fallback allocation policy for the memory of the CMA area. Consider the situation that the number of the normal freepages is *zero* since the bunch of the movable allocation requests come. Kswapd would not be woken up due to following freepage calculation logic. - For movable allocation: freepage = total freepage = CMA freepage If atomic unmovable allocation request comes at this moment, it would fails due to following logic. - For unmovable allocation: freepage = total freepage - CMA freepage = 0 It was reported by Aneesh [3]. Useless compaction: Usual high-order allocation request is unmovable allocation request and it cannot be served from the memory of the CMA area. In compaction, migration scanner try to migrate the page in the CMA area and make high-order page there. As mentioned above, it cannot be usable for the unmovable allocation request so it's just waste. 3. Current approach and new approach Current approach is that the memory of the CMA area is managed by the zone where their pfn is belong to. However, these memory should be distinguishable since they have a strong limitation. So, they are marked as MIGRATE_CMA in pageblock flag and handled specially. However, as mentioned in section 2, the MM subsystem doesn't have enough logic to deal with this special pageblock so many problems raised. New approach is that the memory of the CMA area is managed by the MOVABLE zone. MM already have enough logic to deal with special zone like as HIGHMEM and MOVABLE zone. So, managing the memory of the CMA area by the MOVABLE zone just naturally work well because constraints for the memory of the CMA area that the memory should always be migratable is the same with the constraint for the MOVABLE zone. There is one side-effect for the usability of the memory of the CMA area. The use of MOVABLE zone is only allowed for a request with GFP_HIGHMEM && GFP_MOVABLE so now the memory of the CMA area is also only allowed for this gfp flag. Before this patchset, a request with GFP_MOVABLE can use them. IMO, It would not be a big issue since most of GFP_MOVABLE request also has GFP_HIGHMEM flag. For example, file cache page and anonymous page. However, file cache page for blockdev file is an exception. Request for it has no GFP_HIGHMEM flag. There is pros and cons on this exception. In my experience, blockdev file cache pages are one of the top reason that causes cma_alloc() to fail temporarily. So, we can get more guarantee of cma_alloc() success by discarding this case. Note that there is no change in admin POV since this patchset is just for internal implementation change in MM subsystem. Just one minor difference for admin is that the memory stat for CMA area will be printed in the MOVABLE zone. That's all. 4. Result Following is the experimental result related to utilization problem. 8 CPUs, 1024 MB, VIRTUAL MACHINE make -j16 <Before> CMA area: 0 MB 512 MB Elapsed-time: 92.4 186.5 pswpin: 82 18647 pswpout: 160 69839 <After> CMA : 0 MB 512 MB Elapsed-time: 93.1 93.4 pswpin: 84 46 pswpout: 183 92 akpm: "kernel test robot" reported a 26% improvement in vm-scalability.throughput: http://lkml.kernel.org/r/20180330012721.GA3845@yexl-desktop [1]: lkml.kernel.org/r/1491880640-9944-1-git-send-email-iamjoonsoo.kim@lge.com [2]: https://lkml.org/lkml/2014/10/15/623 [3]: http://www.spinics.net/lists/linux-mm/msg100562.html Link: http://lkml.kernel.org/r/1512114786-5085-2-git-send-email-iamjoonsoo.kim@lge.com Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Tested-by: Tony Lindgren <tony@atomide.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
author: Joonsoo Kim <iamjoonsoo.kim@lge.com> 2018-04-10 19:30:15 -0400
committer: Linus Torvalds <torvalds@linux-foundation.org> 2018-04-11 13:28:32 -0400
commit: bad8c6c0b1144694ecb0bc5629ede9b8b578b86e (patch)
tree: 4f35d3265bcb009ce44b6cd9fe20c45be1f22bc6 /mm
parent: d3cda2337bbc9edd2a26b83cb00eaa8c048ff274 (diff)
3 files changed, 125 insertions, 16 deletions
diff --git a/mm/cma.c b/mm/cma.c
index 5809bbe360d7..aa40e6c7b042 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -39,6 +39,7 @@
 #include <trace/events/cma.h>
 #include "cma.h"
+#include "internal.h"
 struct cma cma_areas[MAX_CMA_AREAS];
 unsigned cma_area_count;
@@ -109,23 +110,25 @@ static int __init cma_activate_area(struct cma *cma)
        if (!cma->bitmap)
                return -ENOMEM;
-        WARN_ON_ONCE(!pfn_valid(pfn));
-        zone = page_zone(pfn_to_page(pfn));
        do {
                unsigned j;
                base_pfn = pfn;
+                if (!pfn_valid(base_pfn))
+                        goto err;
+                zone = page_zone(pfn_to_page(base_pfn));
                for (j = pageblock_nr_pages; j; --j, pfn++) {
-                        WARN_ON_ONCE(!pfn_valid(pfn));
+                        if (!pfn_valid(pfn))
+                                goto err;
                        /*
-                         * alloc_contig_range requires the pfn range
+                         * In init_cma_reserved_pageblock(), present_pages
-                         * specified to be in the same zone. Make this
+                         * is adjusted with assumption that all pages in
-                         * simple by forcing the entire CMA resv range
+                         * the pageblock come from a single zone.
-                         * to be in the same zone.
                         */
                        if (page_zone(pfn_to_page(pfn)) != zone)
-                                goto not_in_zone;
+                                goto err;
                }
                init_cma_reserved_pageblock(pfn_to_page(base_pfn));
        } while (--i);
@@ -139,7 +142,7 @@ static int __init cma_activate_area(struct cma *cma)
        return 0;
-not_in_zone:
+err:
        pr_err("CMA area %s could not be activated\n", cma->name);
        kfree(cma->bitmap);
        cma->count = 0;
@@ -149,6 +152,41 @@ not_in_zone:
 static int __init cma_init_reserved_areas(void)
 {
        int i;
+        struct zone *zone;
+        pg_data_t *pgdat;
+        if (!cma_area_count)
+                return 0;
+        for_each_online_pgdat(pgdat) {
+                unsigned long start_pfn = UINT_MAX, end_pfn = 0;
+                zone = &pgdat->node_zones[ZONE_MOVABLE];
+                /*
+                 * In this case, we cannot adjust the zone range
+                 * since it is now maximum node span and we don't
+                 * know original zone range.
+                 */
+                if (populated_zone(zone))
+                        continue;
+                for (i = 0; i < cma_area_count; i++) {
+                        if (pfn_to_nid(cma_areas[i].base_pfn) !=
+                                pgdat->node_id)
+                                continue;
+                        start_pfn = min(start_pfn, cma_areas[i].base_pfn);
+                        end_pfn = max(end_pfn, cma_areas[i].base_pfn +
+                                                cma_areas[i].count);
+                }
+                if (!end_pfn)
+                        continue;
+                zone->zone_start_pfn = start_pfn;
+                zone->spanned_pages = end_pfn - start_pfn;
+        }
        for (i = 0; i < cma_area_count; i++) {
                int ret = cma_activate_area(&cma_areas[i]);
@@ -157,9 +195,32 @@ static int __init cma_init_reserved_areas(void)
                        return ret;
        }
+        /*
+         * Reserved pages for ZONE_MOVABLE are now activated and
+         * this would change ZONE_MOVABLE's managed page counter and
+         * the other zones' present counter. We need to re-calculate
+         * various zone information that depends on this initialization.
+         */
+        build_all_zonelists(NULL);
+        for_each_populated_zone(zone) {
+                if (zone_idx(zone) == ZONE_MOVABLE) {
+                        zone_pcp_reset(zone);
+                        setup_zone_pageset(zone);
+                } else
+                        zone_pcp_update(zone);
+                set_zone_contiguous(zone);
+        }
+        /*
+         * We need to re-init per zone wmark by calling
+         * init_per_zone_wmark_min() but doesn't call here because it is
+         * registered on core_initcall and it will be called later than us.
+         */
        return 0;
 }
-core_initcall(cma_init_reserved_areas);
+pure_initcall(cma_init_reserved_areas);
 /**
 * cma_init_reserved_mem() - create custom contiguous area from reserved memory
diff --git a/mm/internal.h b/mm/internal.h
index 502d14189794..228dd6642951 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -168,6 +168,9 @@ extern void post_alloc_hook(struct page *page, unsigned int order,
                                        gfp_t gfp_flags);
 extern int user_min_free_kbytes;
+extern void set_zone_contiguous(struct zone *zone);
+extern void clear_zone_contiguous(struct zone *zone);
 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
 /*
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 34a4c12d2675..facc25ee6e2d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1747,16 +1747,38 @@ void __init page_alloc_init_late(void)
 }
 #ifdef CONFIG_CMA
+static void __init adjust_present_page_count(struct page *page, long count)
+{
+        struct zone *zone = page_zone(page);
+        /* We don't need to hold a lock since it is boot-up process */
+        zone->present_pages += count;
+}
 /* Free whole pageblock and set its migration type to MIGRATE_CMA. */
 void __init init_cma_reserved_pageblock(struct page *page)
 {
        unsigned i = pageblock_nr_pages;
+        unsigned long pfn = page_to_pfn(page);
        struct page *p = page;
+        int nid = page_to_nid(page);
+        /*
+         * ZONE_MOVABLE will steal present pages from other zones by
+         * changing page links so page_zone() is changed. Before that,
+         * we need to adjust previous zone's page count first.
+         */
+        adjust_present_page_count(page, -pageblock_nr_pages);
        do {
                __ClearPageReserved(p);
                set_page_count(p, 0);
-        } while (++p, --i);
+                /* Steal pages from other zones */
+                set_page_links(p, ZONE_MOVABLE, nid, pfn);
+        } while (++p, ++pfn, --i);
+        adjust_present_page_count(page, pageblock_nr_pages);
        set_pageblock_migratetype(page, MIGRATE_CMA);
@@ -6208,6 +6230,7 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
 {
        enum zone_type j;
        int nid = pgdat->node_id;
+        unsigned long node_end_pfn = 0;
        pgdat_resize_init(pgdat);
 #ifdef CONFIG_NUMA_BALANCING
@@ -6235,9 +6258,13 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
                struct zone *zone = pgdat->node_zones + j;
                unsigned long size, realsize, freesize, memmap_pages;
                unsigned long zone_start_pfn = zone->zone_start_pfn;
+                unsigned long movable_size = 0;
                size = zone->spanned_pages;
                realsize = freesize = zone->present_pages;
+                if (zone_end_pfn(zone) > node_end_pfn)
+                        node_end_pfn = zone_end_pfn(zone);
                /*
                 * Adjust freesize so that it accounts for how much memory
@@ -6286,12 +6313,30 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
                zone_seqlock_init(zone);
                zone_pcp_init(zone);
-                if (!size)
+                /*
+                 * The size of the CMA area is unknown now so we need to
+                 * prepare the memory for the usemap at maximum.
+                 */
+                if (IS_ENABLED(CONFIG_CMA) && j == ZONE_MOVABLE &&
+                        pgdat->node_spanned_pages) {
+                        movable_size = node_end_pfn - pgdat->node_start_pfn;
+                }
+                if (!size && !movable_size)
                        continue;
                set_pageblock_order();
-                setup_usemap(pgdat, zone, zone_start_pfn, size);
+                if (movable_size) {
-                init_currently_empty_zone(zone, zone_start_pfn, size);
+                        zone->zone_start_pfn = pgdat->node_start_pfn;
+                        zone->spanned_pages = movable_size;
+                        setup_usemap(pgdat, zone,
+                                pgdat->node_start_pfn, movable_size);
+                        init_currently_empty_zone(zone,
+                                pgdat->node_start_pfn, movable_size);
+                } else {
+                        setup_usemap(pgdat, zone, zone_start_pfn, size);
+                        init_currently_empty_zone(zone, zone_start_pfn, size);
+                }
                memmap_init(size, nid, j, zone_start_pfn);
        }
 }
@@ -7932,7 +7977,7 @@ void free_contig_range(unsigned long pfn, unsigned nr_pages)
 }
 #endif
-#ifdef CONFIG_MEMORY_HOTPLUG
+#if defined CONFIG_MEMORY_HOTPLUG || defined CONFIG_CMA
 /*
 * The zone indicated has a new number of managed_pages; batch sizes and percpu
 * page high values need to be recalulated.
author	Joonsoo Kim <iamjoonsoo.kim@lge.com>	2018-04-10 19:30:15 -0400
committer	Linus Torvalds <torvalds@linux-foundation.org>	2018-04-11 13:28:32 -0400
commit	bad8c6c0b1144694ecb0bc5629ede9b8b578b86e (patch)
tree	4f35d3265bcb009ce44b6cd9fe20c45be1f22bc6 /mm
parent	d3cda2337bbc9edd2a26b83cb00eaa8c048ff274 (diff)

diff --git a/mm/cma.c b/mm/cma.c index 5809bbe360d7..aa40e6c7b042 100644 --- a/mm/cma.c +++ b/mm/cma.c
@@ -39,6 +39,7 @@
39	#include <trace/events/cma.h>	39	#include <trace/events/cma.h>
40		40
41	#include "cma.h"	41	#include "cma.h"
		42	#include "internal.h"
42		43
43	struct cma cma_areas[MAX_CMA_AREAS];	44	struct cma cma_areas[MAX_CMA_AREAS];
44	unsigned cma_area_count;	45	unsigned cma_area_count;
@@ -109,23 +110,25 @@ static int __init cma_activate_area(struct cma *cma)
109	if (!cma->bitmap)	110	if (!cma->bitmap)
110	return -ENOMEM;	111	return -ENOMEM;
111		112
112	WARN_ON_ONCE(!pfn_valid(pfn));
113	zone = page_zone(pfn_to_page(pfn));
114
115	do {	113	do {
116	unsigned j;	114	unsigned j;
117		115
118	base_pfn = pfn;	116	base_pfn = pfn;
		117	if (!pfn_valid(base_pfn))
		118	goto err;
		119
		120	zone = page_zone(pfn_to_page(base_pfn));
119	for (j = pageblock_nr_pages; j; --j, pfn++) {	121	for (j = pageblock_nr_pages; j; --j, pfn++) {
120	WARN_ON_ONCE(!pfn_valid(pfn));	122	if (!pfn_valid(pfn))
		123	goto err;
		124
121	/*	125	/*
122	* alloc_contig_range requires the pfn range	126	* In init_cma_reserved_pageblock(), present_pages
123	* specified to be in the same zone. Make this	127	* is adjusted with assumption that all pages in
124	* simple by forcing the entire CMA resv range	128	* the pageblock come from a single zone.
125	* to be in the same zone.
126	*/	129	*/
127	if (page_zone(pfn_to_page(pfn)) != zone)	130	if (page_zone(pfn_to_page(pfn)) != zone)
128	goto not_in_zone;	131	goto err;
129	}	132	}
130	init_cma_reserved_pageblock(pfn_to_page(base_pfn));	133	init_cma_reserved_pageblock(pfn_to_page(base_pfn));
131	} while (--i);	134	} while (--i);
@@ -139,7 +142,7 @@ static int __init cma_activate_area(struct cma *cma)
139		142
140	return 0;	143	return 0;
141		144
142	not_in_zone:	145	err:
143	pr_err("CMA area %s could not be activated\n", cma->name);	146	pr_err("CMA area %s could not be activated\n", cma->name);
144	kfree(cma->bitmap);	147	kfree(cma->bitmap);
145	cma->count = 0;	148	cma->count = 0;
@@ -149,6 +152,41 @@ not_in_zone:
149	static int __init cma_init_reserved_areas(void)	152	static int __init cma_init_reserved_areas(void)
150	{	153	{
151	int i;	154	int i;
		155	struct zone *zone;
		156	pg_data_t *pgdat;
		157
		158	if (!cma_area_count)
		159	return 0;
		160
		161	for_each_online_pgdat(pgdat) {
		162	unsigned long start_pfn = UINT_MAX, end_pfn = 0;
		163
		164	zone = &pgdat->node_zones[ZONE_MOVABLE];
		165
		166	/*
		167	* In this case, we cannot adjust the zone range
		168	* since it is now maximum node span and we don't
		169	* know original zone range.
		170	*/
		171	if (populated_zone(zone))
		172	continue;
		173
		174	for (i = 0; i < cma_area_count; i++) {
		175	if (pfn_to_nid(cma_areas[i].base_pfn) !=
		176	pgdat->node_id)
		177	continue;
		178
		179	start_pfn = min(start_pfn, cma_areas[i].base_pfn);
		180	end_pfn = max(end_pfn, cma_areas[i].base_pfn +
		181	cma_areas[i].count);
		182	}
		183
		184	if (!end_pfn)
		185	continue;
		186
		187	zone->zone_start_pfn = start_pfn;
		188	zone->spanned_pages = end_pfn - start_pfn;
		189	}
152		190
153	for (i = 0; i < cma_area_count; i++) {	191	for (i = 0; i < cma_area_count; i++) {
154	int ret = cma_activate_area(&cma_areas[i]);	192	int ret = cma_activate_area(&cma_areas[i]);
@@ -157,9 +195,32 @@ static int __init cma_init_reserved_areas(void)
157	return ret;	195	return ret;
158	}	196	}
159		197
		198	/*
		199	* Reserved pages for ZONE_MOVABLE are now activated and
		200	* this would change ZONE_MOVABLE's managed page counter and
		201	* the other zones' present counter. We need to re-calculate
		202	* various zone information that depends on this initialization.
		203	*/
		204	build_all_zonelists(NULL);
		205	for_each_populated_zone(zone) {
		206	if (zone_idx(zone) == ZONE_MOVABLE) {
		207	zone_pcp_reset(zone);
		208	setup_zone_pageset(zone);
		209	} else
		210	zone_pcp_update(zone);
		211
		212	set_zone_contiguous(zone);
		213	}
		214
		215	/*
		216	* We need to re-init per zone wmark by calling
		217	* init_per_zone_wmark_min() but doesn't call here because it is
		218	* registered on core_initcall and it will be called later than us.
		219	*/
		220
160	return 0;	221	return 0;
161	}	222	}
162	core_initcall(cma_init_reserved_areas);	223	pure_initcall(cma_init_reserved_areas);
163		224
164	/**	225	/**
165	* cma_init_reserved_mem() - create custom contiguous area from reserved memory	226	* cma_init_reserved_mem() - create custom contiguous area from reserved memory


diff --git a/mm/internal.h b/mm/internal.h index 502d14189794..228dd6642951 100644 --- a/mm/internal.h +++ b/mm/internal.h
@@ -168,6 +168,9 @@ extern void post_alloc_hook(struct page *page, unsigned int order,
168	gfp_t gfp_flags);	168	gfp_t gfp_flags);
169	extern int user_min_free_kbytes;	169	extern int user_min_free_kbytes;
170		170
		171	extern void set_zone_contiguous(struct zone *zone);
		172	extern void clear_zone_contiguous(struct zone *zone);
		173
171	#if defined CONFIG_COMPACTION \|\| defined CONFIG_CMA	174	#if defined CONFIG_COMPACTION \|\| defined CONFIG_CMA
172		175
173	/*	176	/*


diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 34a4c12d2675..facc25ee6e2d 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c
@@ -1747,16 +1747,38 @@ void __init page_alloc_init_late(void)
1747	}	1747	}
1748		1748
1749	#ifdef CONFIG_CMA	1749	#ifdef CONFIG_CMA
		1750	static void __init adjust_present_page_count(struct page *page, long count)
		1751	{
		1752	struct zone *zone = page_zone(page);
		1753
		1754	/* We don't need to hold a lock since it is boot-up process */
		1755	zone->present_pages += count;
		1756	}
		1757
1750	/* Free whole pageblock and set its migration type to MIGRATE_CMA. */	1758	/* Free whole pageblock and set its migration type to MIGRATE_CMA. */
1751	void __init init_cma_reserved_pageblock(struct page *page)	1759	void __init init_cma_reserved_pageblock(struct page *page)
1752	{	1760	{
1753	unsigned i = pageblock_nr_pages;	1761	unsigned i = pageblock_nr_pages;
		1762	unsigned long pfn = page_to_pfn(page);
1754	struct page *p = page;	1763	struct page *p = page;
		1764	int nid = page_to_nid(page);
		1765
		1766	/*
		1767	* ZONE_MOVABLE will steal present pages from other zones by
		1768	* changing page links so page_zone() is changed. Before that,
		1769	* we need to adjust previous zone's page count first.
		1770	*/
		1771	adjust_present_page_count(page, -pageblock_nr_pages);
1755		1772
1756	do {	1773	do {
1757	__ClearPageReserved(p);	1774	__ClearPageReserved(p);
1758	set_page_count(p, 0);	1775	set_page_count(p, 0);
1759	} while (++p, --i);	1776
		1777	/* Steal pages from other zones */
		1778	set_page_links(p, ZONE_MOVABLE, nid, pfn);
		1779	} while (++p, ++pfn, --i);
		1780
		1781	adjust_present_page_count(page, pageblock_nr_pages);
1760		1782
1761	set_pageblock_migratetype(page, MIGRATE_CMA);	1783	set_pageblock_migratetype(page, MIGRATE_CMA);
1762		1784
@@ -6208,6 +6230,7 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
6208	{	6230	{
6209	enum zone_type j;	6231	enum zone_type j;
6210	int nid = pgdat->node_id;	6232	int nid = pgdat->node_id;
		6233	unsigned long node_end_pfn = 0;
6211		6234
6212	pgdat_resize_init(pgdat);	6235	pgdat_resize_init(pgdat);
6213	#ifdef CONFIG_NUMA_BALANCING	6236	#ifdef CONFIG_NUMA_BALANCING
@@ -6235,9 +6258,13 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
6235	struct zone *zone = pgdat->node_zones + j;	6258	struct zone *zone = pgdat->node_zones + j;
6236	unsigned long size, realsize, freesize, memmap_pages;	6259	unsigned long size, realsize, freesize, memmap_pages;
6237	unsigned long zone_start_pfn = zone->zone_start_pfn;	6260	unsigned long zone_start_pfn = zone->zone_start_pfn;
		6261	unsigned long movable_size = 0;
6238		6262
6239	size = zone->spanned_pages;	6263	size = zone->spanned_pages;
6240	realsize = freesize = zone->present_pages;	6264	realsize = freesize = zone->present_pages;
		6265	if (zone_end_pfn(zone) > node_end_pfn)
		6266	node_end_pfn = zone_end_pfn(zone);
		6267
6241		6268
6242	/*	6269	/*
6243	* Adjust freesize so that it accounts for how much memory	6270	* Adjust freesize so that it accounts for how much memory
@@ -6286,12 +6313,30 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
6286	zone_seqlock_init(zone);	6313	zone_seqlock_init(zone);
6287	zone_pcp_init(zone);	6314	zone_pcp_init(zone);
6288		6315
6289	if (!size)	6316	/*
		6317	* The size of the CMA area is unknown now so we need to
		6318	* prepare the memory for the usemap at maximum.
		6319	*/
		6320	if (IS_ENABLED(CONFIG_CMA) && j == ZONE_MOVABLE &&
		6321	pgdat->node_spanned_pages) {
		6322	movable_size = node_end_pfn - pgdat->node_start_pfn;
		6323	}
		6324
		6325	if (!size && !movable_size)
6290	continue;	6326	continue;
6291		6327
6292	set_pageblock_order();	6328	set_pageblock_order();
6293	setup_usemap(pgdat, zone, zone_start_pfn, size);	6329	if (movable_size) {
6294	init_currently_empty_zone(zone, zone_start_pfn, size);	6330	zone->zone_start_pfn = pgdat->node_start_pfn;
		6331	zone->spanned_pages = movable_size;
		6332	setup_usemap(pgdat, zone,
		6333	pgdat->node_start_pfn, movable_size);
		6334	init_currently_empty_zone(zone,
		6335	pgdat->node_start_pfn, movable_size);
		6336	} else {
		6337	setup_usemap(pgdat, zone, zone_start_pfn, size);
		6338	init_currently_empty_zone(zone, zone_start_pfn, size);
		6339	}
6295	memmap_init(size, nid, j, zone_start_pfn);	6340	memmap_init(size, nid, j, zone_start_pfn);
6296	}	6341	}
6297	}	6342	}
@@ -7932,7 +7977,7 @@ void free_contig_range(unsigned long pfn, unsigned nr_pages)
7932	}	7977	}
7933	#endif	7978	#endif
7934		7979
7935	#ifdef CONFIG_MEMORY_HOTPLUG	7980	#if defined CONFIG_MEMORY_HOTPLUG \|\| defined CONFIG_CMA
7936	/*	7981	/*
7937	* The zone indicated has a new number of managed_pages; batch sizes and percpu	7982	* The zone indicated has a new number of managed_pages; batch sizes and percpu
7938	* page high values need to be recalulated.	7983	* page high values need to be recalulated.