diff options
author | Nishanth Aravamudan <nacc@linux.vnet.ibm.com> | 2012-03-21 19:34:07 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2012-03-21 20:54:58 -0400 |
commit | f5bf18fa22f8c41a13eb8762c7373eb3a93a7333 (patch) | |
tree | 3da24eb0edae3563c1937088b72a413e7026fdec /mm/sparse.c | |
parent | f0cb3c76ae1ced85f9034480b1b24cd96530ec78 (diff) |
bootmem/sparsemem: remove limit constraint in alloc_bootmem_section
While testing AMS (Active Memory Sharing) / CMO (Cooperative Memory
Overcommit) on powerpc, we tripped the following:
kernel BUG at mm/bootmem.c:483!
cpu 0x0: Vector: 700 (Program Check) at [c000000000c03940]
pc: c000000000a62bd8: .alloc_bootmem_core+0x90/0x39c
lr: c000000000a64bcc: .sparse_early_usemaps_alloc_node+0x84/0x29c
sp: c000000000c03bc0
msr: 8000000000021032
current = 0xc000000000b0cce0
paca = 0xc000000001d80000
pid = 0, comm = swapper
kernel BUG at mm/bootmem.c:483!
enter ? for help
[c000000000c03c80] c000000000a64bcc
.sparse_early_usemaps_alloc_node+0x84/0x29c
[c000000000c03d50] c000000000a64f10 .sparse_init+0x12c/0x28c
[c000000000c03e20] c000000000a474f4 .setup_arch+0x20c/0x294
[c000000000c03ee0] c000000000a4079c .start_kernel+0xb4/0x460
[c000000000c03f90] c000000000009670 .start_here_common+0x1c/0x2c
This is
BUG_ON(limit && goal + size > limit);
and after some debugging, it seems that
goal = 0x7ffff000000
limit = 0x80000000000
and sparse_early_usemaps_alloc_node ->
sparse_early_usemaps_alloc_pgdat_section calls
return alloc_bootmem_section(usemap_size() * count, section_nr);
This is on a system with 8TB available via the AMS pool, and as a quirk
of AMS in firmware, all of that memory shows up in node 0. So, we end
up with an allocation that will fail the goal/limit constraints.
In theory, we could "fall-back" to alloc_bootmem_node() in
sparse_early_usemaps_alloc_node(), but since we actually have HOTREMOVE
defined, we'll BUG_ON() instead. A simple solution appears to be to
unconditionally remove the limit condition in alloc_bootmem_section,
meaning allocations are allowed to cross section boundaries (necessary
for systems of this size).
Johannes Weiner pointed out that if alloc_bootmem_section() no longer
guarantees section-locality, we need check_usemap_section_nr() to print
possible cross-dependencies between node descriptors and the usemaps
allocated through it. That makes the two loops in
sparse_early_usemaps_alloc_node() identical, so re-factor the code a
bit.
[akpm@linux-foundation.org: code simplification]
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ben Herrenschmidt <benh@kernel.crashing.org>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org> [3.3.1]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm/sparse.c')
-rw-r--r-- | mm/sparse.c | 30 |
1 files changed, 11 insertions, 19 deletions
diff --git a/mm/sparse.c b/mm/sparse.c index 61d7cde23111..a8bc7d364deb 100644 --- a/mm/sparse.c +++ b/mm/sparse.c | |||
@@ -353,29 +353,21 @@ static void __init sparse_early_usemaps_alloc_node(unsigned long**usemap_map, | |||
353 | 353 | ||
354 | usemap = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nodeid), | 354 | usemap = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nodeid), |
355 | usemap_count); | 355 | usemap_count); |
356 | if (usemap) { | 356 | if (!usemap) { |
357 | for (pnum = pnum_begin; pnum < pnum_end; pnum++) { | 357 | usemap = alloc_bootmem_node(NODE_DATA(nodeid), size * usemap_count); |
358 | if (!present_section_nr(pnum)) | 358 | if (!usemap) { |
359 | continue; | 359 | printk(KERN_WARNING "%s: allocation failed\n", __func__); |
360 | usemap_map[pnum] = usemap; | 360 | return; |
361 | usemap += size; | ||
362 | } | 361 | } |
363 | return; | ||
364 | } | 362 | } |
365 | 363 | ||
366 | usemap = alloc_bootmem_node(NODE_DATA(nodeid), size * usemap_count); | 364 | for (pnum = pnum_begin; pnum < pnum_end; pnum++) { |
367 | if (usemap) { | 365 | if (!present_section_nr(pnum)) |
368 | for (pnum = pnum_begin; pnum < pnum_end; pnum++) { | 366 | continue; |
369 | if (!present_section_nr(pnum)) | 367 | usemap_map[pnum] = usemap; |
370 | continue; | 368 | usemap += size; |
371 | usemap_map[pnum] = usemap; | 369 | check_usemap_section_nr(nodeid, usemap_map[pnum]); |
372 | usemap += size; | ||
373 | check_usemap_section_nr(nodeid, usemap_map[pnum]); | ||
374 | } | ||
375 | return; | ||
376 | } | 370 | } |
377 | |||
378 | printk(KERN_WARNING "%s: allocation failed\n", __func__); | ||
379 | } | 371 | } |
380 | 372 | ||
381 | #ifndef CONFIG_SPARSEMEM_VMEMMAP | 373 | #ifndef CONFIG_SPARSEMEM_VMEMMAP |