diff options
author | Andrew Barry <abarry@cray.com> | 2011-05-24 20:12:52 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2011-05-25 11:39:36 -0400 |
commit | cfa54a0fcfc1017c6f122b6f21aaba36daa07f71 (patch) | |
tree | c6bcc41b79475854254384b7b4912a2101364183 /mm/page_alloc.c | |
parent | a539f3533b78e39a22723d6d3e1e11b6c14454d9 (diff) |
mm/page_alloc.c: prevent unending loop in __alloc_pages_slowpath()
I believe I found a problem in __alloc_pages_slowpath, which allows a
process to get stuck endlessly looping, even when lots of memory is
available.
Running an I/O and memory intensive stress-test I see a 0-order page
allocation with __GFP_IO and __GFP_WAIT, running on a system with very
little free memory. Right about the same time that the stress-test gets
killed by the OOM-killer, the utility trying to allocate memory gets stuck
in __alloc_pages_slowpath even though most of the systems memory was freed
by the oom-kill of the stress-test.
The utility ends up looping from the rebalance label down through the
wait_iff_congested continiously. Because order=0,
__alloc_pages_direct_compact skips the call to get_page_from_freelist.
Because all of the reclaimable memory on the system has already been
reclaimed, __alloc_pages_direct_reclaim skips the call to
get_page_from_freelist. Since there is no __GFP_FS flag, the block with
__alloc_pages_may_oom is skipped. The loop hits the wait_iff_congested,
then jumps back to rebalance without ever trying to
get_page_from_freelist. This loop repeats infinitely.
The test case is pretty pathological. Running a mix of I/O stress-tests
that do a lot of fork() and consume all of the system memory, I can pretty
reliably hit this on 600 nodes, in about 12 hours. 32GB/node.
Signed-off-by: Andrew Barry <abarry@cray.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Rik van Riel<riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm/page_alloc.c')
-rw-r--r-- | mm/page_alloc.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 10a8c6da385f..2a00f17c3bf4 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c | |||
@@ -2106,6 +2106,7 @@ restart: | |||
2106 | first_zones_zonelist(zonelist, high_zoneidx, NULL, | 2106 | first_zones_zonelist(zonelist, high_zoneidx, NULL, |
2107 | &preferred_zone); | 2107 | &preferred_zone); |
2108 | 2108 | ||
2109 | rebalance: | ||
2109 | /* This is the last chance, in general, before the goto nopage. */ | 2110 | /* This is the last chance, in general, before the goto nopage. */ |
2110 | page = get_page_from_freelist(gfp_mask, nodemask, order, zonelist, | 2111 | page = get_page_from_freelist(gfp_mask, nodemask, order, zonelist, |
2111 | high_zoneidx, alloc_flags & ~ALLOC_NO_WATERMARKS, | 2112 | high_zoneidx, alloc_flags & ~ALLOC_NO_WATERMARKS, |
@@ -2113,7 +2114,6 @@ restart: | |||
2113 | if (page) | 2114 | if (page) |
2114 | goto got_pg; | 2115 | goto got_pg; |
2115 | 2116 | ||
2116 | rebalance: | ||
2117 | /* Allocate without watermarks if the context allows */ | 2117 | /* Allocate without watermarks if the context allows */ |
2118 | if (alloc_flags & ALLOC_NO_WATERMARKS) { | 2118 | if (alloc_flags & ALLOC_NO_WATERMARKS) { |
2119 | page = __alloc_pages_high_priority(gfp_mask, order, | 2119 | page = __alloc_pages_high_priority(gfp_mask, order, |