diff options
author | Baoquan He <bhe@redhat.com> | 2015-09-09 18:39:00 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2015-09-10 16:29:01 -0400 |
commit | bbb78b8f3f4ea8eca14937b693bfe244838e1d4d (patch) | |
tree | eab87e2db99c2b00a8a9e0d777c1655c7dd3e9ae /kernel | |
parent | 04e9949b2d26ae1f0acd1181876a2a8ece92112d (diff) |
kexec: align crash_notes allocation to make it be inside one physical page
People reported that crash_notes in /proc/vmcore were corrupted and this
cause crash kdump failure. With code debugging and log we got the root
cause. This is because percpu variable crash_notes are allocated in 2
vmalloc pages. Currently percpu is based on vmalloc by default. Vmalloc
can't guarantee 2 continuous vmalloc pages are also on 2 continuous
physical pages. So when 1st kernel exports the starting address and size
of crash_notes through sysfs like below:
/sys/devices/system/cpu/cpux/crash_notes
/sys/devices/system/cpu/cpux/crash_notes_size
kdump kernel use them to get the content of crash_notes. However the 2nd
part may not be in the next neighbouring physical page as we expected if
crash_notes are allocated accross 2 vmalloc pages. That's why
nhdr_ptr->n_namesz or nhdr_ptr->n_descsz could be very huge in
update_note_header_size_elf64() and cause note header merging failure or
some warnings.
In this patch change to call __alloc_percpu() to passed in the align value
by rounding crash_notes_size up to the nearest power of two. This makes
sure the crash_notes is allocated inside one physical page since
sizeof(note_buf_t) in all ARCHS is smaller than PAGE_SIZE. Meanwhile add
a BUILD_BUG_ON to break compile if size is bigger than PAGE_SIZE since
crash_notes definitely will be in 2 pages. That need be avoided, and need
be reported if it's unavoidable.
[akpm@linux-foundation.org: use correct comment layout]
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Lisa Mitchell <lisa.mitchell@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'kernel')
-rw-r--r-- | kernel/kexec_core.c | 23 |
1 files changed, 22 insertions, 1 deletions
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index 9ffc96b65d9a..322dd5579f59 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c | |||
@@ -1004,7 +1004,28 @@ void crash_save_cpu(struct pt_regs *regs, int cpu) | |||
1004 | static int __init crash_notes_memory_init(void) | 1004 | static int __init crash_notes_memory_init(void) |
1005 | { | 1005 | { |
1006 | /* Allocate memory for saving cpu registers. */ | 1006 | /* Allocate memory for saving cpu registers. */ |
1007 | crash_notes = alloc_percpu(note_buf_t); | 1007 | size_t size, align; |
1008 | |||
1009 | /* | ||
1010 | * crash_notes could be allocated across 2 vmalloc pages when percpu | ||
1011 | * is vmalloc based . vmalloc doesn't guarantee 2 continuous vmalloc | ||
1012 | * pages are also on 2 continuous physical pages. In this case the | ||
1013 | * 2nd part of crash_notes in 2nd page could be lost since only the | ||
1014 | * starting address and size of crash_notes are exported through sysfs. | ||
1015 | * Here round up the size of crash_notes to the nearest power of two | ||
1016 | * and pass it to __alloc_percpu as align value. This can make sure | ||
1017 | * crash_notes is allocated inside one physical page. | ||
1018 | */ | ||
1019 | size = sizeof(note_buf_t); | ||
1020 | align = min(roundup_pow_of_two(sizeof(note_buf_t)), PAGE_SIZE); | ||
1021 | |||
1022 | /* | ||
1023 | * Break compile if size is bigger than PAGE_SIZE since crash_notes | ||
1024 | * definitely will be in 2 pages with that. | ||
1025 | */ | ||
1026 | BUILD_BUG_ON(size > PAGE_SIZE); | ||
1027 | |||
1028 | crash_notes = __alloc_percpu(size, align); | ||
1008 | if (!crash_notes) { | 1029 | if (!crash_notes) { |
1009 | pr_warn("Kexec: Memory allocation for saving cpu register states failed\n"); | 1030 | pr_warn("Kexec: Memory allocation for saving cpu register states failed\n"); |
1010 | return -ENOMEM; | 1031 | return -ENOMEM; |