diff options
author | Huang Ying <ying.huang@intel.com> | 2008-07-25 22:45:10 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2008-07-26 15:00:04 -0400 |
commit | 89081d17f7bb81d89fa1aa9b70f821c5cf4d39e9 (patch) | |
tree | 1835fa64801fee048c8074ae4d63b0a7f4b14ee3 | |
parent | 3ab83521378268044a448113c6aa9a9e245f4d2f (diff) |
kexec jump: save/restore device state
This patch implements devices state save/restore before after kexec.
This patch together with features in kexec_jump patch can be used for
following:
- A simple hibernation implementation without ACPI support. You can kexec a
hibernating kernel, save the memory image of original system and shutdown
the system. When resuming, you restore the memory image of original system
via ordinary kexec load then jump back.
- Kernel/system debug through making system snapshot. You can make system
snapshot, jump back, do some thing and make another system snapshot.
- Cooperative multi-kernel/system. With kexec jump, you can switch between
several kernels/systems quickly without boot process except the first time.
This appears like swap a whole kernel/system out/in.
- A general method to call program in physical mode (paging turning
off). This can be used to invoke BIOS code under Linux.
The following user-space tools can be used with kexec jump:
- kexec-tools needs to be patched to support kexec jump. The patches
and the precompiled kexec can be download from the following URL:
source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10
- makedumpfile with patches are used as memory image saving tool, it
can exclude free pages from original kernel memory image file. The
patches and the precompiled makedumpfile can be download from the
following URL:
source: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-src_cvs_kh10.tar.bz2
patches: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-patches_cvs_kh10.tar.bz2
binary: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile_cvs_kh10
- An initramfs image can be used as the root file system of kexeced
kernel. An initramfs image built with "BuildRoot" can be downloaded
from the following URL:
initramfs image: http://khibernation.sourceforge.net/download/release_v10/initramfs/rootfs_cvs_kh10.gz
All user space tools above are included in the initramfs image.
Usage example of simple hibernation:
1. Compile and install patched kernel with following options selected:
CONFIG_X86_32=y
CONFIG_RELOCATABLE=y
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_PM=y
CONFIG_HIBERNATION=y
CONFIG_KEXEC_JUMP=y
2. Build an initramfs image contains kexec-tool and makedumpfile, or
download the pre-built initramfs image, called rootfs.gz in
following text.
3. Prepare a partition to save memory image of original kernel, called
hibernating partition in following text.
4. Boot kernel compiled in step 1 (kernel A).
5. In the kernel A, load kernel compiled in step 1 (kernel B) with
/sbin/kexec. The shell command line can be as follow:
/sbin/kexec --load-preserve-context /boot/bzImage --mem-min=0x100000
--mem-max=0xffffff --initrd=rootfs.gz
6. Boot the kernel B with following shell command line:
/sbin/kexec -e
7. The kernel B will boot as normal kexec. In kernel B the memory
image of kernel A can be saved into hibernating partition as
follow:
jump_back_entry=`cat /proc/cmdline | tr ' ' '\n' | grep kexec_jump_back_entry | cut -d '='`
echo $jump_back_entry > kexec_jump_back_entry
cp /proc/vmcore dump.elf
Then you can shutdown the machine as normal.
8. Boot kernel compiled in step 1 (kernel C). Use the rootfs.gz as
root file system.
9. In kernel C, load the memory image of kernel A as follow:
/sbin/kexec -l --args-none --entry=`cat kexec_jump_back_entry` dump.elf
10. Jump back to the kernel A as follow:
/sbin/kexec -e
Then, kernel A is resumed.
Implementation point:
To support jumping between two kernels, before jumping to (executing)
the new kernel and jumping back to the original kernel, the devices
are put into quiescent state, and the state of devices and CPU is
saved. After jumping back from kexeced kernel and jumping to the new
kernel, the state of devices and CPU are restored accordingly. The
devices/CPU state save/restore code of software suspend is called to
implement corresponding function.
Known issues:
- Because the segment number supported by sys_kexec_load is limited,
hibernation image with many segments may not be load. This is
planned to be eliminated by adding a new flag to sys_kexec_load to
make a image can be loaded with multiple sys_kexec_load invoking.
Now, only the i386 architecture is supported.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r-- | arch/x86/Kconfig | 5 | ||||
-rw-r--r-- | arch/x86/kernel/machine_kexec_32.c | 12 | ||||
-rw-r--r-- | include/linux/suspend.h | 2 | ||||
-rw-r--r-- | kernel/kexec.c | 39 | ||||
-rw-r--r-- | kernel/power/power.h | 2 |
5 files changed, 56 insertions, 4 deletions
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 7ecb679f0130..6b2debfabddc 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig | |||
@@ -1282,9 +1282,10 @@ config CRASH_DUMP | |||
1282 | config KEXEC_JUMP | 1282 | config KEXEC_JUMP |
1283 | bool "kexec jump (EXPERIMENTAL)" | 1283 | bool "kexec jump (EXPERIMENTAL)" |
1284 | depends on EXPERIMENTAL | 1284 | depends on EXPERIMENTAL |
1285 | depends on KEXEC && PM_SLEEP && X86_32 | 1285 | depends on KEXEC && HIBERNATION && X86_32 |
1286 | help | 1286 | help |
1287 | Invoke code in physical address mode via KEXEC | 1287 | Jump between original kernel and kexeced kernel and invoke |
1288 | code in physical address mode via KEXEC | ||
1288 | 1289 | ||
1289 | config PHYSICAL_START | 1290 | config PHYSICAL_START |
1290 | hex "Physical address where the kernel is loaded" if (EMBEDDED || CRASH_DUMP) | 1291 | hex "Physical address where the kernel is loaded" if (EMBEDDED || CRASH_DUMP) |
diff --git a/arch/x86/kernel/machine_kexec_32.c b/arch/x86/kernel/machine_kexec_32.c index 2b67609d0a1c..9fe478d98406 100644 --- a/arch/x86/kernel/machine_kexec_32.c +++ b/arch/x86/kernel/machine_kexec_32.c | |||
@@ -125,6 +125,18 @@ void machine_kexec(struct kimage *image) | |||
125 | /* Interrupts aren't acceptable while we reboot */ | 125 | /* Interrupts aren't acceptable while we reboot */ |
126 | local_irq_disable(); | 126 | local_irq_disable(); |
127 | 127 | ||
128 | if (image->preserve_context) { | ||
129 | #ifdef CONFIG_X86_IO_APIC | ||
130 | /* We need to put APICs in legacy mode so that we can | ||
131 | * get timer interrupts in second kernel. kexec/kdump | ||
132 | * paths already have calls to disable_IO_APIC() in | ||
133 | * one form or other. kexec jump path also need | ||
134 | * one. | ||
135 | */ | ||
136 | disable_IO_APIC(); | ||
137 | #endif | ||
138 | } | ||
139 | |||
128 | control_page = page_address(image->control_code_page); | 140 | control_page = page_address(image->control_code_page); |
129 | memcpy(control_page, relocate_kernel, PAGE_SIZE/2); | 141 | memcpy(control_page, relocate_kernel, PAGE_SIZE/2); |
130 | 142 | ||
diff --git a/include/linux/suspend.h b/include/linux/suspend.h index e8e69159af71..c63435095970 100644 --- a/include/linux/suspend.h +++ b/include/linux/suspend.h | |||
@@ -278,4 +278,6 @@ static inline void register_nosave_region_late(unsigned long b, unsigned long e) | |||
278 | } | 278 | } |
279 | #endif | 279 | #endif |
280 | 280 | ||
281 | extern struct mutex pm_mutex; | ||
282 | |||
281 | #endif /* _LINUX_SUSPEND_H */ | 283 | #endif /* _LINUX_SUSPEND_H */ |
diff --git a/kernel/kexec.c b/kernel/kexec.c index a0d920915b38..c8a4370e2a34 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c | |||
@@ -26,6 +26,10 @@ | |||
26 | #include <linux/numa.h> | 26 | #include <linux/numa.h> |
27 | #include <linux/suspend.h> | 27 | #include <linux/suspend.h> |
28 | #include <linux/device.h> | 28 | #include <linux/device.h> |
29 | #include <linux/freezer.h> | ||
30 | #include <linux/pm.h> | ||
31 | #include <linux/cpu.h> | ||
32 | #include <linux/console.h> | ||
29 | 33 | ||
30 | #include <asm/page.h> | 34 | #include <asm/page.h> |
31 | #include <asm/uaccess.h> | 35 | #include <asm/uaccess.h> |
@@ -1441,7 +1445,31 @@ int kernel_kexec(void) | |||
1441 | 1445 | ||
1442 | if (kexec_image->preserve_context) { | 1446 | if (kexec_image->preserve_context) { |
1443 | #ifdef CONFIG_KEXEC_JUMP | 1447 | #ifdef CONFIG_KEXEC_JUMP |
1448 | mutex_lock(&pm_mutex); | ||
1449 | pm_prepare_console(); | ||
1450 | error = freeze_processes(); | ||
1451 | if (error) { | ||
1452 | error = -EBUSY; | ||
1453 | goto Restore_console; | ||
1454 | } | ||
1455 | suspend_console(); | ||
1456 | error = device_suspend(PMSG_FREEZE); | ||
1457 | if (error) | ||
1458 | goto Resume_console; | ||
1459 | error = disable_nonboot_cpus(); | ||
1460 | if (error) | ||
1461 | goto Resume_devices; | ||
1444 | local_irq_disable(); | 1462 | local_irq_disable(); |
1463 | /* At this point, device_suspend() has been called, | ||
1464 | * but *not* device_power_down(). We *must* | ||
1465 | * device_power_down() now. Otherwise, drivers for | ||
1466 | * some devices (e.g. interrupt controllers) become | ||
1467 | * desynchronized with the actual state of the | ||
1468 | * hardware at resume time, and evil weirdness ensues. | ||
1469 | */ | ||
1470 | error = device_power_down(PMSG_FREEZE); | ||
1471 | if (error) | ||
1472 | goto Enable_irqs; | ||
1445 | save_processor_state(); | 1473 | save_processor_state(); |
1446 | #endif | 1474 | #endif |
1447 | } else { | 1475 | } else { |
@@ -1459,7 +1487,18 @@ int kernel_kexec(void) | |||
1459 | if (kexec_image->preserve_context) { | 1487 | if (kexec_image->preserve_context) { |
1460 | #ifdef CONFIG_KEXEC_JUMP | 1488 | #ifdef CONFIG_KEXEC_JUMP |
1461 | restore_processor_state(); | 1489 | restore_processor_state(); |
1490 | device_power_up(PMSG_RESTORE); | ||
1491 | Enable_irqs: | ||
1462 | local_irq_enable(); | 1492 | local_irq_enable(); |
1493 | enable_nonboot_cpus(); | ||
1494 | Resume_devices: | ||
1495 | device_resume(PMSG_RESTORE); | ||
1496 | Resume_console: | ||
1497 | resume_console(); | ||
1498 | thaw_processes(); | ||
1499 | Restore_console: | ||
1500 | pm_restore_console(); | ||
1501 | mutex_unlock(&pm_mutex); | ||
1463 | #endif | 1502 | #endif |
1464 | } | 1503 | } |
1465 | 1504 | ||
diff --git a/kernel/power/power.h b/kernel/power/power.h index 700f44ec8406..acc0c101dbd5 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h | |||
@@ -53,8 +53,6 @@ extern int hibernation_platform_enter(void); | |||
53 | 53 | ||
54 | extern int pfn_is_nosave(unsigned long); | 54 | extern int pfn_is_nosave(unsigned long); |
55 | 55 | ||
56 | extern struct mutex pm_mutex; | ||
57 | |||
58 | #define power_attr(_name) \ | 56 | #define power_attr(_name) \ |
59 | static struct kobj_attribute _name##_attr = { \ | 57 | static struct kobj_attribute _name##_attr = { \ |
60 | .attr = { \ | 58 | .attr = { \ |