aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/power/basic-pm-debugging.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/power/basic-pm-debugging.txt')
-rw-r--r--Documentation/power/basic-pm-debugging.txt216
1 files changed, 157 insertions, 59 deletions
diff --git a/Documentation/power/basic-pm-debugging.txt b/Documentation/power/basic-pm-debugging.txt
index 57aef2f6e0de..1555001bc733 100644
--- a/Documentation/power/basic-pm-debugging.txt
+++ b/Documentation/power/basic-pm-debugging.txt
@@ -1,45 +1,111 @@
1Debugging suspend and resume 1Debugging hibernation and suspend
2 (C) 2007 Rafael J. Wysocki <rjw@sisk.pl>, GPL 2 (C) 2007 Rafael J. Wysocki <rjw@sisk.pl>, GPL
3 3
41. Testing suspend to disk (STD) 41. Testing hibernation (aka suspend to disk or STD)
5 5
6To verify that the STD works, you can try to suspend in the "reboot" mode: 6To check if hibernation works, you can try to hibernate in the "reboot" mode:
7 7
8# echo reboot > /sys/power/disk 8# echo reboot > /sys/power/disk
9# echo disk > /sys/power/state 9# echo disk > /sys/power/state
10 10
11and the system should suspend, reboot, resume and get back to the command prompt 11and the system should create a hibernation image, reboot, resume and get back to
12where you have started the transition. If that happens, the STD is most likely 12the command prompt where you have started the transition. If that happens,
13to work correctly, but you need to repeat the test at least a couple of times in 13hibernation is most likely to work correctly. Still, you need to repeat the
14a row for confidence. This is necessary, because some problems only show up on 14test at least a couple of times in a row for confidence. [This is necessary,
15a second attempt at suspending and resuming the system. You should also test 15because some problems only show up on a second attempt at suspending and
16the "platform" and "shutdown" modes of suspend: 16resuming the system.] Moreover, hibernating in the "reboot" and "shutdown"
17modes causes the PM core to skip some platform-related callbacks which on ACPI
18systems might be necessary to make hibernation work. Thus, if you machine fails
19to hibernate or resume in the "reboot" mode, you should try the "platform" mode:
17 20
18# echo platform > /sys/power/disk 21# echo platform > /sys/power/disk
19# echo disk > /sys/power/state 22# echo disk > /sys/power/state
20 23
21or 24which is the default and recommended mode of hibernation.
25
26Unfortunately, the "platform" mode of hibernation does not work on some systems
27with broken BIOSes. In such cases the "shutdown" mode of hibernation might
28work:
22 29
23# echo shutdown > /sys/power/disk 30# echo shutdown > /sys/power/disk
24# echo disk > /sys/power/state 31# echo disk > /sys/power/state
25 32
26in which cases you will have to press the power button to make the system 33(it is similar to the "reboot" mode, but it requires you to press the power
27resume. If that does not work, you will need to identify what goes wrong. 34button to make the system resume).
35
36If neither "platform" nor "shutdown" hibernation mode works, you will need to
37identify what goes wrong.
38
39a) Test modes of hibernation
40
41To find out why hibernation fails on your system, you can use a special testing
42facility available if the kernel is compiled with CONFIG_PM_DEBUG set. Then,
43there is the file /sys/power/pm_test that can be used to make the hibernation
44core run in a test mode. There are 5 test modes available:
45
46freezer
47- test the freezing of processes
48
49devices
50- test the freezing of processes and suspending of devices
28 51
29a) Test mode of STD 52platform
53- test the freezing of processes, suspending of devices and platform
54 global control methods(*)
30 55
31To verify if there are any drivers that cause problems you can run the STD 56processors
32in the test mode: 57- test the freezing of processes, suspending of devices, platform
58 global control methods(*) and the disabling of nonboot CPUs
33 59
34# echo test > /sys/power/disk 60core
61- test the freezing of processes, suspending of devices, platform global
62 control methods(*), the disabling of nonboot CPUs and suspending of
63 platform/system devices
64
65(*) the platform global control methods are only available on ACPI systems
66 and are only tested if the hibernation mode is set to "platform"
67
68To use one of them it is necessary to write the corresponding string to
69/sys/power/pm_test (eg. "devices" to test the freezing of processes and
70suspending devices) and issue the standard hibernation commands. For example,
71to use the "devices" test mode along with the "platform" mode of hibernation,
72you should do the following:
73
74# echo devices > /sys/power/pm_test
75# echo platform > /sys/power/disk
35# echo disk > /sys/power/state 76# echo disk > /sys/power/state
36 77
37in which case the system should freeze tasks, suspend devices, disable nonboot 78Then, the kernel will try to freeze processes, suspend devices, wait 5 seconds,
38CPUs (if any), wait for 5 seconds, enable nonboot CPUs, resume devices, thaw 79resume devices and thaw processes. If "platform" is written to
39tasks and return to your command prompt. If that fails, most likely there is 80/sys/power/pm_test , then after suspending devices the kernel will additionally
40a driver that fails to either suspend or resume (in the latter case the system 81invoke the global control methods (eg. ACPI global control methods) used to
41may hang or be unstable after the test, so please take that into consideration). 82prepare the platform firmware for hibernation. Next, it will wait 5 seconds and
42To find this driver, you can carry out a binary search according to the rules: 83invoke the platform (eg. ACPI) global methods used to cancel hibernation etc.
84
85Writing "none" to /sys/power/pm_test causes the kernel to switch to the normal
86hibernation/suspend operations. Also, when open for reading, /sys/power/pm_test
87contains a space-separated list of all available tests (including "none" that
88represents the normal functionality) in which the current test level is
89indicated by square brackets.
90
91Generally, as you can see, each test level is more "invasive" than the previous
92one and the "core" level tests the hardware and drivers as deeply as possible
93without creating a hibernation image. Obviously, if the "devices" test fails,
94the "platform" test will fail as well and so on. Thus, as a rule of thumb, you
95should try the test modes starting from "freezer", through "devices", "platform"
96and "processors" up to "core" (repeat the test on each level a couple of times
97to make sure that any random factors are avoided).
98
99If the "freezer" test fails, there is a task that cannot be frozen (in that case
100it usually is possible to identify the offending task by analysing the output of
101dmesg obtained after the failing test). Failure at this level usually means
102that there is a problem with the tasks freezer subsystem that should be
103reported.
104
105If the "devices" test fails, most likely there is a driver that cannot suspend
106or resume its device (in the latter case the system may hang or become unstable
107after the test, so please take that into consideration). To find this driver,
108you can carry out a binary search according to the rules:
43- if the test fails, unload a half of the drivers currently loaded and repeat 109- if the test fails, unload a half of the drivers currently loaded and repeat
44(that would probably involve rebooting the system, so always note what drivers 110(that would probably involve rebooting the system, so always note what drivers
45have been loaded before the test), 111have been loaded before the test),
@@ -47,23 +113,46 @@ have been loaded before the test),
47recently and repeat. 113recently and repeat.
48 114
49Once you have found the failing driver (there can be more than just one of 115Once you have found the failing driver (there can be more than just one of
50them), you have to unload it every time before the STD transition. In that case 116them), you have to unload it every time before hibernation. In that case please
51please make sure to report the problem with the driver. 117make sure to report the problem with the driver.
52 118
53It is also possible that a cycle can still fail after you have unloaded 119It is also possible that the "devices" test will still fail after you have
54all modules. In that case, you would want to look in your kernel configuration 120unloaded all modules. In that case, you may want to look in your kernel
55for the drivers that can be compiled as modules (testing again with them as 121configuration for the drivers that can be compiled as modules (and test again
56modules), and possibly also try boot time options such as "noapic" or "noacpi". 122with these drivers compiled as modules). You may also try to use some special
123kernel command line options such as "noapic", "noacpi" or even "acpi=off".
124
125If the "platform" test fails, there is a problem with the handling of the
126platform (eg. ACPI) firmware on your system. In that case the "platform" mode
127of hibernation is not likely to work. You can try the "shutdown" mode, but that
128is rather a poor man's workaround.
129
130If the "processors" test fails, the disabling/enabling of nonboot CPUs does not
131work (of course, this only may be an issue on SMP systems) and the problem
132should be reported. In that case you can also try to switch the nonboot CPUs
133off and on using the /sys/devices/system/cpu/cpu*/online sysfs attributes and
134see if that works.
135
136If the "core" test fails, which means that suspending of the system/platform
137devices has failed (these devices are suspended on one CPU with interrupts off),
138the problem is most probably hardware-related and serious, so it should be
139reported.
140
141A failure of any of the "platform", "processors" or "core" tests may cause your
142system to hang or become unstable, so please beware. Such a failure usually
143indicates a serious problem that very well may be related to the hardware, but
144please report it anyway.
57 145
58b) Testing minimal configuration 146b) Testing minimal configuration
59 147
60If the test mode of STD works, you can boot the system with "init=/bin/bash" 148If all of the hibernation test modes work, you can boot the system with the
61and attempt to suspend in the "reboot", "shutdown" and "platform" modes. If 149"init=/bin/bash" command line parameter and attempt to hibernate in the
62that does not work, there probably is a problem with a driver statically 150"reboot", "shutdown" and "platform" modes. If that does not work, there
63compiled into the kernel and you can try to compile more drivers as modules, 151probably is a problem with a driver statically compiled into the kernel and you
64so that they can be tested individually. Otherwise, there is a problem with a 152can try to compile more drivers as modules, so that they can be tested
65modular driver and you can find it by loading a half of the modules you normally 153individually. Otherwise, there is a problem with a modular driver and you can
66use and binary searching in accordance with the algorithm: 154find it by loading a half of the modules you normally use and binary searching
155in accordance with the algorithm:
67- if there are n modules loaded and the attempt to suspend and resume fails, 156- if there are n modules loaded and the attempt to suspend and resume fails,
68unload n/2 of the modules and try again (that would probably involve rebooting 157unload n/2 of the modules and try again (that would probably involve rebooting
69the system), 158the system),
@@ -71,19 +160,19 @@ the system),
71load n/2 modules more and try again. 160load n/2 modules more and try again.
72 161
73Again, if you find the offending module(s), it(they) must be unloaded every time 162Again, if you find the offending module(s), it(they) must be unloaded every time
74before the STD transition, and please report the problem with it(them). 163before hibernation, and please report the problem with it(them).
75 164
76c) Advanced debugging 165c) Advanced debugging
77 166
78In case the STD does not work on your system even in the minimal configuration 167In case that hibernation does not work on your system even in the minimal
79and compiling more drivers as modules is not practical or some modules cannot 168configuration and compiling more drivers as modules is not practical or some
80be unloaded, you can use one of the more advanced debugging techniques to find 169modules cannot be unloaded, you can use one of the more advanced debugging
81the problem. First, if there is a serial port in your box, you can boot the 170techniques to find the problem. First, if there is a serial port in your box,
82kernel with the 'no_console_suspend' parameter and try to log kernel 171you can boot the kernel with the 'no_console_suspend' parameter and try to log
83messages using the serial console. This may provide you with some information 172kernel messages using the serial console. This may provide you with some
84about the reasons of the suspend (resume) failure. Alternatively, it may be 173information about the reasons of the suspend (resume) failure. Alternatively,
85possible to use a FireWire port for debugging with firescope 174it may be possible to use a FireWire port for debugging with firescope
86(ftp://ftp.firstfloor.org/pub/ak/firescope/). On i386 it is also possible to 175(ftp://ftp.firstfloor.org/pub/ak/firescope/). On x86 it is also possible to
87use the PM_TRACE mechanism documented in Documentation/s2ram.txt . 176use the PM_TRACE mechanism documented in Documentation/s2ram.txt .
88 177
892. Testing suspend to RAM (STR) 1782. Testing suspend to RAM (STR)
@@ -91,16 +180,25 @@ use the PM_TRACE mechanism documented in Documentation/s2ram.txt .
91To verify that the STR works, it is generally more convenient to use the s2ram 180To verify that the STR works, it is generally more convenient to use the s2ram
92tool available from http://suspend.sf.net and documented at 181tool available from http://suspend.sf.net and documented at
93http://en.opensuse.org/s2ram . However, before doing that it is recommended to 182http://en.opensuse.org/s2ram . However, before doing that it is recommended to
94carry out the procedure described in section 1. 183carry out STR testing using the facility described in section 1.
95 184
96Assume you have resolved the problems with the STD and you have found some 185Namely, after writing "freezer", "devices", "platform", "processors", or "core"
97failing drivers. These drivers are also likely to fail during the STR or 186into /sys/power/pm_test (available if the kernel is compiled with
98during the resume, so it is better to unload them every time before the STR 187CONFIG_PM_DEBUG set) the suspend code will work in the test mode corresponding
99transition. Now, you can follow the instructions at 188to given string. The STR test modes are defined in the same way as for
100http://en.opensuse.org/s2ram to test the system, but if it does not work 189hibernation, so please refer to Section 1 for more information about them. In
101"out of the box", you may need to boot it with "init=/bin/bash" and test 190particular, the "core" test allows you to test everything except for the actual
102s2ram in the minimal configuration. In that case, you may be able to search 191invocation of the platform firmware in order to put the system into the sleep
103for failing drivers by following the procedure analogous to the one described in 192state.
1041b). If you find some failing drivers, you will have to unload them every time 193
105before the STR transition (ie. before you run s2ram), and please report the 194Among other things, the testing with the help of /sys/power/pm_test may allow
106problems with them. 195you to identify drivers that fail to suspend or resume their devices. They
196should be unloaded every time before an STR transition.
197
198Next, you can follow the instructions at http://en.opensuse.org/s2ram to test
199the system, but if it does not work "out of the box", you may need to boot it
200with "init=/bin/bash" and test s2ram in the minimal configuration. In that
201case, you may be able to search for failing drivers by following the procedure
202analogous to the one described in section 1. If you find some failing drivers,
203you will have to unload them every time before an STR transition (ie. before
204you run s2ram), and please report the problems with them.