diff options
author | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-04-16 18:20:36 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-04-16 18:20:36 -0400 |
commit | 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 (patch) | |
tree | 0bba044c4ce775e45a88a51686b5d9f90697ea9d /Documentation/power |
Linux-2.6.12-rc2v2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
Diffstat (limited to 'Documentation/power')
-rw-r--r-- | Documentation/power/devices.txt | 319 | ||||
-rw-r--r-- | Documentation/power/interface.txt | 43 | ||||
-rw-r--r-- | Documentation/power/kernel_threads.txt | 41 | ||||
-rw-r--r-- | Documentation/power/pci.txt | 332 | ||||
-rw-r--r-- | Documentation/power/states.txt | 79 | ||||
-rw-r--r-- | Documentation/power/swsusp.txt | 235 | ||||
-rw-r--r-- | Documentation/power/tricks.txt | 27 | ||||
-rw-r--r-- | Documentation/power/video.txt | 169 | ||||
-rw-r--r-- | Documentation/power/video_extension.txt | 34 |
9 files changed, 1279 insertions, 0 deletions
diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt new file mode 100644 index 000000000000..5d4ae9a39f1d --- /dev/null +++ b/Documentation/power/devices.txt | |||
@@ -0,0 +1,319 @@ | |||
1 | |||
2 | Device Power Management | ||
3 | |||
4 | |||
5 | Device power management encompasses two areas - the ability to save | ||
6 | state and transition a device to a low-power state when the system is | ||
7 | entering a low-power state; and the ability to transition a device to | ||
8 | a low-power state while the system is running (and independently of | ||
9 | any other power management activity). | ||
10 | |||
11 | |||
12 | Methods | ||
13 | |||
14 | The methods to suspend and resume devices reside in struct bus_type: | ||
15 | |||
16 | struct bus_type { | ||
17 | ... | ||
18 | int (*suspend)(struct device * dev, pm_message_t state); | ||
19 | int (*resume)(struct device * dev); | ||
20 | }; | ||
21 | |||
22 | Each bus driver is responsible implementing these methods, translating | ||
23 | the call into a bus-specific request and forwarding the call to the | ||
24 | bus-specific drivers. For example, PCI drivers implement suspend() and | ||
25 | resume() methods in struct pci_driver. The PCI core is simply | ||
26 | responsible for translating the pointers to PCI-specific ones and | ||
27 | calling the low-level driver. | ||
28 | |||
29 | This is done to a) ease transition to the new power management methods | ||
30 | and leverage the existing PM code in various bus drivers; b) allow | ||
31 | buses to implement generic and default PM routines for devices, and c) | ||
32 | make the flow of execution obvious to the reader. | ||
33 | |||
34 | |||
35 | System Power Management | ||
36 | |||
37 | When the system enters a low-power state, the device tree is walked in | ||
38 | a depth-first fashion to transition each device into a low-power | ||
39 | state. The ordering of the device tree is guaranteed by the order in | ||
40 | which devices get registered - children are never registered before | ||
41 | their ancestors, and devices are placed at the back of the list when | ||
42 | registered. By walking the list in reverse order, we are guaranteed to | ||
43 | suspend devices in the proper order. | ||
44 | |||
45 | Devices are suspended once with interrupts enabled. Drivers are | ||
46 | expected to stop I/O transactions, save device state, and place the | ||
47 | device into a low-power state. Drivers may sleep, allocate memory, | ||
48 | etc. at will. | ||
49 | |||
50 | Some devices are broken and will inevitably have problems powering | ||
51 | down or disabling themselves with interrupts enabled. For these | ||
52 | special cases, they may return -EAGAIN. This will put the device on a | ||
53 | list to be taken care of later. When interrupts are disabled, before | ||
54 | we enter the low-power state, their drivers are called again to put | ||
55 | their device to sleep. | ||
56 | |||
57 | On resume, the devices that returned -EAGAIN will be called to power | ||
58 | themselves back on with interrupts disabled. Once interrupts have been | ||
59 | re-enabled, the rest of the drivers will be called to resume their | ||
60 | devices. On resume, a driver is responsible for powering back on each | ||
61 | device, restoring state, and re-enabling I/O transactions for that | ||
62 | device. | ||
63 | |||
64 | System devices follow a slightly different API, which can be found in | ||
65 | |||
66 | include/linux/sysdev.h | ||
67 | drivers/base/sys.c | ||
68 | |||
69 | System devices will only be suspended with interrupts disabled, and | ||
70 | after all other devices have been suspended. On resume, they will be | ||
71 | resumed before any other devices, and also with interrupts disabled. | ||
72 | |||
73 | |||
74 | Runtime Power Management | ||
75 | |||
76 | Many devices are able to dynamically power down while the system is | ||
77 | still running. This feature is useful for devices that are not being | ||
78 | used, and can offer significant power savings on a running system. | ||
79 | |||
80 | In each device's directory, there is a 'power' directory, which | ||
81 | contains at least a 'state' file. Reading from this file displays what | ||
82 | power state the device is currently in. Writing to this file initiates | ||
83 | a transition to the specified power state, which must be a decimal in | ||
84 | the range 1-3, inclusive; or 0 for 'On'. | ||
85 | |||
86 | The PM core will call the ->suspend() method in the bus_type object | ||
87 | that the device belongs to if the specified state is not 0, or | ||
88 | ->resume() if it is. | ||
89 | |||
90 | Nothing will happen if the specified state is the same state the | ||
91 | device is currently in. | ||
92 | |||
93 | If the device is already in a low-power state, and the specified state | ||
94 | is another, but different, low-power state, the ->resume() method will | ||
95 | first be called to power the device back on, then ->suspend() will be | ||
96 | called again with the new state. | ||
97 | |||
98 | The driver is responsible for saving the working state of the device | ||
99 | and putting it into the low-power state specified. If this was | ||
100 | successful, it returns 0, and the device's power_state field is | ||
101 | updated. | ||
102 | |||
103 | The driver must take care to know whether or not it is able to | ||
104 | properly resume the device, including all step of reinitialization | ||
105 | necessary. (This is the hardest part, and the one most protected by | ||
106 | NDA'd documents). | ||
107 | |||
108 | The driver must also take care not to suspend a device that is | ||
109 | currently in use. It is their responsibility to provide their own | ||
110 | exclusion mechanisms. | ||
111 | |||
112 | The runtime power transition happens with interrupts enabled. If a | ||
113 | device cannot support being powered down with interrupts, it may | ||
114 | return -EAGAIN (as it would during a system power management | ||
115 | transition), but it will _not_ be called again, and the transaction | ||
116 | will fail. | ||
117 | |||
118 | There is currently no way to know what states a device or driver | ||
119 | supports a priori. This will change in the future. | ||
120 | |||
121 | pm_message_t meaning | ||
122 | |||
123 | pm_message_t has two fields. event ("major"), and flags. If driver | ||
124 | does not know event code, it aborts the request, returning error. Some | ||
125 | drivers may need to deal with special cases based on the actual type | ||
126 | of suspend operation being done at the system level. This is why | ||
127 | there are flags. | ||
128 | |||
129 | Event codes are: | ||
130 | |||
131 | ON -- no need to do anything except special cases like broken | ||
132 | HW. | ||
133 | |||
134 | # NOTIFICATION -- pretty much same as ON? | ||
135 | |||
136 | FREEZE -- stop DMA and interrupts, and be prepared to reinit HW from | ||
137 | scratch. That probably means stop accepting upstream requests, the | ||
138 | actual policy of what to do with them beeing specific to a given | ||
139 | driver. It's acceptable for a network driver to just drop packets | ||
140 | while a block driver is expected to block the queue so no request is | ||
141 | lost. (Use IDE as an example on how to do that). FREEZE requires no | ||
142 | power state change, and it's expected for drivers to be able to | ||
143 | quickly transition back to operating state. | ||
144 | |||
145 | SUSPEND -- like FREEZE, but also put hardware into low-power state. If | ||
146 | there's need to distinguish several levels of sleep, additional flag | ||
147 | is probably best way to do that. | ||
148 | |||
149 | Transitions are only from a resumed state to a suspended state, never | ||
150 | between 2 suspended states. (ON -> FREEZE or ON -> SUSPEND can happen, | ||
151 | FREEZE -> SUSPEND or SUSPEND -> FREEZE can not). | ||
152 | |||
153 | All events are: | ||
154 | |||
155 | [NOTE NOTE NOTE: If you are driver author, you should not care; you | ||
156 | should only look at event, and ignore flags.] | ||
157 | |||
158 | #Prepare for suspend -- userland is still running but we are going to | ||
159 | #enter suspend state. This gives drivers chance to load firmware from | ||
160 | #disk and store it in memory, or do other activities taht require | ||
161 | #operating userland, ability to kmalloc GFP_KERNEL, etc... All of these | ||
162 | #are forbiden once the suspend dance is started.. event = ON, flags = | ||
163 | #PREPARE_TO_SUSPEND | ||
164 | |||
165 | Apm standby -- prepare for APM event. Quiesce devices to make life | ||
166 | easier for APM BIOS. event = FREEZE, flags = APM_STANDBY | ||
167 | |||
168 | Apm suspend -- same as APM_STANDBY, but it we should probably avoid | ||
169 | spinning down disks. event = FREEZE, flags = APM_SUSPEND | ||
170 | |||
171 | System halt, reboot -- quiesce devices to make life easier for BIOS. event | ||
172 | = FREEZE, flags = SYSTEM_HALT or SYSTEM_REBOOT | ||
173 | |||
174 | System shutdown -- at least disks need to be spun down, or data may be | ||
175 | lost. Quiesce devices, just to make life easier for BIOS. event = | ||
176 | FREEZE, flags = SYSTEM_SHUTDOWN | ||
177 | |||
178 | Kexec -- turn off DMAs and put hardware into some state where new | ||
179 | kernel can take over. event = FREEZE, flags = KEXEC | ||
180 | |||
181 | Powerdown at end of swsusp -- very similar to SYSTEM_SHUTDOWN, except wake | ||
182 | may need to be enabled on some devices. This actually has at least 3 | ||
183 | subtypes, system can reboot, enter S4 and enter S5 at the end of | ||
184 | swsusp. event = FREEZE, flags = SWSUSP and one of SYSTEM_REBOOT, | ||
185 | SYSTEM_SHUTDOWN, SYSTEM_S4 | ||
186 | |||
187 | Suspend to ram -- put devices into low power state. event = SUSPEND, | ||
188 | flags = SUSPEND_TO_RAM | ||
189 | |||
190 | Freeze for swsusp snapshot -- stop DMA and interrupts. No need to put | ||
191 | devices into low power mode, but you must be able to reinitialize | ||
192 | device from scratch in resume method. This has two flavors, its done | ||
193 | once on suspending kernel, once on resuming kernel. event = FREEZE, | ||
194 | flags = DURING_SUSPEND or DURING_RESUME | ||
195 | |||
196 | Device detach requested from /sys -- deinitialize device; proably same as | ||
197 | SYSTEM_SHUTDOWN, I do not understand this one too much. probably event | ||
198 | = FREEZE, flags = DEV_DETACH. | ||
199 | |||
200 | #These are not really events sent: | ||
201 | # | ||
202 | #System fully on -- device is working normally; this is probably never | ||
203 | #passed to suspend() method... event = ON, flags = 0 | ||
204 | # | ||
205 | #Ready after resume -- userland is now running, again. Time to free any | ||
206 | #memory you ate during prepare to suspend... event = ON, flags = | ||
207 | #READY_AFTER_RESUME | ||
208 | # | ||
209 | |||
210 | Driver Detach Power Management | ||
211 | |||
212 | The kernel now supports the ability to place a device in a low-power | ||
213 | state when it is detached from its driver, which happens when its | ||
214 | module is removed. | ||
215 | |||
216 | Each device contains a 'detach_state' file in its sysfs directory | ||
217 | which can be used to control this state. Reading from this file | ||
218 | displays what the current detach state is set to. This is 0 (On) by | ||
219 | default. A user may write a positive integer value to this file in the | ||
220 | range of 1-4 inclusive. | ||
221 | |||
222 | A value of 1-3 will indicate the device should be placed in that | ||
223 | low-power state, which will cause ->suspend() to be called for that | ||
224 | device. A value of 4 indicates that the device should be shutdown, so | ||
225 | ->shutdown() will be called for that device. | ||
226 | |||
227 | The driver is responsible for reinitializing the device when the | ||
228 | module is re-inserted during it's ->probe() (or equivalent) method. | ||
229 | The driver core will not call any extra functions when binding the | ||
230 | device to the driver. | ||
231 | |||
232 | pm_message_t meaning | ||
233 | |||
234 | pm_message_t has two fields. event ("major"), and flags. If driver | ||
235 | does not know event code, it aborts the request, returning error. Some | ||
236 | drivers may need to deal with special cases based on the actual type | ||
237 | of suspend operation being done at the system level. This is why | ||
238 | there are flags. | ||
239 | |||
240 | Event codes are: | ||
241 | |||
242 | ON -- no need to do anything except special cases like broken | ||
243 | HW. | ||
244 | |||
245 | # NOTIFICATION -- pretty much same as ON? | ||
246 | |||
247 | FREEZE -- stop DMA and interrupts, and be prepared to reinit HW from | ||
248 | scratch. That probably means stop accepting upstream requests, the | ||
249 | actual policy of what to do with them being specific to a given | ||
250 | driver. It's acceptable for a network driver to just drop packets | ||
251 | while a block driver is expected to block the queue so no request is | ||
252 | lost. (Use IDE as an example on how to do that). FREEZE requires no | ||
253 | power state change, and it's expected for drivers to be able to | ||
254 | quickly transition back to operating state. | ||
255 | |||
256 | SUSPEND -- like FREEZE, but also put hardware into low-power state. If | ||
257 | there's need to distinguish several levels of sleep, additional flag | ||
258 | is probably best way to do that. | ||
259 | |||
260 | Transitions are only from a resumed state to a suspended state, never | ||
261 | between 2 suspended states. (ON -> FREEZE or ON -> SUSPEND can happen, | ||
262 | FREEZE -> SUSPEND or SUSPEND -> FREEZE can not). | ||
263 | |||
264 | All events are: | ||
265 | |||
266 | [NOTE NOTE NOTE: If you are driver author, you should not care; you | ||
267 | should only look at event, and ignore flags.] | ||
268 | |||
269 | #Prepare for suspend -- userland is still running but we are going to | ||
270 | #enter suspend state. This gives drivers chance to load firmware from | ||
271 | #disk and store it in memory, or do other activities taht require | ||
272 | #operating userland, ability to kmalloc GFP_KERNEL, etc... All of these | ||
273 | #are forbiden once the suspend dance is started.. event = ON, flags = | ||
274 | #PREPARE_TO_SUSPEND | ||
275 | |||
276 | Apm standby -- prepare for APM event. Quiesce devices to make life | ||
277 | easier for APM BIOS. event = FREEZE, flags = APM_STANDBY | ||
278 | |||
279 | Apm suspend -- same as APM_STANDBY, but it we should probably avoid | ||
280 | spinning down disks. event = FREEZE, flags = APM_SUSPEND | ||
281 | |||
282 | System halt, reboot -- quiesce devices to make life easier for BIOS. event | ||
283 | = FREEZE, flags = SYSTEM_HALT or SYSTEM_REBOOT | ||
284 | |||
285 | System shutdown -- at least disks need to be spun down, or data may be | ||
286 | lost. Quiesce devices, just to make life easier for BIOS. event = | ||
287 | FREEZE, flags = SYSTEM_SHUTDOWN | ||
288 | |||
289 | Kexec -- turn off DMAs and put hardware into some state where new | ||
290 | kernel can take over. event = FREEZE, flags = KEXEC | ||
291 | |||
292 | Powerdown at end of swsusp -- very similar to SYSTEM_SHUTDOWN, except wake | ||
293 | may need to be enabled on some devices. This actually has at least 3 | ||
294 | subtypes, system can reboot, enter S4 and enter S5 at the end of | ||
295 | swsusp. event = FREEZE, flags = SWSUSP and one of SYSTEM_REBOOT, | ||
296 | SYSTEM_SHUTDOWN, SYSTEM_S4 | ||
297 | |||
298 | Suspend to ram -- put devices into low power state. event = SUSPEND, | ||
299 | flags = SUSPEND_TO_RAM | ||
300 | |||
301 | Freeze for swsusp snapshot -- stop DMA and interrupts. No need to put | ||
302 | devices into low power mode, but you must be able to reinitialize | ||
303 | device from scratch in resume method. This has two flavors, its done | ||
304 | once on suspending kernel, once on resuming kernel. event = FREEZE, | ||
305 | flags = DURING_SUSPEND or DURING_RESUME | ||
306 | |||
307 | Device detach requested from /sys -- deinitialize device; proably same as | ||
308 | SYSTEM_SHUTDOWN, I do not understand this one too much. probably event | ||
309 | = FREEZE, flags = DEV_DETACH. | ||
310 | |||
311 | #These are not really events sent: | ||
312 | # | ||
313 | #System fully on -- device is working normally; this is probably never | ||
314 | #passed to suspend() method... event = ON, flags = 0 | ||
315 | # | ||
316 | #Ready after resume -- userland is now running, again. Time to free any | ||
317 | #memory you ate during prepare to suspend... event = ON, flags = | ||
318 | #READY_AFTER_RESUME | ||
319 | # | ||
diff --git a/Documentation/power/interface.txt b/Documentation/power/interface.txt new file mode 100644 index 000000000000..f5ebda5f4276 --- /dev/null +++ b/Documentation/power/interface.txt | |||
@@ -0,0 +1,43 @@ | |||
1 | Power Management Interface | ||
2 | |||
3 | |||
4 | The power management subsystem provides a unified sysfs interface to | ||
5 | userspace, regardless of what architecture or platform one is | ||
6 | running. The interface exists in /sys/power/ directory (assuming sysfs | ||
7 | is mounted at /sys). | ||
8 | |||
9 | /sys/power/state controls system power state. Reading from this file | ||
10 | returns what states are supported, which is hard-coded to 'standby' | ||
11 | (Power-On Suspend), 'mem' (Suspend-to-RAM), and 'disk' | ||
12 | (Suspend-to-Disk). | ||
13 | |||
14 | Writing to this file one of those strings causes the system to | ||
15 | transition into that state. Please see the file | ||
16 | Documentation/power/states.txt for a description of each of those | ||
17 | states. | ||
18 | |||
19 | |||
20 | /sys/power/disk controls the operating mode of the suspend-to-disk | ||
21 | mechanism. Suspend-to-disk can be handled in several ways. The | ||
22 | greatest distinction is who writes memory to disk - the firmware or | ||
23 | the kernel. If the firmware does it, we assume that it also handles | ||
24 | suspending the system. | ||
25 | |||
26 | If the kernel does it, then we have three options for putting the system | ||
27 | to sleep - using the platform driver (e.g. ACPI or other PM | ||
28 | registers), powering off the system or rebooting the system (for | ||
29 | testing). The system will support either 'firmware' or 'platform', and | ||
30 | that is known a priori. But, the user may choose 'shutdown' or | ||
31 | 'reboot' as alternatives. | ||
32 | |||
33 | Reading from this file will display what the mode is currently set | ||
34 | to. Writing to this file will accept one of | ||
35 | |||
36 | 'firmware' | ||
37 | 'platform' | ||
38 | 'shutdown' | ||
39 | 'reboot' | ||
40 | |||
41 | It will only change to 'firmware' or 'platform' if the system supports | ||
42 | it. | ||
43 | |||
diff --git a/Documentation/power/kernel_threads.txt b/Documentation/power/kernel_threads.txt new file mode 100644 index 000000000000..60b548105edf --- /dev/null +++ b/Documentation/power/kernel_threads.txt | |||
@@ -0,0 +1,41 @@ | |||
1 | KERNEL THREADS | ||
2 | |||
3 | |||
4 | Freezer | ||
5 | |||
6 | Upon entering a suspended state the system will freeze all | ||
7 | tasks. This is done by delivering pseudosignals. This affects | ||
8 | kernel threads, too. To successfully freeze a kernel thread | ||
9 | the thread has to check for the pseudosignal and enter the | ||
10 | refrigerator. Code to do this looks like this: | ||
11 | |||
12 | do { | ||
13 | hub_events(); | ||
14 | wait_event_interruptible(khubd_wait, !list_empty(&hub_event_list)); | ||
15 | if (current->flags & PF_FREEZE) | ||
16 | refrigerator(PF_FREEZE); | ||
17 | } while (!signal_pending(current)); | ||
18 | |||
19 | from drivers/usb/core/hub.c::hub_thread() | ||
20 | |||
21 | |||
22 | The Unfreezable | ||
23 | |||
24 | Some kernel threads however, must not be frozen. The kernel must | ||
25 | be able to finish pending IO operations and later on be able to | ||
26 | write the memory image to disk. Kernel threads needed to do IO | ||
27 | must stay awake. Such threads must mark themselves unfreezable | ||
28 | like this: | ||
29 | |||
30 | /* | ||
31 | * This thread doesn't need any user-level access, | ||
32 | * so get rid of all our resources. | ||
33 | */ | ||
34 | daemonize("usb-storage"); | ||
35 | |||
36 | current->flags |= PF_NOFREEZE; | ||
37 | |||
38 | from drivers/usb/storage/usb.c::usb_stor_control_thread() | ||
39 | |||
40 | Such drivers are themselves responsible for staying quiet during | ||
41 | the actual snapshotting. | ||
diff --git a/Documentation/power/pci.txt b/Documentation/power/pci.txt new file mode 100644 index 000000000000..c85428e7ad92 --- /dev/null +++ b/Documentation/power/pci.txt | |||
@@ -0,0 +1,332 @@ | |||
1 | |||
2 | PCI Power Management | ||
3 | ~~~~~~~~~~~~~~~~~~~~ | ||
4 | |||
5 | An overview of the concepts and the related functions in the Linux kernel | ||
6 | |||
7 | Patrick Mochel <mochel@transmeta.com> | ||
8 | (and others) | ||
9 | |||
10 | --------------------------------------------------------------------------- | ||
11 | |||
12 | 1. Overview | ||
13 | 2. How the PCI Subsystem Does Power Management | ||
14 | 3. PCI Utility Functions | ||
15 | 4. PCI Device Drivers | ||
16 | 5. Resources | ||
17 | |||
18 | 1. Overview | ||
19 | ~~~~~~~~~~~ | ||
20 | |||
21 | The PCI Power Management Specification was introduced between the PCI 2.1 and | ||
22 | PCI 2.2 Specifications. It a standard interface for controlling various | ||
23 | power management operations. | ||
24 | |||
25 | Implementation of the PCI PM Spec is optional, as are several sub-components of | ||
26 | it. If a device supports the PCI PM Spec, the device will have an 8 byte | ||
27 | capability field in its PCI configuration space. This field is used to describe | ||
28 | and control the standard PCI power management features. | ||
29 | |||
30 | The PCI PM spec defines 4 operating states for devices (D0 - D3) and for buses | ||
31 | (B0 - B3). The higher the number, the less power the device consumes. However, | ||
32 | the higher the number, the longer the latency is for the device to return to | ||
33 | an operational state (D0). | ||
34 | |||
35 | There are actually two D3 states. When someone talks about D3, they usually | ||
36 | mean D3hot, which corresponds to an ACPI D2 state (power is reduced, the | ||
37 | device may lose some context). But they may also mean D3cold, which is an | ||
38 | ACPI D3 state (power is fully off, all state was discarded); or both. | ||
39 | |||
40 | Bus power management is not covered in this version of this document. | ||
41 | |||
42 | Note that all PCI devices support D0 and D3cold by default, regardless of | ||
43 | whether or not they implement any of the PCI PM spec. | ||
44 | |||
45 | The possible state transitions that a device can undergo are: | ||
46 | |||
47 | +---------------------------+ | ||
48 | | Current State | New State | | ||
49 | +---------------------------+ | ||
50 | | D0 | D1, D2, D3| | ||
51 | +---------------------------+ | ||
52 | | D1 | D2, D3 | | ||
53 | +---------------------------+ | ||
54 | | D2 | D3 | | ||
55 | +---------------------------+ | ||
56 | | D1, D2, D3 | D0 | | ||
57 | +---------------------------+ | ||
58 | |||
59 | Note that when the system is entering a global suspend state, all devices will | ||
60 | be placed into D3 and when resuming, all devices will be placed into D0. | ||
61 | However, when the system is running, other state transitions are possible. | ||
62 | |||
63 | 2. How The PCI Subsystem Handles Power Management | ||
64 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
65 | |||
66 | The PCI suspend/resume functionality is accessed indirectly via the Power | ||
67 | Management subsystem. At boot, the PCI driver registers a power management | ||
68 | callback with that layer. Upon entering a suspend state, the PM layer iterates | ||
69 | through all of its registered callbacks. This currently takes place only during | ||
70 | APM state transitions. | ||
71 | |||
72 | Upon going to sleep, the PCI subsystem walks its device tree twice. Both times, | ||
73 | it does a depth first walk of the device tree. The first walk saves each of the | ||
74 | device's state and checks for devices that will prevent the system from entering | ||
75 | a global power state. The next walk then places the devices in a low power | ||
76 | state. | ||
77 | |||
78 | The first walk allows a graceful recovery in the event of a failure, since none | ||
79 | of the devices have actually been powered down. | ||
80 | |||
81 | In both walks, in particular the second, all children of a bridge are touched | ||
82 | before the actual bridge itself. This allows the bridge to retain power while | ||
83 | its children are being accessed. | ||
84 | |||
85 | Upon resuming from sleep, just the opposite must be true: all bridges must be | ||
86 | powered on and restored before their children are powered on. This is easily | ||
87 | accomplished with a breadth-first walk of the PCI device tree. | ||
88 | |||
89 | |||
90 | 3. PCI Utility Functions | ||
91 | ~~~~~~~~~~~~~~~~~~~~~~~~ | ||
92 | |||
93 | These are helper functions designed to be called by individual device drivers. | ||
94 | Assuming that a device behaves as advertised, these should be applicable in most | ||
95 | cases. However, results may vary. | ||
96 | |||
97 | Note that these functions are never implicitly called for the driver. The driver | ||
98 | is always responsible for deciding when and if to call these. | ||
99 | |||
100 | |||
101 | pci_save_state | ||
102 | -------------- | ||
103 | |||
104 | Usage: | ||
105 | pci_save_state(dev, buffer); | ||
106 | |||
107 | Description: | ||
108 | Save first 64 bytes of PCI config space. Buffer must be allocated by | ||
109 | caller. | ||
110 | |||
111 | |||
112 | pci_restore_state | ||
113 | ----------------- | ||
114 | |||
115 | Usage: | ||
116 | pci_restore_state(dev, buffer); | ||
117 | |||
118 | Description: | ||
119 | Restore previously saved config space. (First 64 bytes only); | ||
120 | |||
121 | If buffer is NULL, then restore what information we know about the | ||
122 | device from bootup: BARs and interrupt line. | ||
123 | |||
124 | |||
125 | pci_set_power_state | ||
126 | ------------------- | ||
127 | |||
128 | Usage: | ||
129 | pci_set_power_state(dev, state); | ||
130 | |||
131 | Description: | ||
132 | Transition device to low power state using PCI PM Capabilities | ||
133 | registers. | ||
134 | |||
135 | Will fail under one of the following conditions: | ||
136 | - If state is less than current state, but not D0 (illegal transition) | ||
137 | - Device doesn't support PM Capabilities | ||
138 | - Device does not support requested state | ||
139 | |||
140 | |||
141 | pci_enable_wake | ||
142 | --------------- | ||
143 | |||
144 | Usage: | ||
145 | pci_enable_wake(dev, state, enable); | ||
146 | |||
147 | Description: | ||
148 | Enable device to generate PME# during low power state using PCI PM | ||
149 | Capabilities. | ||
150 | |||
151 | Checks whether if device supports generating PME# from requested state | ||
152 | and fail if it does not, unless enable == 0 (request is to disable wake | ||
153 | events, which is implicit if it doesn't even support it in the first | ||
154 | place). | ||
155 | |||
156 | Note that the PMC Register in the device's PM Capabilties has a bitmask | ||
157 | of the states it supports generating PME# from. D3hot is bit 3 and | ||
158 | D3cold is bit 4. So, while a value of 4 as the state may not seem | ||
159 | semantically correct, it is. | ||
160 | |||
161 | |||
162 | 4. PCI Device Drivers | ||
163 | ~~~~~~~~~~~~~~~~~~~~~ | ||
164 | |||
165 | These functions are intended for use by individual drivers, and are defined in | ||
166 | struct pci_driver: | ||
167 | |||
168 | int (*save_state) (struct pci_dev *dev, u32 state); | ||
169 | int (*suspend) (struct pci_dev *dev, u32 state); | ||
170 | int (*resume) (struct pci_dev *dev); | ||
171 | int (*enable_wake) (struct pci_dev *dev, u32 state, int enable); | ||
172 | |||
173 | |||
174 | save_state | ||
175 | ---------- | ||
176 | |||
177 | Usage: | ||
178 | |||
179 | if (dev->driver && dev->driver->save_state) | ||
180 | dev->driver->save_state(dev,state); | ||
181 | |||
182 | The driver should use this callback to save device state. It should take into | ||
183 | account the current state of the device and the requested state in order to | ||
184 | avoid any unnecessary operations. | ||
185 | |||
186 | For example, a video card that supports all 4 states (D0-D3), all controller | ||
187 | context is preserved when entering D1, but the screen is placed into a low power | ||
188 | state (blanked). | ||
189 | |||
190 | The driver can also interpret this function as a notification that it may be | ||
191 | entering a sleep state in the near future. If it knows that the device cannot | ||
192 | enter the requested state, either because of lack of support for it, or because | ||
193 | the device is middle of some critical operation, then it should fail. | ||
194 | |||
195 | This function should not be used to set any state in the device or the driver | ||
196 | because the device may not actually enter the sleep state (e.g. another driver | ||
197 | later causes causes a global state transition to fail). | ||
198 | |||
199 | Note that in intermediate low power states, a device's I/O and memory spaces may | ||
200 | be disabled and may not be available in subsequent transitions to lower power | ||
201 | states. | ||
202 | |||
203 | |||
204 | suspend | ||
205 | ------- | ||
206 | |||
207 | Usage: | ||
208 | |||
209 | if (dev->driver && dev->driver->suspend) | ||
210 | dev->driver->suspend(dev,state); | ||
211 | |||
212 | A driver uses this function to actually transition the device into a low power | ||
213 | state. This should include disabling I/O, IRQs, and bus-mastering, as well as | ||
214 | physically transitioning the device to a lower power state; it may also include | ||
215 | calls to pci_enable_wake(). | ||
216 | |||
217 | Bus mastering may be disabled by doing: | ||
218 | |||
219 | pci_disable_device(dev); | ||
220 | |||
221 | For devices that support the PCI PM Spec, this may be used to set the device's | ||
222 | power state to match the suspend() parameter: | ||
223 | |||
224 | pci_set_power_state(dev,state); | ||
225 | |||
226 | The driver is also responsible for disabling any other device-specific features | ||
227 | (e.g blanking screen, turning off on-card memory, etc). | ||
228 | |||
229 | The driver should be sure to track the current state of the device, as it may | ||
230 | obviate the need for some operations. | ||
231 | |||
232 | The driver should update the current_state field in its pci_dev structure in | ||
233 | this function, except for PM-capable devices when pci_set_power_state is used. | ||
234 | |||
235 | resume | ||
236 | ------ | ||
237 | |||
238 | Usage: | ||
239 | |||
240 | if (dev->driver && dev->driver->suspend) | ||
241 | dev->driver->resume(dev) | ||
242 | |||
243 | The resume callback may be called from any power state, and is always meant to | ||
244 | transition the device to the D0 state. | ||
245 | |||
246 | The driver is responsible for reenabling any features of the device that had | ||
247 | been disabled during previous suspend calls, such as IRQs and bus mastering, | ||
248 | as well as calling pci_restore_state(). | ||
249 | |||
250 | If the device is currently in D3, it may need to be reinitialized in resume(). | ||
251 | |||
252 | * Some types of devices, like bus controllers, will preserve context in D3hot | ||
253 | (using Vcc power). Their drivers will often want to avoid re-initializing | ||
254 | them after re-entering D0 (perhaps to avoid resetting downstream devices). | ||
255 | |||
256 | * Other kinds of devices in D3hot will discard device context as part of a | ||
257 | soft reset when re-entering the D0 state. | ||
258 | |||
259 | * Devices resuming from D3cold always go through a power-on reset. Some | ||
260 | device context can also be preserved using Vaux power. | ||
261 | |||
262 | * Some systems hide D3cold resume paths from drivers. For example, on PCs | ||
263 | the resume path for suspend-to-disk often runs BIOS powerup code, which | ||
264 | will sometimes re-initialize the device. | ||
265 | |||
266 | To handle resets during D3 to D0 transitions, it may be convenient to share | ||
267 | device initialization code between probe() and resume(). Device parameters | ||
268 | can also be saved before the driver suspends into D3, avoiding re-probe. | ||
269 | |||
270 | If the device supports the PCI PM Spec, it can use this to physically transition | ||
271 | the device to D0: | ||
272 | |||
273 | pci_set_power_state(dev,0); | ||
274 | |||
275 | Note that if the entire system is transitioning out of a global sleep state, all | ||
276 | devices will be placed in the D0 state, so this is not necessary. However, in | ||
277 | the event that the device is placed in the D3 state during normal operation, | ||
278 | this call is necessary. It is impossible to determine which of the two events is | ||
279 | taking place in the driver, so it is always a good idea to make that call. | ||
280 | |||
281 | The driver should take note of the state that it is resuming from in order to | ||
282 | ensure correct (and speedy) operation. | ||
283 | |||
284 | The driver should update the current_state field in its pci_dev structure in | ||
285 | this function, except for PM-capable devices when pci_set_power_state is used. | ||
286 | |||
287 | |||
288 | enable_wake | ||
289 | ----------- | ||
290 | |||
291 | Usage: | ||
292 | |||
293 | if (dev->driver && dev->driver->enable_wake) | ||
294 | dev->driver->enable_wake(dev,state,enable); | ||
295 | |||
296 | This callback is generally only relevant for devices that support the PCI PM | ||
297 | spec and have the ability to generate a PME# (Power Management Event Signal) | ||
298 | to wake the system up. (However, it is possible that a device may support | ||
299 | some non-standard way of generating a wake event on sleep.) | ||
300 | |||
301 | Bits 15:11 of the PMC (Power Mgmt Capabilities) Register in a device's | ||
302 | PM Capabilties describe what power states the device supports generating a | ||
303 | wake event from: | ||
304 | |||
305 | +------------------+ | ||
306 | | Bit | State | | ||
307 | +------------------+ | ||
308 | | 11 | D0 | | ||
309 | | 12 | D1 | | ||
310 | | 13 | D2 | | ||
311 | | 14 | D3hot | | ||
312 | | 15 | D3cold | | ||
313 | +------------------+ | ||
314 | |||
315 | A device can use this to enable wake events: | ||
316 | |||
317 | pci_enable_wake(dev,state,enable); | ||
318 | |||
319 | Note that to enable PME# from D3cold, a value of 4 should be passed to | ||
320 | pci_enable_wake (since it uses an index into a bitmask). If a driver gets | ||
321 | a request to enable wake events from D3, two calls should be made to | ||
322 | pci_enable_wake (one for both D3hot and D3cold). | ||
323 | |||
324 | |||
325 | 5. Resources | ||
326 | ~~~~~~~~~~~~ | ||
327 | |||
328 | PCI Local Bus Specification | ||
329 | PCI Bus Power Management Interface Specification | ||
330 | |||
331 | http://pcisig.org | ||
332 | |||
diff --git a/Documentation/power/states.txt b/Documentation/power/states.txt new file mode 100644 index 000000000000..3e5e5d3ff419 --- /dev/null +++ b/Documentation/power/states.txt | |||
@@ -0,0 +1,79 @@ | |||
1 | |||
2 | System Power Management States | ||
3 | |||
4 | |||
5 | The kernel supports three power management states generically, though | ||
6 | each is dependent on platform support code to implement the low-level | ||
7 | details for each state. This file describes each state, what they are | ||
8 | commonly called, what ACPI state they map to, and what string to write | ||
9 | to /sys/power/state to enter that state | ||
10 | |||
11 | |||
12 | State: Standby / Power-On Suspend | ||
13 | ACPI State: S1 | ||
14 | String: "standby" | ||
15 | |||
16 | This state offers minimal, though real, power savings, while providing | ||
17 | a very low-latency transition back to a working system. No operating | ||
18 | state is lost (the CPU retains power), so the system easily starts up | ||
19 | again where it left off. | ||
20 | |||
21 | We try to put devices in a low-power state equivalent to D1, which | ||
22 | also offers low power savings, but low resume latency. Not all devices | ||
23 | support D1, and those that don't are left on. | ||
24 | |||
25 | A transition from Standby to the On state should take about 1-2 | ||
26 | seconds. | ||
27 | |||
28 | |||
29 | State: Suspend-to-RAM | ||
30 | ACPI State: S3 | ||
31 | String: "mem" | ||
32 | |||
33 | This state offers significant power savings as everything in the | ||
34 | system is put into a low-power state, except for memory, which is | ||
35 | placed in self-refresh mode to retain its contents. | ||
36 | |||
37 | System and device state is saved and kept in memory. All devices are | ||
38 | suspended and put into D3. In many cases, all peripheral buses lose | ||
39 | power when entering STR, so devices must be able to handle the | ||
40 | transition back to the On state. | ||
41 | |||
42 | For at least ACPI, STR requires some minimal boot-strapping code to | ||
43 | resume the system from STR. This may be true on other platforms. | ||
44 | |||
45 | A transition from Suspend-to-RAM to the On state should take about | ||
46 | 3-5 seconds. | ||
47 | |||
48 | |||
49 | State: Suspend-to-disk | ||
50 | ACPI State: S4 | ||
51 | String: "disk" | ||
52 | |||
53 | This state offers the greatest power savings, and can be used even in | ||
54 | the absence of low-level platform support for power management. This | ||
55 | state operates similarly to Suspend-to-RAM, but includes a final step | ||
56 | of writing memory contents to disk. On resume, this is read and memory | ||
57 | is restored to its pre-suspend state. | ||
58 | |||
59 | STD can be handled by the firmware or the kernel. If it is handled by | ||
60 | the firmware, it usually requires a dedicated partition that must be | ||
61 | setup via another operating system for it to use. Despite the | ||
62 | inconvenience, this method requires minimal work by the kernel, since | ||
63 | the firmware will also handle restoring memory contents on resume. | ||
64 | |||
65 | If the kernel is responsible for persistantly saving state, a mechanism | ||
66 | called 'swsusp' (Swap Suspend) is used to write memory contents to | ||
67 | free swap space. swsusp has some restrictive requirements, but should | ||
68 | work in most cases. Some, albeit outdated, documentation can be found | ||
69 | in Documentation/power/swsusp.txt. | ||
70 | |||
71 | Once memory state is written to disk, the system may either enter a | ||
72 | low-power state (like ACPI S4), or it may simply power down. Powering | ||
73 | down offers greater savings, and allows this mechanism to work on any | ||
74 | system. However, entering a real low-power state allows the user to | ||
75 | trigger wake up events (e.g. pressing a key or opening a laptop lid). | ||
76 | |||
77 | A transition from Suspend-to-Disk to the On state should take about 30 | ||
78 | seconds, though it's typically a bit more with the current | ||
79 | implementation. | ||
diff --git a/Documentation/power/swsusp.txt b/Documentation/power/swsusp.txt new file mode 100644 index 000000000000..c7c3459fde43 --- /dev/null +++ b/Documentation/power/swsusp.txt | |||
@@ -0,0 +1,235 @@ | |||
1 | From kernel/suspend.c: | ||
2 | |||
3 | * BIG FAT WARNING ********************************************************* | ||
4 | * | ||
5 | * If you have unsupported (*) devices using DMA... | ||
6 | * ...say goodbye to your data. | ||
7 | * | ||
8 | * If you touch anything on disk between suspend and resume... | ||
9 | * ...kiss your data goodbye. | ||
10 | * | ||
11 | * If your disk driver does not support suspend... (IDE does) | ||
12 | * ...you'd better find out how to get along | ||
13 | * without your data. | ||
14 | * | ||
15 | * If you change kernel command line between suspend and resume... | ||
16 | * ...prepare for nasty fsck or worse. | ||
17 | * | ||
18 | * If you change your hardware while system is suspended... | ||
19 | * ...well, it was not good idea. | ||
20 | * | ||
21 | * (*) suspend/resume support is needed to make it safe. | ||
22 | |||
23 | You need to append resume=/dev/your_swap_partition to kernel command | ||
24 | line. Then you suspend by | ||
25 | |||
26 | echo shutdown > /sys/power/disk; echo disk > /sys/power/state | ||
27 | |||
28 | . If you feel ACPI works pretty well on your system, you might try | ||
29 | |||
30 | echo platform > /sys/power/disk; echo disk > /sys/power/state | ||
31 | |||
32 | |||
33 | |||
34 | Article about goals and implementation of Software Suspend for Linux | ||
35 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
36 | Author: G‚ábor Kuti | ||
37 | Last revised: 2003-10-20 by Pavel Machek | ||
38 | |||
39 | Idea and goals to achieve | ||
40 | |||
41 | Nowadays it is common in several laptops that they have a suspend button. It | ||
42 | saves the state of the machine to a filesystem or to a partition and switches | ||
43 | to standby mode. Later resuming the machine the saved state is loaded back to | ||
44 | ram and the machine can continue its work. It has two real benefits. First we | ||
45 | save ourselves the time machine goes down and later boots up, energy costs | ||
46 | are real high when running from batteries. The other gain is that we don't have to | ||
47 | interrupt our programs so processes that are calculating something for a long | ||
48 | time shouldn't need to be written interruptible. | ||
49 | |||
50 | swsusp saves the state of the machine into active swaps and then reboots or | ||
51 | powerdowns. You must explicitly specify the swap partition to resume from with | ||
52 | ``resume='' kernel option. If signature is found it loads and restores saved | ||
53 | state. If the option ``noresume'' is specified as a boot parameter, it skips | ||
54 | the resuming. | ||
55 | |||
56 | In the meantime while the system is suspended you should not add/remove any | ||
57 | of the hardware, write to the filesystems, etc. | ||
58 | |||
59 | Sleep states summary | ||
60 | ==================== | ||
61 | |||
62 | There are three different interfaces you can use, /proc/acpi should | ||
63 | work like this: | ||
64 | |||
65 | In a really perfect world: | ||
66 | echo 1 > /proc/acpi/sleep # for standby | ||
67 | echo 2 > /proc/acpi/sleep # for suspend to ram | ||
68 | echo 3 > /proc/acpi/sleep # for suspend to ram, but with more power conservative | ||
69 | echo 4 > /proc/acpi/sleep # for suspend to disk | ||
70 | echo 5 > /proc/acpi/sleep # for shutdown unfriendly the system | ||
71 | |||
72 | and perhaps | ||
73 | echo 4b > /proc/acpi/sleep # for suspend to disk via s4bios | ||
74 | |||
75 | Frequently Asked Questions | ||
76 | ========================== | ||
77 | |||
78 | Q: well, suspending a server is IMHO a really stupid thing, | ||
79 | but... (Diego Zuccato): | ||
80 | |||
81 | A: You bought new UPS for your server. How do you install it without | ||
82 | bringing machine down? Suspend to disk, rearrange power cables, | ||
83 | resume. | ||
84 | |||
85 | You have your server on UPS. Power died, and UPS is indicating 30 | ||
86 | seconds to failure. What do you do? Suspend to disk. | ||
87 | |||
88 | Ethernet card in your server died. You want to replace it. Your | ||
89 | server is not hotplug capable. What do you do? Suspend to disk, | ||
90 | replace ethernet card, resume. If you are fast your users will not | ||
91 | even see broken connections. | ||
92 | |||
93 | |||
94 | Q: Maybe I'm missing something, but why don't the regular I/O paths work? | ||
95 | |||
96 | A: We do use the regular I/O paths. However we cannot restore the data | ||
97 | to its original location as we load it. That would create an | ||
98 | inconsistent kernel state which would certainly result in an oops. | ||
99 | Instead, we load the image into unused memory and then atomically copy | ||
100 | it back to it original location. This implies, of course, a maximum | ||
101 | image size of half the amount of memory. | ||
102 | |||
103 | There are two solutions to this: | ||
104 | |||
105 | * require half of memory to be free during suspend. That way you can | ||
106 | read "new" data onto free spots, then cli and copy | ||
107 | |||
108 | * assume we had special "polling" ide driver that only uses memory | ||
109 | between 0-640KB. That way, I'd have to make sure that 0-640KB is free | ||
110 | during suspending, but otherwise it would work... | ||
111 | |||
112 | suspend2 shares this fundamental limitation, but does not include user | ||
113 | data and disk caches into "used memory" by saving them in | ||
114 | advance. That means that the limitation goes away in practice. | ||
115 | |||
116 | Q: Does linux support ACPI S4? | ||
117 | |||
118 | A: Yes. That's what echo platform > /sys/power/disk does. | ||
119 | |||
120 | Q: My machine doesn't work with ACPI. How can I use swsusp than ? | ||
121 | |||
122 | A: Do a reboot() syscall with right parameters. Warning: glibc gets in | ||
123 | its way, so check with strace: | ||
124 | |||
125 | reboot(LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2, 0xd000fce2) | ||
126 | |||
127 | (Thanks to Peter Osterlund:) | ||
128 | |||
129 | #include <unistd.h> | ||
130 | #include <syscall.h> | ||
131 | |||
132 | #define LINUX_REBOOT_MAGIC1 0xfee1dead | ||
133 | #define LINUX_REBOOT_MAGIC2 672274793 | ||
134 | #define LINUX_REBOOT_CMD_SW_SUSPEND 0xD000FCE2 | ||
135 | |||
136 | int main() | ||
137 | { | ||
138 | syscall(SYS_reboot, LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2, | ||
139 | LINUX_REBOOT_CMD_SW_SUSPEND, 0); | ||
140 | return 0; | ||
141 | } | ||
142 | |||
143 | Also /sys/ interface should be still present. | ||
144 | |||
145 | Q: What is 'suspend2'? | ||
146 | |||
147 | A: suspend2 is 'Software Suspend 2', a forked implementation of | ||
148 | suspend-to-disk which is available as separate patches for 2.4 and 2.6 | ||
149 | kernels from swsusp.sourceforge.net. It includes support for SMP, 4GB | ||
150 | highmem and preemption. It also has a extensible architecture that | ||
151 | allows for arbitrary transformations on the image (compression, | ||
152 | encryption) and arbitrary backends for writing the image (eg to swap | ||
153 | or an NFS share[Work In Progress]). Questions regarding suspend2 | ||
154 | should be sent to the mailing list available through the suspend2 | ||
155 | website, and not to the Linux Kernel Mailing List. We are working | ||
156 | toward merging suspend2 into the mainline kernel. | ||
157 | |||
158 | Q: A kernel thread must voluntarily freeze itself (call 'refrigerator'). | ||
159 | I found some kernel threads that don't do it, and they don't freeze | ||
160 | so the system can't sleep. Is this a known behavior? | ||
161 | |||
162 | A: All such kernel threads need to be fixed, one by one. Select the | ||
163 | place where the thread is safe to be frozen (no kernel semaphores | ||
164 | should be held at that point and it must be safe to sleep there), and | ||
165 | add: | ||
166 | |||
167 | if (current->flags & PF_FREEZE) | ||
168 | refrigerator(PF_FREEZE); | ||
169 | |||
170 | If the thread is needed for writing the image to storage, you should | ||
171 | instead set the PF_NOFREEZE process flag when creating the thread. | ||
172 | |||
173 | |||
174 | Q: What is the difference between between "platform", "shutdown" and | ||
175 | "firmware" in /sys/power/disk? | ||
176 | |||
177 | A: | ||
178 | |||
179 | shutdown: save state in linux, then tell bios to powerdown | ||
180 | |||
181 | platform: save state in linux, then tell bios to powerdown and blink | ||
182 | "suspended led" | ||
183 | |||
184 | firmware: tell bios to save state itself [needs BIOS-specific suspend | ||
185 | partition, and has very little to do with swsusp] | ||
186 | |||
187 | "platform" is actually right thing to do, but "shutdown" is most | ||
188 | reliable. | ||
189 | |||
190 | Q: I do not understand why you have such strong objections to idea of | ||
191 | selective suspend. | ||
192 | |||
193 | A: Do selective suspend during runtime power managment, that's okay. But | ||
194 | its useless for suspend-to-disk. (And I do not see how you could use | ||
195 | it for suspend-to-ram, I hope you do not want that). | ||
196 | |||
197 | Lets see, so you suggest to | ||
198 | |||
199 | * SUSPEND all but swap device and parents | ||
200 | * Snapshot | ||
201 | * Write image to disk | ||
202 | * SUSPEND swap device and parents | ||
203 | * Powerdown | ||
204 | |||
205 | Oh no, that does not work, if swap device or its parents uses DMA, | ||
206 | you've corrupted data. You'd have to do | ||
207 | |||
208 | * SUSPEND all but swap device and parents | ||
209 | * FREEZE swap device and parents | ||
210 | * Snapshot | ||
211 | * UNFREEZE swap device and parents | ||
212 | * Write | ||
213 | * SUSPEND swap device and parents | ||
214 | |||
215 | Which means that you still need that FREEZE state, and you get more | ||
216 | complicated code. (And I have not yet introduce details like system | ||
217 | devices). | ||
218 | |||
219 | Q: There don't seem to be any generally useful behavioral | ||
220 | distinctions between SUSPEND and FREEZE. | ||
221 | |||
222 | A: Doing SUSPEND when you are asked to do FREEZE is always correct, | ||
223 | but it may be unneccessarily slow. If you want USB to stay simple, | ||
224 | slowness may not matter to you. It can always be fixed later. | ||
225 | |||
226 | For devices like disk it does matter, you do not want to spindown for | ||
227 | FREEZE. | ||
228 | |||
229 | Q: After resuming, system is paging heavilly, leading to very bad interactivity. | ||
230 | |||
231 | A: Try running | ||
232 | |||
233 | cat `cat /proc/[0-9]*/maps | grep / | sed 's:.* /:/:' | sort -u` > /dev/null | ||
234 | |||
235 | after resume. swapoff -a; swapon -a may also be usefull. | ||
diff --git a/Documentation/power/tricks.txt b/Documentation/power/tricks.txt new file mode 100644 index 000000000000..c6d58d3da133 --- /dev/null +++ b/Documentation/power/tricks.txt | |||
@@ -0,0 +1,27 @@ | |||
1 | swsusp/S3 tricks | ||
2 | ~~~~~~~~~~~~~~~~ | ||
3 | Pavel Machek <pavel@suse.cz> | ||
4 | |||
5 | If you want to trick swsusp/S3 into working, you might want to try: | ||
6 | |||
7 | * go with minimal config, turn off drivers like USB, AGP you don't | ||
8 | really need | ||
9 | |||
10 | * turn off APIC and preempt | ||
11 | |||
12 | * use ext2. At least it has working fsck. [If something seemes to go | ||
13 | wrong, force fsck when you have a chance] | ||
14 | |||
15 | * turn off modules | ||
16 | |||
17 | * use vga text console, shut down X. [If you really want X, you might | ||
18 | want to try vesafb later] | ||
19 | |||
20 | * try running as few processes as possible, preferably go to single | ||
21 | user mode. | ||
22 | |||
23 | * due to video issues, swsusp should be easier to get working than | ||
24 | S3. Try that first. | ||
25 | |||
26 | When you make it work, try to find out what exactly was it that broke | ||
27 | suspend, and preferably fix that. | ||
diff --git a/Documentation/power/video.txt b/Documentation/power/video.txt new file mode 100644 index 000000000000..8686968416ca --- /dev/null +++ b/Documentation/power/video.txt | |||
@@ -0,0 +1,169 @@ | |||
1 | |||
2 | Video issues with S3 resume | ||
3 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
4 | 2003-2005, Pavel Machek | ||
5 | |||
6 | During S3 resume, hardware needs to be reinitialized. For most | ||
7 | devices, this is easy, and kernel driver knows how to do | ||
8 | it. Unfortunately there's one exception: video card. Those are usually | ||
9 | initialized by BIOS, and kernel does not have enough information to | ||
10 | boot video card. (Kernel usually does not even contain video card | ||
11 | driver -- vesafb and vgacon are widely used). | ||
12 | |||
13 | This is not problem for swsusp, because during swsusp resume, BIOS is | ||
14 | run normally so video card is normally initialized. S3 has absolutely | ||
15 | no chance of working with SMP/HT. Be sure it to turn it off before | ||
16 | testing (swsusp should work ok, OTOH). | ||
17 | |||
18 | There are a few types of systems where video works after S3 resume: | ||
19 | |||
20 | (1) systems where video state is preserved over S3. | ||
21 | |||
22 | (2) systems where it is possible to call the video BIOS during S3 | ||
23 | resume. Unfortunately, it is not correct to call the video BIOS at | ||
24 | that point, but it happens to work on some machines. Use | ||
25 | acpi_sleep=s3_bios. | ||
26 | |||
27 | (3) systems that initialize video card into vga text mode and where | ||
28 | the BIOS works well enough to be able to set video mode. Use | ||
29 | acpi_sleep=s3_mode on these. | ||
30 | |||
31 | (4) on some systems s3_bios kicks video into text mode, and | ||
32 | acpi_sleep=s3_bios,s3_mode is needed. | ||
33 | |||
34 | (5) radeon systems, where X can soft-boot your video card. You'll need | ||
35 | new enough X, and plain text console (no vesafb or radeonfb), see | ||
36 | http://www.doesi.gmxhome.de/linux/tm800s3/s3.html. Actually you | ||
37 | should probably use vbetool (6) instead. | ||
38 | |||
39 | (6) other radeon systems, where vbetool is enough to bring system back | ||
40 | to life. It needs text console to be working. Do vbetool vbestate | ||
41 | save > /tmp/delme; echo 3 > /proc/acpi/sleep; vbetool post; vbetool | ||
42 | vbestate restore < /tmp/delme; setfont <whatever>, and your video | ||
43 | should work. | ||
44 | |||
45 | (7) on some systems, it is possible to boot most of kernel, and then | ||
46 | POSTing bios works. Ole Rohne has patch to do just that at | ||
47 | http://dev.gentoo.org/~marineam/patch-radeonfb-2.6.11-rc2-mm2. | ||
48 | |||
49 | Now, if you pass acpi_sleep=something, and it does not work with your | ||
50 | bios, you'll get a hard crash during resume. Be careful. Also it is | ||
51 | safest to do your experiments with plain old VGA console. The vesafb | ||
52 | and radeonfb (etc) drivers have a tendency to crash the machine during | ||
53 | resume. | ||
54 | |||
55 | You may have a system where none of above works. At that point you | ||
56 | either invent another ugly hack that works, or write proper driver for | ||
57 | your video card (good luck getting docs :-(). Maybe suspending from X | ||
58 | (proper X, knowing your hardware, not XF68_FBcon) might have better | ||
59 | chance of working. | ||
60 | |||
61 | Table of known working systems: | ||
62 | |||
63 | Model hack (or "how to do it") | ||
64 | ------------------------------------------------------------------------------ | ||
65 | Acer Aspire 1406LC ole's late BIOS init (7), turn off DRI | ||
66 | Acer TM 242FX vbetool (6) | ||
67 | Acer TM C300 vga=normal (only suspend on console, not in X), vbetool (6) | ||
68 | Acer TM 4052LCi s3_bios (2) | ||
69 | Acer TM 636Lci s3_bios vga=normal (2) | ||
70 | Acer TM 650 (Radeon M7) vga=normal plus boot-radeon (5) gets text console back | ||
71 | Acer TM 660 ??? (*) | ||
72 | Acer TM 800 vga=normal, X patches, see webpage (5) or vbetool (6) | ||
73 | Acer TM 803 vga=normal, X patches, see webpage (5) or vbetool (6) | ||
74 | Acer TM 803LCi vga=normal, vbetool (6) | ||
75 | Arima W730a vbetool needed (6) | ||
76 | Asus L2400D s3_mode (3)(***) (S1 also works OK) | ||
77 | Asus L3800C (Radeon M7) s3_bios (2) (S1 also works OK) | ||
78 | Asus M6NE ??? (*) | ||
79 | Athlon64 desktop prototype s3_bios (2) | ||
80 | Compal CL-50 ??? (*) | ||
81 | Compaq Armada E500 - P3-700 none (1) (S1 also works OK) | ||
82 | Compaq Evo N620c vga=normal, s3_bios (2) | ||
83 | Dell 600m, ATI R250 Lf none (1), but needs xorg-x11-6.8.1.902-1 | ||
84 | Dell D600, ATI RV250 vga=normal and X, or try vbestate (6) | ||
85 | Dell Inspiron 4000 ??? (*) | ||
86 | Dell Inspiron 500m ??? (*) | ||
87 | Dell Inspiron 600m ??? (*) | ||
88 | Dell Inspiron 8200 ??? (*) | ||
89 | Dell Inspiron 8500 ??? (*) | ||
90 | Dell Inspiron 8600 ??? (*) | ||
91 | eMachines athlon64 machines vbetool needed (6) (someone please get me model #s) | ||
92 | HP NC6000 s3_bios, may not use radeonfb (2); or vbetool (6) | ||
93 | HP NX7000 ??? (*) | ||
94 | HP Pavilion ZD7000 vbetool post needed, need open-source nv driver for X | ||
95 | HP Omnibook XE3 athlon version none (1) | ||
96 | HP Omnibook XE3GC none (1), video is S3 Savage/IX-MV | ||
97 | IBM TP T20, model 2647-44G none (1), video is S3 Inc. 86C270-294 Savage/IX-MV, vesafb gets "interesting" but X work. | ||
98 | IBM TP A31 / Type 2652-M5G s3_mode (3) [works ok with BIOS 1.04 2002-08-23, but not at all with BIOS 1.11 2004-11-05 :-(] | ||
99 | IBM TP R32 / Type 2658-MMG none (1) | ||
100 | IBM TP R40 2722B3G ??? (*) | ||
101 | IBM TP R50p / Type 1832-22U s3_bios (2) | ||
102 | IBM TP R51 ??? (*) | ||
103 | IBM TP T30 236681A ??? (*) | ||
104 | IBM TP T40 / Type 2373-MU4 none (1) | ||
105 | IBM TP T40p none (1) | ||
106 | IBM TP R40p s3_bios (2) | ||
107 | IBM TP T41p s3_bios (2), switch to X after resume | ||
108 | IBM TP T42 ??? (*) | ||
109 | IBM ThinkPad T42p (2373-GTG) s3_bios (2) | ||
110 | IBM TP X20 ??? (*) | ||
111 | IBM TP X30 ??? (*) | ||
112 | IBM TP X31 / Type 2672-XXH none (1), use radeontool (http://fdd.com/software/radeon/) to turn off backlight. | ||
113 | IBM Thinkpad X40 Type 2371-7JG s3_bios,s3_mode (4) | ||
114 | Medion MD4220 ??? (*) | ||
115 | Samsung P35 vbetool needed (6) | ||
116 | Sharp PC-AR10 (ATI rage) none (1) | ||
117 | Sony Vaio PCG-F403 ??? (*) | ||
118 | Sony Vaio PCG-N505SN ??? (*) | ||
119 | Sony Vaio vgn-s260 X or boot-radeon can init it (5) | ||
120 | Toshiba Libretto L5 none (1) | ||
121 | Toshiba Satellite 4030CDT s3_mode (3) | ||
122 | Toshiba Satellite 4080XCDT s3_mode (3) | ||
123 | Toshiba Satellite 4090XCDT ??? (*) | ||
124 | Toshiba Satellite P10-554 s3_bios,s3_mode (4)(****) | ||
125 | Uniwill 244IIO ??? (*) | ||
126 | |||
127 | |||
128 | (*) from http://www.ubuntulinux.org/wiki/HoaryPMResults, not sure | ||
129 | which options to use. If you know, please tell me. | ||
130 | |||
131 | (***) To be tested with a newer kernel. | ||
132 | |||
133 | (****) Not with SMP kernel, UP only. | ||
134 | |||
135 | VBEtool details | ||
136 | ~~~~~~~~~~~~~~~ | ||
137 | (with thanks to Carl-Daniel Hailfinger) | ||
138 | |||
139 | First, boot into X and run the following script ONCE: | ||
140 | #!/bin/bash | ||
141 | statedir=/root/s3/state | ||
142 | mkdir -p $statedir | ||
143 | chvt 2 | ||
144 | sleep 1 | ||
145 | vbetool vbestate save >$statedir/vbe | ||
146 | |||
147 | |||
148 | To suspend and resume properly, call the following script as root: | ||
149 | #!/bin/bash | ||
150 | statedir=/root/s3/state | ||
151 | curcons=`fgconsole` | ||
152 | fuser /dev/tty$curcons 2>/dev/null|xargs ps -o comm= -p|grep -q X && chvt 2 | ||
153 | cat /dev/vcsa >$statedir/vcsa | ||
154 | sync | ||
155 | echo 3 >/proc/acpi/sleep | ||
156 | sync | ||
157 | vbetool post | ||
158 | vbetool vbestate restore <$statedir/vbe | ||
159 | cat $statedir/vcsa >/dev/vcsa | ||
160 | rckbd restart | ||
161 | chvt $[curcons%6+1] | ||
162 | chvt $curcons | ||
163 | |||
164 | |||
165 | Unless you change your graphics card or other hardware configuration, | ||
166 | the state once saved will be OK for every resume afterwards. | ||
167 | NOTE: The "rckbd restart" command may be different for your | ||
168 | distribution. Simply replace it with the command you would use to | ||
169 | set the fonts on screen. | ||
diff --git a/Documentation/power/video_extension.txt b/Documentation/power/video_extension.txt new file mode 100644 index 000000000000..8e33d7c82c49 --- /dev/null +++ b/Documentation/power/video_extension.txt | |||
@@ -0,0 +1,34 @@ | |||
1 | This driver implement the ACPI Extensions For Display Adapters | ||
2 | for integrated graphics devices on motherboard, as specified in | ||
3 | ACPI 2.0 Specification, Appendix B, allowing to perform some basic | ||
4 | control like defining the video POST device, retrieving EDID information | ||
5 | or to setup a video output, etc. Note that this is an ref. implementation only. | ||
6 | It may or may not work for your integrated video device. | ||
7 | |||
8 | Interfaces exposed to userland through /proc/acpi/video: | ||
9 | |||
10 | VGA/info : display the supported video bus device capability like ,Video ROM, CRT/LCD/TV. | ||
11 | VGA/ROM : Used to get a copy of the display devices' ROM data (up to 4k). | ||
12 | VGA/POST_info : Used to determine what options are implemented. | ||
13 | VGA/POST : Used to get/set POST device. | ||
14 | VGA/DOS : Used to get/set ownership of output switching: | ||
15 | Please refer ACPI spec B.4.1 _DOS | ||
16 | VGA/CRT : CRT output | ||
17 | VGA/LCD : LCD output | ||
18 | VGA/TV : TV output | ||
19 | VGA/*/brightness : Used to get/set brightness of output device | ||
20 | |||
21 | Notify event through /proc/acpi/event: | ||
22 | |||
23 | #define ACPI_VIDEO_NOTIFY_SWITCH 0x80 | ||
24 | #define ACPI_VIDEO_NOTIFY_PROBE 0x81 | ||
25 | #define ACPI_VIDEO_NOTIFY_CYCLE 0x82 | ||
26 | #define ACPI_VIDEO_NOTIFY_NEXT_OUTPUT 0x83 | ||
27 | #define ACPI_VIDEO_NOTIFY_PREV_OUTPUT 0x84 | ||
28 | |||
29 | #define ACPI_VIDEO_NOTIFY_CYCLE_BRIGHTNESS 0x82 | ||
30 | #define ACPI_VIDEO_NOTIFY_INC_BRIGHTNESS 0x83 | ||
31 | #define ACPI_VIDEO_NOTIFY_DEC_BRIGHTNESS 0x84 | ||
32 | #define ACPI_VIDEO_NOTIFY_ZERO_BRIGHTNESS 0x85 | ||
33 | #define ACPI_VIDEO_NOTIFY_DISPLAY_OFF 0x86 | ||
34 | |||