diff options
Diffstat (limited to 'Documentation/power/devices.txt')
-rw-r--r-- | Documentation/power/devices.txt | 319 |
1 files changed, 319 insertions, 0 deletions
diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt new file mode 100644 index 000000000000..5d4ae9a39f1d --- /dev/null +++ b/Documentation/power/devices.txt | |||
@@ -0,0 +1,319 @@ | |||
1 | |||
2 | Device Power Management | ||
3 | |||
4 | |||
5 | Device power management encompasses two areas - the ability to save | ||
6 | state and transition a device to a low-power state when the system is | ||
7 | entering a low-power state; and the ability to transition a device to | ||
8 | a low-power state while the system is running (and independently of | ||
9 | any other power management activity). | ||
10 | |||
11 | |||
12 | Methods | ||
13 | |||
14 | The methods to suspend and resume devices reside in struct bus_type: | ||
15 | |||
16 | struct bus_type { | ||
17 | ... | ||
18 | int (*suspend)(struct device * dev, pm_message_t state); | ||
19 | int (*resume)(struct device * dev); | ||
20 | }; | ||
21 | |||
22 | Each bus driver is responsible implementing these methods, translating | ||
23 | the call into a bus-specific request and forwarding the call to the | ||
24 | bus-specific drivers. For example, PCI drivers implement suspend() and | ||
25 | resume() methods in struct pci_driver. The PCI core is simply | ||
26 | responsible for translating the pointers to PCI-specific ones and | ||
27 | calling the low-level driver. | ||
28 | |||
29 | This is done to a) ease transition to the new power management methods | ||
30 | and leverage the existing PM code in various bus drivers; b) allow | ||
31 | buses to implement generic and default PM routines for devices, and c) | ||
32 | make the flow of execution obvious to the reader. | ||
33 | |||
34 | |||
35 | System Power Management | ||
36 | |||
37 | When the system enters a low-power state, the device tree is walked in | ||
38 | a depth-first fashion to transition each device into a low-power | ||
39 | state. The ordering of the device tree is guaranteed by the order in | ||
40 | which devices get registered - children are never registered before | ||
41 | their ancestors, and devices are placed at the back of the list when | ||
42 | registered. By walking the list in reverse order, we are guaranteed to | ||
43 | suspend devices in the proper order. | ||
44 | |||
45 | Devices are suspended once with interrupts enabled. Drivers are | ||
46 | expected to stop I/O transactions, save device state, and place the | ||
47 | device into a low-power state. Drivers may sleep, allocate memory, | ||
48 | etc. at will. | ||
49 | |||
50 | Some devices are broken and will inevitably have problems powering | ||
51 | down or disabling themselves with interrupts enabled. For these | ||
52 | special cases, they may return -EAGAIN. This will put the device on a | ||
53 | list to be taken care of later. When interrupts are disabled, before | ||
54 | we enter the low-power state, their drivers are called again to put | ||
55 | their device to sleep. | ||
56 | |||
57 | On resume, the devices that returned -EAGAIN will be called to power | ||
58 | themselves back on with interrupts disabled. Once interrupts have been | ||
59 | re-enabled, the rest of the drivers will be called to resume their | ||
60 | devices. On resume, a driver is responsible for powering back on each | ||
61 | device, restoring state, and re-enabling I/O transactions for that | ||
62 | device. | ||
63 | |||
64 | System devices follow a slightly different API, which can be found in | ||
65 | |||
66 | include/linux/sysdev.h | ||
67 | drivers/base/sys.c | ||
68 | |||
69 | System devices will only be suspended with interrupts disabled, and | ||
70 | after all other devices have been suspended. On resume, they will be | ||
71 | resumed before any other devices, and also with interrupts disabled. | ||
72 | |||
73 | |||
74 | Runtime Power Management | ||
75 | |||
76 | Many devices are able to dynamically power down while the system is | ||
77 | still running. This feature is useful for devices that are not being | ||
78 | used, and can offer significant power savings on a running system. | ||
79 | |||
80 | In each device's directory, there is a 'power' directory, which | ||
81 | contains at least a 'state' file. Reading from this file displays what | ||
82 | power state the device is currently in. Writing to this file initiates | ||
83 | a transition to the specified power state, which must be a decimal in | ||
84 | the range 1-3, inclusive; or 0 for 'On'. | ||
85 | |||
86 | The PM core will call the ->suspend() method in the bus_type object | ||
87 | that the device belongs to if the specified state is not 0, or | ||
88 | ->resume() if it is. | ||
89 | |||
90 | Nothing will happen if the specified state is the same state the | ||
91 | device is currently in. | ||
92 | |||
93 | If the device is already in a low-power state, and the specified state | ||
94 | is another, but different, low-power state, the ->resume() method will | ||
95 | first be called to power the device back on, then ->suspend() will be | ||
96 | called again with the new state. | ||
97 | |||
98 | The driver is responsible for saving the working state of the device | ||
99 | and putting it into the low-power state specified. If this was | ||
100 | successful, it returns 0, and the device's power_state field is | ||
101 | updated. | ||
102 | |||
103 | The driver must take care to know whether or not it is able to | ||
104 | properly resume the device, including all step of reinitialization | ||
105 | necessary. (This is the hardest part, and the one most protected by | ||
106 | NDA'd documents). | ||
107 | |||
108 | The driver must also take care not to suspend a device that is | ||
109 | currently in use. It is their responsibility to provide their own | ||
110 | exclusion mechanisms. | ||
111 | |||
112 | The runtime power transition happens with interrupts enabled. If a | ||
113 | device cannot support being powered down with interrupts, it may | ||
114 | return -EAGAIN (as it would during a system power management | ||
115 | transition), but it will _not_ be called again, and the transaction | ||
116 | will fail. | ||
117 | |||
118 | There is currently no way to know what states a device or driver | ||
119 | supports a priori. This will change in the future. | ||
120 | |||
121 | pm_message_t meaning | ||
122 | |||
123 | pm_message_t has two fields. event ("major"), and flags. If driver | ||
124 | does not know event code, it aborts the request, returning error. Some | ||
125 | drivers may need to deal with special cases based on the actual type | ||
126 | of suspend operation being done at the system level. This is why | ||
127 | there are flags. | ||
128 | |||
129 | Event codes are: | ||
130 | |||
131 | ON -- no need to do anything except special cases like broken | ||
132 | HW. | ||
133 | |||
134 | # NOTIFICATION -- pretty much same as ON? | ||
135 | |||
136 | FREEZE -- stop DMA and interrupts, and be prepared to reinit HW from | ||
137 | scratch. That probably means stop accepting upstream requests, the | ||
138 | actual policy of what to do with them beeing specific to a given | ||
139 | driver. It's acceptable for a network driver to just drop packets | ||
140 | while a block driver is expected to block the queue so no request is | ||
141 | lost. (Use IDE as an example on how to do that). FREEZE requires no | ||
142 | power state change, and it's expected for drivers to be able to | ||
143 | quickly transition back to operating state. | ||
144 | |||
145 | SUSPEND -- like FREEZE, but also put hardware into low-power state. If | ||
146 | there's need to distinguish several levels of sleep, additional flag | ||
147 | is probably best way to do that. | ||
148 | |||
149 | Transitions are only from a resumed state to a suspended state, never | ||
150 | between 2 suspended states. (ON -> FREEZE or ON -> SUSPEND can happen, | ||
151 | FREEZE -> SUSPEND or SUSPEND -> FREEZE can not). | ||
152 | |||
153 | All events are: | ||
154 | |||
155 | [NOTE NOTE NOTE: If you are driver author, you should not care; you | ||
156 | should only look at event, and ignore flags.] | ||
157 | |||
158 | #Prepare for suspend -- userland is still running but we are going to | ||
159 | #enter suspend state. This gives drivers chance to load firmware from | ||
160 | #disk and store it in memory, or do other activities taht require | ||
161 | #operating userland, ability to kmalloc GFP_KERNEL, etc... All of these | ||
162 | #are forbiden once the suspend dance is started.. event = ON, flags = | ||
163 | #PREPARE_TO_SUSPEND | ||
164 | |||
165 | Apm standby -- prepare for APM event. Quiesce devices to make life | ||
166 | easier for APM BIOS. event = FREEZE, flags = APM_STANDBY | ||
167 | |||
168 | Apm suspend -- same as APM_STANDBY, but it we should probably avoid | ||
169 | spinning down disks. event = FREEZE, flags = APM_SUSPEND | ||
170 | |||
171 | System halt, reboot -- quiesce devices to make life easier for BIOS. event | ||
172 | = FREEZE, flags = SYSTEM_HALT or SYSTEM_REBOOT | ||
173 | |||
174 | System shutdown -- at least disks need to be spun down, or data may be | ||
175 | lost. Quiesce devices, just to make life easier for BIOS. event = | ||
176 | FREEZE, flags = SYSTEM_SHUTDOWN | ||
177 | |||
178 | Kexec -- turn off DMAs and put hardware into some state where new | ||
179 | kernel can take over. event = FREEZE, flags = KEXEC | ||
180 | |||
181 | Powerdown at end of swsusp -- very similar to SYSTEM_SHUTDOWN, except wake | ||
182 | may need to be enabled on some devices. This actually has at least 3 | ||
183 | subtypes, system can reboot, enter S4 and enter S5 at the end of | ||
184 | swsusp. event = FREEZE, flags = SWSUSP and one of SYSTEM_REBOOT, | ||
185 | SYSTEM_SHUTDOWN, SYSTEM_S4 | ||
186 | |||
187 | Suspend to ram -- put devices into low power state. event = SUSPEND, | ||
188 | flags = SUSPEND_TO_RAM | ||
189 | |||
190 | Freeze for swsusp snapshot -- stop DMA and interrupts. No need to put | ||
191 | devices into low power mode, but you must be able to reinitialize | ||
192 | device from scratch in resume method. This has two flavors, its done | ||
193 | once on suspending kernel, once on resuming kernel. event = FREEZE, | ||
194 | flags = DURING_SUSPEND or DURING_RESUME | ||
195 | |||
196 | Device detach requested from /sys -- deinitialize device; proably same as | ||
197 | SYSTEM_SHUTDOWN, I do not understand this one too much. probably event | ||
198 | = FREEZE, flags = DEV_DETACH. | ||
199 | |||
200 | #These are not really events sent: | ||
201 | # | ||
202 | #System fully on -- device is working normally; this is probably never | ||
203 | #passed to suspend() method... event = ON, flags = 0 | ||
204 | # | ||
205 | #Ready after resume -- userland is now running, again. Time to free any | ||
206 | #memory you ate during prepare to suspend... event = ON, flags = | ||
207 | #READY_AFTER_RESUME | ||
208 | # | ||
209 | |||
210 | Driver Detach Power Management | ||
211 | |||
212 | The kernel now supports the ability to place a device in a low-power | ||
213 | state when it is detached from its driver, which happens when its | ||
214 | module is removed. | ||
215 | |||
216 | Each device contains a 'detach_state' file in its sysfs directory | ||
217 | which can be used to control this state. Reading from this file | ||
218 | displays what the current detach state is set to. This is 0 (On) by | ||
219 | default. A user may write a positive integer value to this file in the | ||
220 | range of 1-4 inclusive. | ||
221 | |||
222 | A value of 1-3 will indicate the device should be placed in that | ||
223 | low-power state, which will cause ->suspend() to be called for that | ||
224 | device. A value of 4 indicates that the device should be shutdown, so | ||
225 | ->shutdown() will be called for that device. | ||
226 | |||
227 | The driver is responsible for reinitializing the device when the | ||
228 | module is re-inserted during it's ->probe() (or equivalent) method. | ||
229 | The driver core will not call any extra functions when binding the | ||
230 | device to the driver. | ||
231 | |||
232 | pm_message_t meaning | ||
233 | |||
234 | pm_message_t has two fields. event ("major"), and flags. If driver | ||
235 | does not know event code, it aborts the request, returning error. Some | ||
236 | drivers may need to deal with special cases based on the actual type | ||
237 | of suspend operation being done at the system level. This is why | ||
238 | there are flags. | ||
239 | |||
240 | Event codes are: | ||
241 | |||
242 | ON -- no need to do anything except special cases like broken | ||
243 | HW. | ||
244 | |||
245 | # NOTIFICATION -- pretty much same as ON? | ||
246 | |||
247 | FREEZE -- stop DMA and interrupts, and be prepared to reinit HW from | ||
248 | scratch. That probably means stop accepting upstream requests, the | ||
249 | actual policy of what to do with them being specific to a given | ||
250 | driver. It's acceptable for a network driver to just drop packets | ||
251 | while a block driver is expected to block the queue so no request is | ||
252 | lost. (Use IDE as an example on how to do that). FREEZE requires no | ||
253 | power state change, and it's expected for drivers to be able to | ||
254 | quickly transition back to operating state. | ||
255 | |||
256 | SUSPEND -- like FREEZE, but also put hardware into low-power state. If | ||
257 | there's need to distinguish several levels of sleep, additional flag | ||
258 | is probably best way to do that. | ||
259 | |||
260 | Transitions are only from a resumed state to a suspended state, never | ||
261 | between 2 suspended states. (ON -> FREEZE or ON -> SUSPEND can happen, | ||
262 | FREEZE -> SUSPEND or SUSPEND -> FREEZE can not). | ||
263 | |||
264 | All events are: | ||
265 | |||
266 | [NOTE NOTE NOTE: If you are driver author, you should not care; you | ||
267 | should only look at event, and ignore flags.] | ||
268 | |||
269 | #Prepare for suspend -- userland is still running but we are going to | ||
270 | #enter suspend state. This gives drivers chance to load firmware from | ||
271 | #disk and store it in memory, or do other activities taht require | ||
272 | #operating userland, ability to kmalloc GFP_KERNEL, etc... All of these | ||
273 | #are forbiden once the suspend dance is started.. event = ON, flags = | ||
274 | #PREPARE_TO_SUSPEND | ||
275 | |||
276 | Apm standby -- prepare for APM event. Quiesce devices to make life | ||
277 | easier for APM BIOS. event = FREEZE, flags = APM_STANDBY | ||
278 | |||
279 | Apm suspend -- same as APM_STANDBY, but it we should probably avoid | ||
280 | spinning down disks. event = FREEZE, flags = APM_SUSPEND | ||
281 | |||
282 | System halt, reboot -- quiesce devices to make life easier for BIOS. event | ||
283 | = FREEZE, flags = SYSTEM_HALT or SYSTEM_REBOOT | ||
284 | |||
285 | System shutdown -- at least disks need to be spun down, or data may be | ||
286 | lost. Quiesce devices, just to make life easier for BIOS. event = | ||
287 | FREEZE, flags = SYSTEM_SHUTDOWN | ||
288 | |||
289 | Kexec -- turn off DMAs and put hardware into some state where new | ||
290 | kernel can take over. event = FREEZE, flags = KEXEC | ||
291 | |||
292 | Powerdown at end of swsusp -- very similar to SYSTEM_SHUTDOWN, except wake | ||
293 | may need to be enabled on some devices. This actually has at least 3 | ||
294 | subtypes, system can reboot, enter S4 and enter S5 at the end of | ||
295 | swsusp. event = FREEZE, flags = SWSUSP and one of SYSTEM_REBOOT, | ||
296 | SYSTEM_SHUTDOWN, SYSTEM_S4 | ||
297 | |||
298 | Suspend to ram -- put devices into low power state. event = SUSPEND, | ||
299 | flags = SUSPEND_TO_RAM | ||
300 | |||
301 | Freeze for swsusp snapshot -- stop DMA and interrupts. No need to put | ||
302 | devices into low power mode, but you must be able to reinitialize | ||
303 | device from scratch in resume method. This has two flavors, its done | ||
304 | once on suspending kernel, once on resuming kernel. event = FREEZE, | ||
305 | flags = DURING_SUSPEND or DURING_RESUME | ||
306 | |||
307 | Device detach requested from /sys -- deinitialize device; proably same as | ||
308 | SYSTEM_SHUTDOWN, I do not understand this one too much. probably event | ||
309 | = FREEZE, flags = DEV_DETACH. | ||
310 | |||
311 | #These are not really events sent: | ||
312 | # | ||
313 | #System fully on -- device is working normally; this is probably never | ||
314 | #passed to suspend() method... event = ON, flags = 0 | ||
315 | # | ||
316 | #Ready after resume -- userland is now running, again. Time to free any | ||
317 | #memory you ate during prepare to suspend... event = ON, flags = | ||
318 | #READY_AFTER_RESUME | ||
319 | # | ||