diff options
Diffstat (limited to 'Documentation/power')
-rw-r--r-- | Documentation/power/00-INDEX | 2 | ||||
-rw-r--r-- | Documentation/power/devices.txt | 119 | ||||
-rw-r--r-- | Documentation/power/drivers-testing.txt | 8 | ||||
-rw-r--r-- | Documentation/power/interface.txt | 2 | ||||
-rw-r--r-- | Documentation/power/notifiers.txt | 53 | ||||
-rw-r--r-- | Documentation/power/opp.txt | 378 | ||||
-rw-r--r-- | Documentation/power/regulator/machine.txt | 4 | ||||
-rw-r--r-- | Documentation/power/runtime_pm.txt | 306 | ||||
-rw-r--r-- | Documentation/power/s2ram.txt | 7 | ||||
-rw-r--r-- | Documentation/power/states.txt | 12 | ||||
-rw-r--r-- | Documentation/power/swsusp.txt | 7 | ||||
-rw-r--r-- | Documentation/power/userland-swsusp.txt | 6 |
12 files changed, 773 insertions, 131 deletions
diff --git a/Documentation/power/00-INDEX b/Documentation/power/00-INDEX index fb742c213c9e..45e9d4a91284 100644 --- a/Documentation/power/00-INDEX +++ b/Documentation/power/00-INDEX | |||
@@ -14,6 +14,8 @@ interface.txt | |||
14 | - Power management user interface in /sys/power | 14 | - Power management user interface in /sys/power |
15 | notifiers.txt | 15 | notifiers.txt |
16 | - Registering suspend notifiers in device drivers | 16 | - Registering suspend notifiers in device drivers |
17 | opp.txt | ||
18 | - Operating Performance Point library | ||
17 | pci.txt | 19 | pci.txt |
18 | - How the PCI Subsystem Does Power Management | 20 | - How the PCI Subsystem Does Power Management |
19 | pm_qos_interface.txt | 21 | pm_qos_interface.txt |
diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt index 57080cd74575..64565aac6e40 100644 --- a/Documentation/power/devices.txt +++ b/Documentation/power/devices.txt | |||
@@ -1,6 +1,6 @@ | |||
1 | Device Power Management | 1 | Device Power Management |
2 | 2 | ||
3 | Copyright (c) 2010 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. | 3 | Copyright (c) 2010-2011 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. |
4 | Copyright (c) 2010 Alan Stern <stern@rowland.harvard.edu> | 4 | Copyright (c) 2010 Alan Stern <stern@rowland.harvard.edu> |
5 | 5 | ||
6 | 6 | ||
@@ -159,18 +159,18 @@ matter, and the kernel is responsible for keeping track of it. By contrast, | |||
159 | whether or not a wakeup-capable device should issue wakeup events is a policy | 159 | whether or not a wakeup-capable device should issue wakeup events is a policy |
160 | decision, and it is managed by user space through a sysfs attribute: the | 160 | decision, and it is managed by user space through a sysfs attribute: the |
161 | power/wakeup file. User space can write the strings "enabled" or "disabled" to | 161 | power/wakeup file. User space can write the strings "enabled" or "disabled" to |
162 | set or clear the should_wakeup flag, respectively. Reads from the file will | 162 | set or clear the "should_wakeup" flag, respectively. This file is only present |
163 | return the corresponding string if can_wakeup is true, but if can_wakeup is | 163 | for wakeup-capable devices (i.e. devices whose "can_wakeup" flags are set) |
164 | false then reads will return an empty string, to indicate that the device | 164 | and is created (or removed) by device_set_wakeup_capable(). Reads from the |
165 | doesn't support wakeup events. (But even though the file appears empty, writes | 165 | file will return the corresponding string. |
166 | will still affect the should_wakeup flag.) | ||
167 | 166 | ||
168 | The device_may_wakeup() routine returns true only if both flags are set. | 167 | The device_may_wakeup() routine returns true only if both flags are set. |
169 | Drivers should check this routine when putting devices in a low-power state | 168 | This information is used by subsystems, like the PCI bus type code, to see |
170 | during a system sleep transition, to see whether or not to enable the devices' | 169 | whether or not to enable the devices' wakeup mechanisms. If device wakeup |
171 | wakeup mechanisms. However for runtime power management, wakeup events should | 170 | mechanisms are enabled or disabled directly by drivers, they also should use |
172 | be enabled whenever the device and driver both support them, regardless of the | 171 | device_may_wakeup() to decide what to do during a system sleep transition. |
173 | should_wakeup flag. | 172 | However for runtime power management, wakeup events should be enabled whenever |
173 | the device and driver both support them, regardless of the should_wakeup flag. | ||
174 | 174 | ||
175 | 175 | ||
176 | /sys/devices/.../power/control files | 176 | /sys/devices/.../power/control files |
@@ -249,23 +249,18 @@ various phases always run after tasks have been frozen and before they are | |||
249 | unfrozen. Furthermore, the *_noirq phases run at a time when IRQ handlers have | 249 | unfrozen. Furthermore, the *_noirq phases run at a time when IRQ handlers have |
250 | been disabled (except for those marked with the IRQ_WAKEUP flag). | 250 | been disabled (except for those marked with the IRQ_WAKEUP flag). |
251 | 251 | ||
252 | Most phases use bus, type, and class callbacks (that is, methods defined in | 252 | All phases use bus, type, or class callbacks (that is, methods defined in |
253 | dev->bus->pm, dev->type->pm, and dev->class->pm). The prepare and complete | 253 | dev->bus->pm, dev->type->pm, or dev->class->pm). These callbacks are mutually |
254 | phases are exceptions; they use only bus callbacks. When multiple callbacks | 254 | exclusive, so if the device type provides a struct dev_pm_ops object pointed to |
255 | are used in a phase, they are invoked in the order: <class, type, bus> during | 255 | by its pm field (i.e. both dev->type and dev->type->pm are defined), the |
256 | power-down transitions and in the opposite order during power-up transitions. | 256 | callbacks included in that object (i.e. dev->type->pm) will be used. Otherwise, |
257 | For example, during the suspend phase the PM core invokes | 257 | if the class provides a struct dev_pm_ops object pointed to by its pm field |
258 | 258 | (i.e. both dev->class and dev->class->pm are defined), the PM core will use the | |
259 | dev->class->pm.suspend(dev); | 259 | callbacks from that object (i.e. dev->class->pm). Finally, if the pm fields of |
260 | dev->type->pm.suspend(dev); | 260 | both the device type and class objects are NULL (or those objects do not exist), |
261 | dev->bus->pm.suspend(dev); | 261 | the callbacks provided by the bus (that is, the callbacks from dev->bus->pm) |
262 | 262 | will be used (this allows device types to override callbacks provided by bus | |
263 | before moving on to the next device, whereas during the resume phase the core | 263 | types or classes if necessary). |
264 | invokes | ||
265 | |||
266 | dev->bus->pm.resume(dev); | ||
267 | dev->type->pm.resume(dev); | ||
268 | dev->class->pm.resume(dev); | ||
269 | 264 | ||
270 | These callbacks may in turn invoke device- or driver-specific methods stored in | 265 | These callbacks may in turn invoke device- or driver-specific methods stored in |
271 | dev->driver->pm, but they don't have to. | 266 | dev->driver->pm, but they don't have to. |
@@ -284,11 +279,15 @@ When the system goes into the standby or memory sleep state, the phases are: | |||
284 | time.) Unlike the other suspend-related phases, during the prepare | 279 | time.) Unlike the other suspend-related phases, during the prepare |
285 | phase the device tree is traversed top-down. | 280 | phase the device tree is traversed top-down. |
286 | 281 | ||
287 | The prepare phase uses only a bus callback. After the callback method | 282 | In addition to that, if device drivers need to allocate additional |
288 | returns, no new children may be registered below the device. The method | 283 | memory to be able to hadle device suspend correctly, that should be |
289 | may also prepare the device or driver in some way for the upcoming | 284 | done in the prepare phase. |
290 | system power transition, but it should not put the device into a | 285 | |
291 | low-power state. | 286 | After the prepare callback method returns, no new children may be |
287 | registered below the device. The method may also prepare the device or | ||
288 | driver in some way for the upcoming system power transition (for | ||
289 | example, by allocating additional memory required for this purpose), but | ||
290 | it should not put the device into a low-power state. | ||
292 | 291 | ||
293 | 2. The suspend methods should quiesce the device to stop it from performing | 292 | 2. The suspend methods should quiesce the device to stop it from performing |
294 | I/O. They also may save the device registers and put it into the | 293 | I/O. They also may save the device registers and put it into the |
@@ -372,7 +371,7 @@ Drivers need to be able to handle hardware which has been reset since the | |||
372 | suspend methods were called, for example by complete reinitialization. | 371 | suspend methods were called, for example by complete reinitialization. |
373 | This may be the hardest part, and the one most protected by NDA'd documents | 372 | This may be the hardest part, and the one most protected by NDA'd documents |
374 | and chip errata. It's simplest if the hardware state hasn't changed since | 373 | and chip errata. It's simplest if the hardware state hasn't changed since |
375 | the suspend was carried out, but that can't be guaranteed (in fact, it ususally | 374 | the suspend was carried out, but that can't be guaranteed (in fact, it usually |
376 | is not the case). | 375 | is not the case). |
377 | 376 | ||
378 | Drivers must also be prepared to notice that the device has been removed | 377 | Drivers must also be prepared to notice that the device has been removed |
@@ -507,30 +506,34 @@ routines. Nevertheless, different callback pointers are used in case there is a | |||
507 | situation where it actually matters. | 506 | situation where it actually matters. |
508 | 507 | ||
509 | 508 | ||
510 | System Devices | 509 | Device Power Domains |
511 | -------------- | 510 | -------------------- |
512 | System devices (sysdevs) follow a slightly different API, which can be found in | 511 | Sometimes devices share reference clocks or other power resources. In those |
513 | 512 | cases it generally is not possible to put devices into low-power states | |
514 | include/linux/sysdev.h | 513 | individually. Instead, a set of devices sharing a power resource can be put |
515 | drivers/base/sys.c | 514 | into a low-power state together at the same time by turning off the shared |
516 | 515 | power resource. Of course, they also need to be put into the full-power state | |
517 | System devices will be suspended with interrupts disabled, and after all other | 516 | together, by turning the shared power resource on. A set of devices with this |
518 | devices have been suspended. On resume, they will be resumed before any other | 517 | property is often referred to as a power domain. |
519 | devices, and also with interrupts disabled. These things occur in special | 518 | |
520 | "sysdev_driver" phases, which affect only system devices. | 519 | Support for power domains is provided through the pwr_domain field of struct |
521 | 520 | device. This field is a pointer to an object of type struct dev_power_domain, | |
522 | Thus, after the suspend_noirq (or freeze_noirq or poweroff_noirq) phase, when | 521 | defined in include/linux/pm.h, providing a set of power management callbacks |
523 | the non-boot CPUs are all offline and IRQs are disabled on the remaining online | 522 | analogous to the subsystem-level and device driver callbacks that are executed |
524 | CPU, then a sysdev_driver.suspend phase is carried out, and the system enters a | 523 | for the given device during all power transitions, instead of the respective |
525 | sleep state (or a system image is created). During resume (or after the image | 524 | subsystem-level callbacks. Specifically, if a device's pm_domain pointer is |
526 | has been created or loaded) a sysdev_driver.resume phase is carried out, IRQs | 525 | not NULL, the ->suspend() callback from the object pointed to by it will be |
527 | are enabled on the only online CPU, the non-boot CPUs are enabled, and the | 526 | executed instead of its subsystem's (e.g. bus type's) ->suspend() callback and |
528 | resume_noirq (or thaw_noirq or restore_noirq) phase begins. | 527 | anlogously for all of the remaining callbacks. In other words, power management |
529 | 528 | domain callbacks, if defined for the given device, always take precedence over | |
530 | Code to actually enter and exit the system-wide low power state sometimes | 529 | the callbacks provided by the device's subsystem (e.g. bus type). |
531 | involves hardware details that are only known to the boot firmware, and | 530 | |
532 | may leave a CPU running software (from SRAM or flash memory) that monitors | 531 | The support for device power management domains is only relevant to platforms |
533 | the system and manages its wakeup sequence. | 532 | needing to use the same device driver power management callbacks in many |
533 | different power domain configurations and wanting to avoid incorporating the | ||
534 | support for power domains into subsystem-level callbacks, for example by | ||
535 | modifying the platform bus type. Other platforms need not implement it or take | ||
536 | it into account in any way. | ||
534 | 537 | ||
535 | 538 | ||
536 | Device Low Power (suspend) States | 539 | Device Low Power (suspend) States |
diff --git a/Documentation/power/drivers-testing.txt b/Documentation/power/drivers-testing.txt index 7f7a737f7f9f..638afdf4d6b8 100644 --- a/Documentation/power/drivers-testing.txt +++ b/Documentation/power/drivers-testing.txt | |||
@@ -23,10 +23,10 @@ Once you have resolved the suspend/resume-related problems with your test system | |||
23 | without the new driver, you are ready to test it: | 23 | without the new driver, you are ready to test it: |
24 | 24 | ||
25 | a) Build the driver as a module, load it and try the test modes of hibernation | 25 | a) Build the driver as a module, load it and try the test modes of hibernation |
26 | (see: Documents/power/basic-pm-debugging.txt, 1). | 26 | (see: Documentation/power/basic-pm-debugging.txt, 1). |
27 | 27 | ||
28 | b) Load the driver and attempt to hibernate in the "reboot", "shutdown" and | 28 | b) Load the driver and attempt to hibernate in the "reboot", "shutdown" and |
29 | "platform" modes (see: Documents/power/basic-pm-debugging.txt, 1). | 29 | "platform" modes (see: Documentation/power/basic-pm-debugging.txt, 1). |
30 | 30 | ||
31 | c) Compile the driver directly into the kernel and try the test modes of | 31 | c) Compile the driver directly into the kernel and try the test modes of |
32 | hibernation. | 32 | hibernation. |
@@ -34,12 +34,12 @@ c) Compile the driver directly into the kernel and try the test modes of | |||
34 | d) Attempt to hibernate with the driver compiled directly into the kernel | 34 | d) Attempt to hibernate with the driver compiled directly into the kernel |
35 | in the "reboot", "shutdown" and "platform" modes. | 35 | in the "reboot", "shutdown" and "platform" modes. |
36 | 36 | ||
37 | e) Try the test modes of suspend (see: Documents/power/basic-pm-debugging.txt, | 37 | e) Try the test modes of suspend (see: Documentation/power/basic-pm-debugging.txt, |
38 | 2). [As far as the STR tests are concerned, it should not matter whether or | 38 | 2). [As far as the STR tests are concerned, it should not matter whether or |
39 | not the driver is built as a module.] | 39 | not the driver is built as a module.] |
40 | 40 | ||
41 | f) Attempt to suspend to RAM using the s2ram tool with the driver loaded | 41 | f) Attempt to suspend to RAM using the s2ram tool with the driver loaded |
42 | (see: Documents/power/basic-pm-debugging.txt, 2). | 42 | (see: Documentation/power/basic-pm-debugging.txt, 2). |
43 | 43 | ||
44 | Each of the above tests should be repeated several times and the STD tests | 44 | Each of the above tests should be repeated several times and the STD tests |
45 | should be mixed with the STR tests. If any of them fails, the driver cannot be | 45 | should be mixed with the STR tests. If any of them fails, the driver cannot be |
diff --git a/Documentation/power/interface.txt b/Documentation/power/interface.txt index e67211fe0ee2..c537834af005 100644 --- a/Documentation/power/interface.txt +++ b/Documentation/power/interface.txt | |||
@@ -57,7 +57,7 @@ smallest image possible. In particular, if "0" is written to this file, the | |||
57 | suspend image will be as small as possible. | 57 | suspend image will be as small as possible. |
58 | 58 | ||
59 | Reading from this file will display the current image size limit, which | 59 | Reading from this file will display the current image size limit, which |
60 | is set to 500 MB by default. | 60 | is set to 2/5 of available RAM by default. |
61 | 61 | ||
62 | /sys/power/pm_trace controls the code which saves the last PM event point in | 62 | /sys/power/pm_trace controls the code which saves the last PM event point in |
63 | the RTC across reboots, so that you can debug a machine that just hangs | 63 | the RTC across reboots, so that you can debug a machine that just hangs |
diff --git a/Documentation/power/notifiers.txt b/Documentation/power/notifiers.txt index ae1b7ec07684..c2a4a346c0d9 100644 --- a/Documentation/power/notifiers.txt +++ b/Documentation/power/notifiers.txt | |||
@@ -1,46 +1,41 @@ | |||
1 | Suspend notifiers | 1 | Suspend notifiers |
2 | (C) 2007 Rafael J. Wysocki <rjw@sisk.pl>, GPL | 2 | (C) 2007-2011 Rafael J. Wysocki <rjw@sisk.pl>, GPL |
3 | 3 | ||
4 | There are some operations that device drivers may want to carry out in their | 4 | There are some operations that subsystems or drivers may want to carry out |
5 | .suspend() routines, but shouldn't, because they can cause the hibernation or | 5 | before hibernation/suspend or after restore/resume, but they require the system |
6 | suspend to fail. For example, a driver may want to allocate a substantial amount | 6 | to be fully functional, so the drivers' and subsystems' .suspend() and .resume() |
7 | of memory (like 50 MB) in .suspend(), but that shouldn't be done after the | 7 | or even .prepare() and .complete() callbacks are not suitable for this purpose. |
8 | swsusp's memory shrinker has run. | 8 | For example, device drivers may want to upload firmware to their devices after |
9 | 9 | resume/restore, but they cannot do it by calling request_firmware() from their | |
10 | Also, there may be some operations, that subsystems want to carry out before a | 10 | .resume() or .complete() routines (user land processes are frozen at these |
11 | hibernation/suspend or after a restore/resume, requiring the system to be fully | 11 | points). The solution may be to load the firmware into memory before processes |
12 | functional, so the drivers' .suspend() and .resume() routines are not suitable | 12 | are frozen and upload it from there in the .resume() routine. |
13 | for this purpose. For example, device drivers may want to upload firmware to | 13 | A suspend/hibernation notifier may be used for this purpose. |
14 | their devices after a restore from a hibernation image, but they cannot do it by | 14 | |
15 | calling request_firmware() from their .resume() routines (user land processes | 15 | The subsystems or drivers having such needs can register suspend notifiers that |
16 | are frozen at this point). The solution may be to load the firmware into | 16 | will be called upon the following events by the PM core: |
17 | memory before processes are frozen and upload it from there in the .resume() | ||
18 | routine. Of course, a hibernation notifier may be used for this purpose. | ||
19 | |||
20 | The subsystems that have such needs can register suspend notifiers that will be | ||
21 | called upon the following events by the suspend core: | ||
22 | 17 | ||
23 | PM_HIBERNATION_PREPARE The system is going to hibernate or suspend, tasks will | 18 | PM_HIBERNATION_PREPARE The system is going to hibernate or suspend, tasks will |
24 | be frozen immediately. | 19 | be frozen immediately. |
25 | 20 | ||
26 | PM_POST_HIBERNATION The system memory state has been restored from a | 21 | PM_POST_HIBERNATION The system memory state has been restored from a |
27 | hibernation image or an error occured during the | 22 | hibernation image or an error occurred during |
28 | hibernation. Device drivers' .resume() callbacks have | 23 | hibernation. Device drivers' restore callbacks have |
29 | been executed and tasks have been thawed. | 24 | been executed and tasks have been thawed. |
30 | 25 | ||
31 | PM_RESTORE_PREPARE The system is going to restore a hibernation image. | 26 | PM_RESTORE_PREPARE The system is going to restore a hibernation image. |
32 | If all goes well the restored kernel will issue a | 27 | If all goes well, the restored kernel will issue a |
33 | PM_POST_HIBERNATION notification. | 28 | PM_POST_HIBERNATION notification. |
34 | 29 | ||
35 | PM_POST_RESTORE An error occurred during the hibernation restore. | 30 | PM_POST_RESTORE An error occurred during restore from hibernation. |
36 | Device drivers' .resume() callbacks have been executed | 31 | Device drivers' restore callbacks have been executed |
37 | and tasks have been thawed. | 32 | and tasks have been thawed. |
38 | 33 | ||
39 | PM_SUSPEND_PREPARE The system is preparing for a suspend. | 34 | PM_SUSPEND_PREPARE The system is preparing for suspend. |
40 | 35 | ||
41 | PM_POST_SUSPEND The system has just resumed or an error occured during | 36 | PM_POST_SUSPEND The system has just resumed or an error occurred during |
42 | the suspend. Device drivers' .resume() callbacks have | 37 | suspend. Device drivers' resume callbacks have been |
43 | been executed and tasks have been thawed. | 38 | executed and tasks have been thawed. |
44 | 39 | ||
45 | It is generally assumed that whatever the notifiers do for | 40 | It is generally assumed that whatever the notifiers do for |
46 | PM_HIBERNATION_PREPARE, should be undone for PM_POST_HIBERNATION. Analogously, | 41 | PM_HIBERNATION_PREPARE, should be undone for PM_POST_HIBERNATION. Analogously, |
diff --git a/Documentation/power/opp.txt b/Documentation/power/opp.txt new file mode 100644 index 000000000000..5ae70a12c1e2 --- /dev/null +++ b/Documentation/power/opp.txt | |||
@@ -0,0 +1,378 @@ | |||
1 | *=============* | ||
2 | * OPP Library * | ||
3 | *=============* | ||
4 | |||
5 | (C) 2009-2010 Nishanth Menon <nm@ti.com>, Texas Instruments Incorporated | ||
6 | |||
7 | Contents | ||
8 | -------- | ||
9 | 1. Introduction | ||
10 | 2. Initial OPP List Registration | ||
11 | 3. OPP Search Functions | ||
12 | 4. OPP Availability Control Functions | ||
13 | 5. OPP Data Retrieval Functions | ||
14 | 6. Cpufreq Table Generation | ||
15 | 7. Data Structures | ||
16 | |||
17 | 1. Introduction | ||
18 | =============== | ||
19 | Complex SoCs of today consists of a multiple sub-modules working in conjunction. | ||
20 | In an operational system executing varied use cases, not all modules in the SoC | ||
21 | need to function at their highest performing frequency all the time. To | ||
22 | facilitate this, sub-modules in a SoC are grouped into domains, allowing some | ||
23 | domains to run at lower voltage and frequency while other domains are loaded | ||
24 | more. The set of discrete tuples consisting of frequency and voltage pairs that | ||
25 | the device will support per domain are called Operating Performance Points or | ||
26 | OPPs. | ||
27 | |||
28 | OPP library provides a set of helper functions to organize and query the OPP | ||
29 | information. The library is located in drivers/base/power/opp.c and the header | ||
30 | is located in include/linux/opp.h. OPP library can be enabled by enabling | ||
31 | CONFIG_PM_OPP from power management menuconfig menu. OPP library depends on | ||
32 | CONFIG_PM as certain SoCs such as Texas Instrument's OMAP framework allows to | ||
33 | optionally boot at a certain OPP without needing cpufreq. | ||
34 | |||
35 | Typical usage of the OPP library is as follows: | ||
36 | (users) -> registers a set of default OPPs -> (library) | ||
37 | SoC framework -> modifies on required cases certain OPPs -> OPP layer | ||
38 | -> queries to search/retrieve information -> | ||
39 | |||
40 | Architectures that provide a SoC framework for OPP should select ARCH_HAS_OPP | ||
41 | to make the OPP layer available. | ||
42 | |||
43 | OPP layer expects each domain to be represented by a unique device pointer. SoC | ||
44 | framework registers a set of initial OPPs per device with the OPP layer. This | ||
45 | list is expected to be an optimally small number typically around 5 per device. | ||
46 | This initial list contains a set of OPPs that the framework expects to be safely | ||
47 | enabled by default in the system. | ||
48 | |||
49 | Note on OPP Availability: | ||
50 | ------------------------ | ||
51 | As the system proceeds to operate, SoC framework may choose to make certain | ||
52 | OPPs available or not available on each device based on various external | ||
53 | factors. Example usage: Thermal management or other exceptional situations where | ||
54 | SoC framework might choose to disable a higher frequency OPP to safely continue | ||
55 | operations until that OPP could be re-enabled if possible. | ||
56 | |||
57 | OPP library facilitates this concept in it's implementation. The following | ||
58 | operational functions operate only on available opps: | ||
59 | opp_find_freq_{ceil, floor}, opp_get_voltage, opp_get_freq, opp_get_opp_count | ||
60 | and opp_init_cpufreq_table | ||
61 | |||
62 | opp_find_freq_exact is meant to be used to find the opp pointer which can then | ||
63 | be used for opp_enable/disable functions to make an opp available as required. | ||
64 | |||
65 | WARNING: Users of OPP library should refresh their availability count using | ||
66 | get_opp_count if opp_enable/disable functions are invoked for a device, the | ||
67 | exact mechanism to trigger these or the notification mechanism to other | ||
68 | dependent subsystems such as cpufreq are left to the discretion of the SoC | ||
69 | specific framework which uses the OPP library. Similar care needs to be taken | ||
70 | care to refresh the cpufreq table in cases of these operations. | ||
71 | |||
72 | WARNING on OPP List locking mechanism: | ||
73 | ------------------------------------------------- | ||
74 | OPP library uses RCU for exclusivity. RCU allows the query functions to operate | ||
75 | in multiple contexts and this synchronization mechanism is optimal for a read | ||
76 | intensive operations on data structure as the OPP library caters to. | ||
77 | |||
78 | To ensure that the data retrieved are sane, the users such as SoC framework | ||
79 | should ensure that the section of code operating on OPP queries are locked | ||
80 | using RCU read locks. The opp_find_freq_{exact,ceil,floor}, | ||
81 | opp_get_{voltage, freq, opp_count} fall into this category. | ||
82 | |||
83 | opp_{add,enable,disable} are updaters which use mutex and implement it's own | ||
84 | RCU locking mechanisms. opp_init_cpufreq_table acts as an updater and uses | ||
85 | mutex to implment RCU updater strategy. These functions should *NOT* be called | ||
86 | under RCU locks and other contexts that prevent blocking functions in RCU or | ||
87 | mutex operations from working. | ||
88 | |||
89 | 2. Initial OPP List Registration | ||
90 | ================================ | ||
91 | The SoC implementation calls opp_add function iteratively to add OPPs per | ||
92 | device. It is expected that the SoC framework will register the OPP entries | ||
93 | optimally- typical numbers range to be less than 5. The list generated by | ||
94 | registering the OPPs is maintained by OPP library throughout the device | ||
95 | operation. The SoC framework can subsequently control the availability of the | ||
96 | OPPs dynamically using the opp_enable / disable functions. | ||
97 | |||
98 | opp_add - Add a new OPP for a specific domain represented by the device pointer. | ||
99 | The OPP is defined using the frequency and voltage. Once added, the OPP | ||
100 | is assumed to be available and control of it's availability can be done | ||
101 | with the opp_enable/disable functions. OPP library internally stores | ||
102 | and manages this information in the opp struct. This function may be | ||
103 | used by SoC framework to define a optimal list as per the demands of | ||
104 | SoC usage environment. | ||
105 | |||
106 | WARNING: Do not use this function in interrupt context. | ||
107 | |||
108 | Example: | ||
109 | soc_pm_init() | ||
110 | { | ||
111 | /* Do things */ | ||
112 | r = opp_add(mpu_dev, 1000000, 900000); | ||
113 | if (!r) { | ||
114 | pr_err("%s: unable to register mpu opp(%d)\n", r); | ||
115 | goto no_cpufreq; | ||
116 | } | ||
117 | /* Do cpufreq things */ | ||
118 | no_cpufreq: | ||
119 | /* Do remaining things */ | ||
120 | } | ||
121 | |||
122 | 3. OPP Search Functions | ||
123 | ======================= | ||
124 | High level framework such as cpufreq operates on frequencies. To map the | ||
125 | frequency back to the corresponding OPP, OPP library provides handy functions | ||
126 | to search the OPP list that OPP library internally manages. These search | ||
127 | functions return the matching pointer representing the opp if a match is | ||
128 | found, else returns error. These errors are expected to be handled by standard | ||
129 | error checks such as IS_ERR() and appropriate actions taken by the caller. | ||
130 | |||
131 | opp_find_freq_exact - Search for an OPP based on an *exact* frequency and | ||
132 | availability. This function is especially useful to enable an OPP which | ||
133 | is not available by default. | ||
134 | Example: In a case when SoC framework detects a situation where a | ||
135 | higher frequency could be made available, it can use this function to | ||
136 | find the OPP prior to call the opp_enable to actually make it available. | ||
137 | rcu_read_lock(); | ||
138 | opp = opp_find_freq_exact(dev, 1000000000, false); | ||
139 | rcu_read_unlock(); | ||
140 | /* dont operate on the pointer.. just do a sanity check.. */ | ||
141 | if (IS_ERR(opp)) { | ||
142 | pr_err("frequency not disabled!\n"); | ||
143 | /* trigger appropriate actions.. */ | ||
144 | } else { | ||
145 | opp_enable(dev,1000000000); | ||
146 | } | ||
147 | |||
148 | NOTE: This is the only search function that operates on OPPs which are | ||
149 | not available. | ||
150 | |||
151 | opp_find_freq_floor - Search for an available OPP which is *at most* the | ||
152 | provided frequency. This function is useful while searching for a lesser | ||
153 | match OR operating on OPP information in the order of decreasing | ||
154 | frequency. | ||
155 | Example: To find the highest opp for a device: | ||
156 | freq = ULONG_MAX; | ||
157 | rcu_read_lock(); | ||
158 | opp_find_freq_floor(dev, &freq); | ||
159 | rcu_read_unlock(); | ||
160 | |||
161 | opp_find_freq_ceil - Search for an available OPP which is *at least* the | ||
162 | provided frequency. This function is useful while searching for a | ||
163 | higher match OR operating on OPP information in the order of increasing | ||
164 | frequency. | ||
165 | Example 1: To find the lowest opp for a device: | ||
166 | freq = 0; | ||
167 | rcu_read_lock(); | ||
168 | opp_find_freq_ceil(dev, &freq); | ||
169 | rcu_read_unlock(); | ||
170 | Example 2: A simplified implementation of a SoC cpufreq_driver->target: | ||
171 | soc_cpufreq_target(..) | ||
172 | { | ||
173 | /* Do stuff like policy checks etc. */ | ||
174 | /* Find the best frequency match for the req */ | ||
175 | rcu_read_lock(); | ||
176 | opp = opp_find_freq_ceil(dev, &freq); | ||
177 | rcu_read_unlock(); | ||
178 | if (!IS_ERR(opp)) | ||
179 | soc_switch_to_freq_voltage(freq); | ||
180 | else | ||
181 | /* do something when we can't satisfy the req */ | ||
182 | /* do other stuff */ | ||
183 | } | ||
184 | |||
185 | 4. OPP Availability Control Functions | ||
186 | ===================================== | ||
187 | A default OPP list registered with the OPP library may not cater to all possible | ||
188 | situation. The OPP library provides a set of functions to modify the | ||
189 | availability of a OPP within the OPP list. This allows SoC frameworks to have | ||
190 | fine grained dynamic control of which sets of OPPs are operationally available. | ||
191 | These functions are intended to *temporarily* remove an OPP in conditions such | ||
192 | as thermal considerations (e.g. don't use OPPx until the temperature drops). | ||
193 | |||
194 | WARNING: Do not use these functions in interrupt context. | ||
195 | |||
196 | opp_enable - Make a OPP available for operation. | ||
197 | Example: Lets say that 1GHz OPP is to be made available only if the | ||
198 | SoC temperature is lower than a certain threshold. The SoC framework | ||
199 | implementation might choose to do something as follows: | ||
200 | if (cur_temp < temp_low_thresh) { | ||
201 | /* Enable 1GHz if it was disabled */ | ||
202 | rcu_read_lock(); | ||
203 | opp = opp_find_freq_exact(dev, 1000000000, false); | ||
204 | rcu_read_unlock(); | ||
205 | /* just error check */ | ||
206 | if (!IS_ERR(opp)) | ||
207 | ret = opp_enable(dev, 1000000000); | ||
208 | else | ||
209 | goto try_something_else; | ||
210 | } | ||
211 | |||
212 | opp_disable - Make an OPP to be not available for operation | ||
213 | Example: Lets say that 1GHz OPP is to be disabled if the temperature | ||
214 | exceeds a threshold value. The SoC framework implementation might | ||
215 | choose to do something as follows: | ||
216 | if (cur_temp > temp_high_thresh) { | ||
217 | /* Disable 1GHz if it was enabled */ | ||
218 | rcu_read_lock(); | ||
219 | opp = opp_find_freq_exact(dev, 1000000000, true); | ||
220 | rcu_read_unlock(); | ||
221 | /* just error check */ | ||
222 | if (!IS_ERR(opp)) | ||
223 | ret = opp_disable(dev, 1000000000); | ||
224 | else | ||
225 | goto try_something_else; | ||
226 | } | ||
227 | |||
228 | 5. OPP Data Retrieval Functions | ||
229 | =============================== | ||
230 | Since OPP library abstracts away the OPP information, a set of functions to pull | ||
231 | information from the OPP structure is necessary. Once an OPP pointer is | ||
232 | retrieved using the search functions, the following functions can be used by SoC | ||
233 | framework to retrieve the information represented inside the OPP layer. | ||
234 | |||
235 | opp_get_voltage - Retrieve the voltage represented by the opp pointer. | ||
236 | Example: At a cpufreq transition to a different frequency, SoC | ||
237 | framework requires to set the voltage represented by the OPP using | ||
238 | the regulator framework to the Power Management chip providing the | ||
239 | voltage. | ||
240 | soc_switch_to_freq_voltage(freq) | ||
241 | { | ||
242 | /* do things */ | ||
243 | rcu_read_lock(); | ||
244 | opp = opp_find_freq_ceil(dev, &freq); | ||
245 | v = opp_get_voltage(opp); | ||
246 | rcu_read_unlock(); | ||
247 | if (v) | ||
248 | regulator_set_voltage(.., v); | ||
249 | /* do other things */ | ||
250 | } | ||
251 | |||
252 | opp_get_freq - Retrieve the freq represented by the opp pointer. | ||
253 | Example: Lets say the SoC framework uses a couple of helper functions | ||
254 | we could pass opp pointers instead of doing additional parameters to | ||
255 | handle quiet a bit of data parameters. | ||
256 | soc_cpufreq_target(..) | ||
257 | { | ||
258 | /* do things.. */ | ||
259 | max_freq = ULONG_MAX; | ||
260 | rcu_read_lock(); | ||
261 | max_opp = opp_find_freq_floor(dev,&max_freq); | ||
262 | requested_opp = opp_find_freq_ceil(dev,&freq); | ||
263 | if (!IS_ERR(max_opp) && !IS_ERR(requested_opp)) | ||
264 | r = soc_test_validity(max_opp, requested_opp); | ||
265 | rcu_read_unlock(); | ||
266 | /* do other things */ | ||
267 | } | ||
268 | soc_test_validity(..) | ||
269 | { | ||
270 | if(opp_get_voltage(max_opp) < opp_get_voltage(requested_opp)) | ||
271 | return -EINVAL; | ||
272 | if(opp_get_freq(max_opp) < opp_get_freq(requested_opp)) | ||
273 | return -EINVAL; | ||
274 | /* do things.. */ | ||
275 | } | ||
276 | |||
277 | opp_get_opp_count - Retrieve the number of available opps for a device | ||
278 | Example: Lets say a co-processor in the SoC needs to know the available | ||
279 | frequencies in a table, the main processor can notify as following: | ||
280 | soc_notify_coproc_available_frequencies() | ||
281 | { | ||
282 | /* Do things */ | ||
283 | rcu_read_lock(); | ||
284 | num_available = opp_get_opp_count(dev); | ||
285 | speeds = kzalloc(sizeof(u32) * num_available, GFP_KERNEL); | ||
286 | /* populate the table in increasing order */ | ||
287 | freq = 0; | ||
288 | while (!IS_ERR(opp = opp_find_freq_ceil(dev, &freq))) { | ||
289 | speeds[i] = freq; | ||
290 | freq++; | ||
291 | i++; | ||
292 | } | ||
293 | rcu_read_unlock(); | ||
294 | |||
295 | soc_notify_coproc(AVAILABLE_FREQs, speeds, num_available); | ||
296 | /* Do other things */ | ||
297 | } | ||
298 | |||
299 | 6. Cpufreq Table Generation | ||
300 | =========================== | ||
301 | opp_init_cpufreq_table - cpufreq framework typically is initialized with | ||
302 | cpufreq_frequency_table_cpuinfo which is provided with the list of | ||
303 | frequencies that are available for operation. This function provides | ||
304 | a ready to use conversion routine to translate the OPP layer's internal | ||
305 | information about the available frequencies into a format readily | ||
306 | providable to cpufreq. | ||
307 | |||
308 | WARNING: Do not use this function in interrupt context. | ||
309 | |||
310 | Example: | ||
311 | soc_pm_init() | ||
312 | { | ||
313 | /* Do things */ | ||
314 | r = opp_init_cpufreq_table(dev, &freq_table); | ||
315 | if (!r) | ||
316 | cpufreq_frequency_table_cpuinfo(policy, freq_table); | ||
317 | /* Do other things */ | ||
318 | } | ||
319 | |||
320 | NOTE: This function is available only if CONFIG_CPU_FREQ is enabled in | ||
321 | addition to CONFIG_PM as power management feature is required to | ||
322 | dynamically scale voltage and frequency in a system. | ||
323 | |||
324 | 7. Data Structures | ||
325 | ================== | ||
326 | Typically an SoC contains multiple voltage domains which are variable. Each | ||
327 | domain is represented by a device pointer. The relationship to OPP can be | ||
328 | represented as follows: | ||
329 | SoC | ||
330 | |- device 1 | ||
331 | | |- opp 1 (availability, freq, voltage) | ||
332 | | |- opp 2 .. | ||
333 | ... ... | ||
334 | | `- opp n .. | ||
335 | |- device 2 | ||
336 | ... | ||
337 | `- device m | ||
338 | |||
339 | OPP library maintains a internal list that the SoC framework populates and | ||
340 | accessed by various functions as described above. However, the structures | ||
341 | representing the actual OPPs and domains are internal to the OPP library itself | ||
342 | to allow for suitable abstraction reusable across systems. | ||
343 | |||
344 | struct opp - The internal data structure of OPP library which is used to | ||
345 | represent an OPP. In addition to the freq, voltage, availability | ||
346 | information, it also contains internal book keeping information required | ||
347 | for the OPP library to operate on. Pointer to this structure is | ||
348 | provided back to the users such as SoC framework to be used as a | ||
349 | identifier for OPP in the interactions with OPP layer. | ||
350 | |||
351 | WARNING: The struct opp pointer should not be parsed or modified by the | ||
352 | users. The defaults of for an instance is populated by opp_add, but the | ||
353 | availability of the OPP can be modified by opp_enable/disable functions. | ||
354 | |||
355 | struct device - This is used to identify a domain to the OPP layer. The | ||
356 | nature of the device and it's implementation is left to the user of | ||
357 | OPP library such as the SoC framework. | ||
358 | |||
359 | Overall, in a simplistic view, the data structure operations is represented as | ||
360 | following: | ||
361 | |||
362 | Initialization / modification: | ||
363 | +-----+ /- opp_enable | ||
364 | opp_add --> | opp | <------- | ||
365 | | +-----+ \- opp_disable | ||
366 | \-------> domain_info(device) | ||
367 | |||
368 | Search functions: | ||
369 | /-- opp_find_freq_ceil ---\ +-----+ | ||
370 | domain_info<---- opp_find_freq_exact -----> | opp | | ||
371 | \-- opp_find_freq_floor ---/ +-----+ | ||
372 | |||
373 | Retrieval functions: | ||
374 | +-----+ /- opp_get_voltage | ||
375 | | opp | <--- | ||
376 | +-----+ \- opp_get_freq | ||
377 | |||
378 | domain_info <- opp_get_opp_count | ||
diff --git a/Documentation/power/regulator/machine.txt b/Documentation/power/regulator/machine.txt index bdec39b9bd75..b42419b52e44 100644 --- a/Documentation/power/regulator/machine.txt +++ b/Documentation/power/regulator/machine.txt | |||
@@ -53,11 +53,11 @@ static struct regulator_init_data regulator1_data = { | |||
53 | 53 | ||
54 | Regulator-1 supplies power to Regulator-2. This relationship must be registered | 54 | Regulator-1 supplies power to Regulator-2. This relationship must be registered |
55 | with the core so that Regulator-1 is also enabled when Consumer A enables its | 55 | with the core so that Regulator-1 is also enabled when Consumer A enables its |
56 | supply (Regulator-2). The supply regulator is set by the supply_regulator_dev | 56 | supply (Regulator-2). The supply regulator is set by the supply_regulator |
57 | field below:- | 57 | field below:- |
58 | 58 | ||
59 | static struct regulator_init_data regulator2_data = { | 59 | static struct regulator_init_data regulator2_data = { |
60 | .supply_regulator_dev = &platform_regulator1_device.dev, | 60 | .supply_regulator = "regulator_name", |
61 | .constraints = { | 61 | .constraints = { |
62 | .min_uV = 1800000, | 62 | .min_uV = 1800000, |
63 | .max_uV = 2000000, | 63 | .max_uV = 2000000, |
diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt index 55b859b3bc72..b24875b1ced5 100644 --- a/Documentation/power/runtime_pm.txt +++ b/Documentation/power/runtime_pm.txt | |||
@@ -1,6 +1,7 @@ | |||
1 | Run-time Power Management Framework for I/O Devices | 1 | Run-time Power Management Framework for I/O Devices |
2 | 2 | ||
3 | (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. | 3 | (C) 2009-2011 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. |
4 | (C) 2010 Alan Stern <stern@rowland.harvard.edu> | ||
4 | 5 | ||
5 | 1. Introduction | 6 | 1. Introduction |
6 | 7 | ||
@@ -43,11 +44,21 @@ struct dev_pm_ops { | |||
43 | }; | 44 | }; |
44 | 45 | ||
45 | The ->runtime_suspend(), ->runtime_resume() and ->runtime_idle() callbacks are | 46 | The ->runtime_suspend(), ->runtime_resume() and ->runtime_idle() callbacks are |
46 | executed by the PM core for either the bus type, or device type (if the bus | 47 | executed by the PM core for either the device type, or the class (if the device |
47 | type's callback is not defined), or device class (if the bus type's and device | 48 | type's struct dev_pm_ops object does not exist), or the bus type (if the |
48 | type's callbacks are not defined) of given device. The bus type, device type | 49 | device type's and class' struct dev_pm_ops objects do not exist) of the given |
49 | and device class callbacks are referred to as subsystem-level callbacks in what | 50 | device (this allows device types to override callbacks provided by bus types or |
50 | follows. | 51 | classes if necessary). The bus type, device type and class callbacks are |
52 | referred to as subsystem-level callbacks in what follows. | ||
53 | |||
54 | By default, the callbacks are always invoked in process context with interrupts | ||
55 | enabled. However, subsystems can use the pm_runtime_irq_safe() helper function | ||
56 | to tell the PM core that a device's ->runtime_suspend() and ->runtime_resume() | ||
57 | callbacks should be invoked in atomic context with interrupts disabled | ||
58 | (->runtime_idle() is still invoked the default way). This implies that these | ||
59 | callback routines must not block or sleep, but it also means that the | ||
60 | synchronous helper functions listed at the end of Section 4 can be used within | ||
61 | an interrupt handler or in an atomic context. | ||
51 | 62 | ||
52 | The subsystem-level suspend callback is _entirely_ _responsible_ for handling | 63 | The subsystem-level suspend callback is _entirely_ _responsible_ for handling |
53 | the suspend of the device as appropriate, which may, but need not include | 64 | the suspend of the device as appropriate, which may, but need not include |
@@ -157,7 +168,8 @@ rules: | |||
157 | to execute it, the other callbacks will not be executed for the same device. | 168 | to execute it, the other callbacks will not be executed for the same device. |
158 | 169 | ||
159 | * A request to execute ->runtime_resume() will cancel any pending or | 170 | * A request to execute ->runtime_resume() will cancel any pending or |
160 | scheduled requests to execute the other callbacks for the same device. | 171 | scheduled requests to execute the other callbacks for the same device, |
172 | except for scheduled autosuspends. | ||
161 | 173 | ||
162 | 3. Run-time PM Device Fields | 174 | 3. Run-time PM Device Fields |
163 | 175 | ||
@@ -165,7 +177,7 @@ The following device run-time PM fields are present in 'struct dev_pm_info', as | |||
165 | defined in include/linux/pm.h: | 177 | defined in include/linux/pm.h: |
166 | 178 | ||
167 | struct timer_list suspend_timer; | 179 | struct timer_list suspend_timer; |
168 | - timer used for scheduling (delayed) suspend request | 180 | - timer used for scheduling (delayed) suspend and autosuspend requests |
169 | 181 | ||
170 | unsigned long timer_expires; | 182 | unsigned long timer_expires; |
171 | - timer expiration time, in jiffies (if this is different from zero, the | 183 | - timer expiration time, in jiffies (if this is different from zero, the |
@@ -230,6 +242,32 @@ defined in include/linux/pm.h: | |||
230 | interface; it may only be modified with the help of the pm_runtime_allow() | 242 | interface; it may only be modified with the help of the pm_runtime_allow() |
231 | and pm_runtime_forbid() helper functions | 243 | and pm_runtime_forbid() helper functions |
232 | 244 | ||
245 | unsigned int no_callbacks; | ||
246 | - indicates that the device does not use the run-time PM callbacks (see | ||
247 | Section 8); it may be modified only by the pm_runtime_no_callbacks() | ||
248 | helper function | ||
249 | |||
250 | unsigned int irq_safe; | ||
251 | - indicates that the ->runtime_suspend() and ->runtime_resume() callbacks | ||
252 | will be invoked with the spinlock held and interrupts disabled | ||
253 | |||
254 | unsigned int use_autosuspend; | ||
255 | - indicates that the device's driver supports delayed autosuspend (see | ||
256 | Section 9); it may be modified only by the | ||
257 | pm_runtime{_dont}_use_autosuspend() helper functions | ||
258 | |||
259 | unsigned int timer_autosuspends; | ||
260 | - indicates that the PM core should attempt to carry out an autosuspend | ||
261 | when the timer expires rather than a normal suspend | ||
262 | |||
263 | int autosuspend_delay; | ||
264 | - the delay time (in milliseconds) to be used for autosuspend | ||
265 | |||
266 | unsigned long last_busy; | ||
267 | - the time (in jiffies) when the pm_runtime_mark_last_busy() helper | ||
268 | function was last called for this device; used in calculating inactivity | ||
269 | periods for autosuspend | ||
270 | |||
233 | All of the above fields are members of the 'power' member of 'struct device'. | 271 | All of the above fields are members of the 'power' member of 'struct device'. |
234 | 272 | ||
235 | 4. Run-time PM Device Helper Functions | 273 | 4. Run-time PM Device Helper Functions |
@@ -255,6 +293,12 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h: | |||
255 | error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt | 293 | error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt |
256 | to suspend the device again in future | 294 | to suspend the device again in future |
257 | 295 | ||
296 | int pm_runtime_autosuspend(struct device *dev); | ||
297 | - same as pm_runtime_suspend() except that the autosuspend delay is taken | ||
298 | into account; if pm_runtime_autosuspend_expiration() says the delay has | ||
299 | not yet expired then an autosuspend is scheduled for the appropriate time | ||
300 | and 0 is returned | ||
301 | |||
258 | int pm_runtime_resume(struct device *dev); | 302 | int pm_runtime_resume(struct device *dev); |
259 | - execute the subsystem-level resume callback for the device; returns 0 on | 303 | - execute the subsystem-level resume callback for the device; returns 0 on |
260 | success, 1 if the device's run-time PM status was already 'active' or | 304 | success, 1 if the device's run-time PM status was already 'active' or |
@@ -267,6 +311,11 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h: | |||
267 | device (the request is represented by a work item in pm_wq); returns 0 on | 311 | device (the request is represented by a work item in pm_wq); returns 0 on |
268 | success or error code if the request has not been queued up | 312 | success or error code if the request has not been queued up |
269 | 313 | ||
314 | int pm_request_autosuspend(struct device *dev); | ||
315 | - schedule the execution of the subsystem-level suspend callback for the | ||
316 | device when the autosuspend delay has expired; if the delay has already | ||
317 | expired then the work item is queued up immediately | ||
318 | |||
270 | int pm_schedule_suspend(struct device *dev, unsigned int delay); | 319 | int pm_schedule_suspend(struct device *dev, unsigned int delay); |
271 | - schedule the execution of the subsystem-level suspend callback for the | 320 | - schedule the execution of the subsystem-level suspend callback for the |
272 | device in future, where 'delay' is the time to wait before queuing up a | 321 | device in future, where 'delay' is the time to wait before queuing up a |
@@ -298,12 +347,24 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h: | |||
298 | - decrement the device's usage counter | 347 | - decrement the device's usage counter |
299 | 348 | ||
300 | int pm_runtime_put(struct device *dev); | 349 | int pm_runtime_put(struct device *dev); |
301 | - decrement the device's usage counter, run pm_request_idle(dev) and return | 350 | - decrement the device's usage counter; if the result is 0 then run |
302 | its result | 351 | pm_request_idle(dev) and return its result |
352 | |||
353 | int pm_runtime_put_autosuspend(struct device *dev); | ||
354 | - decrement the device's usage counter; if the result is 0 then run | ||
355 | pm_request_autosuspend(dev) and return its result | ||
303 | 356 | ||
304 | int pm_runtime_put_sync(struct device *dev); | 357 | int pm_runtime_put_sync(struct device *dev); |
305 | - decrement the device's usage counter, run pm_runtime_idle(dev) and return | 358 | - decrement the device's usage counter; if the result is 0 then run |
306 | its result | 359 | pm_runtime_idle(dev) and return its result |
360 | |||
361 | int pm_runtime_put_sync_suspend(struct device *dev); | ||
362 | - decrement the device's usage counter; if the result is 0 then run | ||
363 | pm_runtime_suspend(dev) and return its result | ||
364 | |||
365 | int pm_runtime_put_sync_autosuspend(struct device *dev); | ||
366 | - decrement the device's usage counter; if the result is 0 then run | ||
367 | pm_runtime_autosuspend(dev) and return its result | ||
307 | 368 | ||
308 | void pm_runtime_enable(struct device *dev); | 369 | void pm_runtime_enable(struct device *dev); |
309 | - enable the run-time PM helper functions to run the device bus type's | 370 | - enable the run-time PM helper functions to run the device bus type's |
@@ -336,8 +397,8 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h: | |||
336 | zero) | 397 | zero) |
337 | 398 | ||
338 | bool pm_runtime_suspended(struct device *dev); | 399 | bool pm_runtime_suspended(struct device *dev); |
339 | - return true if the device's runtime PM status is 'suspended', or false | 400 | - return true if the device's runtime PM status is 'suspended' and its |
340 | otherwise | 401 | 'power.disable_depth' field is equal to zero, or false otherwise |
341 | 402 | ||
342 | void pm_runtime_allow(struct device *dev); | 403 | void pm_runtime_allow(struct device *dev); |
343 | - set the power.runtime_auto flag for the device and decrease its usage | 404 | - set the power.runtime_auto flag for the device and decrease its usage |
@@ -349,19 +410,65 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h: | |||
349 | counter (used by the /sys/devices/.../power/control interface to | 410 | counter (used by the /sys/devices/.../power/control interface to |
350 | effectively prevent the device from being power managed at run time) | 411 | effectively prevent the device from being power managed at run time) |
351 | 412 | ||
413 | void pm_runtime_no_callbacks(struct device *dev); | ||
414 | - set the power.no_callbacks flag for the device and remove the run-time | ||
415 | PM attributes from /sys/devices/.../power (or prevent them from being | ||
416 | added when the device is registered) | ||
417 | |||
418 | void pm_runtime_irq_safe(struct device *dev); | ||
419 | - set the power.irq_safe flag for the device, causing the runtime-PM | ||
420 | suspend and resume callbacks (but not the idle callback) to be invoked | ||
421 | with interrupts disabled | ||
422 | |||
423 | void pm_runtime_mark_last_busy(struct device *dev); | ||
424 | - set the power.last_busy field to the current time | ||
425 | |||
426 | void pm_runtime_use_autosuspend(struct device *dev); | ||
427 | - set the power.use_autosuspend flag, enabling autosuspend delays | ||
428 | |||
429 | void pm_runtime_dont_use_autosuspend(struct device *dev); | ||
430 | - clear the power.use_autosuspend flag, disabling autosuspend delays | ||
431 | |||
432 | void pm_runtime_set_autosuspend_delay(struct device *dev, int delay); | ||
433 | - set the power.autosuspend_delay value to 'delay' (expressed in | ||
434 | milliseconds); if 'delay' is negative then run-time suspends are | ||
435 | prevented | ||
436 | |||
437 | unsigned long pm_runtime_autosuspend_expiration(struct device *dev); | ||
438 | - calculate the time when the current autosuspend delay period will expire, | ||
439 | based on power.last_busy and power.autosuspend_delay; if the delay time | ||
440 | is 1000 ms or larger then the expiration time is rounded up to the | ||
441 | nearest second; returns 0 if the delay period has already expired or | ||
442 | power.use_autosuspend isn't set, otherwise returns the expiration time | ||
443 | in jiffies | ||
444 | |||
352 | It is safe to execute the following helper functions from interrupt context: | 445 | It is safe to execute the following helper functions from interrupt context: |
353 | 446 | ||
354 | pm_request_idle() | 447 | pm_request_idle() |
448 | pm_request_autosuspend() | ||
355 | pm_schedule_suspend() | 449 | pm_schedule_suspend() |
356 | pm_request_resume() | 450 | pm_request_resume() |
357 | pm_runtime_get_noresume() | 451 | pm_runtime_get_noresume() |
358 | pm_runtime_get() | 452 | pm_runtime_get() |
359 | pm_runtime_put_noidle() | 453 | pm_runtime_put_noidle() |
360 | pm_runtime_put() | 454 | pm_runtime_put() |
455 | pm_runtime_put_autosuspend() | ||
456 | pm_runtime_enable() | ||
361 | pm_suspend_ignore_children() | 457 | pm_suspend_ignore_children() |
362 | pm_runtime_set_active() | 458 | pm_runtime_set_active() |
363 | pm_runtime_set_suspended() | 459 | pm_runtime_set_suspended() |
364 | pm_runtime_enable() | 460 | pm_runtime_suspended() |
461 | pm_runtime_mark_last_busy() | ||
462 | pm_runtime_autosuspend_expiration() | ||
463 | |||
464 | If pm_runtime_irq_safe() has been called for a device then the following helper | ||
465 | functions may also be used in interrupt context: | ||
466 | |||
467 | pm_runtime_suspend() | ||
468 | pm_runtime_autosuspend() | ||
469 | pm_runtime_resume() | ||
470 | pm_runtime_get_sync() | ||
471 | pm_runtime_put_sync_suspend() | ||
365 | 472 | ||
366 | 5. Run-time PM Initialization, Device Probing and Removal | 473 | 5. Run-time PM Initialization, Device Probing and Removal |
367 | 474 | ||
@@ -394,13 +501,29 @@ helper functions described in Section 4. In that case, pm_runtime_resume() | |||
394 | should be used. Of course, for this purpose the device's run-time PM has to be | 501 | should be used. Of course, for this purpose the device's run-time PM has to be |
395 | enabled earlier by calling pm_runtime_enable(). | 502 | enabled earlier by calling pm_runtime_enable(). |
396 | 503 | ||
397 | If the device bus type's or driver's ->probe() or ->remove() callback runs | 504 | If the device bus type's or driver's ->probe() callback runs |
398 | pm_runtime_suspend() or pm_runtime_idle() or their asynchronous counterparts, | 505 | pm_runtime_suspend() or pm_runtime_idle() or their asynchronous counterparts, |
399 | they will fail returning -EAGAIN, because the device's usage counter is | 506 | they will fail returning -EAGAIN, because the device's usage counter is |
400 | incremented by the core before executing ->probe() and ->remove(). Still, it | 507 | incremented by the driver core before executing ->probe(). Still, it may be |
401 | may be desirable to suspend the device as soon as ->probe() or ->remove() has | 508 | desirable to suspend the device as soon as ->probe() has finished, so the driver |
402 | finished, so the PM core uses pm_runtime_idle_sync() to invoke the | 509 | core uses pm_runtime_put_sync() to invoke the subsystem-level idle callback for |
403 | subsystem-level idle callback for the device at that time. | 510 | the device at that time. |
511 | |||
512 | Moreover, the driver core prevents runtime PM callbacks from racing with the bus | ||
513 | notifier callback in __device_release_driver(), which is necessary, because the | ||
514 | notifier is used by some subsystems to carry out operations affecting the | ||
515 | runtime PM functionality. It does so by calling pm_runtime_get_sync() before | ||
516 | driver_sysfs_remove() and the BUS_NOTIFY_UNBIND_DRIVER notifications. This | ||
517 | resumes the device if it's in the suspended state and prevents it from | ||
518 | being suspended again while those routines are being executed. | ||
519 | |||
520 | To allow bus types and drivers to put devices into the suspended state by | ||
521 | calling pm_runtime_suspend() from their ->remove() routines, the driver core | ||
522 | executes pm_runtime_put_sync() after running the BUS_NOTIFY_UNBIND_DRIVER | ||
523 | notifications in __device_release_driver(). This requires bus types and | ||
524 | drivers to make their ->remove() callbacks avoid races with runtime PM directly, | ||
525 | but also it allows of more flexibility in the handling of devices during the | ||
526 | removal of their drivers. | ||
404 | 527 | ||
405 | The user space can effectively disallow the driver of the device to power manage | 528 | The user space can effectively disallow the driver of the device to power manage |
406 | it at run time by changing the value of its /sys/devices/.../power/control | 529 | it at run time by changing the value of its /sys/devices/.../power/control |
@@ -459,11 +582,6 @@ to do this is: | |||
459 | pm_runtime_set_active(dev); | 582 | pm_runtime_set_active(dev); |
460 | pm_runtime_enable(dev); | 583 | pm_runtime_enable(dev); |
461 | 584 | ||
462 | The PM core always increments the run-time usage counter before calling the | ||
463 | ->prepare() callback and decrements it after calling the ->complete() callback. | ||
464 | Hence disabling run-time PM temporarily like this will not cause any run-time | ||
465 | suspend callbacks to be lost. | ||
466 | |||
467 | 7. Generic subsystem callbacks | 585 | 7. Generic subsystem callbacks |
468 | 586 | ||
469 | Subsystems may wish to conserve code space by using the set of generic power | 587 | Subsystems may wish to conserve code space by using the set of generic power |
@@ -524,3 +642,141 @@ poweroff and run-time suspend callback, and similarly for system resume, thaw, | |||
524 | restore, and run-time resume, can achieve this with the help of the | 642 | restore, and run-time resume, can achieve this with the help of the |
525 | UNIVERSAL_DEV_PM_OPS macro defined in include/linux/pm.h (possibly setting its | 643 | UNIVERSAL_DEV_PM_OPS macro defined in include/linux/pm.h (possibly setting its |
526 | last argument to NULL). | 644 | last argument to NULL). |
645 | |||
646 | 8. "No-Callback" Devices | ||
647 | |||
648 | Some "devices" are only logical sub-devices of their parent and cannot be | ||
649 | power-managed on their own. (The prototype example is a USB interface. Entire | ||
650 | USB devices can go into low-power mode or send wake-up requests, but neither is | ||
651 | possible for individual interfaces.) The drivers for these devices have no | ||
652 | need of run-time PM callbacks; if the callbacks did exist, ->runtime_suspend() | ||
653 | and ->runtime_resume() would always return 0 without doing anything else and | ||
654 | ->runtime_idle() would always call pm_runtime_suspend(). | ||
655 | |||
656 | Subsystems can tell the PM core about these devices by calling | ||
657 | pm_runtime_no_callbacks(). This should be done after the device structure is | ||
658 | initialized and before it is registered (although after device registration is | ||
659 | also okay). The routine will set the device's power.no_callbacks flag and | ||
660 | prevent the non-debugging run-time PM sysfs attributes from being created. | ||
661 | |||
662 | When power.no_callbacks is set, the PM core will not invoke the | ||
663 | ->runtime_idle(), ->runtime_suspend(), or ->runtime_resume() callbacks. | ||
664 | Instead it will assume that suspends and resumes always succeed and that idle | ||
665 | devices should be suspended. | ||
666 | |||
667 | As a consequence, the PM core will never directly inform the device's subsystem | ||
668 | or driver about run-time power changes. Instead, the driver for the device's | ||
669 | parent must take responsibility for telling the device's driver when the | ||
670 | parent's power state changes. | ||
671 | |||
672 | 9. Autosuspend, or automatically-delayed suspends | ||
673 | |||
674 | Changing a device's power state isn't free; it requires both time and energy. | ||
675 | A device should be put in a low-power state only when there's some reason to | ||
676 | think it will remain in that state for a substantial time. A common heuristic | ||
677 | says that a device which hasn't been used for a while is liable to remain | ||
678 | unused; following this advice, drivers should not allow devices to be suspended | ||
679 | at run-time until they have been inactive for some minimum period. Even when | ||
680 | the heuristic ends up being non-optimal, it will still prevent devices from | ||
681 | "bouncing" too rapidly between low-power and full-power states. | ||
682 | |||
683 | The term "autosuspend" is an historical remnant. It doesn't mean that the | ||
684 | device is automatically suspended (the subsystem or driver still has to call | ||
685 | the appropriate PM routines); rather it means that run-time suspends will | ||
686 | automatically be delayed until the desired period of inactivity has elapsed. | ||
687 | |||
688 | Inactivity is determined based on the power.last_busy field. Drivers should | ||
689 | call pm_runtime_mark_last_busy() to update this field after carrying out I/O, | ||
690 | typically just before calling pm_runtime_put_autosuspend(). The desired length | ||
691 | of the inactivity period is a matter of policy. Subsystems can set this length | ||
692 | initially by calling pm_runtime_set_autosuspend_delay(), but after device | ||
693 | registration the length should be controlled by user space, using the | ||
694 | /sys/devices/.../power/autosuspend_delay_ms attribute. | ||
695 | |||
696 | In order to use autosuspend, subsystems or drivers must call | ||
697 | pm_runtime_use_autosuspend() (preferably before registering the device), and | ||
698 | thereafter they should use the various *_autosuspend() helper functions instead | ||
699 | of the non-autosuspend counterparts: | ||
700 | |||
701 | Instead of: pm_runtime_suspend use: pm_runtime_autosuspend; | ||
702 | Instead of: pm_schedule_suspend use: pm_request_autosuspend; | ||
703 | Instead of: pm_runtime_put use: pm_runtime_put_autosuspend; | ||
704 | Instead of: pm_runtime_put_sync use: pm_runtime_put_sync_autosuspend. | ||
705 | |||
706 | Drivers may also continue to use the non-autosuspend helper functions; they | ||
707 | will behave normally, not taking the autosuspend delay into account. | ||
708 | Similarly, if the power.use_autosuspend field isn't set then the autosuspend | ||
709 | helper functions will behave just like the non-autosuspend counterparts. | ||
710 | |||
711 | The implementation is well suited for asynchronous use in interrupt contexts. | ||
712 | However such use inevitably involves races, because the PM core can't | ||
713 | synchronize ->runtime_suspend() callbacks with the arrival of I/O requests. | ||
714 | This synchronization must be handled by the driver, using its private lock. | ||
715 | Here is a schematic pseudo-code example: | ||
716 | |||
717 | foo_read_or_write(struct foo_priv *foo, void *data) | ||
718 | { | ||
719 | lock(&foo->private_lock); | ||
720 | add_request_to_io_queue(foo, data); | ||
721 | if (foo->num_pending_requests++ == 0) | ||
722 | pm_runtime_get(&foo->dev); | ||
723 | if (!foo->is_suspended) | ||
724 | foo_process_next_request(foo); | ||
725 | unlock(&foo->private_lock); | ||
726 | } | ||
727 | |||
728 | foo_io_completion(struct foo_priv *foo, void *req) | ||
729 | { | ||
730 | lock(&foo->private_lock); | ||
731 | if (--foo->num_pending_requests == 0) { | ||
732 | pm_runtime_mark_last_busy(&foo->dev); | ||
733 | pm_runtime_put_autosuspend(&foo->dev); | ||
734 | } else { | ||
735 | foo_process_next_request(foo); | ||
736 | } | ||
737 | unlock(&foo->private_lock); | ||
738 | /* Send req result back to the user ... */ | ||
739 | } | ||
740 | |||
741 | int foo_runtime_suspend(struct device *dev) | ||
742 | { | ||
743 | struct foo_priv foo = container_of(dev, ...); | ||
744 | int ret = 0; | ||
745 | |||
746 | lock(&foo->private_lock); | ||
747 | if (foo->num_pending_requests > 0) { | ||
748 | ret = -EBUSY; | ||
749 | } else { | ||
750 | /* ... suspend the device ... */ | ||
751 | foo->is_suspended = 1; | ||
752 | } | ||
753 | unlock(&foo->private_lock); | ||
754 | return ret; | ||
755 | } | ||
756 | |||
757 | int foo_runtime_resume(struct device *dev) | ||
758 | { | ||
759 | struct foo_priv foo = container_of(dev, ...); | ||
760 | |||
761 | lock(&foo->private_lock); | ||
762 | /* ... resume the device ... */ | ||
763 | foo->is_suspended = 0; | ||
764 | pm_runtime_mark_last_busy(&foo->dev); | ||
765 | if (foo->num_pending_requests > 0) | ||
766 | foo_process_requests(foo); | ||
767 | unlock(&foo->private_lock); | ||
768 | return 0; | ||
769 | } | ||
770 | |||
771 | The important point is that after foo_io_completion() asks for an autosuspend, | ||
772 | the foo_runtime_suspend() callback may race with foo_read_or_write(). | ||
773 | Therefore foo_runtime_suspend() has to check whether there are any pending I/O | ||
774 | requests (while holding the private lock) before allowing the suspend to | ||
775 | proceed. | ||
776 | |||
777 | In addition, the power.autosuspend_delay field can be changed by user space at | ||
778 | any time. If a driver cares about this, it can call | ||
779 | pm_runtime_autosuspend_expiration() from within the ->runtime_suspend() | ||
780 | callback while holding its private lock. If the function returns a nonzero | ||
781 | value then the delay has not yet expired and the callback should return | ||
782 | -EAGAIN. | ||
diff --git a/Documentation/power/s2ram.txt b/Documentation/power/s2ram.txt index 514b94fc931e..1bdfa0443773 100644 --- a/Documentation/power/s2ram.txt +++ b/Documentation/power/s2ram.txt | |||
@@ -49,6 +49,13 @@ machine that doesn't boot) is: | |||
49 | device (lspci and /sys/devices/pci* is your friend), and see if you can | 49 | device (lspci and /sys/devices/pci* is your friend), and see if you can |
50 | fix it, disable it, or trace into its resume function. | 50 | fix it, disable it, or trace into its resume function. |
51 | 51 | ||
52 | If no device matches the hash (or any matches appear to be false positives), | ||
53 | the culprit may be a device from a loadable kernel module that is not loaded | ||
54 | until after the hash is checked. You can check the hash against the current | ||
55 | devices again after more modules are loaded using sysfs: | ||
56 | |||
57 | cat /sys/power/pm_trace_dev_match | ||
58 | |||
52 | For example, the above happens to be the VGA device on my EVO, which I | 59 | For example, the above happens to be the VGA device on my EVO, which I |
53 | used to run with "radeonfb" (it's an ATI Radeon mobility). It turns out | 60 | used to run with "radeonfb" (it's an ATI Radeon mobility). It turns out |
54 | that "radeonfb" simply cannot resume that device - it tries to set the | 61 | that "radeonfb" simply cannot resume that device - it tries to set the |
diff --git a/Documentation/power/states.txt b/Documentation/power/states.txt index 34800cc521bf..4416b28630df 100644 --- a/Documentation/power/states.txt +++ b/Documentation/power/states.txt | |||
@@ -62,12 +62,12 @@ setup via another operating system for it to use. Despite the | |||
62 | inconvenience, this method requires minimal work by the kernel, since | 62 | inconvenience, this method requires minimal work by the kernel, since |
63 | the firmware will also handle restoring memory contents on resume. | 63 | the firmware will also handle restoring memory contents on resume. |
64 | 64 | ||
65 | For suspend-to-disk, a mechanism called swsusp called 'swsusp' (Swap | 65 | For suspend-to-disk, a mechanism called 'swsusp' (Swap Suspend) is used |
66 | Suspend) is used to write memory contents to free swap space. | 66 | to write memory contents to free swap space. swsusp has some restrictive |
67 | swsusp has some restrictive requirements, but should work in most | 67 | requirements, but should work in most cases. Some, albeit outdated, |
68 | cases. Some, albeit outdated, documentation can be found in | 68 | documentation can be found in Documentation/power/swsusp.txt. |
69 | Documentation/power/swsusp.txt. Alternatively, userspace can do most | 69 | Alternatively, userspace can do most of the actual suspend to disk work, |
70 | of the actual suspend to disk work, see userland-swsusp.txt. | 70 | see userland-swsusp.txt. |
71 | 71 | ||
72 | Once memory state is written to disk, the system may either enter a | 72 | Once memory state is written to disk, the system may either enter a |
73 | low-power state (like ACPI S4), or it may simply power down. Powering | 73 | low-power state (like ACPI S4), or it may simply power down. Powering |
diff --git a/Documentation/power/swsusp.txt b/Documentation/power/swsusp.txt index 9d60ab717a7b..ac190cf1963e 100644 --- a/Documentation/power/swsusp.txt +++ b/Documentation/power/swsusp.txt | |||
@@ -66,7 +66,8 @@ swsusp saves the state of the machine into active swaps and then reboots or | |||
66 | powerdowns. You must explicitly specify the swap partition to resume from with | 66 | powerdowns. You must explicitly specify the swap partition to resume from with |
67 | ``resume='' kernel option. If signature is found it loads and restores saved | 67 | ``resume='' kernel option. If signature is found it loads and restores saved |
68 | state. If the option ``noresume'' is specified as a boot parameter, it skips | 68 | state. If the option ``noresume'' is specified as a boot parameter, it skips |
69 | the resuming. | 69 | the resuming. If the option ``hibernate=nocompress'' is specified as a boot |
70 | parameter, it saves hibernation image without compression. | ||
70 | 71 | ||
71 | In the meantime while the system is suspended you should not add/remove any | 72 | In the meantime while the system is suspended you should not add/remove any |
72 | of the hardware, write to the filesystems, etc. | 73 | of the hardware, write to the filesystems, etc. |
@@ -191,7 +192,7 @@ Q: There don't seem to be any generally useful behavioral | |||
191 | distinctions between SUSPEND and FREEZE. | 192 | distinctions between SUSPEND and FREEZE. |
192 | 193 | ||
193 | A: Doing SUSPEND when you are asked to do FREEZE is always correct, | 194 | A: Doing SUSPEND when you are asked to do FREEZE is always correct, |
194 | but it may be unneccessarily slow. If you want your driver to stay simple, | 195 | but it may be unnecessarily slow. If you want your driver to stay simple, |
195 | slowness may not matter to you. It can always be fixed later. | 196 | slowness may not matter to you. It can always be fixed later. |
196 | 197 | ||
197 | For devices like disk it does matter, you do not want to spindown for | 198 | For devices like disk it does matter, you do not want to spindown for |
@@ -236,7 +237,7 @@ disk. Whole sequence goes like | |||
236 | 237 | ||
237 | running system, user asks for suspend-to-disk | 238 | running system, user asks for suspend-to-disk |
238 | 239 | ||
239 | user processes are stopped (in common case there are none, but with resume-from-initrd, noone knows) | 240 | user processes are stopped (in common case there are none, but with resume-from-initrd, no one knows) |
240 | 241 | ||
241 | read image from disk | 242 | read image from disk |
242 | 243 | ||
diff --git a/Documentation/power/userland-swsusp.txt b/Documentation/power/userland-swsusp.txt index 81680f9f5909..1101bee4e822 100644 --- a/Documentation/power/userland-swsusp.txt +++ b/Documentation/power/userland-swsusp.txt | |||
@@ -98,7 +98,7 @@ SNAPSHOT_S2RAM - suspend to RAM; using this call causes the kernel to | |||
98 | The device's read() operation can be used to transfer the snapshot image from | 98 | The device's read() operation can be used to transfer the snapshot image from |
99 | the kernel. It has the following limitations: | 99 | the kernel. It has the following limitations: |
100 | - you cannot read() more than one virtual memory page at a time | 100 | - you cannot read() more than one virtual memory page at a time |
101 | - read()s accross page boundaries are impossible (ie. if ypu read() 1/2 of | 101 | - read()s across page boundaries are impossible (ie. if ypu read() 1/2 of |
102 | a page in the previous call, you will only be able to read() | 102 | a page in the previous call, you will only be able to read() |
103 | _at_ _most_ 1/2 of the page in the next call) | 103 | _at_ _most_ 1/2 of the page in the next call) |
104 | 104 | ||
@@ -137,7 +137,7 @@ mechanism and the userland utilities using the interface SHOULD use additional | |||
137 | means, such as checksums, to ensure the integrity of the snapshot image. | 137 | means, such as checksums, to ensure the integrity of the snapshot image. |
138 | 138 | ||
139 | The suspending and resuming utilities MUST lock themselves in memory, | 139 | The suspending and resuming utilities MUST lock themselves in memory, |
140 | preferrably using mlockall(), before calling SNAPSHOT_FREEZE. | 140 | preferably using mlockall(), before calling SNAPSHOT_FREEZE. |
141 | 141 | ||
142 | The suspending utility MUST check the value stored by SNAPSHOT_CREATE_IMAGE | 142 | The suspending utility MUST check the value stored by SNAPSHOT_CREATE_IMAGE |
143 | in the memory location pointed to by the last argument of ioctl() and proceed | 143 | in the memory location pointed to by the last argument of ioctl() and proceed |
@@ -147,7 +147,7 @@ in accordance with it: | |||
147 | (a) The suspending utility MUST NOT close the snapshot device | 147 | (a) The suspending utility MUST NOT close the snapshot device |
148 | _unless_ the whole suspend procedure is to be cancelled, in | 148 | _unless_ the whole suspend procedure is to be cancelled, in |
149 | which case, if the snapshot image has already been saved, the | 149 | which case, if the snapshot image has already been saved, the |
150 | suspending utility SHOULD destroy it, preferrably by zapping | 150 | suspending utility SHOULD destroy it, preferably by zapping |
151 | its header. If the suspend is not to be cancelled, the | 151 | its header. If the suspend is not to be cancelled, the |
152 | system MUST be powered off or rebooted after the snapshot | 152 | system MUST be powered off or rebooted after the snapshot |
153 | image has been saved. | 153 | image has been saved. |