diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2010-10-21 17:53:17 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2010-10-21 17:53:17 -0400 |
commit | a8cbf22559ceefdcdfac00701e8e6da7518b7e8e (patch) | |
tree | 63ebd5779a37f809f7daed77dbf27aa3f1e1110c /Documentation | |
parent | e36f561a2c88394ef2708f1ab300fe8a79e9f651 (diff) | |
parent | 9c034392533f3e9f00656d5c58478cff2560ef81 (diff) |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (26 commits)
PM / Wakeup: Show wakeup sources statistics in debugfs
PM: Introduce library for device-specific OPPs (v7)
PM: Add sysfs attr for rechecking dev hash from PM trace
PM: Lock PM device list mutex in show_dev_hash()
PM / Runtime: Remove idle notification after failing suspend
PM / Hibernate: Modify signature used to mark swap
PM / Runtime: Reduce code duplication in core helper functions
PM: Allow wakeup events to abort freezing of tasks
PM: runtime: add missed pm_request_autosuspend
PM / Hibernate: Make some boot messages look less scary
PM / Runtime: Implement autosuspend support
PM / Runtime: Add no_callbacks flag
PM / Runtime: Combine runtime PM entry points
PM / Runtime: Merge synchronous and async runtime routines
PM / Runtime: Replace boolean arguments with bitflags
PM / Runtime: Move code in drivers/base/power/runtime.c
sysfs: Add sysfs_merge_group() and sysfs_unmerge_group()
PM: Fix potential issue with failing asynchronous suspend
PM / Wakeup: Introduce wakeup source objects and event statistics (v3)
PM: Fix signed/unsigned warning in dpm_show_time()
...
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/ABI/testing/sysfs-devices-power | 88 | ||||
-rw-r--r-- | Documentation/ABI/testing/sysfs-power | 29 | ||||
-rw-r--r-- | Documentation/kernel-parameters.txt | 5 | ||||
-rw-r--r-- | Documentation/power/00-INDEX | 2 | ||||
-rw-r--r-- | Documentation/power/interface.txt | 2 | ||||
-rw-r--r-- | Documentation/power/opp.txt | 375 | ||||
-rw-r--r-- | Documentation/power/runtime_pm.txt | 227 | ||||
-rw-r--r-- | Documentation/power/s2ram.txt | 7 | ||||
-rw-r--r-- | Documentation/power/swsusp.txt | 3 |
9 files changed, 729 insertions, 9 deletions
diff --git a/Documentation/ABI/testing/sysfs-devices-power b/Documentation/ABI/testing/sysfs-devices-power index 6123c523bfd7..7628cd1bc36a 100644 --- a/Documentation/ABI/testing/sysfs-devices-power +++ b/Documentation/ABI/testing/sysfs-devices-power | |||
@@ -77,3 +77,91 @@ Description: | |||
77 | devices this attribute is set to "enabled" by bus type code or | 77 | devices this attribute is set to "enabled" by bus type code or |
78 | device drivers and in that cases it should be safe to leave the | 78 | device drivers and in that cases it should be safe to leave the |
79 | default value. | 79 | default value. |
80 | |||
81 | What: /sys/devices/.../power/wakeup_count | ||
82 | Date: September 2010 | ||
83 | Contact: Rafael J. Wysocki <rjw@sisk.pl> | ||
84 | Description: | ||
85 | The /sys/devices/.../wakeup_count attribute contains the number | ||
86 | of signaled wakeup events associated with the device. This | ||
87 | attribute is read-only. If the device is not enabled to wake up | ||
88 | the system from sleep states, this attribute is empty. | ||
89 | |||
90 | What: /sys/devices/.../power/wakeup_active_count | ||
91 | Date: September 2010 | ||
92 | Contact: Rafael J. Wysocki <rjw@sisk.pl> | ||
93 | Description: | ||
94 | The /sys/devices/.../wakeup_active_count attribute contains the | ||
95 | number of times the processing of wakeup events associated with | ||
96 | the device was completed (at the kernel level). This attribute | ||
97 | is read-only. If the device is not enabled to wake up the | ||
98 | system from sleep states, this attribute is empty. | ||
99 | |||
100 | What: /sys/devices/.../power/wakeup_hit_count | ||
101 | Date: September 2010 | ||
102 | Contact: Rafael J. Wysocki <rjw@sisk.pl> | ||
103 | Description: | ||
104 | The /sys/devices/.../wakeup_hit_count attribute contains the | ||
105 | number of times the processing of a wakeup event associated with | ||
106 | the device might prevent the system from entering a sleep state. | ||
107 | This attribute is read-only. If the device is not enabled to | ||
108 | wake up the system from sleep states, this attribute is empty. | ||
109 | |||
110 | What: /sys/devices/.../power/wakeup_active | ||
111 | Date: September 2010 | ||
112 | Contact: Rafael J. Wysocki <rjw@sisk.pl> | ||
113 | Description: | ||
114 | The /sys/devices/.../wakeup_active attribute contains either 1, | ||
115 | or 0, depending on whether or not a wakeup event associated with | ||
116 | the device is being processed (1). This attribute is read-only. | ||
117 | If the device is not enabled to wake up the system from sleep | ||
118 | states, this attribute is empty. | ||
119 | |||
120 | What: /sys/devices/.../power/wakeup_total_time_ms | ||
121 | Date: September 2010 | ||
122 | Contact: Rafael J. Wysocki <rjw@sisk.pl> | ||
123 | Description: | ||
124 | The /sys/devices/.../wakeup_total_time_ms attribute contains | ||
125 | the total time of processing wakeup events associated with the | ||
126 | device, in milliseconds. This attribute is read-only. If the | ||
127 | device is not enabled to wake up the system from sleep states, | ||
128 | this attribute is empty. | ||
129 | |||
130 | What: /sys/devices/.../power/wakeup_max_time_ms | ||
131 | Date: September 2010 | ||
132 | Contact: Rafael J. Wysocki <rjw@sisk.pl> | ||
133 | Description: | ||
134 | The /sys/devices/.../wakeup_max_time_ms attribute contains | ||
135 | the maximum time of processing a single wakeup event associated | ||
136 | with the device, in milliseconds. This attribute is read-only. | ||
137 | If the device is not enabled to wake up the system from sleep | ||
138 | states, this attribute is empty. | ||
139 | |||
140 | What: /sys/devices/.../power/wakeup_last_time_ms | ||
141 | Date: September 2010 | ||
142 | Contact: Rafael J. Wysocki <rjw@sisk.pl> | ||
143 | Description: | ||
144 | The /sys/devices/.../wakeup_last_time_ms attribute contains | ||
145 | the value of the monotonic clock corresponding to the time of | ||
146 | signaling the last wakeup event associated with the device, in | ||
147 | milliseconds. This attribute is read-only. If the device is | ||
148 | not enabled to wake up the system from sleep states, this | ||
149 | attribute is empty. | ||
150 | |||
151 | What: /sys/devices/.../power/autosuspend_delay_ms | ||
152 | Date: September 2010 | ||
153 | Contact: Alan Stern <stern@rowland.harvard.edu> | ||
154 | Description: | ||
155 | The /sys/devices/.../power/autosuspend_delay_ms attribute | ||
156 | contains the autosuspend delay value (in milliseconds). Some | ||
157 | drivers do not want their device to suspend as soon as it | ||
158 | becomes idle at run time; they want the device to remain | ||
159 | inactive for a certain minimum period of time first. That | ||
160 | period is called the autosuspend delay. Negative values will | ||
161 | prevent the device from being suspended at run time (similar | ||
162 | to writing "on" to the power/control attribute). Values >= | ||
163 | 1000 will cause the autosuspend timer expiration to be rounded | ||
164 | up to the nearest second. | ||
165 | |||
166 | Not all drivers support this attribute. If it isn't supported, | ||
167 | attempts to read or write it will yield I/O errors. | ||
diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power index 2875f1f74a07..194ca446ac28 100644 --- a/Documentation/ABI/testing/sysfs-power +++ b/Documentation/ABI/testing/sysfs-power | |||
@@ -99,9 +99,38 @@ Description: | |||
99 | 99 | ||
100 | dmesg -s 1000000 | grep 'hash matches' | 100 | dmesg -s 1000000 | grep 'hash matches' |
101 | 101 | ||
102 | If you do not get any matches (or they appear to be false | ||
103 | positives), it is possible that the last PM event point | ||
104 | referred to a device created by a loadable kernel module. In | ||
105 | this case cat /sys/power/pm_trace_dev_match (see below) after | ||
106 | your system is started up and the kernel modules are loaded. | ||
107 | |||
102 | CAUTION: Using it will cause your machine's real-time (CMOS) | 108 | CAUTION: Using it will cause your machine's real-time (CMOS) |
103 | clock to be set to a random invalid time after a resume. | 109 | clock to be set to a random invalid time after a resume. |
104 | 110 | ||
111 | What; /sys/power/pm_trace_dev_match | ||
112 | Date: October 2010 | ||
113 | Contact: James Hogan <james@albanarts.com> | ||
114 | Description: | ||
115 | The /sys/power/pm_trace_dev_match file contains the name of the | ||
116 | device associated with the last PM event point saved in the RTC | ||
117 | across reboots when pm_trace has been used. More precisely it | ||
118 | contains the list of current devices (including those | ||
119 | registered by loadable kernel modules since boot) which match | ||
120 | the device hash in the RTC at boot, with a newline after each | ||
121 | one. | ||
122 | |||
123 | The advantage of this file over the hash matches printed to the | ||
124 | kernel log (see /sys/power/pm_trace), is that it includes | ||
125 | devices created after boot by loadable kernel modules. | ||
126 | |||
127 | Due to the small hash size necessary to fit in the RTC, it is | ||
128 | possible that more than one device matches the hash, in which | ||
129 | case further investigation is required to determine which | ||
130 | device is causing the problem. Note that genuine RTC clock | ||
131 | values (such as when pm_trace has not been used), can still | ||
132 | match a device and output it's name here. | ||
133 | |||
105 | What: /sys/power/pm_async | 134 | What: /sys/power/pm_async |
106 | Date: January 2009 | 135 | Date: January 2009 |
107 | Contact: Rafael J. Wysocki <rjw@sisk.pl> | 136 | Contact: Rafael J. Wysocki <rjw@sisk.pl> |
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 3a0009e03d14..02f21d9220ce 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt | |||
@@ -2170,6 +2170,11 @@ and is between 256 and 4096 characters. It is defined in the file | |||
2170 | in <PAGE_SIZE> units (needed only for swap files). | 2170 | in <PAGE_SIZE> units (needed only for swap files). |
2171 | See Documentation/power/swsusp-and-swap-files.txt | 2171 | See Documentation/power/swsusp-and-swap-files.txt |
2172 | 2172 | ||
2173 | hibernate= [HIBERNATION] | ||
2174 | noresume Don't check if there's a hibernation image | ||
2175 | present during boot. | ||
2176 | nocompress Don't compress/decompress hibernation images. | ||
2177 | |||
2173 | retain_initrd [RAM] Keep initrd memory after extraction | 2178 | retain_initrd [RAM] Keep initrd memory after extraction |
2174 | 2179 | ||
2175 | rhash_entries= [KNL,NET] | 2180 | rhash_entries= [KNL,NET] |
diff --git a/Documentation/power/00-INDEX b/Documentation/power/00-INDEX index fb742c213c9e..45e9d4a91284 100644 --- a/Documentation/power/00-INDEX +++ b/Documentation/power/00-INDEX | |||
@@ -14,6 +14,8 @@ interface.txt | |||
14 | - Power management user interface in /sys/power | 14 | - Power management user interface in /sys/power |
15 | notifiers.txt | 15 | notifiers.txt |
16 | - Registering suspend notifiers in device drivers | 16 | - Registering suspend notifiers in device drivers |
17 | opp.txt | ||
18 | - Operating Performance Point library | ||
17 | pci.txt | 19 | pci.txt |
18 | - How the PCI Subsystem Does Power Management | 20 | - How the PCI Subsystem Does Power Management |
19 | pm_qos_interface.txt | 21 | pm_qos_interface.txt |
diff --git a/Documentation/power/interface.txt b/Documentation/power/interface.txt index e67211fe0ee2..c537834af005 100644 --- a/Documentation/power/interface.txt +++ b/Documentation/power/interface.txt | |||
@@ -57,7 +57,7 @@ smallest image possible. In particular, if "0" is written to this file, the | |||
57 | suspend image will be as small as possible. | 57 | suspend image will be as small as possible. |
58 | 58 | ||
59 | Reading from this file will display the current image size limit, which | 59 | Reading from this file will display the current image size limit, which |
60 | is set to 500 MB by default. | 60 | is set to 2/5 of available RAM by default. |
61 | 61 | ||
62 | /sys/power/pm_trace controls the code which saves the last PM event point in | 62 | /sys/power/pm_trace controls the code which saves the last PM event point in |
63 | the RTC across reboots, so that you can debug a machine that just hangs | 63 | the RTC across reboots, so that you can debug a machine that just hangs |
diff --git a/Documentation/power/opp.txt b/Documentation/power/opp.txt new file mode 100644 index 000000000000..44d87ad3cea9 --- /dev/null +++ b/Documentation/power/opp.txt | |||
@@ -0,0 +1,375 @@ | |||
1 | *=============* | ||
2 | * OPP Library * | ||
3 | *=============* | ||
4 | |||
5 | (C) 2009-2010 Nishanth Menon <nm@ti.com>, Texas Instruments Incorporated | ||
6 | |||
7 | Contents | ||
8 | -------- | ||
9 | 1. Introduction | ||
10 | 2. Initial OPP List Registration | ||
11 | 3. OPP Search Functions | ||
12 | 4. OPP Availability Control Functions | ||
13 | 5. OPP Data Retrieval Functions | ||
14 | 6. Cpufreq Table Generation | ||
15 | 7. Data Structures | ||
16 | |||
17 | 1. Introduction | ||
18 | =============== | ||
19 | Complex SoCs of today consists of a multiple sub-modules working in conjunction. | ||
20 | In an operational system executing varied use cases, not all modules in the SoC | ||
21 | need to function at their highest performing frequency all the time. To | ||
22 | facilitate this, sub-modules in a SoC are grouped into domains, allowing some | ||
23 | domains to run at lower voltage and frequency while other domains are loaded | ||
24 | more. The set of discrete tuples consisting of frequency and voltage pairs that | ||
25 | the device will support per domain are called Operating Performance Points or | ||
26 | OPPs. | ||
27 | |||
28 | OPP library provides a set of helper functions to organize and query the OPP | ||
29 | information. The library is located in drivers/base/power/opp.c and the header | ||
30 | is located in include/linux/opp.h. OPP library can be enabled by enabling | ||
31 | CONFIG_PM_OPP from power management menuconfig menu. OPP library depends on | ||
32 | CONFIG_PM as certain SoCs such as Texas Instrument's OMAP framework allows to | ||
33 | optionally boot at a certain OPP without needing cpufreq. | ||
34 | |||
35 | Typical usage of the OPP library is as follows: | ||
36 | (users) -> registers a set of default OPPs -> (library) | ||
37 | SoC framework -> modifies on required cases certain OPPs -> OPP layer | ||
38 | -> queries to search/retrieve information -> | ||
39 | |||
40 | OPP layer expects each domain to be represented by a unique device pointer. SoC | ||
41 | framework registers a set of initial OPPs per device with the OPP layer. This | ||
42 | list is expected to be an optimally small number typically around 5 per device. | ||
43 | This initial list contains a set of OPPs that the framework expects to be safely | ||
44 | enabled by default in the system. | ||
45 | |||
46 | Note on OPP Availability: | ||
47 | ------------------------ | ||
48 | As the system proceeds to operate, SoC framework may choose to make certain | ||
49 | OPPs available or not available on each device based on various external | ||
50 | factors. Example usage: Thermal management or other exceptional situations where | ||
51 | SoC framework might choose to disable a higher frequency OPP to safely continue | ||
52 | operations until that OPP could be re-enabled if possible. | ||
53 | |||
54 | OPP library facilitates this concept in it's implementation. The following | ||
55 | operational functions operate only on available opps: | ||
56 | opp_find_freq_{ceil, floor}, opp_get_voltage, opp_get_freq, opp_get_opp_count | ||
57 | and opp_init_cpufreq_table | ||
58 | |||
59 | opp_find_freq_exact is meant to be used to find the opp pointer which can then | ||
60 | be used for opp_enable/disable functions to make an opp available as required. | ||
61 | |||
62 | WARNING: Users of OPP library should refresh their availability count using | ||
63 | get_opp_count if opp_enable/disable functions are invoked for a device, the | ||
64 | exact mechanism to trigger these or the notification mechanism to other | ||
65 | dependent subsystems such as cpufreq are left to the discretion of the SoC | ||
66 | specific framework which uses the OPP library. Similar care needs to be taken | ||
67 | care to refresh the cpufreq table in cases of these operations. | ||
68 | |||
69 | WARNING on OPP List locking mechanism: | ||
70 | ------------------------------------------------- | ||
71 | OPP library uses RCU for exclusivity. RCU allows the query functions to operate | ||
72 | in multiple contexts and this synchronization mechanism is optimal for a read | ||
73 | intensive operations on data structure as the OPP library caters to. | ||
74 | |||
75 | To ensure that the data retrieved are sane, the users such as SoC framework | ||
76 | should ensure that the section of code operating on OPP queries are locked | ||
77 | using RCU read locks. The opp_find_freq_{exact,ceil,floor}, | ||
78 | opp_get_{voltage, freq, opp_count} fall into this category. | ||
79 | |||
80 | opp_{add,enable,disable} are updaters which use mutex and implement it's own | ||
81 | RCU locking mechanisms. opp_init_cpufreq_table acts as an updater and uses | ||
82 | mutex to implment RCU updater strategy. These functions should *NOT* be called | ||
83 | under RCU locks and other contexts that prevent blocking functions in RCU or | ||
84 | mutex operations from working. | ||
85 | |||
86 | 2. Initial OPP List Registration | ||
87 | ================================ | ||
88 | The SoC implementation calls opp_add function iteratively to add OPPs per | ||
89 | device. It is expected that the SoC framework will register the OPP entries | ||
90 | optimally- typical numbers range to be less than 5. The list generated by | ||
91 | registering the OPPs is maintained by OPP library throughout the device | ||
92 | operation. The SoC framework can subsequently control the availability of the | ||
93 | OPPs dynamically using the opp_enable / disable functions. | ||
94 | |||
95 | opp_add - Add a new OPP for a specific domain represented by the device pointer. | ||
96 | The OPP is defined using the frequency and voltage. Once added, the OPP | ||
97 | is assumed to be available and control of it's availability can be done | ||
98 | with the opp_enable/disable functions. OPP library internally stores | ||
99 | and manages this information in the opp struct. This function may be | ||
100 | used by SoC framework to define a optimal list as per the demands of | ||
101 | SoC usage environment. | ||
102 | |||
103 | WARNING: Do not use this function in interrupt context. | ||
104 | |||
105 | Example: | ||
106 | soc_pm_init() | ||
107 | { | ||
108 | /* Do things */ | ||
109 | r = opp_add(mpu_dev, 1000000, 900000); | ||
110 | if (!r) { | ||
111 | pr_err("%s: unable to register mpu opp(%d)\n", r); | ||
112 | goto no_cpufreq; | ||
113 | } | ||
114 | /* Do cpufreq things */ | ||
115 | no_cpufreq: | ||
116 | /* Do remaining things */ | ||
117 | } | ||
118 | |||
119 | 3. OPP Search Functions | ||
120 | ======================= | ||
121 | High level framework such as cpufreq operates on frequencies. To map the | ||
122 | frequency back to the corresponding OPP, OPP library provides handy functions | ||
123 | to search the OPP list that OPP library internally manages. These search | ||
124 | functions return the matching pointer representing the opp if a match is | ||
125 | found, else returns error. These errors are expected to be handled by standard | ||
126 | error checks such as IS_ERR() and appropriate actions taken by the caller. | ||
127 | |||
128 | opp_find_freq_exact - Search for an OPP based on an *exact* frequency and | ||
129 | availability. This function is especially useful to enable an OPP which | ||
130 | is not available by default. | ||
131 | Example: In a case when SoC framework detects a situation where a | ||
132 | higher frequency could be made available, it can use this function to | ||
133 | find the OPP prior to call the opp_enable to actually make it available. | ||
134 | rcu_read_lock(); | ||
135 | opp = opp_find_freq_exact(dev, 1000000000, false); | ||
136 | rcu_read_unlock(); | ||
137 | /* dont operate on the pointer.. just do a sanity check.. */ | ||
138 | if (IS_ERR(opp)) { | ||
139 | pr_err("frequency not disabled!\n"); | ||
140 | /* trigger appropriate actions.. */ | ||
141 | } else { | ||
142 | opp_enable(dev,1000000000); | ||
143 | } | ||
144 | |||
145 | NOTE: This is the only search function that operates on OPPs which are | ||
146 | not available. | ||
147 | |||
148 | opp_find_freq_floor - Search for an available OPP which is *at most* the | ||
149 | provided frequency. This function is useful while searching for a lesser | ||
150 | match OR operating on OPP information in the order of decreasing | ||
151 | frequency. | ||
152 | Example: To find the highest opp for a device: | ||
153 | freq = ULONG_MAX; | ||
154 | rcu_read_lock(); | ||
155 | opp_find_freq_floor(dev, &freq); | ||
156 | rcu_read_unlock(); | ||
157 | |||
158 | opp_find_freq_ceil - Search for an available OPP which is *at least* the | ||
159 | provided frequency. This function is useful while searching for a | ||
160 | higher match OR operating on OPP information in the order of increasing | ||
161 | frequency. | ||
162 | Example 1: To find the lowest opp for a device: | ||
163 | freq = 0; | ||
164 | rcu_read_lock(); | ||
165 | opp_find_freq_ceil(dev, &freq); | ||
166 | rcu_read_unlock(); | ||
167 | Example 2: A simplified implementation of a SoC cpufreq_driver->target: | ||
168 | soc_cpufreq_target(..) | ||
169 | { | ||
170 | /* Do stuff like policy checks etc. */ | ||
171 | /* Find the best frequency match for the req */ | ||
172 | rcu_read_lock(); | ||
173 | opp = opp_find_freq_ceil(dev, &freq); | ||
174 | rcu_read_unlock(); | ||
175 | if (!IS_ERR(opp)) | ||
176 | soc_switch_to_freq_voltage(freq); | ||
177 | else | ||
178 | /* do something when we cant satisfy the req */ | ||
179 | /* do other stuff */ | ||
180 | } | ||
181 | |||
182 | 4. OPP Availability Control Functions | ||
183 | ===================================== | ||
184 | A default OPP list registered with the OPP library may not cater to all possible | ||
185 | situation. The OPP library provides a set of functions to modify the | ||
186 | availability of a OPP within the OPP list. This allows SoC frameworks to have | ||
187 | fine grained dynamic control of which sets of OPPs are operationally available. | ||
188 | These functions are intended to *temporarily* remove an OPP in conditions such | ||
189 | as thermal considerations (e.g. don't use OPPx until the temperature drops). | ||
190 | |||
191 | WARNING: Do not use these functions in interrupt context. | ||
192 | |||
193 | opp_enable - Make a OPP available for operation. | ||
194 | Example: Lets say that 1GHz OPP is to be made available only if the | ||
195 | SoC temperature is lower than a certain threshold. The SoC framework | ||
196 | implementation might choose to do something as follows: | ||
197 | if (cur_temp < temp_low_thresh) { | ||
198 | /* Enable 1GHz if it was disabled */ | ||
199 | rcu_read_lock(); | ||
200 | opp = opp_find_freq_exact(dev, 1000000000, false); | ||
201 | rcu_read_unlock(); | ||
202 | /* just error check */ | ||
203 | if (!IS_ERR(opp)) | ||
204 | ret = opp_enable(dev, 1000000000); | ||
205 | else | ||
206 | goto try_something_else; | ||
207 | } | ||
208 | |||
209 | opp_disable - Make an OPP to be not available for operation | ||
210 | Example: Lets say that 1GHz OPP is to be disabled if the temperature | ||
211 | exceeds a threshold value. The SoC framework implementation might | ||
212 | choose to do something as follows: | ||
213 | if (cur_temp > temp_high_thresh) { | ||
214 | /* Disable 1GHz if it was enabled */ | ||
215 | rcu_read_lock(); | ||
216 | opp = opp_find_freq_exact(dev, 1000000000, true); | ||
217 | rcu_read_unlock(); | ||
218 | /* just error check */ | ||
219 | if (!IS_ERR(opp)) | ||
220 | ret = opp_disable(dev, 1000000000); | ||
221 | else | ||
222 | goto try_something_else; | ||
223 | } | ||
224 | |||
225 | 5. OPP Data Retrieval Functions | ||
226 | =============================== | ||
227 | Since OPP library abstracts away the OPP information, a set of functions to pull | ||
228 | information from the OPP structure is necessary. Once an OPP pointer is | ||
229 | retrieved using the search functions, the following functions can be used by SoC | ||
230 | framework to retrieve the information represented inside the OPP layer. | ||
231 | |||
232 | opp_get_voltage - Retrieve the voltage represented by the opp pointer. | ||
233 | Example: At a cpufreq transition to a different frequency, SoC | ||
234 | framework requires to set the voltage represented by the OPP using | ||
235 | the regulator framework to the Power Management chip providing the | ||
236 | voltage. | ||
237 | soc_switch_to_freq_voltage(freq) | ||
238 | { | ||
239 | /* do things */ | ||
240 | rcu_read_lock(); | ||
241 | opp = opp_find_freq_ceil(dev, &freq); | ||
242 | v = opp_get_voltage(opp); | ||
243 | rcu_read_unlock(); | ||
244 | if (v) | ||
245 | regulator_set_voltage(.., v); | ||
246 | /* do other things */ | ||
247 | } | ||
248 | |||
249 | opp_get_freq - Retrieve the freq represented by the opp pointer. | ||
250 | Example: Lets say the SoC framework uses a couple of helper functions | ||
251 | we could pass opp pointers instead of doing additional parameters to | ||
252 | handle quiet a bit of data parameters. | ||
253 | soc_cpufreq_target(..) | ||
254 | { | ||
255 | /* do things.. */ | ||
256 | max_freq = ULONG_MAX; | ||
257 | rcu_read_lock(); | ||
258 | max_opp = opp_find_freq_floor(dev,&max_freq); | ||
259 | requested_opp = opp_find_freq_ceil(dev,&freq); | ||
260 | if (!IS_ERR(max_opp) && !IS_ERR(requested_opp)) | ||
261 | r = soc_test_validity(max_opp, requested_opp); | ||
262 | rcu_read_unlock(); | ||
263 | /* do other things */ | ||
264 | } | ||
265 | soc_test_validity(..) | ||
266 | { | ||
267 | if(opp_get_voltage(max_opp) < opp_get_voltage(requested_opp)) | ||
268 | return -EINVAL; | ||
269 | if(opp_get_freq(max_opp) < opp_get_freq(requested_opp)) | ||
270 | return -EINVAL; | ||
271 | /* do things.. */ | ||
272 | } | ||
273 | |||
274 | opp_get_opp_count - Retrieve the number of available opps for a device | ||
275 | Example: Lets say a co-processor in the SoC needs to know the available | ||
276 | frequencies in a table, the main processor can notify as following: | ||
277 | soc_notify_coproc_available_frequencies() | ||
278 | { | ||
279 | /* Do things */ | ||
280 | rcu_read_lock(); | ||
281 | num_available = opp_get_opp_count(dev); | ||
282 | speeds = kzalloc(sizeof(u32) * num_available, GFP_KERNEL); | ||
283 | /* populate the table in increasing order */ | ||
284 | freq = 0; | ||
285 | while (!IS_ERR(opp = opp_find_freq_ceil(dev, &freq))) { | ||
286 | speeds[i] = freq; | ||
287 | freq++; | ||
288 | i++; | ||
289 | } | ||
290 | rcu_read_unlock(); | ||
291 | |||
292 | soc_notify_coproc(AVAILABLE_FREQs, speeds, num_available); | ||
293 | /* Do other things */ | ||
294 | } | ||
295 | |||
296 | 6. Cpufreq Table Generation | ||
297 | =========================== | ||
298 | opp_init_cpufreq_table - cpufreq framework typically is initialized with | ||
299 | cpufreq_frequency_table_cpuinfo which is provided with the list of | ||
300 | frequencies that are available for operation. This function provides | ||
301 | a ready to use conversion routine to translate the OPP layer's internal | ||
302 | information about the available frequencies into a format readily | ||
303 | providable to cpufreq. | ||
304 | |||
305 | WARNING: Do not use this function in interrupt context. | ||
306 | |||
307 | Example: | ||
308 | soc_pm_init() | ||
309 | { | ||
310 | /* Do things */ | ||
311 | r = opp_init_cpufreq_table(dev, &freq_table); | ||
312 | if (!r) | ||
313 | cpufreq_frequency_table_cpuinfo(policy, freq_table); | ||
314 | /* Do other things */ | ||
315 | } | ||
316 | |||
317 | NOTE: This function is available only if CONFIG_CPU_FREQ is enabled in | ||
318 | addition to CONFIG_PM as power management feature is required to | ||
319 | dynamically scale voltage and frequency in a system. | ||
320 | |||
321 | 7. Data Structures | ||
322 | ================== | ||
323 | Typically an SoC contains multiple voltage domains which are variable. Each | ||
324 | domain is represented by a device pointer. The relationship to OPP can be | ||
325 | represented as follows: | ||
326 | SoC | ||
327 | |- device 1 | ||
328 | | |- opp 1 (availability, freq, voltage) | ||
329 | | |- opp 2 .. | ||
330 | ... ... | ||
331 | | `- opp n .. | ||
332 | |- device 2 | ||
333 | ... | ||
334 | `- device m | ||
335 | |||
336 | OPP library maintains a internal list that the SoC framework populates and | ||
337 | accessed by various functions as described above. However, the structures | ||
338 | representing the actual OPPs and domains are internal to the OPP library itself | ||
339 | to allow for suitable abstraction reusable across systems. | ||
340 | |||
341 | struct opp - The internal data structure of OPP library which is used to | ||
342 | represent an OPP. In addition to the freq, voltage, availability | ||
343 | information, it also contains internal book keeping information required | ||
344 | for the OPP library to operate on. Pointer to this structure is | ||
345 | provided back to the users such as SoC framework to be used as a | ||
346 | identifier for OPP in the interactions with OPP layer. | ||
347 | |||
348 | WARNING: The struct opp pointer should not be parsed or modified by the | ||
349 | users. The defaults of for an instance is populated by opp_add, but the | ||
350 | availability of the OPP can be modified by opp_enable/disable functions. | ||
351 | |||
352 | struct device - This is used to identify a domain to the OPP layer. The | ||
353 | nature of the device and it's implementation is left to the user of | ||
354 | OPP library such as the SoC framework. | ||
355 | |||
356 | Overall, in a simplistic view, the data structure operations is represented as | ||
357 | following: | ||
358 | |||
359 | Initialization / modification: | ||
360 | +-----+ /- opp_enable | ||
361 | opp_add --> | opp | <------- | ||
362 | | +-----+ \- opp_disable | ||
363 | \-------> domain_info(device) | ||
364 | |||
365 | Search functions: | ||
366 | /-- opp_find_freq_ceil ---\ +-----+ | ||
367 | domain_info<---- opp_find_freq_exact -----> | opp | | ||
368 | \-- opp_find_freq_floor ---/ +-----+ | ||
369 | |||
370 | Retrieval functions: | ||
371 | +-----+ /- opp_get_voltage | ||
372 | | opp | <--- | ||
373 | +-----+ \- opp_get_freq | ||
374 | |||
375 | domain_info <- opp_get_opp_count | ||
diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt index 55b859b3bc72..489e9bacd165 100644 --- a/Documentation/power/runtime_pm.txt +++ b/Documentation/power/runtime_pm.txt | |||
@@ -1,6 +1,7 @@ | |||
1 | Run-time Power Management Framework for I/O Devices | 1 | Run-time Power Management Framework for I/O Devices |
2 | 2 | ||
3 | (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. | 3 | (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. |
4 | (C) 2010 Alan Stern <stern@rowland.harvard.edu> | ||
4 | 5 | ||
5 | 1. Introduction | 6 | 1. Introduction |
6 | 7 | ||
@@ -157,7 +158,8 @@ rules: | |||
157 | to execute it, the other callbacks will not be executed for the same device. | 158 | to execute it, the other callbacks will not be executed for the same device. |
158 | 159 | ||
159 | * A request to execute ->runtime_resume() will cancel any pending or | 160 | * A request to execute ->runtime_resume() will cancel any pending or |
160 | scheduled requests to execute the other callbacks for the same device. | 161 | scheduled requests to execute the other callbacks for the same device, |
162 | except for scheduled autosuspends. | ||
161 | 163 | ||
162 | 3. Run-time PM Device Fields | 164 | 3. Run-time PM Device Fields |
163 | 165 | ||
@@ -165,7 +167,7 @@ The following device run-time PM fields are present in 'struct dev_pm_info', as | |||
165 | defined in include/linux/pm.h: | 167 | defined in include/linux/pm.h: |
166 | 168 | ||
167 | struct timer_list suspend_timer; | 169 | struct timer_list suspend_timer; |
168 | - timer used for scheduling (delayed) suspend request | 170 | - timer used for scheduling (delayed) suspend and autosuspend requests |
169 | 171 | ||
170 | unsigned long timer_expires; | 172 | unsigned long timer_expires; |
171 | - timer expiration time, in jiffies (if this is different from zero, the | 173 | - timer expiration time, in jiffies (if this is different from zero, the |
@@ -230,6 +232,28 @@ defined in include/linux/pm.h: | |||
230 | interface; it may only be modified with the help of the pm_runtime_allow() | 232 | interface; it may only be modified with the help of the pm_runtime_allow() |
231 | and pm_runtime_forbid() helper functions | 233 | and pm_runtime_forbid() helper functions |
232 | 234 | ||
235 | unsigned int no_callbacks; | ||
236 | - indicates that the device does not use the run-time PM callbacks (see | ||
237 | Section 8); it may be modified only by the pm_runtime_no_callbacks() | ||
238 | helper function | ||
239 | |||
240 | unsigned int use_autosuspend; | ||
241 | - indicates that the device's driver supports delayed autosuspend (see | ||
242 | Section 9); it may be modified only by the | ||
243 | pm_runtime{_dont}_use_autosuspend() helper functions | ||
244 | |||
245 | unsigned int timer_autosuspends; | ||
246 | - indicates that the PM core should attempt to carry out an autosuspend | ||
247 | when the timer expires rather than a normal suspend | ||
248 | |||
249 | int autosuspend_delay; | ||
250 | - the delay time (in milliseconds) to be used for autosuspend | ||
251 | |||
252 | unsigned long last_busy; | ||
253 | - the time (in jiffies) when the pm_runtime_mark_last_busy() helper | ||
254 | function was last called for this device; used in calculating inactivity | ||
255 | periods for autosuspend | ||
256 | |||
233 | All of the above fields are members of the 'power' member of 'struct device'. | 257 | All of the above fields are members of the 'power' member of 'struct device'. |
234 | 258 | ||
235 | 4. Run-time PM Device Helper Functions | 259 | 4. Run-time PM Device Helper Functions |
@@ -255,6 +279,12 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h: | |||
255 | error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt | 279 | error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt |
256 | to suspend the device again in future | 280 | to suspend the device again in future |
257 | 281 | ||
282 | int pm_runtime_autosuspend(struct device *dev); | ||
283 | - same as pm_runtime_suspend() except that the autosuspend delay is taken | ||
284 | into account; if pm_runtime_autosuspend_expiration() says the delay has | ||
285 | not yet expired then an autosuspend is scheduled for the appropriate time | ||
286 | and 0 is returned | ||
287 | |||
258 | int pm_runtime_resume(struct device *dev); | 288 | int pm_runtime_resume(struct device *dev); |
259 | - execute the subsystem-level resume callback for the device; returns 0 on | 289 | - execute the subsystem-level resume callback for the device; returns 0 on |
260 | success, 1 if the device's run-time PM status was already 'active' or | 290 | success, 1 if the device's run-time PM status was already 'active' or |
@@ -267,6 +297,11 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h: | |||
267 | device (the request is represented by a work item in pm_wq); returns 0 on | 297 | device (the request is represented by a work item in pm_wq); returns 0 on |
268 | success or error code if the request has not been queued up | 298 | success or error code if the request has not been queued up |
269 | 299 | ||
300 | int pm_request_autosuspend(struct device *dev); | ||
301 | - schedule the execution of the subsystem-level suspend callback for the | ||
302 | device when the autosuspend delay has expired; if the delay has already | ||
303 | expired then the work item is queued up immediately | ||
304 | |||
270 | int pm_schedule_suspend(struct device *dev, unsigned int delay); | 305 | int pm_schedule_suspend(struct device *dev, unsigned int delay); |
271 | - schedule the execution of the subsystem-level suspend callback for the | 306 | - schedule the execution of the subsystem-level suspend callback for the |
272 | device in future, where 'delay' is the time to wait before queuing up a | 307 | device in future, where 'delay' is the time to wait before queuing up a |
@@ -298,12 +333,20 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h: | |||
298 | - decrement the device's usage counter | 333 | - decrement the device's usage counter |
299 | 334 | ||
300 | int pm_runtime_put(struct device *dev); | 335 | int pm_runtime_put(struct device *dev); |
301 | - decrement the device's usage counter, run pm_request_idle(dev) and return | 336 | - decrement the device's usage counter; if the result is 0 then run |
302 | its result | 337 | pm_request_idle(dev) and return its result |
338 | |||
339 | int pm_runtime_put_autosuspend(struct device *dev); | ||
340 | - decrement the device's usage counter; if the result is 0 then run | ||
341 | pm_request_autosuspend(dev) and return its result | ||
303 | 342 | ||
304 | int pm_runtime_put_sync(struct device *dev); | 343 | int pm_runtime_put_sync(struct device *dev); |
305 | - decrement the device's usage counter, run pm_runtime_idle(dev) and return | 344 | - decrement the device's usage counter; if the result is 0 then run |
306 | its result | 345 | pm_runtime_idle(dev) and return its result |
346 | |||
347 | int pm_runtime_put_sync_autosuspend(struct device *dev); | ||
348 | - decrement the device's usage counter; if the result is 0 then run | ||
349 | pm_runtime_autosuspend(dev) and return its result | ||
307 | 350 | ||
308 | void pm_runtime_enable(struct device *dev); | 351 | void pm_runtime_enable(struct device *dev); |
309 | - enable the run-time PM helper functions to run the device bus type's | 352 | - enable the run-time PM helper functions to run the device bus type's |
@@ -349,19 +392,51 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h: | |||
349 | counter (used by the /sys/devices/.../power/control interface to | 392 | counter (used by the /sys/devices/.../power/control interface to |
350 | effectively prevent the device from being power managed at run time) | 393 | effectively prevent the device from being power managed at run time) |
351 | 394 | ||
395 | void pm_runtime_no_callbacks(struct device *dev); | ||
396 | - set the power.no_callbacks flag for the device and remove the run-time | ||
397 | PM attributes from /sys/devices/.../power (or prevent them from being | ||
398 | added when the device is registered) | ||
399 | |||
400 | void pm_runtime_mark_last_busy(struct device *dev); | ||
401 | - set the power.last_busy field to the current time | ||
402 | |||
403 | void pm_runtime_use_autosuspend(struct device *dev); | ||
404 | - set the power.use_autosuspend flag, enabling autosuspend delays | ||
405 | |||
406 | void pm_runtime_dont_use_autosuspend(struct device *dev); | ||
407 | - clear the power.use_autosuspend flag, disabling autosuspend delays | ||
408 | |||
409 | void pm_runtime_set_autosuspend_delay(struct device *dev, int delay); | ||
410 | - set the power.autosuspend_delay value to 'delay' (expressed in | ||
411 | milliseconds); if 'delay' is negative then run-time suspends are | ||
412 | prevented | ||
413 | |||
414 | unsigned long pm_runtime_autosuspend_expiration(struct device *dev); | ||
415 | - calculate the time when the current autosuspend delay period will expire, | ||
416 | based on power.last_busy and power.autosuspend_delay; if the delay time | ||
417 | is 1000 ms or larger then the expiration time is rounded up to the | ||
418 | nearest second; returns 0 if the delay period has already expired or | ||
419 | power.use_autosuspend isn't set, otherwise returns the expiration time | ||
420 | in jiffies | ||
421 | |||
352 | It is safe to execute the following helper functions from interrupt context: | 422 | It is safe to execute the following helper functions from interrupt context: |
353 | 423 | ||
354 | pm_request_idle() | 424 | pm_request_idle() |
425 | pm_request_autosuspend() | ||
355 | pm_schedule_suspend() | 426 | pm_schedule_suspend() |
356 | pm_request_resume() | 427 | pm_request_resume() |
357 | pm_runtime_get_noresume() | 428 | pm_runtime_get_noresume() |
358 | pm_runtime_get() | 429 | pm_runtime_get() |
359 | pm_runtime_put_noidle() | 430 | pm_runtime_put_noidle() |
360 | pm_runtime_put() | 431 | pm_runtime_put() |
432 | pm_runtime_put_autosuspend() | ||
433 | pm_runtime_enable() | ||
361 | pm_suspend_ignore_children() | 434 | pm_suspend_ignore_children() |
362 | pm_runtime_set_active() | 435 | pm_runtime_set_active() |
363 | pm_runtime_set_suspended() | 436 | pm_runtime_set_suspended() |
364 | pm_runtime_enable() | 437 | pm_runtime_suspended() |
438 | pm_runtime_mark_last_busy() | ||
439 | pm_runtime_autosuspend_expiration() | ||
365 | 440 | ||
366 | 5. Run-time PM Initialization, Device Probing and Removal | 441 | 5. Run-time PM Initialization, Device Probing and Removal |
367 | 442 | ||
@@ -524,3 +599,141 @@ poweroff and run-time suspend callback, and similarly for system resume, thaw, | |||
524 | restore, and run-time resume, can achieve this with the help of the | 599 | restore, and run-time resume, can achieve this with the help of the |
525 | UNIVERSAL_DEV_PM_OPS macro defined in include/linux/pm.h (possibly setting its | 600 | UNIVERSAL_DEV_PM_OPS macro defined in include/linux/pm.h (possibly setting its |
526 | last argument to NULL). | 601 | last argument to NULL). |
602 | |||
603 | 8. "No-Callback" Devices | ||
604 | |||
605 | Some "devices" are only logical sub-devices of their parent and cannot be | ||
606 | power-managed on their own. (The prototype example is a USB interface. Entire | ||
607 | USB devices can go into low-power mode or send wake-up requests, but neither is | ||
608 | possible for individual interfaces.) The drivers for these devices have no | ||
609 | need of run-time PM callbacks; if the callbacks did exist, ->runtime_suspend() | ||
610 | and ->runtime_resume() would always return 0 without doing anything else and | ||
611 | ->runtime_idle() would always call pm_runtime_suspend(). | ||
612 | |||
613 | Subsystems can tell the PM core about these devices by calling | ||
614 | pm_runtime_no_callbacks(). This should be done after the device structure is | ||
615 | initialized and before it is registered (although after device registration is | ||
616 | also okay). The routine will set the device's power.no_callbacks flag and | ||
617 | prevent the non-debugging run-time PM sysfs attributes from being created. | ||
618 | |||
619 | When power.no_callbacks is set, the PM core will not invoke the | ||
620 | ->runtime_idle(), ->runtime_suspend(), or ->runtime_resume() callbacks. | ||
621 | Instead it will assume that suspends and resumes always succeed and that idle | ||
622 | devices should be suspended. | ||
623 | |||
624 | As a consequence, the PM core will never directly inform the device's subsystem | ||
625 | or driver about run-time power changes. Instead, the driver for the device's | ||
626 | parent must take responsibility for telling the device's driver when the | ||
627 | parent's power state changes. | ||
628 | |||
629 | 9. Autosuspend, or automatically-delayed suspends | ||
630 | |||
631 | Changing a device's power state isn't free; it requires both time and energy. | ||
632 | A device should be put in a low-power state only when there's some reason to | ||
633 | think it will remain in that state for a substantial time. A common heuristic | ||
634 | says that a device which hasn't been used for a while is liable to remain | ||
635 | unused; following this advice, drivers should not allow devices to be suspended | ||
636 | at run-time until they have been inactive for some minimum period. Even when | ||
637 | the heuristic ends up being non-optimal, it will still prevent devices from | ||
638 | "bouncing" too rapidly between low-power and full-power states. | ||
639 | |||
640 | The term "autosuspend" is an historical remnant. It doesn't mean that the | ||
641 | device is automatically suspended (the subsystem or driver still has to call | ||
642 | the appropriate PM routines); rather it means that run-time suspends will | ||
643 | automatically be delayed until the desired period of inactivity has elapsed. | ||
644 | |||
645 | Inactivity is determined based on the power.last_busy field. Drivers should | ||
646 | call pm_runtime_mark_last_busy() to update this field after carrying out I/O, | ||
647 | typically just before calling pm_runtime_put_autosuspend(). The desired length | ||
648 | of the inactivity period is a matter of policy. Subsystems can set this length | ||
649 | initially by calling pm_runtime_set_autosuspend_delay(), but after device | ||
650 | registration the length should be controlled by user space, using the | ||
651 | /sys/devices/.../power/autosuspend_delay_ms attribute. | ||
652 | |||
653 | In order to use autosuspend, subsystems or drivers must call | ||
654 | pm_runtime_use_autosuspend() (preferably before registering the device), and | ||
655 | thereafter they should use the various *_autosuspend() helper functions instead | ||
656 | of the non-autosuspend counterparts: | ||
657 | |||
658 | Instead of: pm_runtime_suspend use: pm_runtime_autosuspend; | ||
659 | Instead of: pm_schedule_suspend use: pm_request_autosuspend; | ||
660 | Instead of: pm_runtime_put use: pm_runtime_put_autosuspend; | ||
661 | Instead of: pm_runtime_put_sync use: pm_runtime_put_sync_autosuspend. | ||
662 | |||
663 | Drivers may also continue to use the non-autosuspend helper functions; they | ||
664 | will behave normally, not taking the autosuspend delay into account. | ||
665 | Similarly, if the power.use_autosuspend field isn't set then the autosuspend | ||
666 | helper functions will behave just like the non-autosuspend counterparts. | ||
667 | |||
668 | The implementation is well suited for asynchronous use in interrupt contexts. | ||
669 | However such use inevitably involves races, because the PM core can't | ||
670 | synchronize ->runtime_suspend() callbacks with the arrival of I/O requests. | ||
671 | This synchronization must be handled by the driver, using its private lock. | ||
672 | Here is a schematic pseudo-code example: | ||
673 | |||
674 | foo_read_or_write(struct foo_priv *foo, void *data) | ||
675 | { | ||
676 | lock(&foo->private_lock); | ||
677 | add_request_to_io_queue(foo, data); | ||
678 | if (foo->num_pending_requests++ == 0) | ||
679 | pm_runtime_get(&foo->dev); | ||
680 | if (!foo->is_suspended) | ||
681 | foo_process_next_request(foo); | ||
682 | unlock(&foo->private_lock); | ||
683 | } | ||
684 | |||
685 | foo_io_completion(struct foo_priv *foo, void *req) | ||
686 | { | ||
687 | lock(&foo->private_lock); | ||
688 | if (--foo->num_pending_requests == 0) { | ||
689 | pm_runtime_mark_last_busy(&foo->dev); | ||
690 | pm_runtime_put_autosuspend(&foo->dev); | ||
691 | } else { | ||
692 | foo_process_next_request(foo); | ||
693 | } | ||
694 | unlock(&foo->private_lock); | ||
695 | /* Send req result back to the user ... */ | ||
696 | } | ||
697 | |||
698 | int foo_runtime_suspend(struct device *dev) | ||
699 | { | ||
700 | struct foo_priv foo = container_of(dev, ...); | ||
701 | int ret = 0; | ||
702 | |||
703 | lock(&foo->private_lock); | ||
704 | if (foo->num_pending_requests > 0) { | ||
705 | ret = -EBUSY; | ||
706 | } else { | ||
707 | /* ... suspend the device ... */ | ||
708 | foo->is_suspended = 1; | ||
709 | } | ||
710 | unlock(&foo->private_lock); | ||
711 | return ret; | ||
712 | } | ||
713 | |||
714 | int foo_runtime_resume(struct device *dev) | ||
715 | { | ||
716 | struct foo_priv foo = container_of(dev, ...); | ||
717 | |||
718 | lock(&foo->private_lock); | ||
719 | /* ... resume the device ... */ | ||
720 | foo->is_suspended = 0; | ||
721 | pm_runtime_mark_last_busy(&foo->dev); | ||
722 | if (foo->num_pending_requests > 0) | ||
723 | foo_process_requests(foo); | ||
724 | unlock(&foo->private_lock); | ||
725 | return 0; | ||
726 | } | ||
727 | |||
728 | The important point is that after foo_io_completion() asks for an autosuspend, | ||
729 | the foo_runtime_suspend() callback may race with foo_read_or_write(). | ||
730 | Therefore foo_runtime_suspend() has to check whether there are any pending I/O | ||
731 | requests (while holding the private lock) before allowing the suspend to | ||
732 | proceed. | ||
733 | |||
734 | In addition, the power.autosuspend_delay field can be changed by user space at | ||
735 | any time. If a driver cares about this, it can call | ||
736 | pm_runtime_autosuspend_expiration() from within the ->runtime_suspend() | ||
737 | callback while holding its private lock. If the function returns a nonzero | ||
738 | value then the delay has not yet expired and the callback should return | ||
739 | -EAGAIN. | ||
diff --git a/Documentation/power/s2ram.txt b/Documentation/power/s2ram.txt index 514b94fc931e..1bdfa0443773 100644 --- a/Documentation/power/s2ram.txt +++ b/Documentation/power/s2ram.txt | |||
@@ -49,6 +49,13 @@ machine that doesn't boot) is: | |||
49 | device (lspci and /sys/devices/pci* is your friend), and see if you can | 49 | device (lspci and /sys/devices/pci* is your friend), and see if you can |
50 | fix it, disable it, or trace into its resume function. | 50 | fix it, disable it, or trace into its resume function. |
51 | 51 | ||
52 | If no device matches the hash (or any matches appear to be false positives), | ||
53 | the culprit may be a device from a loadable kernel module that is not loaded | ||
54 | until after the hash is checked. You can check the hash against the current | ||
55 | devices again after more modules are loaded using sysfs: | ||
56 | |||
57 | cat /sys/power/pm_trace_dev_match | ||
58 | |||
52 | For example, the above happens to be the VGA device on my EVO, which I | 59 | For example, the above happens to be the VGA device on my EVO, which I |
53 | used to run with "radeonfb" (it's an ATI Radeon mobility). It turns out | 60 | used to run with "radeonfb" (it's an ATI Radeon mobility). It turns out |
54 | that "radeonfb" simply cannot resume that device - it tries to set the | 61 | that "radeonfb" simply cannot resume that device - it tries to set the |
diff --git a/Documentation/power/swsusp.txt b/Documentation/power/swsusp.txt index 9d60ab717a7b..ea718891a665 100644 --- a/Documentation/power/swsusp.txt +++ b/Documentation/power/swsusp.txt | |||
@@ -66,7 +66,8 @@ swsusp saves the state of the machine into active swaps and then reboots or | |||
66 | powerdowns. You must explicitly specify the swap partition to resume from with | 66 | powerdowns. You must explicitly specify the swap partition to resume from with |
67 | ``resume='' kernel option. If signature is found it loads and restores saved | 67 | ``resume='' kernel option. If signature is found it loads and restores saved |
68 | state. If the option ``noresume'' is specified as a boot parameter, it skips | 68 | state. If the option ``noresume'' is specified as a boot parameter, it skips |
69 | the resuming. | 69 | the resuming. If the option ``hibernate=nocompress'' is specified as a boot |
70 | parameter, it saves hibernation image without compression. | ||
70 | 71 | ||
71 | In the meantime while the system is suspended you should not add/remove any | 72 | In the meantime while the system is suspended you should not add/remove any |
72 | of the hardware, write to the filesystems, etc. | 73 | of the hardware, write to the filesystems, etc. |