summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorJavi Merino <javi.merino@arm.com>2015-03-02 12:17:19 -0500
committerEduardo Valentin <edubezval@gmail.com>2015-05-05 00:27:52 -0400
commit6b775e870c56c59c3e16531ea2307b797395f9f7 (patch)
tree40cceadc9fd3cfc6f30efe8b90db92c703ea7e00
parentc36cf07176316fbe6a4bdbc23afcb0cbf7822bf2 (diff)
thermal: introduce the Power Allocator governor
The power allocator governor is a thermal governor that controls system and device power allocation to control temperature. Conceptually, the implementation divides the sustainable power of a thermal zone among all the heat sources in that zone. This governor relies on "power actors", entities that represent heat sources. They can report current and maximum power consumption and can set a given maximum power consumption, usually via a cooling device. The governor uses a Proportional Integral Derivative (PID) controller driven by the temperature of the thermal zone. The output of the controller is a power budget that is then allocated to each power actor that can have bearing on the temperature we are trying to control. It decides how much power to give each cooling device based on the performance they are requesting. The PID controller ensures that the total power budget does not exceed the control temperature. Cc: Zhang Rui <rui.zhang@intel.com> Cc: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Signed-off-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
-rw-r--r--Documentation/thermal/power_allocator.txt247
-rw-r--r--drivers/thermal/Kconfig15
-rw-r--r--drivers/thermal/Makefile1
-rw-r--r--drivers/thermal/power_allocator.c520
-rw-r--r--drivers/thermal/thermal_core.c9
-rw-r--r--drivers/thermal/thermal_core.h8
-rw-r--r--include/linux/thermal.h37
7 files changed, 830 insertions, 7 deletions
diff --git a/Documentation/thermal/power_allocator.txt b/Documentation/thermal/power_allocator.txt
new file mode 100644
index 000000000000..c3797b529991
--- /dev/null
+++ b/Documentation/thermal/power_allocator.txt
@@ -0,0 +1,247 @@
1Power allocator governor tunables
2=================================
3
4Trip points
5-----------
6
7The governor requires the following two passive trip points:
8
91. "switch on" trip point: temperature above which the governor
10 control loop starts operating. This is the first passive trip
11 point of the thermal zone.
12
132. "desired temperature" trip point: it should be higher than the
14 "switch on" trip point. This the target temperature the governor
15 is controlling for. This is the last passive trip point of the
16 thermal zone.
17
18PID Controller
19--------------
20
21The power allocator governor implements a
22Proportional-Integral-Derivative controller (PID controller) with
23temperature as the control input and power as the controlled output:
24
25 P_max = k_p * e + k_i * err_integral + k_d * diff_err + sustainable_power
26
27where
28 e = desired_temperature - current_temperature
29 err_integral is the sum of previous errors
30 diff_err = e - previous_error
31
32It is similar to the one depicted below:
33
34 k_d
35 |
36current_temp |
37 | v
38 | +----------+ +---+
39 | +----->| diff_err |-->| X |------+
40 | | +----------+ +---+ |
41 | | | tdp actor
42 | | k_i | | get_requested_power()
43 | | | | | | |
44 | | | | | | | ...
45 v | v v v v v
46 +---+ | +-------+ +---+ +---+ +---+ +----------+
47 | S |-------+----->| sum e |----->| X |--->| S |-->| S |-->|power |
48 +---+ | +-------+ +---+ +---+ +---+ |allocation|
49 ^ | ^ +----------+
50 | | | | |
51 | | +---+ | | |
52 | +------->| X |-------------------+ v v
53 | +---+ granted performance
54desired_temperature ^
55 |
56 |
57 k_po/k_pu
58
59Sustainable power
60-----------------
61
62An estimate of the sustainable dissipatable power (in mW) should be
63provided while registering the thermal zone. This estimates the
64sustained power that can be dissipated at the desired control
65temperature. This is the maximum sustained power for allocation at
66the desired maximum temperature. The actual sustained power can vary
67for a number of reasons. The closed loop controller will take care of
68variations such as environmental conditions, and some factors related
69to the speed-grade of the silicon. `sustainable_power` is therefore
70simply an estimate, and may be tuned to affect the aggressiveness of
71the thermal ramp. For reference, the sustainable power of a 4" phone
72is typically 2000mW, while on a 10" tablet is around 4500mW (may vary
73depending on screen size).
74
75If you are using device tree, do add it as a property of the
76thermal-zone. For example:
77
78 thermal-zones {
79 soc_thermal {
80 polling-delay = <1000>;
81 polling-delay-passive = <100>;
82 sustainable-power = <2500>;
83 ...
84
85Instead, if the thermal zone is registered from the platform code, pass a
86`thermal_zone_params` that has a `sustainable_power`. If no
87`thermal_zone_params` were being passed, then something like below
88will suffice:
89
90 static const struct thermal_zone_params tz_params = {
91 .sustainable_power = 3500,
92 };
93
94and then pass `tz_params` as the 5th parameter to
95`thermal_zone_device_register()`
96
97k_po and k_pu
98-------------
99
100The implementation of the PID controller in the power allocator
101thermal governor allows the configuration of two proportional term
102constants: `k_po` and `k_pu`. `k_po` is the proportional term
103constant during temperature overshoot periods (current temperature is
104above "desired temperature" trip point). Conversely, `k_pu` is the
105proportional term constant during temperature undershoot periods
106(current temperature below "desired temperature" trip point).
107
108These controls are intended as the primary mechanism for configuring
109the permitted thermal "ramp" of the system. For instance, a lower
110`k_pu` value will provide a slower ramp, at the cost of capping
111available capacity at a low temperature. On the other hand, a high
112value of `k_pu` will result in the governor granting very high power
113whilst temperature is low, and may lead to temperature overshooting.
114
115The default value for `k_pu` is:
116
117 2 * sustainable_power / (desired_temperature - switch_on_temp)
118
119This means that at `switch_on_temp` the output of the controller's
120proportional term will be 2 * `sustainable_power`. The default value
121for `k_po` is:
122
123 sustainable_power / (desired_temperature - switch_on_temp)
124
125Focusing on the proportional and feed forward values of the PID
126controller equation we have:
127
128 P_max = k_p * e + sustainable_power
129
130The proportional term is proportional to the difference between the
131desired temperature and the current one. When the current temperature
132is the desired one, then the proportional component is zero and
133`P_max` = `sustainable_power`. That is, the system should operate in
134thermal equilibrium under constant load. `sustainable_power` is only
135an estimate, which is the reason for closed-loop control such as this.
136
137Expanding `k_pu` we get:
138 P_max = 2 * sustainable_power * (T_set - T) / (T_set - T_on) +
139 sustainable_power
140
141where
142 T_set is the desired temperature
143 T is the current temperature
144 T_on is the switch on temperature
145
146When the current temperature is the switch_on temperature, the above
147formula becomes:
148
149 P_max = 2 * sustainable_power * (T_set - T_on) / (T_set - T_on) +
150 sustainable_power = 2 * sustainable_power + sustainable_power =
151 3 * sustainable_power
152
153Therefore, the proportional term alone linearly decreases power from
1543 * `sustainable_power` to `sustainable_power` as the temperature
155rises from the switch on temperature to the desired temperature.
156
157k_i and integral_cutoff
158-----------------------
159
160`k_i` configures the PID loop's integral term constant. This term
161allows the PID controller to compensate for long term drift and for
162the quantized nature of the output control: cooling devices can't set
163the exact power that the governor requests. When the temperature
164error is below `integral_cutoff`, errors are accumulated in the
165integral term. This term is then multiplied by `k_i` and the result
166added to the output of the controller. Typically `k_i` is set low (1
167or 2) and `integral_cutoff` is 0.
168
169k_d
170---
171
172`k_d` configures the PID loop's derivative term constant. It's
173recommended to leave it as the default: 0.
174
175Cooling device power API
176========================
177
178Cooling devices controlled by this governor must supply the additional
179"power" API in their `cooling_device_ops`. It consists on three ops:
180
1811. int get_requested_power(struct thermal_cooling_device *cdev,
182 struct thermal_zone_device *tz, u32 *power);
183@cdev: The `struct thermal_cooling_device` pointer
184@tz: thermal zone in which we are currently operating
185@power: pointer in which to store the calculated power
186
187`get_requested_power()` calculates the power requested by the device
188in milliwatts and stores it in @power . It should return 0 on
189success, -E* on failure. This is currently used by the power
190allocator governor to calculate how much power to give to each cooling
191device.
192
1932. int state2power(struct thermal_cooling_device *cdev, struct
194 thermal_zone_device *tz, unsigned long state, u32 *power);
195@cdev: The `struct thermal_cooling_device` pointer
196@tz: thermal zone in which we are currently operating
197@state: A cooling device state
198@power: pointer in which to store the equivalent power
199
200Convert cooling device state @state into power consumption in
201milliwatts and store it in @power. It should return 0 on success, -E*
202on failure. This is currently used by thermal core to calculate the
203maximum power that an actor can consume.
204
2053. int power2state(struct thermal_cooling_device *cdev, u32 power,
206 unsigned long *state);
207@cdev: The `struct thermal_cooling_device` pointer
208@power: power in milliwatts
209@state: pointer in which to store the resulting state
210
211Calculate a cooling device state that would make the device consume at
212most @power mW and store it in @state. It should return 0 on success,
213-E* on failure. This is currently used by the thermal core to convert
214a given power set by the power allocator governor to a state that the
215cooling device can set. It is a function because this conversion may
216depend on external factors that may change so this function should the
217best conversion given "current circumstances".
218
219Cooling device weights
220----------------------
221
222Weights are a mechanism to bias the allocation among cooling
223devices. They express the relative power efficiency of different
224cooling devices. Higher weight can be used to express higher power
225efficiency. Weighting is relative such that if each cooling device
226has a weight of one they are considered equal. This is particularly
227useful in heterogeneous systems where two cooling devices may perform
228the same kind of compute, but with different efficiency. For example,
229a system with two different types of processors.
230
231If the thermal zone is registered using
232`thermal_zone_device_register()` (i.e., platform code), then weights
233are passed as part of the thermal zone's `thermal_bind_parameters`.
234If the platform is registered using device tree, then they are passed
235as the `contribution` property of each map in the `cooling-maps` node.
236
237Limitations of the power allocator governor
238===========================================
239
240The power allocator governor's PID controller works best if there is a
241periodic tick. If you have a driver that calls
242`thermal_zone_device_update()` (or anything that ends up calling the
243governor's `throttle()` function) repetitively, the governor response
244won't be very good. Note that this is not particular to this
245governor, step-wise will also misbehave if you call its throttle()
246faster than the normal thermal framework tick (due to interrupts for
247example) as it will overreact.
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 30aee81e9f5b..a1b43eab0a70 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -71,6 +71,14 @@ config THERMAL_DEFAULT_GOV_USER_SPACE
71 Select this if you want to let the user space manage the 71 Select this if you want to let the user space manage the
72 platform thermals. 72 platform thermals.
73 73
74config THERMAL_DEFAULT_GOV_POWER_ALLOCATOR
75 bool "power_allocator"
76 select THERMAL_GOV_POWER_ALLOCATOR
77 help
78 Select this if you want to control temperature based on
79 system and device power allocation. This governor can only
80 operate on cooling devices that implement the power API.
81
74endchoice 82endchoice
75 83
76config THERMAL_GOV_FAIR_SHARE 84config THERMAL_GOV_FAIR_SHARE
@@ -99,6 +107,13 @@ config THERMAL_GOV_USER_SPACE
99 help 107 help
100 Enable this to let the user space manage the platform thermals. 108 Enable this to let the user space manage the platform thermals.
101 109
110config THERMAL_GOV_POWER_ALLOCATOR
111 bool "Power allocator thermal governor"
112 select THERMAL_POWER_ACTOR
113 help
114 Enable this to manage platform thermals by dynamically
115 allocating and limiting power to devices.
116
102config CPU_THERMAL 117config CPU_THERMAL
103 bool "generic cpu cooling support" 118 bool "generic cpu cooling support"
104 depends on CPU_FREQ 119 depends on CPU_FREQ
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index 1fe86652cfb6..b1783cf37ed2 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -14,6 +14,7 @@ thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE) += fair_share.o
14thermal_sys-$(CONFIG_THERMAL_GOV_BANG_BANG) += gov_bang_bang.o 14thermal_sys-$(CONFIG_THERMAL_GOV_BANG_BANG) += gov_bang_bang.o
15thermal_sys-$(CONFIG_THERMAL_GOV_STEP_WISE) += step_wise.o 15thermal_sys-$(CONFIG_THERMAL_GOV_STEP_WISE) += step_wise.o
16thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE) += user_space.o 16thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE) += user_space.o
17thermal_sys-$(CONFIG_THERMAL_GOV_POWER_ALLOCATOR) += power_allocator.o
17 18
18# cpufreq cooling 19# cpufreq cooling
19thermal_sys-$(CONFIG_CPU_THERMAL) += cpu_cooling.o 20thermal_sys-$(CONFIG_CPU_THERMAL) += cpu_cooling.o
diff --git a/drivers/thermal/power_allocator.c b/drivers/thermal/power_allocator.c
new file mode 100644
index 000000000000..67982d79b76c
--- /dev/null
+++ b/drivers/thermal/power_allocator.c
@@ -0,0 +1,520 @@
1/*
2 * A power allocator to manage temperature
3 *
4 * Copyright (C) 2014 ARM Ltd.
5 *
6 * This program is free software; you can redistribute it and/or modify
7 * it under the terms of the GNU General Public License version 2 as
8 * published by the Free Software Foundation.
9 *
10 * This program is distributed "as is" WITHOUT ANY WARRANTY of any
11 * kind, whether express or implied; without even the implied warranty
12 * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
14 */
15
16#define pr_fmt(fmt) "Power allocator: " fmt
17
18#include <linux/rculist.h>
19#include <linux/slab.h>
20#include <linux/thermal.h>
21
22#include "thermal_core.h"
23
24#define FRAC_BITS 10
25#define int_to_frac(x) ((x) << FRAC_BITS)
26#define frac_to_int(x) ((x) >> FRAC_BITS)
27
28/**
29 * mul_frac() - multiply two fixed-point numbers
30 * @x: first multiplicand
31 * @y: second multiplicand
32 *
33 * Return: the result of multiplying two fixed-point numbers. The
34 * result is also a fixed-point number.
35 */
36static inline s64 mul_frac(s64 x, s64 y)
37{
38 return (x * y) >> FRAC_BITS;
39}
40
41/**
42 * div_frac() - divide two fixed-point numbers
43 * @x: the dividend
44 * @y: the divisor
45 *
46 * Return: the result of dividing two fixed-point numbers. The
47 * result is also a fixed-point number.
48 */
49static inline s64 div_frac(s64 x, s64 y)
50{
51 return div_s64(x << FRAC_BITS, y);
52}
53
54/**
55 * struct power_allocator_params - parameters for the power allocator governor
56 * @err_integral: accumulated error in the PID controller.
57 * @prev_err: error in the previous iteration of the PID controller.
58 * Used to calculate the derivative term.
59 * @trip_switch_on: first passive trip point of the thermal zone. The
60 * governor switches on when this trip point is crossed.
61 * @trip_max_desired_temperature: last passive trip point of the thermal
62 * zone. The temperature we are
63 * controlling for.
64 */
65struct power_allocator_params {
66 s64 err_integral;
67 s32 prev_err;
68 int trip_switch_on;
69 int trip_max_desired_temperature;
70};
71
72/**
73 * pid_controller() - PID controller
74 * @tz: thermal zone we are operating in
75 * @current_temp: the current temperature in millicelsius
76 * @control_temp: the target temperature in millicelsius
77 * @max_allocatable_power: maximum allocatable power for this thermal zone
78 *
79 * This PID controller increases the available power budget so that the
80 * temperature of the thermal zone gets as close as possible to
81 * @control_temp and limits the power if it exceeds it. k_po is the
82 * proportional term when we are overshooting, k_pu is the
83 * proportional term when we are undershooting. integral_cutoff is a
84 * threshold below which we stop accumulating the error. The
85 * accumulated error is only valid if the requested power will make
86 * the system warmer. If the system is mostly idle, there's no point
87 * in accumulating positive error.
88 *
89 * Return: The power budget for the next period.
90 */
91static u32 pid_controller(struct thermal_zone_device *tz,
92 unsigned long current_temp,
93 unsigned long control_temp,
94 u32 max_allocatable_power)
95{
96 s64 p, i, d, power_range;
97 s32 err, max_power_frac;
98 struct power_allocator_params *params = tz->governor_data;
99
100 max_power_frac = int_to_frac(max_allocatable_power);
101
102 err = ((s32)control_temp - (s32)current_temp);
103 err = int_to_frac(err);
104
105 /* Calculate the proportional term */
106 p = mul_frac(err < 0 ? tz->tzp->k_po : tz->tzp->k_pu, err);
107
108 /*
109 * Calculate the integral term
110 *
111 * if the error is less than cut off allow integration (but
112 * the integral is limited to max power)
113 */
114 i = mul_frac(tz->tzp->k_i, params->err_integral);
115
116 if (err < int_to_frac(tz->tzp->integral_cutoff)) {
117 s64 i_next = i + mul_frac(tz->tzp->k_i, err);
118
119 if (abs64(i_next) < max_power_frac) {
120 i = i_next;
121 params->err_integral += err;
122 }
123 }
124
125 /*
126 * Calculate the derivative term
127 *
128 * We do err - prev_err, so with a positive k_d, a decreasing
129 * error (i.e. driving closer to the line) results in less
130 * power being applied, slowing down the controller)
131 */
132 d = mul_frac(tz->tzp->k_d, err - params->prev_err);
133 d = div_frac(d, tz->passive_delay);
134 params->prev_err = err;
135
136 power_range = p + i + d;
137
138 /* feed-forward the known sustainable dissipatable power */
139 power_range = tz->tzp->sustainable_power + frac_to_int(power_range);
140
141 return clamp(power_range, (s64)0, (s64)max_allocatable_power);
142}
143
144/**
145 * divvy_up_power() - divvy the allocated power between the actors
146 * @req_power: each actor's requested power
147 * @max_power: each actor's maximum available power
148 * @num_actors: size of the @req_power, @max_power and @granted_power's array
149 * @total_req_power: sum of @req_power
150 * @power_range: total allocated power
151 * @granted_power: output array: each actor's granted power
152 * @extra_actor_power: an appropriately sized array to be used in the
153 * function as temporary storage of the extra power given
154 * to the actors
155 *
156 * This function divides the total allocated power (@power_range)
157 * fairly between the actors. It first tries to give each actor a
158 * share of the @power_range according to how much power it requested
159 * compared to the rest of the actors. For example, if only one actor
160 * requests power, then it receives all the @power_range. If
161 * three actors each requests 1mW, each receives a third of the
162 * @power_range.
163 *
164 * If any actor received more than their maximum power, then that
165 * surplus is re-divvied among the actors based on how far they are
166 * from their respective maximums.
167 *
168 * Granted power for each actor is written to @granted_power, which
169 * should've been allocated by the calling function.
170 */
171static void divvy_up_power(u32 *req_power, u32 *max_power, int num_actors,
172 u32 total_req_power, u32 power_range,
173 u32 *granted_power, u32 *extra_actor_power)
174{
175 u32 extra_power, capped_extra_power;
176 int i;
177
178 /*
179 * Prevent division by 0 if none of the actors request power.
180 */
181 if (!total_req_power)
182 total_req_power = 1;
183
184 capped_extra_power = 0;
185 extra_power = 0;
186 for (i = 0; i < num_actors; i++) {
187 u64 req_range = req_power[i] * power_range;
188
189 granted_power[i] = div_u64(req_range, total_req_power);
190
191 if (granted_power[i] > max_power[i]) {
192 extra_power += granted_power[i] - max_power[i];
193 granted_power[i] = max_power[i];
194 }
195
196 extra_actor_power[i] = max_power[i] - granted_power[i];
197 capped_extra_power += extra_actor_power[i];
198 }
199
200 if (!extra_power)
201 return;
202
203 /*
204 * Re-divvy the reclaimed extra among actors based on
205 * how far they are from the max
206 */
207 extra_power = min(extra_power, capped_extra_power);
208 if (capped_extra_power > 0)
209 for (i = 0; i < num_actors; i++)
210 granted_power[i] += (extra_actor_power[i] *
211 extra_power) / capped_extra_power;
212}
213
214static int allocate_power(struct thermal_zone_device *tz,
215 unsigned long current_temp,
216 unsigned long control_temp)
217{
218 struct thermal_instance *instance;
219 struct power_allocator_params *params = tz->governor_data;
220 u32 *req_power, *max_power, *granted_power, *extra_actor_power;
221 u32 total_req_power, max_allocatable_power;
222 u32 power_range;
223 int i, num_actors, total_weight, ret = 0;
224 int trip_max_desired_temperature = params->trip_max_desired_temperature;
225
226 mutex_lock(&tz->lock);
227
228 num_actors = 0;
229 total_weight = 0;
230 list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
231 if ((instance->trip == trip_max_desired_temperature) &&
232 cdev_is_power_actor(instance->cdev)) {
233 num_actors++;
234 total_weight += instance->weight;
235 }
236 }
237
238 /*
239 * We need to allocate three arrays of the same size:
240 * req_power, max_power and granted_power. They are going to
241 * be needed until this function returns. Allocate them all
242 * in one go to simplify the allocation and deallocation
243 * logic.
244 */
245 BUILD_BUG_ON(sizeof(*req_power) != sizeof(*max_power));
246 BUILD_BUG_ON(sizeof(*req_power) != sizeof(*granted_power));
247 BUILD_BUG_ON(sizeof(*req_power) != sizeof(*extra_actor_power));
248 req_power = devm_kcalloc(&tz->device, num_actors * 4,
249 sizeof(*req_power), GFP_KERNEL);
250 if (!req_power) {
251 ret = -ENOMEM;
252 goto unlock;
253 }
254
255 max_power = &req_power[num_actors];
256 granted_power = &req_power[2 * num_actors];
257 extra_actor_power = &req_power[3 * num_actors];
258
259 i = 0;
260 total_req_power = 0;
261 max_allocatable_power = 0;
262
263 list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
264 int weight;
265 struct thermal_cooling_device *cdev = instance->cdev;
266
267 if (instance->trip != trip_max_desired_temperature)
268 continue;
269
270 if (!cdev_is_power_actor(cdev))
271 continue;
272
273 if (cdev->ops->get_requested_power(cdev, tz, &req_power[i]))
274 continue;
275
276 if (!total_weight)
277 weight = 1 << FRAC_BITS;
278 else
279 weight = instance->weight;
280
281 req_power[i] = frac_to_int(weight * req_power[i]);
282
283 if (power_actor_get_max_power(cdev, tz, &max_power[i]))
284 continue;
285
286 total_req_power += req_power[i];
287 max_allocatable_power += max_power[i];
288
289 i++;
290 }
291
292 power_range = pid_controller(tz, current_temp, control_temp,
293 max_allocatable_power);
294
295 divvy_up_power(req_power, max_power, num_actors, total_req_power,
296 power_range, granted_power, extra_actor_power);
297
298 i = 0;
299 list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
300 if (instance->trip != trip_max_desired_temperature)
301 continue;
302
303 if (!cdev_is_power_actor(instance->cdev))
304 continue;
305
306 power_actor_set_power(instance->cdev, instance,
307 granted_power[i]);
308
309 i++;
310 }
311
312 devm_kfree(&tz->device, req_power);
313unlock:
314 mutex_unlock(&tz->lock);
315
316 return ret;
317}
318
319static int get_governor_trips(struct thermal_zone_device *tz,
320 struct power_allocator_params *params)
321{
322 int i, ret, last_passive;
323 bool found_first_passive;
324
325 found_first_passive = false;
326 last_passive = -1;
327 ret = -EINVAL;
328
329 for (i = 0; i < tz->trips; i++) {
330 enum thermal_trip_type type;
331
332 ret = tz->ops->get_trip_type(tz, i, &type);
333 if (ret)
334 return ret;
335
336 if (!found_first_passive) {
337 if (type == THERMAL_TRIP_PASSIVE) {
338 params->trip_switch_on = i;
339 found_first_passive = true;
340 }
341 } else if (type == THERMAL_TRIP_PASSIVE) {
342 last_passive = i;
343 } else {
344 break;
345 }
346 }
347
348 if (last_passive != -1) {
349 params->trip_max_desired_temperature = last_passive;
350 ret = 0;
351 } else {
352 ret = -EINVAL;
353 }
354
355 return ret;
356}
357
358static void reset_pid_controller(struct power_allocator_params *params)
359{
360 params->err_integral = 0;
361 params->prev_err = 0;
362}
363
364static void allow_maximum_power(struct thermal_zone_device *tz)
365{
366 struct thermal_instance *instance;
367 struct power_allocator_params *params = tz->governor_data;
368
369 list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
370 if ((instance->trip != params->trip_max_desired_temperature) ||
371 (!cdev_is_power_actor(instance->cdev)))
372 continue;
373
374 instance->target = 0;
375 instance->cdev->updated = false;
376 thermal_cdev_update(instance->cdev);
377 }
378}
379
380/**
381 * power_allocator_bind() - bind the power_allocator governor to a thermal zone
382 * @tz: thermal zone to bind it to
383 *
384 * Check that the thermal zone is valid for this governor, that is, it
385 * has two thermal trips. If so, initialize the PID controller
386 * parameters and bind it to the thermal zone.
387 *
388 * Return: 0 on success, -EINVAL if the trips were invalid or -ENOMEM
389 * if we ran out of memory.
390 */
391static int power_allocator_bind(struct thermal_zone_device *tz)
392{
393 int ret;
394 struct power_allocator_params *params;
395 unsigned long switch_on_temp, control_temp;
396 u32 temperature_threshold;
397
398 if (!tz->tzp || !tz->tzp->sustainable_power) {
399 dev_err(&tz->device,
400 "power_allocator: missing sustainable_power\n");
401 return -EINVAL;
402 }
403
404 params = devm_kzalloc(&tz->device, sizeof(*params), GFP_KERNEL);
405 if (!params)
406 return -ENOMEM;
407
408 ret = get_governor_trips(tz, params);
409 if (ret) {
410 dev_err(&tz->device,
411 "thermal zone %s has wrong trip setup for power allocator\n",
412 tz->type);
413 goto free;
414 }
415
416 ret = tz->ops->get_trip_temp(tz, params->trip_switch_on,
417 &switch_on_temp);
418 if (ret)
419 goto free;
420
421 ret = tz->ops->get_trip_temp(tz, params->trip_max_desired_temperature,
422 &control_temp);
423 if (ret)
424 goto free;
425
426 temperature_threshold = control_temp - switch_on_temp;
427
428 tz->tzp->k_po = tz->tzp->k_po ?:
429 int_to_frac(tz->tzp->sustainable_power) / temperature_threshold;
430 tz->tzp->k_pu = tz->tzp->k_pu ?:
431 int_to_frac(2 * tz->tzp->sustainable_power) /
432 temperature_threshold;
433 tz->tzp->k_i = tz->tzp->k_i ?: int_to_frac(10) / 1000;
434 /*
435 * The default for k_d and integral_cutoff is 0, so we can
436 * leave them as they are.
437 */
438
439 reset_pid_controller(params);
440
441 tz->governor_data = params;
442
443 return 0;
444
445free:
446 devm_kfree(&tz->device, params);
447 return ret;
448}
449
450static void power_allocator_unbind(struct thermal_zone_device *tz)
451{
452 dev_dbg(&tz->device, "Unbinding from thermal zone %d\n", tz->id);
453 devm_kfree(&tz->device, tz->governor_data);
454 tz->governor_data = NULL;
455}
456
457static int power_allocator_throttle(struct thermal_zone_device *tz, int trip)
458{
459 int ret;
460 unsigned long switch_on_temp, control_temp, current_temp;
461 struct power_allocator_params *params = tz->governor_data;
462
463 /*
464 * We get called for every trip point but we only need to do
465 * our calculations once
466 */
467 if (trip != params->trip_max_desired_temperature)
468 return 0;
469
470 ret = thermal_zone_get_temp(tz, &current_temp);
471 if (ret) {
472 dev_warn(&tz->device, "Failed to get temperature: %d\n", ret);
473 return ret;
474 }
475
476 ret = tz->ops->get_trip_temp(tz, params->trip_switch_on,
477 &switch_on_temp);
478 if (ret) {
479 dev_warn(&tz->device,
480 "Failed to get switch on temperature: %d\n", ret);
481 return ret;
482 }
483
484 if (current_temp < switch_on_temp) {
485 tz->passive = 0;
486 reset_pid_controller(params);
487 allow_maximum_power(tz);
488 return 0;
489 }
490
491 tz->passive = 1;
492
493 ret = tz->ops->get_trip_temp(tz, params->trip_max_desired_temperature,
494 &control_temp);
495 if (ret) {
496 dev_warn(&tz->device,
497 "Failed to get the maximum desired temperature: %d\n",
498 ret);
499 return ret;
500 }
501
502 return allocate_power(tz, current_temp, control_temp);
503}
504
505static struct thermal_governor thermal_gov_power_allocator = {
506 .name = "power_allocator",
507 .bind_to_tz = power_allocator_bind,
508 .unbind_from_tz = power_allocator_unbind,
509 .throttle = power_allocator_throttle,
510};
511
512int thermal_gov_power_allocator_register(void)
513{
514 return thermal_register_governor(&thermal_gov_power_allocator);
515}
516
517void thermal_gov_power_allocator_unregister(void)
518{
519 thermal_unregister_governor(&thermal_gov_power_allocator);
520}
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 263628b0e862..b389bc2ec0fa 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1616,7 +1616,7 @@ static void remove_trip_attrs(struct thermal_zone_device *tz)
1616struct thermal_zone_device *thermal_zone_device_register(const char *type, 1616struct thermal_zone_device *thermal_zone_device_register(const char *type,
1617 int trips, int mask, void *devdata, 1617 int trips, int mask, void *devdata,
1618 struct thermal_zone_device_ops *ops, 1618 struct thermal_zone_device_ops *ops,
1619 const struct thermal_zone_params *tzp, 1619 struct thermal_zone_params *tzp,
1620 int passive_delay, int polling_delay) 1620 int passive_delay, int polling_delay)
1621{ 1621{
1622 struct thermal_zone_device *tz; 1622 struct thermal_zone_device *tz;
@@ -1968,7 +1968,11 @@ static int __init thermal_register_governors(void)
1968 if (result) 1968 if (result)
1969 return result; 1969 return result;
1970 1970
1971 return thermal_gov_user_space_register(); 1971 result = thermal_gov_user_space_register();
1972 if (result)
1973 return result;
1974
1975 return thermal_gov_power_allocator_register();
1972} 1976}
1973 1977
1974static void thermal_unregister_governors(void) 1978static void thermal_unregister_governors(void)
@@ -1977,6 +1981,7 @@ static void thermal_unregister_governors(void)
1977 thermal_gov_fair_share_unregister(); 1981 thermal_gov_fair_share_unregister();
1978 thermal_gov_bang_bang_unregister(); 1982 thermal_gov_bang_bang_unregister();
1979 thermal_gov_user_space_unregister(); 1983 thermal_gov_user_space_unregister();
1984 thermal_gov_power_allocator_unregister();
1980} 1985}
1981 1986
1982static int __init thermal_init(void) 1987static int __init thermal_init(void)
diff --git a/drivers/thermal/thermal_core.h b/drivers/thermal/thermal_core.h
index faebe881f062..8a6624488cc5 100644
--- a/drivers/thermal/thermal_core.h
+++ b/drivers/thermal/thermal_core.h
@@ -88,6 +88,14 @@ static inline int thermal_gov_user_space_register(void) { return 0; }
88static inline void thermal_gov_user_space_unregister(void) {} 88static inline void thermal_gov_user_space_unregister(void) {}
89#endif /* CONFIG_THERMAL_GOV_USER_SPACE */ 89#endif /* CONFIG_THERMAL_GOV_USER_SPACE */
90 90
91#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
92int thermal_gov_power_allocator_register(void);
93void thermal_gov_power_allocator_unregister(void);
94#else
95static inline int thermal_gov_power_allocator_register(void) { return 0; }
96static inline void thermal_gov_power_allocator_unregister(void) {}
97#endif /* CONFIG_THERMAL_GOV_POWER_ALLOCATOR */
98
91/* device tree support */ 99/* device tree support */
92#ifdef CONFIG_THERMAL_OF 100#ifdef CONFIG_THERMAL_OF
93int of_parse_thermal_zones(void); 101int of_parse_thermal_zones(void);
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index bf3c55f405c2..6bbe11c97cea 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -59,6 +59,8 @@
59#define DEFAULT_THERMAL_GOVERNOR "fair_share" 59#define DEFAULT_THERMAL_GOVERNOR "fair_share"
60#elif defined(CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE) 60#elif defined(CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE)
61#define DEFAULT_THERMAL_GOVERNOR "user_space" 61#define DEFAULT_THERMAL_GOVERNOR "user_space"
62#elif defined(CONFIG_THERMAL_DEFAULT_GOV_POWER_ALLOCATOR)
63#define DEFAULT_THERMAL_GOVERNOR "power_allocator"
62#endif 64#endif
63 65
64struct thermal_zone_device; 66struct thermal_zone_device;
@@ -154,8 +156,7 @@ struct thermal_attr {
154 * @devdata: private pointer for device private data 156 * @devdata: private pointer for device private data
155 * @trips: number of trip points the thermal zone supports 157 * @trips: number of trip points the thermal zone supports
156 * @passive_delay: number of milliseconds to wait between polls when 158 * @passive_delay: number of milliseconds to wait between polls when
157 * performing passive cooling. Currenty only used by the 159 * performing passive cooling.
158 * step-wise governor
159 * @polling_delay: number of milliseconds to wait between polls when 160 * @polling_delay: number of milliseconds to wait between polls when
160 * checking whether trip points have been crossed (0 for 161 * checking whether trip points have been crossed (0 for
161 * interrupt driven systems) 162 * interrupt driven systems)
@@ -165,7 +166,6 @@ struct thermal_attr {
165 * @last_temperature: previous temperature read 166 * @last_temperature: previous temperature read
166 * @emul_temperature: emulated temperature when using CONFIG_THERMAL_EMULATION 167 * @emul_temperature: emulated temperature when using CONFIG_THERMAL_EMULATION
167 * @passive: 1 if you've crossed a passive trip point, 0 otherwise. 168 * @passive: 1 if you've crossed a passive trip point, 0 otherwise.
168 * Currenty only used by the step-wise governor.
169 * @forced_passive: If > 0, temperature at which to switch on all ACPI 169 * @forced_passive: If > 0, temperature at which to switch on all ACPI
170 * processor cooling devices. Currently only used by the 170 * processor cooling devices. Currently only used by the
171 * step-wise governor. 171 * step-wise governor.
@@ -197,7 +197,7 @@ struct thermal_zone_device {
197 int passive; 197 int passive;
198 unsigned int forced_passive; 198 unsigned int forced_passive;
199 struct thermal_zone_device_ops *ops; 199 struct thermal_zone_device_ops *ops;
200 const struct thermal_zone_params *tzp; 200 struct thermal_zone_params *tzp;
201 struct thermal_governor *governor; 201 struct thermal_governor *governor;
202 void *governor_data; 202 void *governor_data;
203 struct list_head thermal_instances; 203 struct list_head thermal_instances;
@@ -275,6 +275,33 @@ struct thermal_zone_params {
275 275
276 int num_tbps; /* Number of tbp entries */ 276 int num_tbps; /* Number of tbp entries */
277 struct thermal_bind_params *tbp; 277 struct thermal_bind_params *tbp;
278
279 /*
280 * Sustainable power (heat) that this thermal zone can dissipate in
281 * mW
282 */
283 u32 sustainable_power;
284
285 /*
286 * Proportional parameter of the PID controller when
287 * overshooting (i.e., when temperature is below the target)
288 */
289 s32 k_po;
290
291 /*
292 * Proportional parameter of the PID controller when
293 * undershooting
294 */
295 s32 k_pu;
296
297 /* Integral parameter of the PID controller */
298 s32 k_i;
299
300 /* Derivative parameter of the PID controller */
301 s32 k_d;
302
303 /* threshold below which the error is no longer accumulated */
304 s32 integral_cutoff;
278}; 305};
279 306
280struct thermal_genl_event { 307struct thermal_genl_event {
@@ -350,7 +377,7 @@ int power_actor_set_power(struct thermal_cooling_device *,
350 struct thermal_instance *, u32); 377 struct thermal_instance *, u32);
351struct thermal_zone_device *thermal_zone_device_register(const char *, int, int, 378struct thermal_zone_device *thermal_zone_device_register(const char *, int, int,
352 void *, struct thermal_zone_device_ops *, 379 void *, struct thermal_zone_device_ops *,
353 const struct thermal_zone_params *, int, int); 380 struct thermal_zone_params *, int, int);
354void thermal_zone_device_unregister(struct thermal_zone_device *); 381void thermal_zone_device_unregister(struct thermal_zone_device *);
355 382
356int thermal_zone_bind_cooling_device(struct thermal_zone_device *, int, 383int thermal_zone_bind_cooling_device(struct thermal_zone_device *, int,