diff options
author | Quentin Perret <quentin.perret@arm.com> | 2019-01-10 06:05:45 -0500 |
---|---|---|
committer | Ingo Molnar <mingo@kernel.org> | 2019-01-27 06:29:37 -0500 |
commit | 1017b48ccc11a70634a7b8ec4ba3a6acb234c17b (patch) | |
tree | 65e6b5e1b00163d05f5f3e425dbebaa5523bb26a /Documentation/power | |
parent | c0ad4aa4d8416a39ad262a2bd68b30acd951bf0e (diff) |
PM/EM: Document the Energy Model framework
Introduce a documentation file summarizing the key design points and
APIs of the newly introduced Energy Model framework.
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: corbet@lwn.net
Cc: dietmar.eggemann@arm.com
Cc: morten.rasmussen@arm.com
Cc: patrick.bellasi@arm.com
Cc: qais.yousef@arm.com
Cc: rjw@rjwysocki.net
Link: https://lkml.kernel.org/r/20190110110546.8101-2-quentin.perret@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Diffstat (limited to 'Documentation/power')
-rw-r--r-- | Documentation/power/energy-model.txt | 144 |
1 files changed, 144 insertions, 0 deletions
diff --git a/Documentation/power/energy-model.txt b/Documentation/power/energy-model.txt new file mode 100644 index 000000000000..a2b0ae4c76bd --- /dev/null +++ b/Documentation/power/energy-model.txt | |||
@@ -0,0 +1,144 @@ | |||
1 | ==================== | ||
2 | Energy Model of CPUs | ||
3 | ==================== | ||
4 | |||
5 | 1. Overview | ||
6 | ----------- | ||
7 | |||
8 | The Energy Model (EM) framework serves as an interface between drivers knowing | ||
9 | the power consumed by CPUs at various performance levels, and the kernel | ||
10 | subsystems willing to use that information to make energy-aware decisions. | ||
11 | |||
12 | The source of the information about the power consumed by CPUs can vary greatly | ||
13 | from one platform to another. These power costs can be estimated using | ||
14 | devicetree data in some cases. In others, the firmware will know better. | ||
15 | Alternatively, userspace might be best positioned. And so on. In order to avoid | ||
16 | each and every client subsystem to re-implement support for each and every | ||
17 | possible source of information on its own, the EM framework intervenes as an | ||
18 | abstraction layer which standardizes the format of power cost tables in the | ||
19 | kernel, hence enabling to avoid redundant work. | ||
20 | |||
21 | The figure below depicts an example of drivers (Arm-specific here, but the | ||
22 | approach is applicable to any architecture) providing power costs to the EM | ||
23 | framework, and interested clients reading the data from it. | ||
24 | |||
25 | +---------------+ +-----------------+ +---------------+ | ||
26 | | Thermal (IPA) | | Scheduler (EAS) | | Other | | ||
27 | +---------------+ +-----------------+ +---------------+ | ||
28 | | | em_pd_energy() | | ||
29 | | | em_cpu_get() | | ||
30 | +---------+ | +---------+ | ||
31 | | | | | ||
32 | v v v | ||
33 | +---------------------+ | ||
34 | | Energy Model | | ||
35 | | Framework | | ||
36 | +---------------------+ | ||
37 | ^ ^ ^ | ||
38 | | | | em_register_perf_domain() | ||
39 | +----------+ | +---------+ | ||
40 | | | | | ||
41 | +---------------+ +---------------+ +--------------+ | ||
42 | | cpufreq-dt | | arm_scmi | | Other | | ||
43 | +---------------+ +---------------+ +--------------+ | ||
44 | ^ ^ ^ | ||
45 | | | | | ||
46 | +--------------+ +---------------+ +--------------+ | ||
47 | | Device Tree | | Firmware | | ? | | ||
48 | +--------------+ +---------------+ +--------------+ | ||
49 | |||
50 | The EM framework manages power cost tables per 'performance domain' in the | ||
51 | system. A performance domain is a group of CPUs whose performance is scaled | ||
52 | together. Performance domains generally have a 1-to-1 mapping with CPUFreq | ||
53 | policies. All CPUs in a performance domain are required to have the same | ||
54 | micro-architecture. CPUs in different performance domains can have different | ||
55 | micro-architectures. | ||
56 | |||
57 | |||
58 | 2. Core APIs | ||
59 | ------------ | ||
60 | |||
61 | 2.1 Config options | ||
62 | |||
63 | CONFIG_ENERGY_MODEL must be enabled to use the EM framework. | ||
64 | |||
65 | |||
66 | 2.2 Registration of performance domains | ||
67 | |||
68 | Drivers are expected to register performance domains into the EM framework by | ||
69 | calling the following API: | ||
70 | |||
71 | int em_register_perf_domain(cpumask_t *span, unsigned int nr_states, | ||
72 | struct em_data_callback *cb); | ||
73 | |||
74 | Drivers must specify the CPUs of the performance domains using the cpumask | ||
75 | argument, and provide a callback function returning <frequency, power> tuples | ||
76 | for each capacity state. The callback function provided by the driver is free | ||
77 | to fetch data from any relevant location (DT, firmware, ...), and by any mean | ||
78 | deemed necessary. See Section 3. for an example of driver implementing this | ||
79 | callback, and kernel/power/energy_model.c for further documentation on this | ||
80 | API. | ||
81 | |||
82 | |||
83 | 2.3 Accessing performance domains | ||
84 | |||
85 | Subsystems interested in the energy model of a CPU can retrieve it using the | ||
86 | em_cpu_get() API. The energy model tables are allocated once upon creation of | ||
87 | the performance domains, and kept in memory untouched. | ||
88 | |||
89 | The energy consumed by a performance domain can be estimated using the | ||
90 | em_pd_energy() API. The estimation is performed assuming that the schedutil | ||
91 | CPUfreq governor is in use. | ||
92 | |||
93 | More details about the above APIs can be found in include/linux/energy_model.h. | ||
94 | |||
95 | |||
96 | 3. Example driver | ||
97 | ----------------- | ||
98 | |||
99 | This section provides a simple example of a CPUFreq driver registering a | ||
100 | performance domain in the Energy Model framework using the (fake) 'foo' | ||
101 | protocol. The driver implements an est_power() function to be provided to the | ||
102 | EM framework. | ||
103 | |||
104 | -> drivers/cpufreq/foo_cpufreq.c | ||
105 | |||
106 | 01 static int est_power(unsigned long *mW, unsigned long *KHz, int cpu) | ||
107 | 02 { | ||
108 | 03 long freq, power; | ||
109 | 04 | ||
110 | 05 /* Use the 'foo' protocol to ceil the frequency */ | ||
111 | 06 freq = foo_get_freq_ceil(cpu, *KHz); | ||
112 | 07 if (freq < 0); | ||
113 | 08 return freq; | ||
114 | 09 | ||
115 | 10 /* Estimate the power cost for the CPU at the relevant freq. */ | ||
116 | 11 power = foo_estimate_power(cpu, freq); | ||
117 | 12 if (power < 0); | ||
118 | 13 return power; | ||
119 | 14 | ||
120 | 15 /* Return the values to the EM framework */ | ||
121 | 16 *mW = power; | ||
122 | 17 *KHz = freq; | ||
123 | 18 | ||
124 | 19 return 0; | ||
125 | 20 } | ||
126 | 21 | ||
127 | 22 static int foo_cpufreq_init(struct cpufreq_policy *policy) | ||
128 | 23 { | ||
129 | 24 struct em_data_callback em_cb = EM_DATA_CB(est_power); | ||
130 | 25 int nr_opp, ret; | ||
131 | 26 | ||
132 | 27 /* Do the actual CPUFreq init work ... */ | ||
133 | 28 ret = do_foo_cpufreq_init(policy); | ||
134 | 29 if (ret) | ||
135 | 30 return ret; | ||
136 | 31 | ||
137 | 32 /* Find the number of OPPs for this policy */ | ||
138 | 33 nr_opp = foo_get_nr_opp(policy); | ||
139 | 34 | ||
140 | 35 /* And register the new performance domain */ | ||
141 | 36 em_register_perf_domain(policy->cpus, nr_opp, &em_cb); | ||
142 | 37 | ||
143 | 38 return 0; | ||
144 | 39 } | ||