aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/xen
diff options
context:
space:
mode:
authorKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>2012-02-03 16:03:20 -0500
committerKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>2012-03-14 12:35:42 -0400
commit59a56802918100c1e39e68c30a2e5ae9f7d837f0 (patch)
tree18bf73e267ec02e0f8337a039ac12cec83c9e12d /drivers/xen
parentead1d01425bbd28c4354b539caa4075bde00ed72 (diff)
xen/acpi-processor: C and P-state driver that uploads said data to hypervisor.
This driver solves three problems: 1). Parse and upload ACPI0007 (or PROCESSOR_TYPE) information to the hypervisor - aka P-states (cpufreq data). 2). Upload the the Cx state information (cpuidle data). 3). Inhibit CPU frequency scaling drivers from loading. The reason for wanting to solve 1) and 2) is such that the Xen hypervisor is the only one that knows the CPU usage of different guests and can make the proper decision of when to put CPUs and packages in proper states. Unfortunately the hypervisor has no support to parse ACPI DSDT tables, hence it needs help from the initial domain to provide this information. The reason for 3) is that we do not want the initial domain to change P-states while the hypervisor is doing it as well - it causes rather some funny cases of P-states transitions. For this to work, the driver parses the Power Management data and uploads said information to the Xen hypervisor. It also calls acpi_processor_notify_smm() to inhibit the other CPU frequency scaling drivers from being loaded. Everything revolves around the 'struct acpi_processor' structure which gets updated during the bootup cycle in different stages. At the startup, when the ACPI parser starts, the C-state information is processed (processor_idle) and saved in said structure as 'power' element. Later on, the CPU frequency scaling driver (powernow-k8 or acpi_cpufreq), would call the the acpi_processor_* (processor_perflib functions) to parse P-states information and populate in the said structure the 'performance' element. Since we do not want the CPU frequency scaling drivers from loading we have to call the acpi_processor_* functions to parse the P-states and call "acpi_processor_notify_smm" to stop them from loading. There is also one oddity in this driver which is that under Xen, the physical online CPU count can be different from the virtual online CPU count. Meaning that the macros 'for_[online|possible]_cpu' would process only up to virtual online CPU count. We on the other hand want to process the full amount of physical CPUs. For that, the driver checks if the ACPI IDs count is different from the APIC ID count - which can happen if the user choose to use dom0_max_vcpu argument. In such a case a backup of the PM structure is used and uploaded to the hypervisor. [v1-v2: Initial RFC implementations that were posted] [v3: Changed the name to passthru suggested by Pasi Kärkkäinen <pasik@iki.fi>] [v4: Added vCPU != pCPU support - aka dom0_max_vcpus support] [v5: Cleaned up the driver, fix bug under Athlon XP] [v6: Changed the driver to a CPU frequency governor] [v7: Jan Beulich <jbeulich@suse.com> suggestion to make it a cpufreq scaling driver made me rework it as driver that inhibits cpufreq scaling driver] [v8: Per Jan's review comments, fixed up the driver] [v9: Allow to continue even if acpi_processor_preregister_perf.. fails] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Diffstat (limited to 'drivers/xen')
-rw-r--r--drivers/xen/Kconfig17
-rw-r--r--drivers/xen/Makefile2
-rw-r--r--drivers/xen/xen-acpi-processor.c562
3 files changed, 580 insertions, 1 deletions
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index a1ced521cf74..648bcd4195c5 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -178,4 +178,21 @@ config XEN_PRIVCMD
178 depends on XEN 178 depends on XEN
179 default m 179 default m
180 180
181config XEN_ACPI_PROCESSOR
182 tristate "Xen ACPI processor"
183 depends on XEN && X86 && ACPI_PROCESSOR
184 default y if (X86_ACPI_CPUFREQ = y || X86_POWERNOW_K8 = y)
185 default m if (X86_ACPI_CPUFREQ = m || X86_POWERNOW_K8 = m)
186 help
187 This ACPI processor uploads Power Management information to the Xen hypervisor.
188
189 To do that the driver parses the Power Management data and uploads said
190 information to the Xen hypervisor. Then the Xen hypervisor can select the
191 proper Cx and Pxx states. It also registers itslef as the SMM so that
192 other drivers (such as ACPI cpufreq scaling driver) will not load.
193
194 To compile this driver as a module, choose M here: the
195 module will be called xen_acpi_processor If you do not know what to choose,
196 select M here. If the CPUFREQ drivers are built in, select Y here.
197
181endmenu 198endmenu
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index aa31337192cc..9adc5be57b13 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -20,7 +20,7 @@ obj-$(CONFIG_SWIOTLB_XEN) += swiotlb-xen.o
20obj-$(CONFIG_XEN_DOM0) += pci.o 20obj-$(CONFIG_XEN_DOM0) += pci.o
21obj-$(CONFIG_XEN_PCIDEV_BACKEND) += xen-pciback/ 21obj-$(CONFIG_XEN_PCIDEV_BACKEND) += xen-pciback/
22obj-$(CONFIG_XEN_PRIVCMD) += xen-privcmd.o 22obj-$(CONFIG_XEN_PRIVCMD) += xen-privcmd.o
23 23obj-$(CONFIG_XEN_ACPI_PROCESSOR) += xen-acpi-processor.o
24xen-evtchn-y := evtchn.o 24xen-evtchn-y := evtchn.o
25xen-gntdev-y := gntdev.o 25xen-gntdev-y := gntdev.o
26xen-gntalloc-y := gntalloc.o 26xen-gntalloc-y := gntalloc.o
diff --git a/drivers/xen/xen-acpi-processor.c b/drivers/xen/xen-acpi-processor.c
new file mode 100644
index 000000000000..5c2be963aa18
--- /dev/null
+++ b/drivers/xen/xen-acpi-processor.c
@@ -0,0 +1,562 @@
1/*
2 * Copyright 2012 by Oracle Inc
3 * Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
4 *
5 * This code borrows ideas from https://lkml.org/lkml/2011/11/30/249
6 * so many thanks go to Kevin Tian <kevin.tian@intel.com>
7 * and Yu Ke <ke.yu@intel.com>.
8 *
9 * This program is free software; you can redistribute it and/or modify it
10 * under the terms and conditions of the GNU General Public License,
11 * version 2, as published by the Free Software Foundation.
12 *
13 * This program is distributed in the hope it will be useful, but WITHOUT
14 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
15 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
16 * more details.
17 *
18 */
19
20#include <linux/cpumask.h>
21#include <linux/cpufreq.h>
22#include <linux/freezer.h>
23#include <linux/kernel.h>
24#include <linux/kthread.h>
25#include <linux/init.h>
26#include <linux/module.h>
27#include <linux/types.h>
28#include <acpi/acpi_bus.h>
29#include <acpi/acpi_drivers.h>
30#include <acpi/processor.h>
31
32#include <xen/interface/platform.h>
33#include <asm/xen/hypercall.h>
34
35#define DRV_NAME "xen-acpi-processor: "
36
37static int no_hypercall;
38MODULE_PARM_DESC(off, "Inhibit the hypercall.");
39module_param_named(off, no_hypercall, int, 0400);
40
41/*
42 * Note: Do not convert the acpi_id* below to cpumask_var_t or use cpumask_bit
43 * - as those shrink to nr_cpu_bits (which is dependent on possible_cpu), which
44 * can be less than what we want to put in. Instead use the 'nr_acpi_bits'
45 * which is dynamically computed based on the MADT or x2APIC table.
46 */
47static unsigned int nr_acpi_bits;
48/* Mutex to protect the acpi_ids_done - for CPU hotplug use. */
49static DEFINE_MUTEX(acpi_ids_mutex);
50/* Which ACPI ID we have processed from 'struct acpi_processor'. */
51static unsigned long *acpi_ids_done;
52/* Which ACPI ID exist in the SSDT/DSDT processor definitions. */
53static unsigned long __initdata *acpi_id_present;
54/* And if there is an _CST definition (or a PBLK) for the ACPI IDs */
55static unsigned long __initdata *acpi_id_cst_present;
56
57static int push_cxx_to_hypervisor(struct acpi_processor *_pr)
58{
59 struct xen_platform_op op = {
60 .cmd = XENPF_set_processor_pminfo,
61 .interface_version = XENPF_INTERFACE_VERSION,
62 .u.set_pminfo.id = _pr->acpi_id,
63 .u.set_pminfo.type = XEN_PM_CX,
64 };
65 struct xen_processor_cx *dst_cx, *dst_cx_states = NULL;
66 struct acpi_processor_cx *cx;
67 unsigned int i, ok;
68 int ret = 0;
69
70 dst_cx_states = kcalloc(_pr->power.count,
71 sizeof(struct xen_processor_cx), GFP_KERNEL);
72 if (!dst_cx_states)
73 return -ENOMEM;
74
75 for (ok = 0, i = 1; i <= _pr->power.count; i++) {
76 cx = &_pr->power.states[i];
77 if (!cx->valid)
78 continue;
79
80 dst_cx = &(dst_cx_states[ok++]);
81
82 dst_cx->reg.space_id = ACPI_ADR_SPACE_SYSTEM_IO;
83 if (cx->entry_method == ACPI_CSTATE_SYSTEMIO) {
84 dst_cx->reg.bit_width = 8;
85 dst_cx->reg.bit_offset = 0;
86 dst_cx->reg.access_size = 1;
87 } else {
88 dst_cx->reg.space_id = ACPI_ADR_SPACE_FIXED_HARDWARE;
89 if (cx->entry_method == ACPI_CSTATE_FFH) {
90 /* NATIVE_CSTATE_BEYOND_HALT */
91 dst_cx->reg.bit_offset = 2;
92 dst_cx->reg.bit_width = 1; /* VENDOR_INTEL */
93 }
94 dst_cx->reg.access_size = 0;
95 }
96 dst_cx->reg.address = cx->address;
97
98 dst_cx->type = cx->type;
99 dst_cx->latency = cx->latency;
100 dst_cx->power = cx->power;
101
102 dst_cx->dpcnt = 0;
103 set_xen_guest_handle(dst_cx->dp, NULL);
104 }
105 if (!ok) {
106 pr_debug(DRV_NAME "No _Cx for ACPI CPU %u\n", _pr->acpi_id);
107 kfree(dst_cx_states);
108 return -EINVAL;
109 }
110 op.u.set_pminfo.power.count = ok;
111 op.u.set_pminfo.power.flags.bm_control = _pr->flags.bm_control;
112 op.u.set_pminfo.power.flags.bm_check = _pr->flags.bm_check;
113 op.u.set_pminfo.power.flags.has_cst = _pr->flags.has_cst;
114 op.u.set_pminfo.power.flags.power_setup_done =
115 _pr->flags.power_setup_done;
116
117 set_xen_guest_handle(op.u.set_pminfo.power.states, dst_cx_states);
118
119 if (!no_hypercall)
120 ret = HYPERVISOR_dom0_op(&op);
121
122 if (!ret) {
123 pr_debug("ACPI CPU%u - C-states uploaded.\n", _pr->acpi_id);
124 for (i = 1; i <= _pr->power.count; i++) {
125 cx = &_pr->power.states[i];
126 if (!cx->valid)
127 continue;
128 pr_debug(" C%d: %s %d uS\n",
129 cx->type, cx->desc, (u32)cx->latency);
130 }
131 } else
132 pr_err(DRV_NAME "(CX): Hypervisor error (%d) for ACPI CPU%u\n",
133 ret, _pr->acpi_id);
134
135 kfree(dst_cx_states);
136
137 return ret;
138}
139static struct xen_processor_px *
140xen_copy_pss_data(struct acpi_processor *_pr,
141 struct xen_processor_performance *dst_perf)
142{
143 struct xen_processor_px *dst_states = NULL;
144 unsigned int i;
145
146 BUILD_BUG_ON(sizeof(struct xen_processor_px) !=
147 sizeof(struct acpi_processor_px));
148
149 dst_states = kcalloc(_pr->performance->state_count,
150 sizeof(struct xen_processor_px), GFP_KERNEL);
151 if (!dst_states)
152 return ERR_PTR(-ENOMEM);
153
154 dst_perf->state_count = _pr->performance->state_count;
155 for (i = 0; i < _pr->performance->state_count; i++) {
156 /* Fortunatly for us, they are both the same size */
157 memcpy(&(dst_states[i]), &(_pr->performance->states[i]),
158 sizeof(struct acpi_processor_px));
159 }
160 return dst_states;
161}
162static int xen_copy_psd_data(struct acpi_processor *_pr,
163 struct xen_processor_performance *dst)
164{
165 struct acpi_psd_package *pdomain;
166
167 BUILD_BUG_ON(sizeof(struct xen_psd_package) !=
168 sizeof(struct acpi_psd_package));
169
170 /* This information is enumerated only if acpi_processor_preregister_performance
171 * has been called.
172 */
173 dst->shared_type = _pr->performance->shared_type;
174
175 pdomain = &(_pr->performance->domain_info);
176
177 /* 'acpi_processor_preregister_performance' does not parse if the
178 * num_processors <= 1, but Xen still requires it. Do it manually here.
179 */
180 if (pdomain->num_processors <= 1) {
181 if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ALL)
182 dst->shared_type = CPUFREQ_SHARED_TYPE_ALL;
183 else if (pdomain->coord_type == DOMAIN_COORD_TYPE_HW_ALL)
184 dst->shared_type = CPUFREQ_SHARED_TYPE_HW;
185 else if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ANY)
186 dst->shared_type = CPUFREQ_SHARED_TYPE_ANY;
187
188 }
189 memcpy(&(dst->domain_info), pdomain, sizeof(struct acpi_psd_package));
190 return 0;
191}
192static int xen_copy_pct_data(struct acpi_pct_register *pct,
193 struct xen_pct_register *dst_pct)
194{
195 /* It would be nice if you could just do 'memcpy(pct, dst_pct') but
196 * sadly the Xen structure did not have the proper padding so the
197 * descriptor field takes two (dst_pct) bytes instead of one (pct).
198 */
199 dst_pct->descriptor = pct->descriptor;
200 dst_pct->length = pct->length;
201 dst_pct->space_id = pct->space_id;
202 dst_pct->bit_width = pct->bit_width;
203 dst_pct->bit_offset = pct->bit_offset;
204 dst_pct->reserved = pct->reserved;
205 dst_pct->address = pct->address;
206 return 0;
207}
208static int push_pxx_to_hypervisor(struct acpi_processor *_pr)
209{
210 int ret = 0;
211 struct xen_platform_op op = {
212 .cmd = XENPF_set_processor_pminfo,
213 .interface_version = XENPF_INTERFACE_VERSION,
214 .u.set_pminfo.id = _pr->acpi_id,
215 .u.set_pminfo.type = XEN_PM_PX,
216 };
217 struct xen_processor_performance *dst_perf;
218 struct xen_processor_px *dst_states = NULL;
219
220 dst_perf = &op.u.set_pminfo.perf;
221
222 dst_perf->platform_limit = _pr->performance_platform_limit;
223 dst_perf->flags |= XEN_PX_PPC;
224 xen_copy_pct_data(&(_pr->performance->control_register),
225 &dst_perf->control_register);
226 xen_copy_pct_data(&(_pr->performance->status_register),
227 &dst_perf->status_register);
228 dst_perf->flags |= XEN_PX_PCT;
229 dst_states = xen_copy_pss_data(_pr, dst_perf);
230 if (!IS_ERR_OR_NULL(dst_states)) {
231 set_xen_guest_handle(dst_perf->states, dst_states);
232 dst_perf->flags |= XEN_PX_PSS;
233 }
234 if (!xen_copy_psd_data(_pr, dst_perf))
235 dst_perf->flags |= XEN_PX_PSD;
236
237 if (dst_perf->flags != (XEN_PX_PSD | XEN_PX_PSS | XEN_PX_PCT | XEN_PX_PPC)) {
238 pr_warn(DRV_NAME "ACPI CPU%u missing some P-state data (%x), skipping.\n",
239 _pr->acpi_id, dst_perf->flags);
240 ret = -ENODEV;
241 goto err_free;
242 }
243
244 if (!no_hypercall)
245 ret = HYPERVISOR_dom0_op(&op);
246
247 if (!ret) {
248 struct acpi_processor_performance *perf;
249 unsigned int i;
250
251 perf = _pr->performance;
252 pr_debug("ACPI CPU%u - P-states uploaded.\n", _pr->acpi_id);
253 for (i = 0; i < perf->state_count; i++) {
254 pr_debug(" %cP%d: %d MHz, %d mW, %d uS\n",
255 (i == perf->state ? '*' : ' '), i,
256 (u32) perf->states[i].core_frequency,
257 (u32) perf->states[i].power,
258 (u32) perf->states[i].transition_latency);
259 }
260 } else if (ret != -EINVAL)
261 /* EINVAL means the ACPI ID is incorrect - meaning the ACPI
262 * table is referencing a non-existing CPU - which can happen
263 * with broken ACPI tables. */
264 pr_warn(DRV_NAME "(_PXX): Hypervisor error (%d) for ACPI CPU%u\n",
265 ret, _pr->acpi_id);
266err_free:
267 if (!IS_ERR_OR_NULL(dst_states))
268 kfree(dst_states);
269
270 return ret;
271}
272static int upload_pm_data(struct acpi_processor *_pr)
273{
274 int err = 0;
275
276 mutex_lock(&acpi_ids_mutex);
277 if (__test_and_set_bit(_pr->acpi_id, acpi_ids_done)) {
278 mutex_unlock(&acpi_ids_mutex);
279 return -EBUSY;
280 }
281 if (_pr->flags.power)
282 err = push_cxx_to_hypervisor(_pr);
283
284 if (_pr->performance && _pr->performance->states)
285 err |= push_pxx_to_hypervisor(_pr);
286
287 mutex_unlock(&acpi_ids_mutex);
288 return err;
289}
290static unsigned int __init get_max_acpi_id(void)
291{
292 struct xenpf_pcpuinfo *info;
293 struct xen_platform_op op = {
294 .cmd = XENPF_get_cpuinfo,
295 .interface_version = XENPF_INTERFACE_VERSION,
296 };
297 int ret = 0;
298 unsigned int i, last_cpu, max_acpi_id = 0;
299
300 info = &op.u.pcpu_info;
301 info->xen_cpuid = 0;
302
303 ret = HYPERVISOR_dom0_op(&op);
304 if (ret)
305 return NR_CPUS;
306
307 /* The max_present is the same irregardless of the xen_cpuid */
308 last_cpu = op.u.pcpu_info.max_present;
309 for (i = 0; i <= last_cpu; i++) {
310 info->xen_cpuid = i;
311 ret = HYPERVISOR_dom0_op(&op);
312 if (ret)
313 continue;
314 max_acpi_id = max(info->acpi_id, max_acpi_id);
315 }
316 max_acpi_id *= 2; /* Slack for CPU hotplug support. */
317 pr_debug(DRV_NAME "Max ACPI ID: %u\n", max_acpi_id);
318 return max_acpi_id;
319}
320/*
321 * The read_acpi_id and check_acpi_ids are there to support the Xen
322 * oddity of virtual CPUs != physical CPUs in the initial domain.
323 * The user can supply 'xen_max_vcpus=X' on the Xen hypervisor line
324 * which will band the amount of CPUs the initial domain can see.
325 * In general that is OK, except it plays havoc with any of the
326 * for_each_[present|online]_cpu macros which are banded to the virtual
327 * CPU amount.
328 */
329static acpi_status __init
330read_acpi_id(acpi_handle handle, u32 lvl, void *context, void **rv)
331{
332 u32 acpi_id;
333 acpi_status status;
334 acpi_object_type acpi_type;
335 unsigned long long tmp;
336 union acpi_object object = { 0 };
337 struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
338 acpi_io_address pblk = 0;
339
340 status = acpi_get_type(handle, &acpi_type);
341 if (ACPI_FAILURE(status))
342 return AE_OK;
343
344 switch (acpi_type) {
345 case ACPI_TYPE_PROCESSOR:
346 status = acpi_evaluate_object(handle, NULL, NULL, &buffer);
347 if (ACPI_FAILURE(status))
348 return AE_OK;
349 acpi_id = object.processor.proc_id;
350 pblk = object.processor.pblk_address;
351 break;
352 case ACPI_TYPE_DEVICE:
353 status = acpi_evaluate_integer(handle, "_UID", NULL, &tmp);
354 if (ACPI_FAILURE(status))
355 return AE_OK;
356 acpi_id = tmp;
357 break;
358 default:
359 return AE_OK;
360 }
361 /* There are more ACPI Processor objects than in x2APIC or MADT.
362 * This can happen with incorrect ACPI SSDT declerations. */
363 if (acpi_id > nr_acpi_bits) {
364 pr_debug(DRV_NAME "We only have %u, trying to set %u\n",
365 nr_acpi_bits, acpi_id);
366 return AE_OK;
367 }
368 /* OK, There is a ACPI Processor object */
369 __set_bit(acpi_id, acpi_id_present);
370
371 pr_debug(DRV_NAME "ACPI CPU%u w/ PBLK:0x%lx\n", acpi_id,
372 (unsigned long)pblk);
373
374 status = acpi_evaluate_object(handle, "_CST", NULL, &buffer);
375 if (ACPI_FAILURE(status)) {
376 if (!pblk)
377 return AE_OK;
378 }
379 /* .. and it has a C-state */
380 __set_bit(acpi_id, acpi_id_cst_present);
381
382 return AE_OK;
383}
384static int __init check_acpi_ids(struct acpi_processor *pr_backup)
385{
386
387 if (!pr_backup)
388 return -ENODEV;
389
390 /* All online CPUs have been processed at this stage. Now verify
391 * whether in fact "online CPUs" == physical CPUs.
392 */
393 acpi_id_present = kcalloc(BITS_TO_LONGS(nr_acpi_bits), sizeof(unsigned long), GFP_KERNEL);
394 if (!acpi_id_present)
395 return -ENOMEM;
396
397 acpi_id_cst_present = kcalloc(BITS_TO_LONGS(nr_acpi_bits), sizeof(unsigned long), GFP_KERNEL);
398 if (!acpi_id_cst_present) {
399 kfree(acpi_id_present);
400 return -ENOMEM;
401 }
402
403 acpi_walk_namespace(ACPI_TYPE_PROCESSOR, ACPI_ROOT_OBJECT,
404 ACPI_UINT32_MAX,
405 read_acpi_id, NULL, NULL, NULL);
406 acpi_get_devices("ACPI0007", read_acpi_id, NULL, NULL);
407
408 if (!bitmap_equal(acpi_id_present, acpi_ids_done, nr_acpi_bits)) {
409 unsigned int i;
410 for_each_set_bit(i, acpi_id_present, nr_acpi_bits) {
411 pr_backup->acpi_id = i;
412 /* Mask out C-states if there are no _CST or PBLK */
413 pr_backup->flags.power = test_bit(i, acpi_id_cst_present);
414 (void)upload_pm_data(pr_backup);
415 }
416 }
417 kfree(acpi_id_present);
418 acpi_id_present = NULL;
419 kfree(acpi_id_cst_present);
420 acpi_id_cst_present = NULL;
421 return 0;
422}
423static int __init check_prereq(void)
424{
425 struct cpuinfo_x86 *c = &cpu_data(0);
426
427 if (!xen_initial_domain())
428 return -ENODEV;
429
430 if (!acpi_gbl_FADT.smi_command)
431 return -ENODEV;
432
433 if (c->x86_vendor == X86_VENDOR_INTEL) {
434 if (!cpu_has(c, X86_FEATURE_EST))
435 return -ENODEV;
436
437 return 0;
438 }
439 if (c->x86_vendor == X86_VENDOR_AMD) {
440 /* Copied from powernow-k8.h, can't include ../cpufreq/powernow
441 * as we get compile warnings for the static functions.
442 */
443#define CPUID_FREQ_VOLT_CAPABILITIES 0x80000007
444#define USE_HW_PSTATE 0x00000080
445 u32 eax, ebx, ecx, edx;
446 cpuid(CPUID_FREQ_VOLT_CAPABILITIES, &eax, &ebx, &ecx, &edx);
447 if ((edx & USE_HW_PSTATE) != USE_HW_PSTATE)
448 return -ENODEV;
449 return 0;
450 }
451 return -ENODEV;
452}
453/* acpi_perf_data is a pointer to percpu data. */
454static struct acpi_processor_performance __percpu *acpi_perf_data;
455
456static void free_acpi_perf_data(void)
457{
458 unsigned int i;
459
460 /* Freeing a NULL pointer is OK, and alloc_percpu zeroes. */
461 for_each_possible_cpu(i)
462 free_cpumask_var(per_cpu_ptr(acpi_perf_data, i)
463 ->shared_cpu_map);
464 free_percpu(acpi_perf_data);
465}
466
467static int __init xen_acpi_processor_init(void)
468{
469 struct acpi_processor *pr_backup = NULL;
470 unsigned int i;
471 int rc = check_prereq();
472
473 if (rc)
474 return rc;
475
476 nr_acpi_bits = get_max_acpi_id() + 1;
477 acpi_ids_done = kcalloc(BITS_TO_LONGS(nr_acpi_bits), sizeof(unsigned long), GFP_KERNEL);
478 if (!acpi_ids_done)
479 return -ENOMEM;
480
481 acpi_perf_data = alloc_percpu(struct acpi_processor_performance);
482 if (!acpi_perf_data) {
483 pr_debug(DRV_NAME "Memory allocation error for acpi_perf_data.\n");
484 kfree(acpi_ids_done);
485 return -ENOMEM;
486 }
487 for_each_possible_cpu(i) {
488 if (!zalloc_cpumask_var_node(
489 &per_cpu_ptr(acpi_perf_data, i)->shared_cpu_map,
490 GFP_KERNEL, cpu_to_node(i))) {
491 rc = -ENOMEM;
492 goto err_out;
493 }
494 }
495
496 /* Do initialization in ACPI core. It is OK to fail here. */
497 (void)acpi_processor_preregister_performance(acpi_perf_data);
498
499 for_each_possible_cpu(i) {
500 struct acpi_processor_performance *perf;
501
502 perf = per_cpu_ptr(acpi_perf_data, i);
503 rc = acpi_processor_register_performance(perf, i);
504 if (WARN_ON(rc))
505 goto err_out;
506 }
507 rc = acpi_processor_notify_smm(THIS_MODULE);
508 if (WARN_ON(rc))
509 goto err_unregister;
510
511 for_each_possible_cpu(i) {
512 struct acpi_processor *_pr;
513 _pr = per_cpu(processors, i /* APIC ID */);
514 if (!_pr)
515 continue;
516
517 if (!pr_backup) {
518 pr_backup = kzalloc(sizeof(struct acpi_processor), GFP_KERNEL);
519 memcpy(pr_backup, _pr, sizeof(struct acpi_processor));
520 }
521 (void)upload_pm_data(_pr);
522 }
523 rc = check_acpi_ids(pr_backup);
524 if (rc)
525 goto err_unregister;
526
527 kfree(pr_backup);
528
529 return 0;
530err_unregister:
531 for_each_possible_cpu(i) {
532 struct acpi_processor_performance *perf;
533 perf = per_cpu_ptr(acpi_perf_data, i);
534 acpi_processor_unregister_performance(perf, i);
535 }
536err_out:
537 /* Freeing a NULL pointer is OK: alloc_percpu zeroes. */
538 free_acpi_perf_data();
539 kfree(acpi_ids_done);
540 return rc;
541}
542static void __exit xen_acpi_processor_exit(void)
543{
544 int i;
545
546 kfree(acpi_ids_done);
547 for_each_possible_cpu(i) {
548 struct acpi_processor_performance *perf;
549 perf = per_cpu_ptr(acpi_perf_data, i);
550 acpi_processor_unregister_performance(perf, i);
551 }
552 free_acpi_perf_data();
553}
554
555MODULE_AUTHOR("Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>");
556MODULE_DESCRIPTION("Xen ACPI Processor P-states (and Cx) driver which uploads PM data to Xen hypervisor");
557MODULE_LICENSE("GPL");
558
559/* We want to be loaded before the CPU freq scaling drivers are loaded.
560 * They are loaded in late_initcall. */
561device_initcall(xen_acpi_processor_init);
562module_exit(xen_acpi_processor_exit);