aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2017-07-04 16:39:41 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2017-07-04 16:39:41 -0400
commit408c9861c6979db974455b9e7a9bcadd60e0934c (patch)
tree9bdb862da2883cd4f74297d01ec8ce3b4619dd66
parentb39de277b02ffd8e3dccb01e9159bd45cb07b95d (diff)
parent8f8e5c3e2796eaf150d6262115af12707c2616dd (diff)
Merge tag 'pm-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki: "The big ticket items here are the rework of suspend-to-idle in order to add proper support for power button wakeup from it on recent Dell laptops and the rework of interfaces exporting the current CPU frequency on x86. In addition to that, support for a few new pieces of hardware is added, the PCI/ACPI device wakeup infrastructure is simplified significantly and the wakeup IRQ framework is fixed to unbreak the IRQ bus locking infrastructure. Also, there are some functional improvements for intel_pstate, tools updates and small fixes and cleanups all over. Specifics: - Rework suspend-to-idle to allow it to take wakeup events signaled by the EC into account on ACPI-based platforms in order to properly support power button wakeup from suspend-to-idle on recent Dell laptops (Rafael Wysocki). That includes the core suspend-to-idle code rework, support for the Low Power S0 _DSM interface, and support for the ACPI INT0002 Virtual GPIO device from Hans de Goede (required for USB keyboard wakeup from suspend-to-idle to work on some machines). - Stop trying to export the current CPU frequency via /proc/cpuinfo on x86 as that is inaccurate and confusing (Len Brown). - Rework the way in which the current CPU frequency is exported by the kernel (over the cpufreq sysfs interface) on x86 systems with the APERF and MPERF registers by always using values read from these registers, when available, to compute the current frequency regardless of which cpufreq driver is in use (Len Brown). - Rework the PCI/ACPI device wakeup infrastructure to remove the questionable and artificial distinction between "devices that can wake up the system from sleep states" and "devices that can generate wakeup signals in the working state" from it, which allows the code to be simplified quite a bit (Rafael Wysocki). - Fix the wakeup IRQ framework by making it use SRCU instead of RCU which doesn't allow sleeping in the read-side critical sections, but which in turn is expected to be allowed by the IRQ bus locking infrastructure (Thomas Gleixner). - Modify some computations in the intel_pstate driver to avoid rounding errors resulting from them (Srinivas Pandruvada). - Reduce the overhead of the intel_pstate driver in the HWP (hardware-managed P-states) mode and when the "performance" P-state selection algorithm is in use by making it avoid registering scheduler callbacks in those cases (Len Brown). - Rework the energy_performance_preference sysfs knob in intel_pstate by changing the values that correspond to different symbolic hint names used by it (Len Brown). - Make it possible to use more than one cpuidle driver at the same time on ARM (Daniel Lezcano). - Make it possible to prevent the cpuidle menu governor from using the 0 state by disabling it via sysfs (Nicholas Piggin). - Add support for FFH (Fixed Functional Hardware) MWAIT in ACPI C1 on AMD systems (Yazen Ghannam). - Make the CPPC cpufreq driver take the lowest nonlinear performance information into account (Prashanth Prakash). - Add support for hi3660 to the cpufreq-dt driver, fix the imx6q driver and clean up the sfi, exynos5440 and intel_pstate drivers (Colin Ian King, Krzysztof Kozlowski, Octavian Purdila, Rafael Wysocki, Tao Wang). - Fix a few minor issues in the generic power domains (genpd) framework and clean it up somewhat (Krzysztof Kozlowski, Mikko Perttunen, Viresh Kumar). - Fix a couple of minor issues in the operating performance points (OPP) framework and clean it up somewhat (Viresh Kumar). - Fix a CONFIG dependency in the hibernation core and clean it up slightly (Balbir Singh, Arvind Yadav, BaoJun Luo). - Add rk3228 support to the rockchip-io adaptive voltage scaling (AVS) driver (David Wu). - Fix an incorrect bit shift operation in the RAPL power capping driver (Adam Lessnau). - Add support for the EPP field in the HWP (hardware managed P-states) control register, HWP.EPP, to the x86_energy_perf_policy tool and update msr-index.h with HWP.EPP values (Len Brown). - Fix some minor issues in the turbostat tool (Len Brown). - Add support for AMD family 0x17 CPUs to the cpupower tool and fix a minor issue in it (Sherry Hurwitz). - Assorted cleanups, mostly related to the constification of some data structures (Arvind Yadav, Joe Perches, Kees Cook, Krzysztof Kozlowski)" * tag 'pm-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (69 commits) cpufreq: Update scaling_cur_freq documentation cpufreq: intel_pstate: Clean up after performance governor changes PM: hibernate: constify attribute_group structures. cpuidle: menu: allow state 0 to be disabled intel_idle: Use more common logging style PM / Domains: Fix missing default_power_down_ok comment PM / Domains: Fix unsafe iteration over modified list of domains PM / Domains: Fix unsafe iteration over modified list of domain providers PM / Domains: Fix unsafe iteration over modified list of device links PM / Domains: Handle safely genpd_syscore_switch() call on non-genpd device PM / Domains: Call driver's noirq callbacks PM / core: Drop run_wake flag from struct dev_pm_info PCI / PM: Simplify device wakeup settings code PCI / PM: Drop pme_interrupt flag from struct pci_dev ACPI / PM: Consolidate device wakeup settings code ACPI / PM: Drop run_wake from struct acpi_device_wakeup_flags PM / QoS: constify *_attribute_group. PM / AVS: rockchip-io: add io selectors and supplies for rk3228 powercap/RAPL: prevent overridding bits outside of the mask PM / sysfs: Constify attribute groups ...
-rw-r--r--Documentation/admin-guide/pm/cpufreq.rst12
-rw-r--r--Documentation/admin-guide/pm/intel_pstate.rst6
-rw-r--r--Documentation/devicetree/bindings/opp/opp.txt38
-rw-r--r--Documentation/devicetree/bindings/power/rockchip-io-domain.txt7
-rw-r--r--Documentation/power/runtime_pm.txt7
-rw-r--r--arch/x86/include/asm/msr-index.h18
-rw-r--r--arch/x86/include/asm/suspend_64.h5
-rw-r--r--arch/x86/kernel/acpi/cstate.c3
-rw-r--r--arch/x86/kernel/cpu/Makefile1
-rw-r--r--arch/x86/kernel/cpu/aperfmperf.c79
-rw-r--r--arch/x86/kernel/cpu/proc.c10
-rw-r--r--arch/x86/power/hibernate_64.c6
-rw-r--r--drivers/acpi/battery.c2
-rw-r--r--drivers/acpi/button.c5
-rw-r--r--drivers/acpi/device_pm.c102
-rw-r--r--drivers/acpi/ec.c2
-rw-r--r--drivers/acpi/internal.h4
-rw-r--r--drivers/acpi/pci_root.c5
-rw-r--r--drivers/acpi/proc.c4
-rw-r--r--drivers/acpi/scan.c23
-rw-r--r--drivers/acpi/sleep.c152
-rw-r--r--drivers/ata/libata-zpodd.c9
-rw-r--r--drivers/base/power/domain.c103
-rw-r--r--drivers/base/power/domain_governor.c12
-rw-r--r--drivers/base/power/main.c40
-rw-r--r--drivers/base/power/opp/core.c154
-rw-r--r--drivers/base/power/opp/debugfs.c7
-rw-r--r--drivers/base/power/opp/of.c10
-rw-r--r--drivers/base/power/sysfs.c12
-rw-r--r--drivers/base/power/wakeup.c50
-rw-r--r--drivers/cpufreq/cppc_cpufreq.c19
-rw-r--r--drivers/cpufreq/cpufreq-dt-platdev.c1
-rw-r--r--drivers/cpufreq/cpufreq.c12
-rw-r--r--drivers/cpufreq/exynos5440-cpufreq.c6
-rw-r--r--drivers/cpufreq/imx6q-cpufreq.c6
-rw-r--r--drivers/cpufreq/intel_pstate.c172
-rw-r--r--drivers/cpufreq/sfi-cpufreq.c2
-rw-r--r--drivers/cpuidle/Kconfig.arm1
-rw-r--r--drivers/cpuidle/cpuidle-arm.c62
-rw-r--r--drivers/cpuidle/governors/menu.c20
-rw-r--r--drivers/idle/intel_idle.c32
-rw-r--r--drivers/pci/pci-acpi.c90
-rw-r--r--drivers/pci/pci-driver.c2
-rw-r--r--drivers/pci/pci-mid.c10
-rw-r--r--drivers/pci/pci.c68
-rw-r--r--drivers/pci/pci.h9
-rw-r--r--drivers/pci/pcie/pme.c16
-rw-r--r--drivers/platform/x86/Kconfig19
-rw-r--r--drivers/platform/x86/Makefile1
-rw-r--r--drivers/platform/x86/intel-hid.c40
-rw-r--r--drivers/platform/x86/intel-vbtn.c39
-rw-r--r--drivers/platform/x86/intel_int0002_vgpio.c219
-rw-r--r--drivers/pnp/pnpacpi/core.c6
-rw-r--r--drivers/power/avs/rockchip-io-domain.c14
-rw-r--r--drivers/powercap/intel_rapl.c4
-rw-r--r--drivers/usb/core/hcd-pci.c7
-rw-r--r--drivers/usb/dwc3/dwc3-pci.c3
-rw-r--r--drivers/usb/host/uhci-pci.c2
-rw-r--r--include/acpi/acpi_bus.h29
-rw-r--r--include/linux/cpufreq.h2
-rw-r--r--include/linux/pci.h10
-rw-r--r--include/linux/pm.h1
-rw-r--r--include/linux/pm_opp.h9
-rw-r--r--include/linux/pm_runtime.h12
-rw-r--r--include/linux/suspend.h7
-rw-r--r--kernel/power/hibernate.c2
-rw-r--r--kernel/power/process.c2
-rw-r--r--kernel/power/snapshot.c11
-rw-r--r--kernel/power/suspend.c35
-rw-r--r--tools/power/cpupower/utils/helpers/amd.c31
-rw-r--r--tools/power/cpupower/utils/helpers/helpers.h2
-rw-r--r--tools/power/cpupower/utils/helpers/misc.c23
-rw-r--r--tools/power/x86/turbostat/turbostat.c94
-rw-r--r--tools/power/x86/x86_energy_perf_policy/Makefile27
-rw-r--r--tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.8241
-rw-r--r--tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c1504
76 files changed, 2887 insertions, 925 deletions
diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst
index 09aa2e949787..463cf7e73db8 100644
--- a/Documentation/admin-guide/pm/cpufreq.rst
+++ b/Documentation/admin-guide/pm/cpufreq.rst
@@ -269,16 +269,16 @@ are the following:
269``scaling_cur_freq`` 269``scaling_cur_freq``
270 Current frequency of all of the CPUs belonging to this policy (in kHz). 270 Current frequency of all of the CPUs belonging to this policy (in kHz).
271 271
272 For the majority of scaling drivers, this is the frequency of the last 272 In the majority of cases, this is the frequency of the last P-state
273 P-state requested by the driver from the hardware using the scaling 273 requested by the scaling driver from the hardware using the scaling
274 interface provided by it, which may or may not reflect the frequency 274 interface provided by it, which may or may not reflect the frequency
275 the CPU is actually running at (due to hardware design and other 275 the CPU is actually running at (due to hardware design and other
276 limitations). 276 limitations).
277 277
278 Some scaling drivers (e.g. |intel_pstate|) attempt to provide 278 Some architectures (e.g. ``x86``) may attempt to provide information
279 information more precisely reflecting the current CPU frequency through 279 more precisely reflecting the current CPU frequency through this
280 this attribute, but that still may not be the exact current CPU 280 attribute, but that still may not be the exact current CPU frequency as
281 frequency as seen by the hardware at the moment. 281 seen by the hardware at the moment.
282 282
283``scaling_driver`` 283``scaling_driver``
284 The scaling driver currently in use. 284 The scaling driver currently in use.
diff --git a/Documentation/admin-guide/pm/intel_pstate.rst b/Documentation/admin-guide/pm/intel_pstate.rst
index 33d703989ea8..1d6249825efc 100644
--- a/Documentation/admin-guide/pm/intel_pstate.rst
+++ b/Documentation/admin-guide/pm/intel_pstate.rst
@@ -157,10 +157,8 @@ Without HWP, this P-state selection algorithm is always the same regardless of
157the processor model and platform configuration. 157the processor model and platform configuration.
158 158
159It selects the maximum P-state it is allowed to use, subject to limits set via 159It selects the maximum P-state it is allowed to use, subject to limits set via
160``sysfs``, every time the P-state selection computations are carried out by the 160``sysfs``, every time the driver configuration for the given CPU is updated
161driver's utilization update callback for the given CPU (that does not happen 161(e.g. via ``sysfs``).
162more often than every 10 ms), but the hardware configuration will not be changed
163if the new P-state is the same as the current one.
164 162
165This is the default P-state selection algorithm if the 163This is the default P-state selection algorithm if the
166:c:macro:`CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE` kernel configuration option 164:c:macro:`CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE` kernel configuration option
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt
index 63725498bd20..e36d261b9ba6 100644
--- a/Documentation/devicetree/bindings/opp/opp.txt
+++ b/Documentation/devicetree/bindings/opp/opp.txt
@@ -186,20 +186,20 @@ Example 1: Single cluster Dual-core ARM cortex A9, switch DVFS states together.
186 compatible = "operating-points-v2"; 186 compatible = "operating-points-v2";
187 opp-shared; 187 opp-shared;
188 188
189 opp@1000000000 { 189 opp-1000000000 {
190 opp-hz = /bits/ 64 <1000000000>; 190 opp-hz = /bits/ 64 <1000000000>;
191 opp-microvolt = <975000 970000 985000>; 191 opp-microvolt = <975000 970000 985000>;
192 opp-microamp = <70000>; 192 opp-microamp = <70000>;
193 clock-latency-ns = <300000>; 193 clock-latency-ns = <300000>;
194 opp-suspend; 194 opp-suspend;
195 }; 195 };
196 opp@1100000000 { 196 opp-1100000000 {
197 opp-hz = /bits/ 64 <1100000000>; 197 opp-hz = /bits/ 64 <1100000000>;
198 opp-microvolt = <1000000 980000 1010000>; 198 opp-microvolt = <1000000 980000 1010000>;
199 opp-microamp = <80000>; 199 opp-microamp = <80000>;
200 clock-latency-ns = <310000>; 200 clock-latency-ns = <310000>;
201 }; 201 };
202 opp@1200000000 { 202 opp-1200000000 {
203 opp-hz = /bits/ 64 <1200000000>; 203 opp-hz = /bits/ 64 <1200000000>;
204 opp-microvolt = <1025000>; 204 opp-microvolt = <1025000>;
205 clock-latency-ns = <290000>; 205 clock-latency-ns = <290000>;
@@ -265,20 +265,20 @@ independently.
265 * independently. 265 * independently.
266 */ 266 */
267 267
268 opp@1000000000 { 268 opp-1000000000 {
269 opp-hz = /bits/ 64 <1000000000>; 269 opp-hz = /bits/ 64 <1000000000>;
270 opp-microvolt = <975000 970000 985000>; 270 opp-microvolt = <975000 970000 985000>;
271 opp-microamp = <70000>; 271 opp-microamp = <70000>;
272 clock-latency-ns = <300000>; 272 clock-latency-ns = <300000>;
273 opp-suspend; 273 opp-suspend;
274 }; 274 };
275 opp@1100000000 { 275 opp-1100000000 {
276 opp-hz = /bits/ 64 <1100000000>; 276 opp-hz = /bits/ 64 <1100000000>;
277 opp-microvolt = <1000000 980000 1010000>; 277 opp-microvolt = <1000000 980000 1010000>;
278 opp-microamp = <80000>; 278 opp-microamp = <80000>;
279 clock-latency-ns = <310000>; 279 clock-latency-ns = <310000>;
280 }; 280 };
281 opp@1200000000 { 281 opp-1200000000 {
282 opp-hz = /bits/ 64 <1200000000>; 282 opp-hz = /bits/ 64 <1200000000>;
283 opp-microvolt = <1025000>; 283 opp-microvolt = <1025000>;
284 opp-microamp = <90000; 284 opp-microamp = <90000;
@@ -341,20 +341,20 @@ DVFS state together.
341 compatible = "operating-points-v2"; 341 compatible = "operating-points-v2";
342 opp-shared; 342 opp-shared;
343 343
344 opp@1000000000 { 344 opp-1000000000 {
345 opp-hz = /bits/ 64 <1000000000>; 345 opp-hz = /bits/ 64 <1000000000>;
346 opp-microvolt = <975000 970000 985000>; 346 opp-microvolt = <975000 970000 985000>;
347 opp-microamp = <70000>; 347 opp-microamp = <70000>;
348 clock-latency-ns = <300000>; 348 clock-latency-ns = <300000>;
349 opp-suspend; 349 opp-suspend;
350 }; 350 };
351 opp@1100000000 { 351 opp-1100000000 {
352 opp-hz = /bits/ 64 <1100000000>; 352 opp-hz = /bits/ 64 <1100000000>;
353 opp-microvolt = <1000000 980000 1010000>; 353 opp-microvolt = <1000000 980000 1010000>;
354 opp-microamp = <80000>; 354 opp-microamp = <80000>;
355 clock-latency-ns = <310000>; 355 clock-latency-ns = <310000>;
356 }; 356 };
357 opp@1200000000 { 357 opp-1200000000 {
358 opp-hz = /bits/ 64 <1200000000>; 358 opp-hz = /bits/ 64 <1200000000>;
359 opp-microvolt = <1025000>; 359 opp-microvolt = <1025000>;
360 opp-microamp = <90000>; 360 opp-microamp = <90000>;
@@ -367,20 +367,20 @@ DVFS state together.
367 compatible = "operating-points-v2"; 367 compatible = "operating-points-v2";
368 opp-shared; 368 opp-shared;
369 369
370 opp@1300000000 { 370 opp-1300000000 {
371 opp-hz = /bits/ 64 <1300000000>; 371 opp-hz = /bits/ 64 <1300000000>;
372 opp-microvolt = <1050000 1045000 1055000>; 372 opp-microvolt = <1050000 1045000 1055000>;
373 opp-microamp = <95000>; 373 opp-microamp = <95000>;
374 clock-latency-ns = <400000>; 374 clock-latency-ns = <400000>;
375 opp-suspend; 375 opp-suspend;
376 }; 376 };
377 opp@1400000000 { 377 opp-1400000000 {
378 opp-hz = /bits/ 64 <1400000000>; 378 opp-hz = /bits/ 64 <1400000000>;
379 opp-microvolt = <1075000>; 379 opp-microvolt = <1075000>;
380 opp-microamp = <100000>; 380 opp-microamp = <100000>;
381 clock-latency-ns = <400000>; 381 clock-latency-ns = <400000>;
382 }; 382 };
383 opp@1500000000 { 383 opp-1500000000 {
384 opp-hz = /bits/ 64 <1500000000>; 384 opp-hz = /bits/ 64 <1500000000>;
385 opp-microvolt = <1100000 1010000 1110000>; 385 opp-microvolt = <1100000 1010000 1110000>;
386 opp-microamp = <95000>; 386 opp-microamp = <95000>;
@@ -409,7 +409,7 @@ Example 4: Handling multiple regulators
409 compatible = "operating-points-v2"; 409 compatible = "operating-points-v2";
410 opp-shared; 410 opp-shared;
411 411
412 opp@1000000000 { 412 opp-1000000000 {
413 opp-hz = /bits/ 64 <1000000000>; 413 opp-hz = /bits/ 64 <1000000000>;
414 opp-microvolt = <970000>, /* Supply 0 */ 414 opp-microvolt = <970000>, /* Supply 0 */
415 <960000>, /* Supply 1 */ 415 <960000>, /* Supply 1 */
@@ -422,7 +422,7 @@ Example 4: Handling multiple regulators
422 422
423 /* OR */ 423 /* OR */
424 424
425 opp@1000000000 { 425 opp-1000000000 {
426 opp-hz = /bits/ 64 <1000000000>; 426 opp-hz = /bits/ 64 <1000000000>;
427 opp-microvolt = <975000 970000 985000>, /* Supply 0 */ 427 opp-microvolt = <975000 970000 985000>, /* Supply 0 */
428 <965000 960000 975000>, /* Supply 1 */ 428 <965000 960000 975000>, /* Supply 1 */
@@ -435,7 +435,7 @@ Example 4: Handling multiple regulators
435 435
436 /* OR */ 436 /* OR */
437 437
438 opp@1000000000 { 438 opp-1000000000 {
439 opp-hz = /bits/ 64 <1000000000>; 439 opp-hz = /bits/ 64 <1000000000>;
440 opp-microvolt = <975000 970000 985000>, /* Supply 0 */ 440 opp-microvolt = <975000 970000 985000>, /* Supply 0 */
441 <965000 960000 975000>, /* Supply 1 */ 441 <965000 960000 975000>, /* Supply 1 */
@@ -467,7 +467,7 @@ Example 5: opp-supported-hw
467 status = "okay"; 467 status = "okay";
468 opp-shared; 468 opp-shared;
469 469
470 opp@600000000 { 470 opp-600000000 {
471 /* 471 /*
472 * Supports all substrate and process versions for 0xF 472 * Supports all substrate and process versions for 0xF
473 * cuts, i.e. only first four cuts. 473 * cuts, i.e. only first four cuts.
@@ -478,7 +478,7 @@ Example 5: opp-supported-hw
478 ... 478 ...
479 }; 479 };
480 480
481 opp@800000000 { 481 opp-800000000 {
482 /* 482 /*
483 * Supports: 483 * Supports:
484 * - cuts: only one, 6th cut (represented by 6th bit). 484 * - cuts: only one, 6th cut (represented by 6th bit).
@@ -510,7 +510,7 @@ Example 6: opp-microvolt-<name>, opp-microamp-<name>:
510 compatible = "operating-points-v2"; 510 compatible = "operating-points-v2";
511 opp-shared; 511 opp-shared;
512 512
513 opp@1000000000 { 513 opp-1000000000 {
514 opp-hz = /bits/ 64 <1000000000>; 514 opp-hz = /bits/ 64 <1000000000>;
515 opp-microvolt-slow = <915000 900000 925000>; 515 opp-microvolt-slow = <915000 900000 925000>;
516 opp-microvolt-fast = <975000 970000 985000>; 516 opp-microvolt-fast = <975000 970000 985000>;
@@ -518,7 +518,7 @@ Example 6: opp-microvolt-<name>, opp-microamp-<name>:
518 opp-microamp-fast = <71000>; 518 opp-microamp-fast = <71000>;
519 }; 519 };
520 520
521 opp@1200000000 { 521 opp-1200000000 {
522 opp-hz = /bits/ 64 <1200000000>; 522 opp-hz = /bits/ 64 <1200000000>;
523 opp-microvolt-slow = <915000 900000 925000>, /* Supply vcc0 */ 523 opp-microvolt-slow = <915000 900000 925000>, /* Supply vcc0 */
524 <925000 910000 935000>; /* Supply vcc1 */ 524 <925000 910000 935000>; /* Supply vcc1 */
diff --git a/Documentation/devicetree/bindings/power/rockchip-io-domain.txt b/Documentation/devicetree/bindings/power/rockchip-io-domain.txt
index d3a5a93a65cd..43c21fb04564 100644
--- a/Documentation/devicetree/bindings/power/rockchip-io-domain.txt
+++ b/Documentation/devicetree/bindings/power/rockchip-io-domain.txt
@@ -32,6 +32,7 @@ SoC is on the same page.
32Required properties: 32Required properties:
33- compatible: should be one of: 33- compatible: should be one of:
34 - "rockchip,rk3188-io-voltage-domain" for rk3188 34 - "rockchip,rk3188-io-voltage-domain" for rk3188
35 - "rockchip,rk3228-io-voltage-domain" for rk3228
35 - "rockchip,rk3288-io-voltage-domain" for rk3288 36 - "rockchip,rk3288-io-voltage-domain" for rk3288
36 - "rockchip,rk3328-io-voltage-domain" for rk3328 37 - "rockchip,rk3328-io-voltage-domain" for rk3328
37 - "rockchip,rk3368-io-voltage-domain" for rk3368 38 - "rockchip,rk3368-io-voltage-domain" for rk3368
@@ -59,6 +60,12 @@ Possible supplies for rk3188:
59- vccio1-supply: The supply connected to VCCIO1. 60- vccio1-supply: The supply connected to VCCIO1.
60 Sometimes also labeled VCCIO1 and VCCIO2. 61 Sometimes also labeled VCCIO1 and VCCIO2.
61 62
63Possible supplies for rk3228:
64- vccio1-supply: The supply connected to VCCIO1.
65- vccio2-supply: The supply connected to VCCIO2.
66- vccio3-supply: The supply connected to VCCIO3.
67- vccio4-supply: The supply connected to VCCIO4.
68
62Possible supplies for rk3288: 69Possible supplies for rk3288:
63- audio-supply: The supply connected to APIO4_VDD. 70- audio-supply: The supply connected to APIO4_VDD.
64- bb-supply: The supply connected to APIO5_VDD. 71- bb-supply: The supply connected to APIO5_VDD.
diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
index ee69d7532172..0fde3dcf077a 100644
--- a/Documentation/power/runtime_pm.txt
+++ b/Documentation/power/runtime_pm.txt
@@ -105,9 +105,9 @@ knows what to do to handle the device).
105 105
106In particular, if the driver requires remote wakeup capability (i.e. hardware 106In particular, if the driver requires remote wakeup capability (i.e. hardware
107mechanism allowing the device to request a change of its power state, such as 107mechanism allowing the device to request a change of its power state, such as
108PCI PME) for proper functioning and device_run_wake() returns 'false' for the 108PCI PME) for proper functioning and device_can_wakeup() returns 'false' for the
109device, then ->runtime_suspend() should return -EBUSY. On the other hand, if 109device, then ->runtime_suspend() should return -EBUSY. On the other hand, if
110device_run_wake() returns 'true' for the device and the device is put into a 110device_can_wakeup() returns 'true' for the device and the device is put into a
111low-power state during the execution of the suspend callback, it is expected 111low-power state during the execution of the suspend callback, it is expected
112that remote wakeup will be enabled for the device. Generally, remote wakeup 112that remote wakeup will be enabled for the device. Generally, remote wakeup
113should be enabled for all input devices put into low-power states at run time. 113should be enabled for all input devices put into low-power states at run time.
@@ -253,9 +253,6 @@ defined in include/linux/pm.h:
253 being executed for that device and it is not practical to wait for the 253 being executed for that device and it is not practical to wait for the
254 suspend to complete; means "start a resume as soon as you've suspended" 254 suspend to complete; means "start a resume as soon as you've suspended"
255 255
256 unsigned int run_wake;
257 - set if the device is capable of generating runtime wake-up events
258
259 enum rpm_status runtime_status; 256 enum rpm_status runtime_status;
260 - the runtime PM status of the device; this field's initial value is 257 - the runtime PM status of the device; this field's initial value is
261 RPM_SUSPENDED, which means that each device is initially regarded by the 258 RPM_SUSPENDED, which means that each device is initially regarded by the
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 18b162322eff..d406894cd9a2 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -251,9 +251,13 @@
251#define HWP_MIN_PERF(x) (x & 0xff) 251#define HWP_MIN_PERF(x) (x & 0xff)
252#define HWP_MAX_PERF(x) ((x & 0xff) << 8) 252#define HWP_MAX_PERF(x) ((x & 0xff) << 8)
253#define HWP_DESIRED_PERF(x) ((x & 0xff) << 16) 253#define HWP_DESIRED_PERF(x) ((x & 0xff) << 16)
254#define HWP_ENERGY_PERF_PREFERENCE(x) ((x & 0xff) << 24) 254#define HWP_ENERGY_PERF_PREFERENCE(x) (((unsigned long long) x & 0xff) << 24)
255#define HWP_ACTIVITY_WINDOW(x) ((x & 0xff3) << 32) 255#define HWP_EPP_PERFORMANCE 0x00
256#define HWP_PACKAGE_CONTROL(x) ((x & 0x1) << 42) 256#define HWP_EPP_BALANCE_PERFORMANCE 0x80
257#define HWP_EPP_BALANCE_POWERSAVE 0xC0
258#define HWP_EPP_POWERSAVE 0xFF
259#define HWP_ACTIVITY_WINDOW(x) ((unsigned long long)(x & 0xff3) << 32)
260#define HWP_PACKAGE_CONTROL(x) ((unsigned long long)(x & 0x1) << 42)
257 261
258/* IA32_HWP_STATUS */ 262/* IA32_HWP_STATUS */
259#define HWP_GUARANTEED_CHANGE(x) (x & 0x1) 263#define HWP_GUARANTEED_CHANGE(x) (x & 0x1)
@@ -476,9 +480,11 @@
476#define MSR_MISC_PWR_MGMT 0x000001aa 480#define MSR_MISC_PWR_MGMT 0x000001aa
477 481
478#define MSR_IA32_ENERGY_PERF_BIAS 0x000001b0 482#define MSR_IA32_ENERGY_PERF_BIAS 0x000001b0
479#define ENERGY_PERF_BIAS_PERFORMANCE 0 483#define ENERGY_PERF_BIAS_PERFORMANCE 0
480#define ENERGY_PERF_BIAS_NORMAL 6 484#define ENERGY_PERF_BIAS_BALANCE_PERFORMANCE 4
481#define ENERGY_PERF_BIAS_POWERSAVE 15 485#define ENERGY_PERF_BIAS_NORMAL 6
486#define ENERGY_PERF_BIAS_BALANCE_POWERSAVE 8
487#define ENERGY_PERF_BIAS_POWERSAVE 15
482 488
483#define MSR_IA32_PACKAGE_THERM_STATUS 0x000001b1 489#define MSR_IA32_PACKAGE_THERM_STATUS 0x000001b1
484 490
diff --git a/arch/x86/include/asm/suspend_64.h b/arch/x86/include/asm/suspend_64.h
index 6136a18152af..2bd96b4df140 100644
--- a/arch/x86/include/asm/suspend_64.h
+++ b/arch/x86/include/asm/suspend_64.h
@@ -42,8 +42,7 @@ struct saved_context {
42 set_debugreg((thread)->debugreg##register, register) 42 set_debugreg((thread)->debugreg##register, register)
43 43
44/* routines for saving/restoring kernel state */ 44/* routines for saving/restoring kernel state */
45extern int acpi_save_state_mem(void); 45extern char core_restore_code[];
46extern char core_restore_code; 46extern char restore_registers[];
47extern char restore_registers;
48 47
49#endif /* _ASM_X86_SUSPEND_64_H */ 48#endif /* _ASM_X86_SUSPEND_64_H */
diff --git a/arch/x86/kernel/acpi/cstate.c b/arch/x86/kernel/acpi/cstate.c
index 8233a630280f..dde437f5d14f 100644
--- a/arch/x86/kernel/acpi/cstate.c
+++ b/arch/x86/kernel/acpi/cstate.c
@@ -167,7 +167,8 @@ static int __init ffh_cstate_init(void)
167{ 167{
168 struct cpuinfo_x86 *c = &boot_cpu_data; 168 struct cpuinfo_x86 *c = &boot_cpu_data;
169 169
170 if (c->x86_vendor != X86_VENDOR_INTEL) 170 if (c->x86_vendor != X86_VENDOR_INTEL &&
171 c->x86_vendor != X86_VENDOR_AMD)
171 return -1; 172 return -1;
172 173
173 cpu_cstate_entry = alloc_percpu(struct cstate_entry); 174 cpu_cstate_entry = alloc_percpu(struct cstate_entry);
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 52000010c62e..cdf82492b770 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -21,6 +21,7 @@ obj-y += common.o
21obj-y += rdrand.o 21obj-y += rdrand.o
22obj-y += match.o 22obj-y += match.o
23obj-y += bugs.o 23obj-y += bugs.o
24obj-$(CONFIG_CPU_FREQ) += aperfmperf.o
24 25
25obj-$(CONFIG_PROC_FS) += proc.o 26obj-$(CONFIG_PROC_FS) += proc.o
26obj-$(CONFIG_X86_FEATURE_NAMES) += capflags.o powerflags.o 27obj-$(CONFIG_X86_FEATURE_NAMES) += capflags.o powerflags.o
diff --git a/arch/x86/kernel/cpu/aperfmperf.c b/arch/x86/kernel/cpu/aperfmperf.c
new file mode 100644
index 000000000000..d869c8671e36
--- /dev/null
+++ b/arch/x86/kernel/cpu/aperfmperf.c
@@ -0,0 +1,79 @@
1/*
2 * x86 APERF/MPERF KHz calculation for
3 * /sys/.../cpufreq/scaling_cur_freq
4 *
5 * Copyright (C) 2017 Intel Corp.
6 * Author: Len Brown <len.brown@intel.com>
7 *
8 * This file is licensed under GPLv2.
9 */
10
11#include <linux/jiffies.h>
12#include <linux/math64.h>
13#include <linux/percpu.h>
14#include <linux/smp.h>
15
16struct aperfmperf_sample {
17 unsigned int khz;
18 unsigned long jiffies;
19 u64 aperf;
20 u64 mperf;
21};
22
23static DEFINE_PER_CPU(struct aperfmperf_sample, samples);
24
25/*
26 * aperfmperf_snapshot_khz()
27 * On the current CPU, snapshot APERF, MPERF, and jiffies
28 * unless we already did it within 10ms
29 * calculate kHz, save snapshot
30 */
31static void aperfmperf_snapshot_khz(void *dummy)
32{
33 u64 aperf, aperf_delta;
34 u64 mperf, mperf_delta;
35 struct aperfmperf_sample *s = this_cpu_ptr(&samples);
36
37 /* Don't bother re-computing within 10 ms */
38 if (time_before(jiffies, s->jiffies + HZ/100))
39 return;
40
41 rdmsrl(MSR_IA32_APERF, aperf);
42 rdmsrl(MSR_IA32_MPERF, mperf);
43
44 aperf_delta = aperf - s->aperf;
45 mperf_delta = mperf - s->mperf;
46
47 /*
48 * There is no architectural guarantee that MPERF
49 * increments faster than we can read it.
50 */
51 if (mperf_delta == 0)
52 return;
53
54 /*
55 * if (cpu_khz * aperf_delta) fits into ULLONG_MAX, then
56 * khz = (cpu_khz * aperf_delta) / mperf_delta
57 */
58 if (div64_u64(ULLONG_MAX, cpu_khz) > aperf_delta)
59 s->khz = div64_u64((cpu_khz * aperf_delta), mperf_delta);
60 else /* khz = aperf_delta / (mperf_delta / cpu_khz) */
61 s->khz = div64_u64(aperf_delta,
62 div64_u64(mperf_delta, cpu_khz));
63 s->jiffies = jiffies;
64 s->aperf = aperf;
65 s->mperf = mperf;
66}
67
68unsigned int arch_freq_get_on_cpu(int cpu)
69{
70 if (!cpu_khz)
71 return 0;
72
73 if (!static_cpu_has(X86_FEATURE_APERFMPERF))
74 return 0;
75
76 smp_call_function_single(cpu, aperfmperf_snapshot_khz, NULL, 1);
77
78 return per_cpu(samples.khz, cpu);
79}
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index 6df621ae62a7..218f79825b3c 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -2,7 +2,6 @@
2#include <linux/timex.h> 2#include <linux/timex.h>
3#include <linux/string.h> 3#include <linux/string.h>
4#include <linux/seq_file.h> 4#include <linux/seq_file.h>
5#include <linux/cpufreq.h>
6 5
7/* 6/*
8 * Get CPU information for use by the procfs. 7 * Get CPU information for use by the procfs.
@@ -76,14 +75,9 @@ static int show_cpuinfo(struct seq_file *m, void *v)
76 if (c->microcode) 75 if (c->microcode)
77 seq_printf(m, "microcode\t: 0x%x\n", c->microcode); 76 seq_printf(m, "microcode\t: 0x%x\n", c->microcode);
78 77
79 if (cpu_has(c, X86_FEATURE_TSC)) { 78 if (cpu_has(c, X86_FEATURE_TSC))
80 unsigned int freq = cpufreq_quick_get(cpu);
81
82 if (!freq)
83 freq = cpu_khz;
84 seq_printf(m, "cpu MHz\t\t: %u.%03u\n", 79 seq_printf(m, "cpu MHz\t\t: %u.%03u\n",
85 freq / 1000, (freq % 1000)); 80 cpu_khz / 1000, (cpu_khz % 1000));
86 }
87 81
88 /* Cache size */ 82 /* Cache size */
89 if (c->x86_cache_size >= 0) 83 if (c->x86_cache_size >= 0)
diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
index e3e62c8a8e70..f2598d81cd55 100644
--- a/arch/x86/power/hibernate_64.c
+++ b/arch/x86/power/hibernate_64.c
@@ -147,7 +147,7 @@ static int relocate_restore_code(void)
147 if (!relocated_restore_code) 147 if (!relocated_restore_code)
148 return -ENOMEM; 148 return -ENOMEM;
149 149
150 memcpy((void *)relocated_restore_code, &core_restore_code, PAGE_SIZE); 150 memcpy((void *)relocated_restore_code, core_restore_code, PAGE_SIZE);
151 151
152 /* Make the page containing the relocated code executable */ 152 /* Make the page containing the relocated code executable */
153 pgd = (pgd_t *)__va(read_cr3_pa()) + 153 pgd = (pgd_t *)__va(read_cr3_pa()) +
@@ -293,8 +293,8 @@ int arch_hibernation_header_save(void *addr, unsigned int max_size)
293 293
294 if (max_size < sizeof(struct restore_data_record)) 294 if (max_size < sizeof(struct restore_data_record))
295 return -EOVERFLOW; 295 return -EOVERFLOW;
296 rdr->jump_address = (unsigned long)&restore_registers; 296 rdr->jump_address = (unsigned long)restore_registers;
297 rdr->jump_address_phys = __pa_symbol(&restore_registers); 297 rdr->jump_address_phys = __pa_symbol(restore_registers);
298 rdr->cr3 = restore_cr3; 298 rdr->cr3 = restore_cr3;
299 rdr->magic = RESTORE_MAGIC; 299 rdr->magic = RESTORE_MAGIC;
300 300
diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c
index d42eeef9d928..1cbb88d938e5 100644
--- a/drivers/acpi/battery.c
+++ b/drivers/acpi/battery.c
@@ -782,7 +782,7 @@ static int acpi_battery_update(struct acpi_battery *battery, bool resume)
782 if ((battery->state & ACPI_BATTERY_STATE_CRITICAL) || 782 if ((battery->state & ACPI_BATTERY_STATE_CRITICAL) ||
783 (test_bit(ACPI_BATTERY_ALARM_PRESENT, &battery->flags) && 783 (test_bit(ACPI_BATTERY_ALARM_PRESENT, &battery->flags) &&
784 (battery->capacity_now <= battery->alarm))) 784 (battery->capacity_now <= battery->alarm)))
785 pm_wakeup_event(&battery->device->dev, 0); 785 acpi_pm_wakeup_event(&battery->device->dev);
786 786
787 return result; 787 return result;
788} 788}
diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
index e19f530f1083..91cfdf377df7 100644
--- a/drivers/acpi/button.c
+++ b/drivers/acpi/button.c
@@ -217,7 +217,7 @@ static int acpi_lid_notify_state(struct acpi_device *device, int state)
217 } 217 }
218 218
219 if (state) 219 if (state)
220 pm_wakeup_event(&device->dev, 0); 220 acpi_pm_wakeup_event(&device->dev);
221 221
222 ret = blocking_notifier_call_chain(&acpi_lid_notifier, state, device); 222 ret = blocking_notifier_call_chain(&acpi_lid_notifier, state, device);
223 if (ret == NOTIFY_DONE) 223 if (ret == NOTIFY_DONE)
@@ -402,7 +402,7 @@ static void acpi_button_notify(struct acpi_device *device, u32 event)
402 } else { 402 } else {
403 int keycode; 403 int keycode;
404 404
405 pm_wakeup_event(&device->dev, 0); 405 acpi_pm_wakeup_event(&device->dev);
406 if (button->suspended) 406 if (button->suspended)
407 break; 407 break;
408 408
@@ -534,6 +534,7 @@ static int acpi_button_add(struct acpi_device *device)
534 lid_device = device; 534 lid_device = device;
535 } 535 }
536 536
537 device_init_wakeup(&device->dev, true);
537 printk(KERN_INFO PREFIX "%s [%s]\n", name, acpi_device_bid(device)); 538 printk(KERN_INFO PREFIX "%s [%s]\n", name, acpi_device_bid(device));
538 return 0; 539 return 0;
539 540
diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
index 993fd31394c8..28938b5a334e 100644
--- a/drivers/acpi/device_pm.c
+++ b/drivers/acpi/device_pm.c
@@ -24,6 +24,7 @@
24#include <linux/pm_qos.h> 24#include <linux/pm_qos.h>
25#include <linux/pm_domain.h> 25#include <linux/pm_domain.h>
26#include <linux/pm_runtime.h> 26#include <linux/pm_runtime.h>
27#include <linux/suspend.h>
27 28
28#include "internal.h" 29#include "internal.h"
29 30
@@ -385,6 +386,12 @@ EXPORT_SYMBOL(acpi_bus_power_manageable);
385#ifdef CONFIG_PM 386#ifdef CONFIG_PM
386static DEFINE_MUTEX(acpi_pm_notifier_lock); 387static DEFINE_MUTEX(acpi_pm_notifier_lock);
387 388
389void acpi_pm_wakeup_event(struct device *dev)
390{
391 pm_wakeup_dev_event(dev, 0, acpi_s2idle_wakeup());
392}
393EXPORT_SYMBOL_GPL(acpi_pm_wakeup_event);
394
388static void acpi_pm_notify_handler(acpi_handle handle, u32 val, void *not_used) 395static void acpi_pm_notify_handler(acpi_handle handle, u32 val, void *not_used)
389{ 396{
390 struct acpi_device *adev; 397 struct acpi_device *adev;
@@ -399,9 +406,9 @@ static void acpi_pm_notify_handler(acpi_handle handle, u32 val, void *not_used)
399 mutex_lock(&acpi_pm_notifier_lock); 406 mutex_lock(&acpi_pm_notifier_lock);
400 407
401 if (adev->wakeup.flags.notifier_present) { 408 if (adev->wakeup.flags.notifier_present) {
402 __pm_wakeup_event(adev->wakeup.ws, 0); 409 pm_wakeup_ws_event(adev->wakeup.ws, 0, acpi_s2idle_wakeup());
403 if (adev->wakeup.context.work.func) 410 if (adev->wakeup.context.func)
404 queue_pm_work(&adev->wakeup.context.work); 411 adev->wakeup.context.func(&adev->wakeup.context);
405 } 412 }
406 413
407 mutex_unlock(&acpi_pm_notifier_lock); 414 mutex_unlock(&acpi_pm_notifier_lock);
@@ -413,7 +420,7 @@ static void acpi_pm_notify_handler(acpi_handle handle, u32 val, void *not_used)
413 * acpi_add_pm_notifier - Register PM notify handler for given ACPI device. 420 * acpi_add_pm_notifier - Register PM notify handler for given ACPI device.
414 * @adev: ACPI device to add the notify handler for. 421 * @adev: ACPI device to add the notify handler for.
415 * @dev: Device to generate a wakeup event for while handling the notification. 422 * @dev: Device to generate a wakeup event for while handling the notification.
416 * @work_func: Work function to execute when handling the notification. 423 * @func: Work function to execute when handling the notification.
417 * 424 *
418 * NOTE: @adev need not be a run-wake or wakeup device to be a valid source of 425 * NOTE: @adev need not be a run-wake or wakeup device to be a valid source of
419 * PM wakeup events. For example, wakeup events may be generated for bridges 426 * PM wakeup events. For example, wakeup events may be generated for bridges
@@ -421,11 +428,11 @@ static void acpi_pm_notify_handler(acpi_handle handle, u32 val, void *not_used)
421 * bridge itself doesn't have a wakeup GPE associated with it. 428 * bridge itself doesn't have a wakeup GPE associated with it.
422 */ 429 */
423acpi_status acpi_add_pm_notifier(struct acpi_device *adev, struct device *dev, 430acpi_status acpi_add_pm_notifier(struct acpi_device *adev, struct device *dev,
424 void (*work_func)(struct work_struct *work)) 431 void (*func)(struct acpi_device_wakeup_context *context))
425{ 432{
426 acpi_status status = AE_ALREADY_EXISTS; 433 acpi_status status = AE_ALREADY_EXISTS;
427 434
428 if (!dev && !work_func) 435 if (!dev && !func)
429 return AE_BAD_PARAMETER; 436 return AE_BAD_PARAMETER;
430 437
431 mutex_lock(&acpi_pm_notifier_lock); 438 mutex_lock(&acpi_pm_notifier_lock);
@@ -435,8 +442,7 @@ acpi_status acpi_add_pm_notifier(struct acpi_device *adev, struct device *dev,
435 442
436 adev->wakeup.ws = wakeup_source_register(dev_name(&adev->dev)); 443 adev->wakeup.ws = wakeup_source_register(dev_name(&adev->dev));
437 adev->wakeup.context.dev = dev; 444 adev->wakeup.context.dev = dev;
438 if (work_func) 445 adev->wakeup.context.func = func;
439 INIT_WORK(&adev->wakeup.context.work, work_func);
440 446
441 status = acpi_install_notify_handler(adev->handle, ACPI_SYSTEM_NOTIFY, 447 status = acpi_install_notify_handler(adev->handle, ACPI_SYSTEM_NOTIFY,
442 acpi_pm_notify_handler, NULL); 448 acpi_pm_notify_handler, NULL);
@@ -469,10 +475,7 @@ acpi_status acpi_remove_pm_notifier(struct acpi_device *adev)
469 if (ACPI_FAILURE(status)) 475 if (ACPI_FAILURE(status))
470 goto out; 476 goto out;
471 477
472 if (adev->wakeup.context.work.func) { 478 adev->wakeup.context.func = NULL;
473 cancel_work_sync(&adev->wakeup.context.work);
474 adev->wakeup.context.work.func = NULL;
475 }
476 adev->wakeup.context.dev = NULL; 479 adev->wakeup.context.dev = NULL;
477 wakeup_source_unregister(adev->wakeup.ws); 480 wakeup_source_unregister(adev->wakeup.ws);
478 481
@@ -493,6 +496,13 @@ bool acpi_bus_can_wakeup(acpi_handle handle)
493} 496}
494EXPORT_SYMBOL(acpi_bus_can_wakeup); 497EXPORT_SYMBOL(acpi_bus_can_wakeup);
495 498
499bool acpi_pm_device_can_wakeup(struct device *dev)
500{
501 struct acpi_device *adev = ACPI_COMPANION(dev);
502
503 return adev ? acpi_device_can_wakeup(adev) : false;
504}
505
496/** 506/**
497 * acpi_dev_pm_get_state - Get preferred power state of ACPI device. 507 * acpi_dev_pm_get_state - Get preferred power state of ACPI device.
498 * @dev: Device whose preferred target power state to return. 508 * @dev: Device whose preferred target power state to return.
@@ -658,16 +668,15 @@ EXPORT_SYMBOL(acpi_pm_device_sleep_state);
658 668
659/** 669/**
660 * acpi_pm_notify_work_func - ACPI devices wakeup notification work function. 670 * acpi_pm_notify_work_func - ACPI devices wakeup notification work function.
661 * @work: Work item to handle. 671 * @context: Device wakeup context.
662 */ 672 */
663static void acpi_pm_notify_work_func(struct work_struct *work) 673static void acpi_pm_notify_work_func(struct acpi_device_wakeup_context *context)
664{ 674{
665 struct device *dev; 675 struct device *dev = context->dev;
666 676
667 dev = container_of(work, struct acpi_device_wakeup_context, work)->dev;
668 if (dev) { 677 if (dev) {
669 pm_wakeup_event(dev, 0); 678 pm_wakeup_event(dev, 0);
670 pm_runtime_resume(dev); 679 pm_request_resume(dev);
671 } 680 }
672} 681}
673 682
@@ -693,80 +702,53 @@ static int acpi_device_wakeup(struct acpi_device *adev, u32 target_state,
693 acpi_status res; 702 acpi_status res;
694 int error; 703 int error;
695 704
705 if (adev->wakeup.flags.enabled)
706 return 0;
707
696 error = acpi_enable_wakeup_device_power(adev, target_state); 708 error = acpi_enable_wakeup_device_power(adev, target_state);
697 if (error) 709 if (error)
698 return error; 710 return error;
699 711
700 if (adev->wakeup.flags.enabled)
701 return 0;
702
703 res = acpi_enable_gpe(wakeup->gpe_device, wakeup->gpe_number); 712 res = acpi_enable_gpe(wakeup->gpe_device, wakeup->gpe_number);
704 if (ACPI_SUCCESS(res)) { 713 if (ACPI_FAILURE(res)) {
705 adev->wakeup.flags.enabled = 1;
706 } else {
707 acpi_disable_wakeup_device_power(adev); 714 acpi_disable_wakeup_device_power(adev);
708 return -EIO; 715 return -EIO;
709 } 716 }
710 } else { 717 adev->wakeup.flags.enabled = 1;
711 if (adev->wakeup.flags.enabled) { 718 } else if (adev->wakeup.flags.enabled) {
712 acpi_disable_gpe(wakeup->gpe_device, wakeup->gpe_number); 719 acpi_disable_gpe(wakeup->gpe_device, wakeup->gpe_number);
713 adev->wakeup.flags.enabled = 0;
714 }
715 acpi_disable_wakeup_device_power(adev); 720 acpi_disable_wakeup_device_power(adev);
721 adev->wakeup.flags.enabled = 0;
716 } 722 }
717 return 0; 723 return 0;
718} 724}
719 725
720/** 726/**
721 * acpi_pm_device_run_wake - Enable/disable remote wakeup for given device. 727 * acpi_pm_set_device_wakeup - Enable/disable remote wakeup for given device.
722 * @dev: Device to enable/disable the platform to wake up. 728 * @dev: Device to enable/disable to generate wakeup events.
723 * @enable: Whether to enable or disable the wakeup functionality. 729 * @enable: Whether to enable or disable the wakeup functionality.
724 */ 730 */
725int acpi_pm_device_run_wake(struct device *phys_dev, bool enable) 731int acpi_pm_set_device_wakeup(struct device *dev, bool enable)
726{
727 struct acpi_device *adev;
728
729 if (!device_run_wake(phys_dev))
730 return -EINVAL;
731
732 adev = ACPI_COMPANION(phys_dev);
733 if (!adev) {
734 dev_dbg(phys_dev, "ACPI companion missing in %s!\n", __func__);
735 return -ENODEV;
736 }
737
738 return acpi_device_wakeup(adev, ACPI_STATE_S0, enable);
739}
740EXPORT_SYMBOL(acpi_pm_device_run_wake);
741
742#ifdef CONFIG_PM_SLEEP
743/**
744 * acpi_pm_device_sleep_wake - Enable or disable device to wake up the system.
745 * @dev: Device to enable/desible to wake up the system from sleep states.
746 * @enable: Whether to enable or disable @dev to wake up the system.
747 */
748int acpi_pm_device_sleep_wake(struct device *dev, bool enable)
749{ 732{
750 struct acpi_device *adev; 733 struct acpi_device *adev;
751 int error; 734 int error;
752 735
753 if (!device_can_wakeup(dev))
754 return -EINVAL;
755
756 adev = ACPI_COMPANION(dev); 736 adev = ACPI_COMPANION(dev);
757 if (!adev) { 737 if (!adev) {
758 dev_dbg(dev, "ACPI companion missing in %s!\n", __func__); 738 dev_dbg(dev, "ACPI companion missing in %s!\n", __func__);
759 return -ENODEV; 739 return -ENODEV;
760 } 740 }
761 741
742 if (!acpi_device_can_wakeup(adev))
743 return -EINVAL;
744
762 error = acpi_device_wakeup(adev, acpi_target_system_state(), enable); 745 error = acpi_device_wakeup(adev, acpi_target_system_state(), enable);
763 if (!error) 746 if (!error)
764 dev_info(dev, "System wakeup %s by ACPI\n", 747 dev_dbg(dev, "Wakeup %s by ACPI\n", enable ? "enabled" : "disabled");
765 enable ? "enabled" : "disabled");
766 748
767 return error; 749 return error;
768} 750}
769#endif /* CONFIG_PM_SLEEP */ 751EXPORT_SYMBOL(acpi_pm_set_device_wakeup);
770 752
771/** 753/**
772 * acpi_dev_pm_low_power - Put ACPI device into a low-power state. 754 * acpi_dev_pm_low_power - Put ACPI device into a low-power state.
diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index c24235d8fb52..156e15c35ffa 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -1835,7 +1835,7 @@ static int acpi_ec_suspend(struct device *dev)
1835 struct acpi_ec *ec = 1835 struct acpi_ec *ec =
1836 acpi_driver_data(to_acpi_device(dev)); 1836 acpi_driver_data(to_acpi_device(dev));
1837 1837
1838 if (ec_freeze_events) 1838 if (acpi_sleep_no_ec_events() && ec_freeze_events)
1839 acpi_ec_disable_event(ec); 1839 acpi_ec_disable_event(ec);
1840 return 0; 1840 return 0;
1841} 1841}
diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index 66229ffa909b..be79f7db1850 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -198,8 +198,12 @@ void acpi_ec_remove_query_handler(struct acpi_ec *ec, u8 query_bit);
198 Suspend/Resume 198 Suspend/Resume
199 -------------------------------------------------------------------------- */ 199 -------------------------------------------------------------------------- */
200#ifdef CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT 200#ifdef CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT
201extern bool acpi_s2idle_wakeup(void);
202extern bool acpi_sleep_no_ec_events(void);
201extern int acpi_sleep_init(void); 203extern int acpi_sleep_init(void);
202#else 204#else
205static inline bool acpi_s2idle_wakeup(void) { return false; }
206static inline bool acpi_sleep_no_ec_events(void) { return true; }
203static inline int acpi_sleep_init(void) { return -ENXIO; } 207static inline int acpi_sleep_init(void) { return -ENXIO; }
204#endif 208#endif
205 209
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 240544253ccd..9eec3095e6c3 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -608,8 +608,7 @@ static int acpi_pci_root_add(struct acpi_device *device,
608 pcie_no_aspm(); 608 pcie_no_aspm();
609 609
610 pci_acpi_add_bus_pm_notifier(device); 610 pci_acpi_add_bus_pm_notifier(device);
611 if (device->wakeup.flags.run_wake) 611 device_set_wakeup_capable(root->bus->bridge, device->wakeup.flags.valid);
612 device_set_run_wake(root->bus->bridge, true);
613 612
614 if (hotadd) { 613 if (hotadd) {
615 pcibios_resource_survey_bus(root->bus); 614 pcibios_resource_survey_bus(root->bus);
@@ -649,7 +648,7 @@ static void acpi_pci_root_remove(struct acpi_device *device)
649 pci_stop_root_bus(root->bus); 648 pci_stop_root_bus(root->bus);
650 649
651 pci_ioapic_remove(root); 650 pci_ioapic_remove(root);
652 device_set_run_wake(root->bus->bridge, false); 651 device_set_wakeup_capable(root->bus->bridge, false);
653 pci_acpi_remove_bus_pm_notifier(device); 652 pci_acpi_remove_bus_pm_notifier(device);
654 653
655 pci_remove_root_bus(root->bus); 654 pci_remove_root_bus(root->bus);
diff --git a/drivers/acpi/proc.c b/drivers/acpi/proc.c
index a34669cc823b..85ac848ac6ab 100644
--- a/drivers/acpi/proc.c
+++ b/drivers/acpi/proc.c
@@ -42,7 +42,7 @@ acpi_system_wakeup_device_seq_show(struct seq_file *seq, void *offset)
42 42
43 if (!dev->physical_node_count) { 43 if (!dev->physical_node_count) {
44 seq_printf(seq, "%c%-8s\n", 44 seq_printf(seq, "%c%-8s\n",
45 dev->wakeup.flags.run_wake ? '*' : ' ', 45 dev->wakeup.flags.valid ? '*' : ' ',
46 device_may_wakeup(&dev->dev) ? 46 device_may_wakeup(&dev->dev) ?
47 "enabled" : "disabled"); 47 "enabled" : "disabled");
48 } else { 48 } else {
@@ -58,7 +58,7 @@ acpi_system_wakeup_device_seq_show(struct seq_file *seq, void *offset)
58 seq_printf(seq, "\t\t"); 58 seq_printf(seq, "\t\t");
59 59
60 seq_printf(seq, "%c%-8s %s:%s\n", 60 seq_printf(seq, "%c%-8s %s:%s\n",
61 dev->wakeup.flags.run_wake ? '*' : ' ', 61 dev->wakeup.flags.valid ? '*' : ' ',
62 (device_may_wakeup(&dev->dev) || 62 (device_may_wakeup(&dev->dev) ||
63 device_may_wakeup(ldev)) ? 63 device_may_wakeup(ldev)) ?
64 "enabled" : "disabled", 64 "enabled" : "disabled",
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index d53162997f32..09f65f57bebe 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -835,7 +835,7 @@ static int acpi_bus_extract_wakeup_device_power_package(acpi_handle handle,
835 return err; 835 return err;
836} 836}
837 837
838static void acpi_wakeup_gpe_init(struct acpi_device *device) 838static bool acpi_wakeup_gpe_init(struct acpi_device *device)
839{ 839{
840 static const struct acpi_device_id button_device_ids[] = { 840 static const struct acpi_device_id button_device_ids[] = {
841 {"PNP0C0C", 0}, 841 {"PNP0C0C", 0},
@@ -845,13 +845,11 @@ static void acpi_wakeup_gpe_init(struct acpi_device *device)
845 }; 845 };
846 struct acpi_device_wakeup *wakeup = &device->wakeup; 846 struct acpi_device_wakeup *wakeup = &device->wakeup;
847 acpi_status status; 847 acpi_status status;
848 acpi_event_status event_status;
849 848
850 wakeup->flags.notifier_present = 0; 849 wakeup->flags.notifier_present = 0;
851 850
852 /* Power button, Lid switch always enable wakeup */ 851 /* Power button, Lid switch always enable wakeup */
853 if (!acpi_match_device_ids(device, button_device_ids)) { 852 if (!acpi_match_device_ids(device, button_device_ids)) {
854 wakeup->flags.run_wake = 1;
855 if (!acpi_match_device_ids(device, &button_device_ids[1])) { 853 if (!acpi_match_device_ids(device, &button_device_ids[1])) {
856 /* Do not use Lid/sleep button for S5 wakeup */ 854 /* Do not use Lid/sleep button for S5 wakeup */
857 if (wakeup->sleep_state == ACPI_STATE_S5) 855 if (wakeup->sleep_state == ACPI_STATE_S5)
@@ -859,17 +857,12 @@ static void acpi_wakeup_gpe_init(struct acpi_device *device)
859 } 857 }
860 acpi_mark_gpe_for_wake(wakeup->gpe_device, wakeup->gpe_number); 858 acpi_mark_gpe_for_wake(wakeup->gpe_device, wakeup->gpe_number);
861 device_set_wakeup_capable(&device->dev, true); 859 device_set_wakeup_capable(&device->dev, true);
862 return; 860 return true;
863 } 861 }
864 862
865 acpi_setup_gpe_for_wake(device->handle, wakeup->gpe_device, 863 status = acpi_setup_gpe_for_wake(device->handle, wakeup->gpe_device,
866 wakeup->gpe_number); 864 wakeup->gpe_number);
867 status = acpi_get_gpe_status(wakeup->gpe_device, wakeup->gpe_number, 865 return ACPI_SUCCESS(status);
868 &event_status);
869 if (ACPI_FAILURE(status))
870 return;
871
872 wakeup->flags.run_wake = !!(event_status & ACPI_EVENT_FLAG_HAS_HANDLER);
873} 866}
874 867
875static void acpi_bus_get_wakeup_device_flags(struct acpi_device *device) 868static void acpi_bus_get_wakeup_device_flags(struct acpi_device *device)
@@ -887,10 +880,10 @@ static void acpi_bus_get_wakeup_device_flags(struct acpi_device *device)
887 return; 880 return;
888 } 881 }
889 882
890 device->wakeup.flags.valid = 1; 883 device->wakeup.flags.valid = acpi_wakeup_gpe_init(device);
891 device->wakeup.prepare_count = 0; 884 device->wakeup.prepare_count = 0;
892 acpi_wakeup_gpe_init(device); 885 /*
893 /* Call _PSW/_DSW object to disable its ability to wake the sleeping 886 * Call _PSW/_DSW object to disable its ability to wake the sleeping
894 * system for the ACPI device with the _PRW object. 887 * system for the ACPI device with the _PRW object.
895 * The _PSW object is depreciated in ACPI 3.0 and is replaced by _DSW. 888 * The _PSW object is depreciated in ACPI 3.0 and is replaced by _DSW.
896 * So it is necessary to call _DSW object first. Only when it is not 889 * So it is necessary to call _DSW object first. Only when it is not
diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
index 097d630ab886..be17664736b2 100644
--- a/drivers/acpi/sleep.c
+++ b/drivers/acpi/sleep.c
@@ -650,38 +650,165 @@ static const struct platform_suspend_ops acpi_suspend_ops_old = {
650 .recover = acpi_pm_finish, 650 .recover = acpi_pm_finish,
651}; 651};
652 652
653static bool s2idle_in_progress;
654static bool s2idle_wakeup;
655
656/*
657 * On platforms supporting the Low Power S0 Idle interface there is an ACPI
658 * device object with the PNP0D80 compatible device ID (System Power Management
659 * Controller) and a specific _DSM method under it. That method, if present,
660 * can be used to indicate to the platform that the OS is transitioning into a
661 * low-power state in which certain types of activity are not desirable or that
662 * it is leaving such a state, which allows the platform to adjust its operation
663 * mode accordingly.
664 */
665static const struct acpi_device_id lps0_device_ids[] = {
666 {"PNP0D80", },
667 {"", },
668};
669
670#define ACPI_LPS0_DSM_UUID "c4eb40a0-6cd2-11e2-bcfd-0800200c9a66"
671
672#define ACPI_LPS0_SCREEN_OFF 3
673#define ACPI_LPS0_SCREEN_ON 4
674#define ACPI_LPS0_ENTRY 5
675#define ACPI_LPS0_EXIT 6
676
677#define ACPI_S2IDLE_FUNC_MASK ((1 << ACPI_LPS0_ENTRY) | (1 << ACPI_LPS0_EXIT))
678
679static acpi_handle lps0_device_handle;
680static guid_t lps0_dsm_guid;
681static char lps0_dsm_func_mask;
682
683static void acpi_sleep_run_lps0_dsm(unsigned int func)
684{
685 union acpi_object *out_obj;
686
687 if (!(lps0_dsm_func_mask & (1 << func)))
688 return;
689
690 out_obj = acpi_evaluate_dsm(lps0_device_handle, &lps0_dsm_guid, 1, func, NULL);
691 ACPI_FREE(out_obj);
692
693 acpi_handle_debug(lps0_device_handle, "_DSM function %u evaluation %s\n",
694 func, out_obj ? "successful" : "failed");
695}
696
697static int lps0_device_attach(struct acpi_device *adev,
698 const struct acpi_device_id *not_used)
699{
700 union acpi_object *out_obj;
701
702 if (lps0_device_handle)
703 return 0;
704
705 if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0))
706 return 0;
707
708 guid_parse(ACPI_LPS0_DSM_UUID, &lps0_dsm_guid);
709 /* Check if the _DSM is present and as expected. */
710 out_obj = acpi_evaluate_dsm(adev->handle, &lps0_dsm_guid, 1, 0, NULL);
711 if (out_obj && out_obj->type == ACPI_TYPE_BUFFER) {
712 char bitmask = *(char *)out_obj->buffer.pointer;
713
714 if ((bitmask & ACPI_S2IDLE_FUNC_MASK) == ACPI_S2IDLE_FUNC_MASK) {
715 lps0_dsm_func_mask = bitmask;
716 lps0_device_handle = adev->handle;
717 }
718
719 acpi_handle_debug(adev->handle, "_DSM function mask: 0x%x\n",
720 bitmask);
721 } else {
722 acpi_handle_debug(adev->handle,
723 "_DSM function 0 evaluation failed\n");
724 }
725 ACPI_FREE(out_obj);
726 return 0;
727}
728
729static struct acpi_scan_handler lps0_handler = {
730 .ids = lps0_device_ids,
731 .attach = lps0_device_attach,
732};
733
653static int acpi_freeze_begin(void) 734static int acpi_freeze_begin(void)
654{ 735{
655 acpi_scan_lock_acquire(); 736 acpi_scan_lock_acquire();
737 s2idle_in_progress = true;
656 return 0; 738 return 0;
657} 739}
658 740
659static int acpi_freeze_prepare(void) 741static int acpi_freeze_prepare(void)
660{ 742{
661 acpi_enable_wakeup_devices(ACPI_STATE_S0); 743 if (lps0_device_handle) {
662 acpi_enable_all_wakeup_gpes(); 744 acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_OFF);
663 acpi_os_wait_events_complete(); 745 acpi_sleep_run_lps0_dsm(ACPI_LPS0_ENTRY);
746 } else {
747 /*
748 * The configuration of GPEs is changed here to avoid spurious
749 * wakeups, but that should not be necessary if this is a
750 * "low-power S0" platform and the low-power S0 _DSM is present.
751 */
752 acpi_enable_all_wakeup_gpes();
753 acpi_os_wait_events_complete();
754 }
664 if (acpi_sci_irq_valid()) 755 if (acpi_sci_irq_valid())
665 enable_irq_wake(acpi_sci_irq); 756 enable_irq_wake(acpi_sci_irq);
757
666 return 0; 758 return 0;
667} 759}
668 760
761static void acpi_freeze_wake(void)
762{
763 /*
764 * If IRQD_WAKEUP_ARMED is not set for the SCI at this point, it means
765 * that the SCI has triggered while suspended, so cancel the wakeup in
766 * case it has not been a wakeup event (the GPEs will be checked later).
767 */
768 if (acpi_sci_irq_valid() &&
769 !irqd_is_wakeup_armed(irq_get_irq_data(acpi_sci_irq))) {
770 pm_system_cancel_wakeup();
771 s2idle_wakeup = true;
772 }
773}
774
775static void acpi_freeze_sync(void)
776{
777 /*
778 * Process all pending events in case there are any wakeup ones.
779 *
780 * The EC driver uses the system workqueue, so that one needs to be
781 * flushed too.
782 */
783 acpi_os_wait_events_complete();
784 flush_scheduled_work();
785 s2idle_wakeup = false;
786}
787
669static void acpi_freeze_restore(void) 788static void acpi_freeze_restore(void)
670{ 789{
671 acpi_disable_wakeup_devices(ACPI_STATE_S0);
672 if (acpi_sci_irq_valid()) 790 if (acpi_sci_irq_valid())
673 disable_irq_wake(acpi_sci_irq); 791 disable_irq_wake(acpi_sci_irq);
674 acpi_enable_all_runtime_gpes(); 792
793 if (lps0_device_handle) {
794 acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT);
795 acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_ON);
796 } else {
797 acpi_enable_all_runtime_gpes();
798 }
675} 799}
676 800
677static void acpi_freeze_end(void) 801static void acpi_freeze_end(void)
678{ 802{
803 s2idle_in_progress = false;
679 acpi_scan_lock_release(); 804 acpi_scan_lock_release();
680} 805}
681 806
682static const struct platform_freeze_ops acpi_freeze_ops = { 807static const struct platform_freeze_ops acpi_freeze_ops = {
683 .begin = acpi_freeze_begin, 808 .begin = acpi_freeze_begin,
684 .prepare = acpi_freeze_prepare, 809 .prepare = acpi_freeze_prepare,
810 .wake = acpi_freeze_wake,
811 .sync = acpi_freeze_sync,
685 .restore = acpi_freeze_restore, 812 .restore = acpi_freeze_restore,
686 .end = acpi_freeze_end, 813 .end = acpi_freeze_end,
687}; 814};
@@ -696,13 +823,28 @@ static void acpi_sleep_suspend_setup(void)
696 823
697 suspend_set_ops(old_suspend_ordering ? 824 suspend_set_ops(old_suspend_ordering ?
698 &acpi_suspend_ops_old : &acpi_suspend_ops); 825 &acpi_suspend_ops_old : &acpi_suspend_ops);
826
827 acpi_scan_add_handler(&lps0_handler);
699 freeze_set_ops(&acpi_freeze_ops); 828 freeze_set_ops(&acpi_freeze_ops);
700} 829}
701 830
702#else /* !CONFIG_SUSPEND */ 831#else /* !CONFIG_SUSPEND */
832#define s2idle_in_progress (false)
833#define s2idle_wakeup (false)
834#define lps0_device_handle (NULL)
703static inline void acpi_sleep_suspend_setup(void) {} 835static inline void acpi_sleep_suspend_setup(void) {}
704#endif /* !CONFIG_SUSPEND */ 836#endif /* !CONFIG_SUSPEND */
705 837
838bool acpi_s2idle_wakeup(void)
839{
840 return s2idle_wakeup;
841}
842
843bool acpi_sleep_no_ec_events(void)
844{
845 return !s2idle_in_progress || !lps0_device_handle;
846}
847
706#ifdef CONFIG_PM_SLEEP 848#ifdef CONFIG_PM_SLEEP
707static u32 saved_bm_rld; 849static u32 saved_bm_rld;
708 850
diff --git a/drivers/ata/libata-zpodd.c b/drivers/ata/libata-zpodd.c
index f3a65a3140d3..8a01d09ac4db 100644
--- a/drivers/ata/libata-zpodd.c
+++ b/drivers/ata/libata-zpodd.c
@@ -174,8 +174,7 @@ void zpodd_enable_run_wake(struct ata_device *dev)
174 sdev_disable_disk_events(dev->sdev); 174 sdev_disable_disk_events(dev->sdev);
175 175
176 zpodd->powered_off = true; 176 zpodd->powered_off = true;
177 device_set_run_wake(&dev->tdev, true); 177 acpi_pm_set_device_wakeup(&dev->tdev, true);
178 acpi_pm_device_run_wake(&dev->tdev, true);
179} 178}
180 179
181/* Disable runtime wake capability if it is enabled */ 180/* Disable runtime wake capability if it is enabled */
@@ -183,10 +182,8 @@ void zpodd_disable_run_wake(struct ata_device *dev)
183{ 182{
184 struct zpodd *zpodd = dev->zpodd; 183 struct zpodd *zpodd = dev->zpodd;
185 184
186 if (zpodd->powered_off) { 185 if (zpodd->powered_off)
187 acpi_pm_device_run_wake(&dev->tdev, false); 186 acpi_pm_set_device_wakeup(&dev->tdev, false);
188 device_set_run_wake(&dev->tdev, false);
189 }
190} 187}
191 188
192/* 189/*
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index da49a8383dc3..b8e4b966c74d 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -126,7 +126,7 @@ static const struct genpd_lock_ops genpd_spin_ops = {
126#define genpd_is_always_on(genpd) (genpd->flags & GENPD_FLAG_ALWAYS_ON) 126#define genpd_is_always_on(genpd) (genpd->flags & GENPD_FLAG_ALWAYS_ON)
127 127
128static inline bool irq_safe_dev_in_no_sleep_domain(struct device *dev, 128static inline bool irq_safe_dev_in_no_sleep_domain(struct device *dev,
129 struct generic_pm_domain *genpd) 129 const struct generic_pm_domain *genpd)
130{ 130{
131 bool ret; 131 bool ret;
132 132
@@ -181,12 +181,14 @@ static struct generic_pm_domain *dev_to_genpd(struct device *dev)
181 return pd_to_genpd(dev->pm_domain); 181 return pd_to_genpd(dev->pm_domain);
182} 182}
183 183
184static int genpd_stop_dev(struct generic_pm_domain *genpd, struct device *dev) 184static int genpd_stop_dev(const struct generic_pm_domain *genpd,
185 struct device *dev)
185{ 186{
186 return GENPD_DEV_CALLBACK(genpd, int, stop, dev); 187 return GENPD_DEV_CALLBACK(genpd, int, stop, dev);
187} 188}
188 189
189static int genpd_start_dev(struct generic_pm_domain *genpd, struct device *dev) 190static int genpd_start_dev(const struct generic_pm_domain *genpd,
191 struct device *dev)
190{ 192{
191 return GENPD_DEV_CALLBACK(genpd, int, start, dev); 193 return GENPD_DEV_CALLBACK(genpd, int, start, dev);
192} 194}
@@ -443,7 +445,7 @@ static int genpd_dev_pm_qos_notifier(struct notifier_block *nb,
443 445
444 pdd = dev->power.subsys_data ? 446 pdd = dev->power.subsys_data ?
445 dev->power.subsys_data->domain_data : NULL; 447 dev->power.subsys_data->domain_data : NULL;
446 if (pdd && pdd->dev) { 448 if (pdd) {
447 to_gpd_data(pdd)->td.constraint_changed = true; 449 to_gpd_data(pdd)->td.constraint_changed = true;
448 genpd = dev_to_genpd(dev); 450 genpd = dev_to_genpd(dev);
449 } else { 451 } else {
@@ -738,7 +740,7 @@ static bool pm_genpd_present(const struct generic_pm_domain *genpd)
738 740
739#ifdef CONFIG_PM_SLEEP 741#ifdef CONFIG_PM_SLEEP
740 742
741static bool genpd_dev_active_wakeup(struct generic_pm_domain *genpd, 743static bool genpd_dev_active_wakeup(const struct generic_pm_domain *genpd,
742 struct device *dev) 744 struct device *dev)
743{ 745{
744 return GENPD_DEV_CALLBACK(genpd, bool, active_wakeup, dev); 746 return GENPD_DEV_CALLBACK(genpd, bool, active_wakeup, dev);
@@ -840,7 +842,8 @@ static void genpd_sync_power_on(struct generic_pm_domain *genpd, bool use_lock,
840 * signal remote wakeup from the system's working state as needed by runtime PM. 842 * signal remote wakeup from the system's working state as needed by runtime PM.
841 * Return 'true' in either of the above cases. 843 * Return 'true' in either of the above cases.
842 */ 844 */
843static bool resume_needed(struct device *dev, struct generic_pm_domain *genpd) 845static bool resume_needed(struct device *dev,
846 const struct generic_pm_domain *genpd)
844{ 847{
845 bool active_wakeup; 848 bool active_wakeup;
846 849
@@ -899,19 +902,19 @@ static int pm_genpd_prepare(struct device *dev)
899} 902}
900 903
901/** 904/**
902 * pm_genpd_suspend_noirq - Completion of suspend of device in an I/O PM domain. 905 * genpd_finish_suspend - Completion of suspend or hibernation of device in an
906 * I/O pm domain.
903 * @dev: Device to suspend. 907 * @dev: Device to suspend.
908 * @poweroff: Specifies if this is a poweroff_noirq or suspend_noirq callback.
904 * 909 *
905 * Stop the device and remove power from the domain if all devices in it have 910 * Stop the device and remove power from the domain if all devices in it have
906 * been stopped. 911 * been stopped.
907 */ 912 */
908static int pm_genpd_suspend_noirq(struct device *dev) 913static int genpd_finish_suspend(struct device *dev, bool poweroff)
909{ 914{
910 struct generic_pm_domain *genpd; 915 struct generic_pm_domain *genpd;
911 int ret; 916 int ret;
912 917
913 dev_dbg(dev, "%s()\n", __func__);
914
915 genpd = dev_to_genpd(dev); 918 genpd = dev_to_genpd(dev);
916 if (IS_ERR(genpd)) 919 if (IS_ERR(genpd))
917 return -EINVAL; 920 return -EINVAL;
@@ -919,6 +922,13 @@ static int pm_genpd_suspend_noirq(struct device *dev)
919 if (dev->power.wakeup_path && genpd_dev_active_wakeup(genpd, dev)) 922 if (dev->power.wakeup_path && genpd_dev_active_wakeup(genpd, dev))
920 return 0; 923 return 0;
921 924
925 if (poweroff)
926 ret = pm_generic_poweroff_noirq(dev);
927 else
928 ret = pm_generic_suspend_noirq(dev);
929 if (ret)
930 return ret;
931
922 if (genpd->dev_ops.stop && genpd->dev_ops.start) { 932 if (genpd->dev_ops.stop && genpd->dev_ops.start) {
923 ret = pm_runtime_force_suspend(dev); 933 ret = pm_runtime_force_suspend(dev);
924 if (ret) 934 if (ret)
@@ -934,6 +944,20 @@ static int pm_genpd_suspend_noirq(struct device *dev)
934} 944}
935 945
936/** 946/**
947 * pm_genpd_suspend_noirq - Completion of suspend of device in an I/O PM domain.
948 * @dev: Device to suspend.
949 *
950 * Stop the device and remove power from the domain if all devices in it have
951 * been stopped.
952 */
953static int pm_genpd_suspend_noirq(struct device *dev)
954{
955 dev_dbg(dev, "%s()\n", __func__);
956
957 return genpd_finish_suspend(dev, false);
958}
959
960/**
937 * pm_genpd_resume_noirq - Start of resume of device in an I/O PM domain. 961 * pm_genpd_resume_noirq - Start of resume of device in an I/O PM domain.
938 * @dev: Device to resume. 962 * @dev: Device to resume.
939 * 963 *
@@ -961,6 +985,10 @@ static int pm_genpd_resume_noirq(struct device *dev)
961 if (genpd->dev_ops.stop && genpd->dev_ops.start) 985 if (genpd->dev_ops.stop && genpd->dev_ops.start)
962 ret = pm_runtime_force_resume(dev); 986 ret = pm_runtime_force_resume(dev);
963 987
988 ret = pm_generic_resume_noirq(dev);
989 if (ret)
990 return ret;
991
964 return ret; 992 return ret;
965} 993}
966 994
@@ -975,7 +1003,7 @@ static int pm_genpd_resume_noirq(struct device *dev)
975 */ 1003 */
976static int pm_genpd_freeze_noirq(struct device *dev) 1004static int pm_genpd_freeze_noirq(struct device *dev)
977{ 1005{
978 struct generic_pm_domain *genpd; 1006 const struct generic_pm_domain *genpd;
979 int ret = 0; 1007 int ret = 0;
980 1008
981 dev_dbg(dev, "%s()\n", __func__); 1009 dev_dbg(dev, "%s()\n", __func__);
@@ -984,6 +1012,10 @@ static int pm_genpd_freeze_noirq(struct device *dev)
984 if (IS_ERR(genpd)) 1012 if (IS_ERR(genpd))
985 return -EINVAL; 1013 return -EINVAL;
986 1014
1015 ret = pm_generic_freeze_noirq(dev);
1016 if (ret)
1017 return ret;
1018
987 if (genpd->dev_ops.stop && genpd->dev_ops.start) 1019 if (genpd->dev_ops.stop && genpd->dev_ops.start)
988 ret = pm_runtime_force_suspend(dev); 1020 ret = pm_runtime_force_suspend(dev);
989 1021
@@ -999,7 +1031,7 @@ static int pm_genpd_freeze_noirq(struct device *dev)
999 */ 1031 */
1000static int pm_genpd_thaw_noirq(struct device *dev) 1032static int pm_genpd_thaw_noirq(struct device *dev)
1001{ 1033{
1002 struct generic_pm_domain *genpd; 1034 const struct generic_pm_domain *genpd;
1003 int ret = 0; 1035 int ret = 0;
1004 1036
1005 dev_dbg(dev, "%s()\n", __func__); 1037 dev_dbg(dev, "%s()\n", __func__);
@@ -1008,10 +1040,28 @@ static int pm_genpd_thaw_noirq(struct device *dev)
1008 if (IS_ERR(genpd)) 1040 if (IS_ERR(genpd))
1009 return -EINVAL; 1041 return -EINVAL;
1010 1042
1011 if (genpd->dev_ops.stop && genpd->dev_ops.start) 1043 if (genpd->dev_ops.stop && genpd->dev_ops.start) {
1012 ret = pm_runtime_force_resume(dev); 1044 ret = pm_runtime_force_resume(dev);
1045 if (ret)
1046 return ret;
1047 }
1013 1048
1014 return ret; 1049 return pm_generic_thaw_noirq(dev);
1050}
1051
1052/**
1053 * pm_genpd_poweroff_noirq - Completion of hibernation of device in an
1054 * I/O PM domain.
1055 * @dev: Device to poweroff.
1056 *
1057 * Stop the device and remove power from the domain if all devices in it have
1058 * been stopped.
1059 */
1060static int pm_genpd_poweroff_noirq(struct device *dev)
1061{
1062 dev_dbg(dev, "%s()\n", __func__);
1063
1064 return genpd_finish_suspend(dev, true);
1015} 1065}
1016 1066
1017/** 1067/**
@@ -1048,10 +1098,13 @@ static int pm_genpd_restore_noirq(struct device *dev)
1048 genpd_sync_power_on(genpd, true, 0); 1098 genpd_sync_power_on(genpd, true, 0);
1049 genpd_unlock(genpd); 1099 genpd_unlock(genpd);
1050 1100
1051 if (genpd->dev_ops.stop && genpd->dev_ops.start) 1101 if (genpd->dev_ops.stop && genpd->dev_ops.start) {
1052 ret = pm_runtime_force_resume(dev); 1102 ret = pm_runtime_force_resume(dev);
1103 if (ret)
1104 return ret;
1105 }
1053 1106
1054 return ret; 1107 return pm_generic_restore_noirq(dev);
1055} 1108}
1056 1109
1057/** 1110/**
@@ -1095,8 +1148,8 @@ static void genpd_syscore_switch(struct device *dev, bool suspend)
1095{ 1148{
1096 struct generic_pm_domain *genpd; 1149 struct generic_pm_domain *genpd;
1097 1150
1098 genpd = dev_to_genpd(dev); 1151 genpd = genpd_lookup_dev(dev);
1099 if (!pm_genpd_present(genpd)) 1152 if (!genpd)
1100 return; 1153 return;
1101 1154
1102 if (suspend) { 1155 if (suspend) {
@@ -1393,7 +1446,7 @@ EXPORT_SYMBOL_GPL(pm_genpd_add_subdomain);
1393int pm_genpd_remove_subdomain(struct generic_pm_domain *genpd, 1446int pm_genpd_remove_subdomain(struct generic_pm_domain *genpd,
1394 struct generic_pm_domain *subdomain) 1447 struct generic_pm_domain *subdomain)
1395{ 1448{
1396 struct gpd_link *link; 1449 struct gpd_link *l, *link;
1397 int ret = -EINVAL; 1450 int ret = -EINVAL;
1398 1451
1399 if (IS_ERR_OR_NULL(genpd) || IS_ERR_OR_NULL(subdomain)) 1452 if (IS_ERR_OR_NULL(genpd) || IS_ERR_OR_NULL(subdomain))
@@ -1409,7 +1462,7 @@ int pm_genpd_remove_subdomain(struct generic_pm_domain *genpd,
1409 goto out; 1462 goto out;
1410 } 1463 }
1411 1464
1412 list_for_each_entry(link, &genpd->master_links, master_node) { 1465 list_for_each_entry_safe(link, l, &genpd->master_links, master_node) {
1413 if (link->slave != subdomain) 1466 if (link->slave != subdomain)
1414 continue; 1467 continue;
1415 1468
@@ -1493,7 +1546,7 @@ int pm_genpd_init(struct generic_pm_domain *genpd,
1493 genpd->domain.ops.resume_noirq = pm_genpd_resume_noirq; 1546 genpd->domain.ops.resume_noirq = pm_genpd_resume_noirq;
1494 genpd->domain.ops.freeze_noirq = pm_genpd_freeze_noirq; 1547 genpd->domain.ops.freeze_noirq = pm_genpd_freeze_noirq;
1495 genpd->domain.ops.thaw_noirq = pm_genpd_thaw_noirq; 1548 genpd->domain.ops.thaw_noirq = pm_genpd_thaw_noirq;
1496 genpd->domain.ops.poweroff_noirq = pm_genpd_suspend_noirq; 1549 genpd->domain.ops.poweroff_noirq = pm_genpd_poweroff_noirq;
1497 genpd->domain.ops.restore_noirq = pm_genpd_restore_noirq; 1550 genpd->domain.ops.restore_noirq = pm_genpd_restore_noirq;
1498 genpd->domain.ops.complete = pm_genpd_complete; 1551 genpd->domain.ops.complete = pm_genpd_complete;
1499 1552
@@ -1780,12 +1833,12 @@ EXPORT_SYMBOL_GPL(of_genpd_add_provider_onecell);
1780 */ 1833 */
1781void of_genpd_del_provider(struct device_node *np) 1834void of_genpd_del_provider(struct device_node *np)
1782{ 1835{
1783 struct of_genpd_provider *cp; 1836 struct of_genpd_provider *cp, *tmp;
1784 struct generic_pm_domain *gpd; 1837 struct generic_pm_domain *gpd;
1785 1838
1786 mutex_lock(&gpd_list_lock); 1839 mutex_lock(&gpd_list_lock);
1787 mutex_lock(&of_genpd_mutex); 1840 mutex_lock(&of_genpd_mutex);
1788 list_for_each_entry(cp, &of_genpd_providers, link) { 1841 list_for_each_entry_safe(cp, tmp, &of_genpd_providers, link) {
1789 if (cp->node == np) { 1842 if (cp->node == np) {
1790 /* 1843 /*
1791 * For each PM domain associated with the 1844 * For each PM domain associated with the
@@ -1925,14 +1978,14 @@ EXPORT_SYMBOL_GPL(of_genpd_add_subdomain);
1925 */ 1978 */
1926struct generic_pm_domain *of_genpd_remove_last(struct device_node *np) 1979struct generic_pm_domain *of_genpd_remove_last(struct device_node *np)
1927{ 1980{
1928 struct generic_pm_domain *gpd, *genpd = ERR_PTR(-ENOENT); 1981 struct generic_pm_domain *gpd, *tmp, *genpd = ERR_PTR(-ENOENT);
1929 int ret; 1982 int ret;
1930 1983
1931 if (IS_ERR_OR_NULL(np)) 1984 if (IS_ERR_OR_NULL(np))
1932 return ERR_PTR(-EINVAL); 1985 return ERR_PTR(-EINVAL);
1933 1986
1934 mutex_lock(&gpd_list_lock); 1987 mutex_lock(&gpd_list_lock);
1935 list_for_each_entry(gpd, &gpd_list, gpd_list_node) { 1988 list_for_each_entry_safe(gpd, tmp, &gpd_list, gpd_list_node) {
1936 if (gpd->provider == &np->fwnode) { 1989 if (gpd->provider == &np->fwnode) {
1937 ret = genpd_remove(gpd); 1990 ret = genpd_remove(gpd);
1938 genpd = ret ? ERR_PTR(ret) : gpd; 1991 genpd = ret ? ERR_PTR(ret) : gpd;
diff --git a/drivers/base/power/domain_governor.c b/drivers/base/power/domain_governor.c
index 2e0fce711135..281f949c5ffe 100644
--- a/drivers/base/power/domain_governor.c
+++ b/drivers/base/power/domain_governor.c
@@ -92,12 +92,6 @@ static bool default_suspend_ok(struct device *dev)
92 return td->cached_suspend_ok; 92 return td->cached_suspend_ok;
93} 93}
94 94
95/**
96 * default_power_down_ok - Default generic PM domain power off governor routine.
97 * @pd: PM domain to check.
98 *
99 * This routine must be executed under the PM domain's lock.
100 */
101static bool __default_power_down_ok(struct dev_pm_domain *pd, 95static bool __default_power_down_ok(struct dev_pm_domain *pd,
102 unsigned int state) 96 unsigned int state)
103{ 97{
@@ -187,6 +181,12 @@ static bool __default_power_down_ok(struct dev_pm_domain *pd,
187 return true; 181 return true;
188} 182}
189 183
184/**
185 * default_power_down_ok - Default generic PM domain power off governor routine.
186 * @pd: PM domain to check.
187 *
188 * This routine must be executed under the PM domain's lock.
189 */
190static bool default_power_down_ok(struct dev_pm_domain *pd) 190static bool default_power_down_ok(struct dev_pm_domain *pd)
191{ 191{
192 struct generic_pm_domain *genpd = pd_to_genpd(pd); 192 struct generic_pm_domain *genpd = pd_to_genpd(pd);
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 9faee1c893e5..c99f8730de82 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -62,7 +62,7 @@ static pm_message_t pm_transition;
62 62
63static int async_error; 63static int async_error;
64 64
65static char *pm_verb(int event) 65static const char *pm_verb(int event)
66{ 66{
67 switch (event) { 67 switch (event) {
68 case PM_EVENT_SUSPEND: 68 case PM_EVENT_SUSPEND:
@@ -208,7 +208,8 @@ static ktime_t initcall_debug_start(struct device *dev)
208} 208}
209 209
210static void initcall_debug_report(struct device *dev, ktime_t calltime, 210static void initcall_debug_report(struct device *dev, ktime_t calltime,
211 int error, pm_message_t state, char *info) 211 int error, pm_message_t state,
212 const char *info)
212{ 213{
213 ktime_t rettime; 214 ktime_t rettime;
214 s64 nsecs; 215 s64 nsecs;
@@ -403,21 +404,23 @@ static pm_callback_t pm_noirq_op(const struct dev_pm_ops *ops, pm_message_t stat
403 return NULL; 404 return NULL;
404} 405}
405 406
406static void pm_dev_dbg(struct device *dev, pm_message_t state, char *info) 407static void pm_dev_dbg(struct device *dev, pm_message_t state, const char *info)
407{ 408{
408 dev_dbg(dev, "%s%s%s\n", info, pm_verb(state.event), 409 dev_dbg(dev, "%s%s%s\n", info, pm_verb(state.event),
409 ((state.event & PM_EVENT_SLEEP) && device_may_wakeup(dev)) ? 410 ((state.event & PM_EVENT_SLEEP) && device_may_wakeup(dev)) ?
410 ", may wakeup" : ""); 411 ", may wakeup" : "");
411} 412}
412 413
413static void pm_dev_err(struct device *dev, pm_message_t state, char *info, 414static void pm_dev_err(struct device *dev, pm_message_t state, const char *info,
414 int error) 415 int error)
415{ 416{
416 printk(KERN_ERR "PM: Device %s failed to %s%s: error %d\n", 417 printk(KERN_ERR "PM: Device %s failed to %s%s: error %d\n",
417 dev_name(dev), pm_verb(state.event), info, error); 418 dev_name(dev), pm_verb(state.event), info, error);
418} 419}
419 420
420static void dpm_show_time(ktime_t starttime, pm_message_t state, char *info) 421#ifdef CONFIG_PM_DEBUG
422static void dpm_show_time(ktime_t starttime, pm_message_t state,
423 const char *info)
421{ 424{
422 ktime_t calltime; 425 ktime_t calltime;
423 u64 usecs64; 426 u64 usecs64;
@@ -433,9 +436,13 @@ static void dpm_show_time(ktime_t starttime, pm_message_t state, char *info)
433 info ?: "", info ? " " : "", pm_verb(state.event), 436 info ?: "", info ? " " : "", pm_verb(state.event),
434 usecs / USEC_PER_MSEC, usecs % USEC_PER_MSEC); 437 usecs / USEC_PER_MSEC, usecs % USEC_PER_MSEC);
435} 438}
439#else
440static inline void dpm_show_time(ktime_t starttime, pm_message_t state,
441 const char *info) {}
442#endif /* CONFIG_PM_DEBUG */
436 443
437static int dpm_run_callback(pm_callback_t cb, struct device *dev, 444static int dpm_run_callback(pm_callback_t cb, struct device *dev,
438 pm_message_t state, char *info) 445 pm_message_t state, const char *info)
439{ 446{
440 ktime_t calltime; 447 ktime_t calltime;
441 int error; 448 int error;
@@ -535,7 +542,7 @@ static void dpm_watchdog_clear(struct dpm_watchdog *wd)
535static int device_resume_noirq(struct device *dev, pm_message_t state, bool async) 542static int device_resume_noirq(struct device *dev, pm_message_t state, bool async)
536{ 543{
537 pm_callback_t callback = NULL; 544 pm_callback_t callback = NULL;
538 char *info = NULL; 545 const char *info = NULL;
539 int error = 0; 546 int error = 0;
540 547
541 TRACE_DEVICE(dev); 548 TRACE_DEVICE(dev);
@@ -665,7 +672,7 @@ void dpm_resume_noirq(pm_message_t state)
665static int device_resume_early(struct device *dev, pm_message_t state, bool async) 672static int device_resume_early(struct device *dev, pm_message_t state, bool async)
666{ 673{
667 pm_callback_t callback = NULL; 674 pm_callback_t callback = NULL;
668 char *info = NULL; 675 const char *info = NULL;
669 int error = 0; 676 int error = 0;
670 677
671 TRACE_DEVICE(dev); 678 TRACE_DEVICE(dev);
@@ -793,7 +800,7 @@ EXPORT_SYMBOL_GPL(dpm_resume_start);
793static int device_resume(struct device *dev, pm_message_t state, bool async) 800static int device_resume(struct device *dev, pm_message_t state, bool async)
794{ 801{
795 pm_callback_t callback = NULL; 802 pm_callback_t callback = NULL;
796 char *info = NULL; 803 const char *info = NULL;
797 int error = 0; 804 int error = 0;
798 DECLARE_DPM_WATCHDOG_ON_STACK(wd); 805 DECLARE_DPM_WATCHDOG_ON_STACK(wd);
799 806
@@ -955,7 +962,7 @@ void dpm_resume(pm_message_t state)
955static void device_complete(struct device *dev, pm_message_t state) 962static void device_complete(struct device *dev, pm_message_t state)
956{ 963{
957 void (*callback)(struct device *) = NULL; 964 void (*callback)(struct device *) = NULL;
958 char *info = NULL; 965 const char *info = NULL;
959 966
960 if (dev->power.syscore) 967 if (dev->power.syscore)
961 return; 968 return;
@@ -1080,7 +1087,7 @@ static pm_message_t resume_event(pm_message_t sleep_state)
1080static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool async) 1087static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool async)
1081{ 1088{
1082 pm_callback_t callback = NULL; 1089 pm_callback_t callback = NULL;
1083 char *info = NULL; 1090 const char *info = NULL;
1084 int error = 0; 1091 int error = 0;
1085 1092
1086 TRACE_DEVICE(dev); 1093 TRACE_DEVICE(dev);
@@ -1091,11 +1098,6 @@ static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool a
1091 if (async_error) 1098 if (async_error)
1092 goto Complete; 1099 goto Complete;
1093 1100
1094 if (pm_wakeup_pending()) {
1095 async_error = -EBUSY;
1096 goto Complete;
1097 }
1098
1099 if (dev->power.syscore || dev->power.direct_complete) 1101 if (dev->power.syscore || dev->power.direct_complete)
1100 goto Complete; 1102 goto Complete;
1101 1103
@@ -1225,7 +1227,7 @@ int dpm_suspend_noirq(pm_message_t state)
1225static int __device_suspend_late(struct device *dev, pm_message_t state, bool async) 1227static int __device_suspend_late(struct device *dev, pm_message_t state, bool async)
1226{ 1228{
1227 pm_callback_t callback = NULL; 1229 pm_callback_t callback = NULL;
1228 char *info = NULL; 1230 const char *info = NULL;
1229 int error = 0; 1231 int error = 0;
1230 1232
1231 TRACE_DEVICE(dev); 1233 TRACE_DEVICE(dev);
@@ -1384,7 +1386,7 @@ EXPORT_SYMBOL_GPL(dpm_suspend_end);
1384 */ 1386 */
1385static int legacy_suspend(struct device *dev, pm_message_t state, 1387static int legacy_suspend(struct device *dev, pm_message_t state,
1386 int (*cb)(struct device *dev, pm_message_t state), 1388 int (*cb)(struct device *dev, pm_message_t state),
1387 char *info) 1389 const char *info)
1388{ 1390{
1389 int error; 1391 int error;
1390 ktime_t calltime; 1392 ktime_t calltime;
@@ -1426,7 +1428,7 @@ static void dpm_clear_suppliers_direct_complete(struct device *dev)
1426static int __device_suspend(struct device *dev, pm_message_t state, bool async) 1428static int __device_suspend(struct device *dev, pm_message_t state, bool async)
1427{ 1429{
1428 pm_callback_t callback = NULL; 1430 pm_callback_t callback = NULL;
1429 char *info = NULL; 1431 const char *info = NULL;
1430 int error = 0; 1432 int error = 0;
1431 DECLARE_DPM_WATCHDOG_ON_STACK(wd); 1433 DECLARE_DPM_WATCHDOG_ON_STACK(wd);
1432 1434
diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index dae61720b314..a8cc14fd8ae4 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -180,7 +180,7 @@ unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev)
180{ 180{
181 struct opp_table *opp_table; 181 struct opp_table *opp_table;
182 struct dev_pm_opp *opp; 182 struct dev_pm_opp *opp;
183 struct regulator *reg, **regulators; 183 struct regulator *reg;
184 unsigned long latency_ns = 0; 184 unsigned long latency_ns = 0;
185 int ret, i, count; 185 int ret, i, count;
186 struct { 186 struct {
@@ -198,15 +198,9 @@ unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev)
198 if (!count) 198 if (!count)
199 goto put_opp_table; 199 goto put_opp_table;
200 200
201 regulators = kmalloc_array(count, sizeof(*regulators), GFP_KERNEL);
202 if (!regulators)
203 goto put_opp_table;
204
205 uV = kmalloc_array(count, sizeof(*uV), GFP_KERNEL); 201 uV = kmalloc_array(count, sizeof(*uV), GFP_KERNEL);
206 if (!uV) 202 if (!uV)
207 goto free_regulators; 203 goto put_opp_table;
208
209 memcpy(regulators, opp_table->regulators, count * sizeof(*regulators));
210 204
211 mutex_lock(&opp_table->lock); 205 mutex_lock(&opp_table->lock);
212 206
@@ -232,15 +226,13 @@ unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev)
232 * isn't freed, while we are executing this routine. 226 * isn't freed, while we are executing this routine.
233 */ 227 */
234 for (i = 0; i < count; i++) { 228 for (i = 0; i < count; i++) {
235 reg = regulators[i]; 229 reg = opp_table->regulators[i];
236 ret = regulator_set_voltage_time(reg, uV[i].min, uV[i].max); 230 ret = regulator_set_voltage_time(reg, uV[i].min, uV[i].max);
237 if (ret > 0) 231 if (ret > 0)
238 latency_ns += ret * 1000; 232 latency_ns += ret * 1000;
239 } 233 }
240 234
241 kfree(uV); 235 kfree(uV);
242free_regulators:
243 kfree(regulators);
244put_opp_table: 236put_opp_table:
245 dev_pm_opp_put_opp_table(opp_table); 237 dev_pm_opp_put_opp_table(opp_table);
246 238
@@ -543,17 +535,18 @@ _generic_set_opp_clk_only(struct device *dev, struct clk *clk,
543 return ret; 535 return ret;
544} 536}
545 537
546static int _generic_set_opp(struct dev_pm_set_opp_data *data) 538static int _generic_set_opp_regulator(const struct opp_table *opp_table,
539 struct device *dev,
540 unsigned long old_freq,
541 unsigned long freq,
542 struct dev_pm_opp_supply *old_supply,
543 struct dev_pm_opp_supply *new_supply)
547{ 544{
548 struct dev_pm_opp_supply *old_supply = data->old_opp.supplies; 545 struct regulator *reg = opp_table->regulators[0];
549 struct dev_pm_opp_supply *new_supply = data->new_opp.supplies;
550 unsigned long old_freq = data->old_opp.rate, freq = data->new_opp.rate;
551 struct regulator *reg = data->regulators[0];
552 struct device *dev= data->dev;
553 int ret; 546 int ret;
554 547
555 /* This function only supports single regulator per device */ 548 /* This function only supports single regulator per device */
556 if (WARN_ON(data->regulator_count > 1)) { 549 if (WARN_ON(opp_table->regulator_count > 1)) {
557 dev_err(dev, "multiple regulators are not supported\n"); 550 dev_err(dev, "multiple regulators are not supported\n");
558 return -EINVAL; 551 return -EINVAL;
559 } 552 }
@@ -566,7 +559,7 @@ static int _generic_set_opp(struct dev_pm_set_opp_data *data)
566 } 559 }
567 560
568 /* Change frequency */ 561 /* Change frequency */
569 ret = _generic_set_opp_clk_only(dev, data->clk, old_freq, freq); 562 ret = _generic_set_opp_clk_only(dev, opp_table->clk, old_freq, freq);
570 if (ret) 563 if (ret)
571 goto restore_voltage; 564 goto restore_voltage;
572 565
@@ -580,12 +573,12 @@ static int _generic_set_opp(struct dev_pm_set_opp_data *data)
580 return 0; 573 return 0;
581 574
582restore_freq: 575restore_freq:
583 if (_generic_set_opp_clk_only(dev, data->clk, freq, old_freq)) 576 if (_generic_set_opp_clk_only(dev, opp_table->clk, freq, old_freq))
584 dev_err(dev, "%s: failed to restore old-freq (%lu Hz)\n", 577 dev_err(dev, "%s: failed to restore old-freq (%lu Hz)\n",
585 __func__, old_freq); 578 __func__, old_freq);
586restore_voltage: 579restore_voltage:
587 /* This shouldn't harm even if the voltages weren't updated earlier */ 580 /* This shouldn't harm even if the voltages weren't updated earlier */
588 if (old_supply->u_volt) 581 if (old_supply)
589 _set_opp_voltage(dev, reg, old_supply); 582 _set_opp_voltage(dev, reg, old_supply);
590 583
591 return ret; 584 return ret;
@@ -603,10 +596,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
603{ 596{
604 struct opp_table *opp_table; 597 struct opp_table *opp_table;
605 unsigned long freq, old_freq; 598 unsigned long freq, old_freq;
606 int (*set_opp)(struct dev_pm_set_opp_data *data);
607 struct dev_pm_opp *old_opp, *opp; 599 struct dev_pm_opp *old_opp, *opp;
608 struct regulator **regulators;
609 struct dev_pm_set_opp_data *data;
610 struct clk *clk; 600 struct clk *clk;
611 int ret, size; 601 int ret, size;
612 602
@@ -661,38 +651,35 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
661 dev_dbg(dev, "%s: switching OPP: %lu Hz --> %lu Hz\n", __func__, 651 dev_dbg(dev, "%s: switching OPP: %lu Hz --> %lu Hz\n", __func__,
662 old_freq, freq); 652 old_freq, freq);
663 653
664 regulators = opp_table->regulators;
665
666 /* Only frequency scaling */ 654 /* Only frequency scaling */
667 if (!regulators) { 655 if (!opp_table->regulators) {
668 ret = _generic_set_opp_clk_only(dev, clk, old_freq, freq); 656 ret = _generic_set_opp_clk_only(dev, clk, old_freq, freq);
669 goto put_opps; 657 } else if (!opp_table->set_opp) {
670 } 658 ret = _generic_set_opp_regulator(opp_table, dev, old_freq, freq,
659 IS_ERR(old_opp) ? NULL : old_opp->supplies,
660 opp->supplies);
661 } else {
662 struct dev_pm_set_opp_data *data;
671 663
672 if (opp_table->set_opp) 664 data = opp_table->set_opp_data;
673 set_opp = opp_table->set_opp; 665 data->regulators = opp_table->regulators;
674 else 666 data->regulator_count = opp_table->regulator_count;
675 set_opp = _generic_set_opp; 667 data->clk = clk;
676 668 data->dev = dev;
677 data = opp_table->set_opp_data;
678 data->regulators = regulators;
679 data->regulator_count = opp_table->regulator_count;
680 data->clk = clk;
681 data->dev = dev;
682
683 data->old_opp.rate = old_freq;
684 size = sizeof(*opp->supplies) * opp_table->regulator_count;
685 if (IS_ERR(old_opp))
686 memset(data->old_opp.supplies, 0, size);
687 else
688 memcpy(data->old_opp.supplies, old_opp->supplies, size);
689 669
690 data->new_opp.rate = freq; 670 data->old_opp.rate = old_freq;
691 memcpy(data->new_opp.supplies, opp->supplies, size); 671 size = sizeof(*opp->supplies) * opp_table->regulator_count;
672 if (IS_ERR(old_opp))
673 memset(data->old_opp.supplies, 0, size);
674 else
675 memcpy(data->old_opp.supplies, old_opp->supplies, size);
692 676
693 ret = set_opp(data); 677 data->new_opp.rate = freq;
678 memcpy(data->new_opp.supplies, opp->supplies, size);
679
680 ret = opp_table->set_opp(data);
681 }
694 682
695put_opps:
696 dev_pm_opp_put(opp); 683 dev_pm_opp_put(opp);
697put_old_opp: 684put_old_opp:
698 if (!IS_ERR(old_opp)) 685 if (!IS_ERR(old_opp))
@@ -1376,6 +1363,73 @@ void dev_pm_opp_put_regulators(struct opp_table *opp_table)
1376EXPORT_SYMBOL_GPL(dev_pm_opp_put_regulators); 1363EXPORT_SYMBOL_GPL(dev_pm_opp_put_regulators);
1377 1364
1378/** 1365/**
1366 * dev_pm_opp_set_clkname() - Set clk name for the device
1367 * @dev: Device for which clk name is being set.
1368 * @name: Clk name.
1369 *
1370 * In order to support OPP switching, OPP layer needs to get pointer to the
1371 * clock for the device. Simple cases work fine without using this routine (i.e.
1372 * by passing connection-id as NULL), but for a device with multiple clocks
1373 * available, the OPP core needs to know the exact name of the clk to use.
1374 *
1375 * This must be called before any OPPs are initialized for the device.
1376 */
1377struct opp_table *dev_pm_opp_set_clkname(struct device *dev, const char *name)
1378{
1379 struct opp_table *opp_table;
1380 int ret;
1381
1382 opp_table = dev_pm_opp_get_opp_table(dev);
1383 if (!opp_table)
1384 return ERR_PTR(-ENOMEM);
1385
1386 /* This should be called before OPPs are initialized */
1387 if (WARN_ON(!list_empty(&opp_table->opp_list))) {
1388 ret = -EBUSY;
1389 goto err;
1390 }
1391
1392 /* Already have default clk set, free it */
1393 if (!IS_ERR(opp_table->clk))
1394 clk_put(opp_table->clk);
1395
1396 /* Find clk for the device */
1397 opp_table->clk = clk_get(dev, name);
1398 if (IS_ERR(opp_table->clk)) {
1399 ret = PTR_ERR(opp_table->clk);
1400 if (ret != -EPROBE_DEFER) {
1401 dev_err(dev, "%s: Couldn't find clock: %d\n", __func__,
1402 ret);
1403 }
1404 goto err;
1405 }
1406
1407 return opp_table;
1408
1409err:
1410 dev_pm_opp_put_opp_table(opp_table);
1411
1412 return ERR_PTR(ret);
1413}
1414EXPORT_SYMBOL_GPL(dev_pm_opp_set_clkname);
1415
1416/**
1417 * dev_pm_opp_put_clkname() - Releases resources blocked for clk.
1418 * @opp_table: OPP table returned from dev_pm_opp_set_clkname().
1419 */
1420void dev_pm_opp_put_clkname(struct opp_table *opp_table)
1421{
1422 /* Make sure there are no concurrent readers while updating opp_table */
1423 WARN_ON(!list_empty(&opp_table->opp_list));
1424
1425 clk_put(opp_table->clk);
1426 opp_table->clk = ERR_PTR(-EINVAL);
1427
1428 dev_pm_opp_put_opp_table(opp_table);
1429}
1430EXPORT_SYMBOL_GPL(dev_pm_opp_put_clkname);
1431
1432/**
1379 * dev_pm_opp_register_set_opp_helper() - Register custom set OPP helper 1433 * dev_pm_opp_register_set_opp_helper() - Register custom set OPP helper
1380 * @dev: Device for which the helper is getting registered. 1434 * @dev: Device for which the helper is getting registered.
1381 * @set_opp: Custom set OPP helper. 1435 * @set_opp: Custom set OPP helper.
diff --git a/drivers/base/power/opp/debugfs.c b/drivers/base/power/opp/debugfs.c
index 95f433db4ac7..81cf120fcf43 100644
--- a/drivers/base/power/opp/debugfs.c
+++ b/drivers/base/power/opp/debugfs.c
@@ -40,11 +40,10 @@ static bool opp_debug_create_supplies(struct dev_pm_opp *opp,
40 struct dentry *pdentry) 40 struct dentry *pdentry)
41{ 41{
42 struct dentry *d; 42 struct dentry *d;
43 int i = 0; 43 int i;
44 char *name; 44 char *name;
45 45
46 /* Always create at least supply-0 directory */ 46 for (i = 0; i < opp_table->regulator_count; i++) {
47 do {
48 name = kasprintf(GFP_KERNEL, "supply-%d", i); 47 name = kasprintf(GFP_KERNEL, "supply-%d", i);
49 48
50 /* Create per-opp directory */ 49 /* Create per-opp directory */
@@ -70,7 +69,7 @@ static bool opp_debug_create_supplies(struct dev_pm_opp *opp,
70 if (!debugfs_create_ulong("u_amp", S_IRUGO, d, 69 if (!debugfs_create_ulong("u_amp", S_IRUGO, d,
71 &opp->supplies[i].u_amp)) 70 &opp->supplies[i].u_amp))
72 return false; 71 return false;
73 } while (++i < opp_table->regulator_count); 72 }
74 73
75 return true; 74 return true;
76} 75}
diff --git a/drivers/base/power/opp/of.c b/drivers/base/power/opp/of.c
index 779428676f63..57eec1ca0569 100644
--- a/drivers/base/power/opp/of.c
+++ b/drivers/base/power/opp/of.c
@@ -131,8 +131,14 @@ static int opp_parse_supplies(struct dev_pm_opp *opp, struct device *dev,
131 prop = of_find_property(opp->np, name, NULL); 131 prop = of_find_property(opp->np, name, NULL);
132 132
133 /* Missing property isn't a problem, but an invalid entry is */ 133 /* Missing property isn't a problem, but an invalid entry is */
134 if (!prop) 134 if (!prop) {
135 return 0; 135 if (!opp_table->regulator_count)
136 return 0;
137
138 dev_err(dev, "%s: opp-microvolt missing although OPP managing regulators\n",
139 __func__);
140 return -EINVAL;
141 }
136 } 142 }
137 143
138 vcount = of_property_count_u32_elems(opp->np, name); 144 vcount = of_property_count_u32_elems(opp->np, name);
diff --git a/drivers/base/power/sysfs.c b/drivers/base/power/sysfs.c
index 33b4b902741a..185a52581cfa 100644
--- a/drivers/base/power/sysfs.c
+++ b/drivers/base/power/sysfs.c
@@ -607,7 +607,7 @@ static struct attribute *power_attrs[] = {
607#endif /* CONFIG_PM_ADVANCED_DEBUG */ 607#endif /* CONFIG_PM_ADVANCED_DEBUG */
608 NULL, 608 NULL,
609}; 609};
610static struct attribute_group pm_attr_group = { 610static const struct attribute_group pm_attr_group = {
611 .name = power_group_name, 611 .name = power_group_name,
612 .attrs = power_attrs, 612 .attrs = power_attrs,
613}; 613};
@@ -629,7 +629,7 @@ static struct attribute *wakeup_attrs[] = {
629#endif 629#endif
630 NULL, 630 NULL,
631}; 631};
632static struct attribute_group pm_wakeup_attr_group = { 632static const struct attribute_group pm_wakeup_attr_group = {
633 .name = power_group_name, 633 .name = power_group_name,
634 .attrs = wakeup_attrs, 634 .attrs = wakeup_attrs,
635}; 635};
@@ -644,7 +644,7 @@ static struct attribute *runtime_attrs[] = {
644 &dev_attr_autosuspend_delay_ms.attr, 644 &dev_attr_autosuspend_delay_ms.attr,
645 NULL, 645 NULL,
646}; 646};
647static struct attribute_group pm_runtime_attr_group = { 647static const struct attribute_group pm_runtime_attr_group = {
648 .name = power_group_name, 648 .name = power_group_name,
649 .attrs = runtime_attrs, 649 .attrs = runtime_attrs,
650}; 650};
@@ -653,7 +653,7 @@ static struct attribute *pm_qos_resume_latency_attrs[] = {
653 &dev_attr_pm_qos_resume_latency_us.attr, 653 &dev_attr_pm_qos_resume_latency_us.attr,
654 NULL, 654 NULL,
655}; 655};
656static struct attribute_group pm_qos_resume_latency_attr_group = { 656static const struct attribute_group pm_qos_resume_latency_attr_group = {
657 .name = power_group_name, 657 .name = power_group_name,
658 .attrs = pm_qos_resume_latency_attrs, 658 .attrs = pm_qos_resume_latency_attrs,
659}; 659};
@@ -662,7 +662,7 @@ static struct attribute *pm_qos_latency_tolerance_attrs[] = {
662 &dev_attr_pm_qos_latency_tolerance_us.attr, 662 &dev_attr_pm_qos_latency_tolerance_us.attr,
663 NULL, 663 NULL,
664}; 664};
665static struct attribute_group pm_qos_latency_tolerance_attr_group = { 665static const struct attribute_group pm_qos_latency_tolerance_attr_group = {
666 .name = power_group_name, 666 .name = power_group_name,
667 .attrs = pm_qos_latency_tolerance_attrs, 667 .attrs = pm_qos_latency_tolerance_attrs,
668}; 668};
@@ -672,7 +672,7 @@ static struct attribute *pm_qos_flags_attrs[] = {
672 &dev_attr_pm_qos_remote_wakeup.attr, 672 &dev_attr_pm_qos_remote_wakeup.attr,
673 NULL, 673 NULL,
674}; 674};
675static struct attribute_group pm_qos_flags_attr_group = { 675static const struct attribute_group pm_qos_flags_attr_group = {
676 .name = power_group_name, 676 .name = power_group_name,
677 .attrs = pm_qos_flags_attrs, 677 .attrs = pm_qos_flags_attrs,
678}; 678};
diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
index c313b600d356..144e6d8fafc8 100644
--- a/drivers/base/power/wakeup.c
+++ b/drivers/base/power/wakeup.c
@@ -28,8 +28,8 @@ bool events_check_enabled __read_mostly;
28/* First wakeup IRQ seen by the kernel in the last cycle. */ 28/* First wakeup IRQ seen by the kernel in the last cycle. */
29unsigned int pm_wakeup_irq __read_mostly; 29unsigned int pm_wakeup_irq __read_mostly;
30 30
31/* If set and the system is suspending, terminate the suspend. */ 31/* If greater than 0 and the system is suspending, terminate the suspend. */
32static bool pm_abort_suspend __read_mostly; 32static atomic_t pm_abort_suspend __read_mostly;
33 33
34/* 34/*
35 * Combined counters of registered wakeup events and wakeup events in progress. 35 * Combined counters of registered wakeup events and wakeup events in progress.
@@ -60,6 +60,8 @@ static LIST_HEAD(wakeup_sources);
60 60
61static DECLARE_WAIT_QUEUE_HEAD(wakeup_count_wait_queue); 61static DECLARE_WAIT_QUEUE_HEAD(wakeup_count_wait_queue);
62 62
63DEFINE_STATIC_SRCU(wakeup_srcu);
64
63static struct wakeup_source deleted_ws = { 65static struct wakeup_source deleted_ws = {
64 .name = "deleted", 66 .name = "deleted",
65 .lock = __SPIN_LOCK_UNLOCKED(deleted_ws.lock), 67 .lock = __SPIN_LOCK_UNLOCKED(deleted_ws.lock),
@@ -198,7 +200,7 @@ void wakeup_source_remove(struct wakeup_source *ws)
198 spin_lock_irqsave(&events_lock, flags); 200 spin_lock_irqsave(&events_lock, flags);
199 list_del_rcu(&ws->entry); 201 list_del_rcu(&ws->entry);
200 spin_unlock_irqrestore(&events_lock, flags); 202 spin_unlock_irqrestore(&events_lock, flags);
201 synchronize_rcu(); 203 synchronize_srcu(&wakeup_srcu);
202} 204}
203EXPORT_SYMBOL_GPL(wakeup_source_remove); 205EXPORT_SYMBOL_GPL(wakeup_source_remove);
204 206
@@ -332,12 +334,12 @@ void device_wakeup_detach_irq(struct device *dev)
332void device_wakeup_arm_wake_irqs(void) 334void device_wakeup_arm_wake_irqs(void)
333{ 335{
334 struct wakeup_source *ws; 336 struct wakeup_source *ws;
337 int srcuidx;
335 338
336 rcu_read_lock(); 339 srcuidx = srcu_read_lock(&wakeup_srcu);
337 list_for_each_entry_rcu(ws, &wakeup_sources, entry) 340 list_for_each_entry_rcu(ws, &wakeup_sources, entry)
338 dev_pm_arm_wake_irq(ws->wakeirq); 341 dev_pm_arm_wake_irq(ws->wakeirq);
339 342 srcu_read_unlock(&wakeup_srcu, srcuidx);
340 rcu_read_unlock();
341} 343}
342 344
343/** 345/**
@@ -348,12 +350,12 @@ void device_wakeup_arm_wake_irqs(void)
348void device_wakeup_disarm_wake_irqs(void) 350void device_wakeup_disarm_wake_irqs(void)
349{ 351{
350 struct wakeup_source *ws; 352 struct wakeup_source *ws;
353 int srcuidx;
351 354
352 rcu_read_lock(); 355 srcuidx = srcu_read_lock(&wakeup_srcu);
353 list_for_each_entry_rcu(ws, &wakeup_sources, entry) 356 list_for_each_entry_rcu(ws, &wakeup_sources, entry)
354 dev_pm_disarm_wake_irq(ws->wakeirq); 357 dev_pm_disarm_wake_irq(ws->wakeirq);
355 358 srcu_read_unlock(&wakeup_srcu, srcuidx);
356 rcu_read_unlock();
357} 359}
358 360
359/** 361/**
@@ -804,10 +806,10 @@ EXPORT_SYMBOL_GPL(pm_wakeup_dev_event);
804void pm_print_active_wakeup_sources(void) 806void pm_print_active_wakeup_sources(void)
805{ 807{
806 struct wakeup_source *ws; 808 struct wakeup_source *ws;
807 int active = 0; 809 int srcuidx, active = 0;
808 struct wakeup_source *last_activity_ws = NULL; 810 struct wakeup_source *last_activity_ws = NULL;
809 811
810 rcu_read_lock(); 812 srcuidx = srcu_read_lock(&wakeup_srcu);
811 list_for_each_entry_rcu(ws, &wakeup_sources, entry) { 813 list_for_each_entry_rcu(ws, &wakeup_sources, entry) {
812 if (ws->active) { 814 if (ws->active) {
813 pr_debug("active wakeup source: %s\n", ws->name); 815 pr_debug("active wakeup source: %s\n", ws->name);
@@ -823,7 +825,7 @@ void pm_print_active_wakeup_sources(void)
823 if (!active && last_activity_ws) 825 if (!active && last_activity_ws)
824 pr_debug("last active wakeup source: %s\n", 826 pr_debug("last active wakeup source: %s\n",
825 last_activity_ws->name); 827 last_activity_ws->name);
826 rcu_read_unlock(); 828 srcu_read_unlock(&wakeup_srcu, srcuidx);
827} 829}
828EXPORT_SYMBOL_GPL(pm_print_active_wakeup_sources); 830EXPORT_SYMBOL_GPL(pm_print_active_wakeup_sources);
829 831
@@ -855,20 +857,26 @@ bool pm_wakeup_pending(void)
855 pm_print_active_wakeup_sources(); 857 pm_print_active_wakeup_sources();
856 } 858 }
857 859
858 return ret || pm_abort_suspend; 860 return ret || atomic_read(&pm_abort_suspend) > 0;
859} 861}
860 862
861void pm_system_wakeup(void) 863void pm_system_wakeup(void)
862{ 864{
863 pm_abort_suspend = true; 865 atomic_inc(&pm_abort_suspend);
864 freeze_wake(); 866 freeze_wake();
865} 867}
866EXPORT_SYMBOL_GPL(pm_system_wakeup); 868EXPORT_SYMBOL_GPL(pm_system_wakeup);
867 869
868void pm_wakeup_clear(void) 870void pm_system_cancel_wakeup(void)
871{
872 atomic_dec(&pm_abort_suspend);
873}
874
875void pm_wakeup_clear(bool reset)
869{ 876{
870 pm_abort_suspend = false;
871 pm_wakeup_irq = 0; 877 pm_wakeup_irq = 0;
878 if (reset)
879 atomic_set(&pm_abort_suspend, 0);
872} 880}
873 881
874void pm_system_irq_wakeup(unsigned int irq_number) 882void pm_system_irq_wakeup(unsigned int irq_number)
@@ -950,8 +958,9 @@ void pm_wakep_autosleep_enabled(bool set)
950{ 958{
951 struct wakeup_source *ws; 959 struct wakeup_source *ws;
952 ktime_t now = ktime_get(); 960 ktime_t now = ktime_get();
961 int srcuidx;
953 962
954 rcu_read_lock(); 963 srcuidx = srcu_read_lock(&wakeup_srcu);
955 list_for_each_entry_rcu(ws, &wakeup_sources, entry) { 964 list_for_each_entry_rcu(ws, &wakeup_sources, entry) {
956 spin_lock_irq(&ws->lock); 965 spin_lock_irq(&ws->lock);
957 if (ws->autosleep_enabled != set) { 966 if (ws->autosleep_enabled != set) {
@@ -965,7 +974,7 @@ void pm_wakep_autosleep_enabled(bool set)
965 } 974 }
966 spin_unlock_irq(&ws->lock); 975 spin_unlock_irq(&ws->lock);
967 } 976 }
968 rcu_read_unlock(); 977 srcu_read_unlock(&wakeup_srcu, srcuidx);
969} 978}
970#endif /* CONFIG_PM_AUTOSLEEP */ 979#endif /* CONFIG_PM_AUTOSLEEP */
971 980
@@ -1026,15 +1035,16 @@ static int print_wakeup_source_stats(struct seq_file *m,
1026static int wakeup_sources_stats_show(struct seq_file *m, void *unused) 1035static int wakeup_sources_stats_show(struct seq_file *m, void *unused)
1027{ 1036{
1028 struct wakeup_source *ws; 1037 struct wakeup_source *ws;
1038 int srcuidx;
1029 1039
1030 seq_puts(m, "name\t\tactive_count\tevent_count\twakeup_count\t" 1040 seq_puts(m, "name\t\tactive_count\tevent_count\twakeup_count\t"
1031 "expire_count\tactive_since\ttotal_time\tmax_time\t" 1041 "expire_count\tactive_since\ttotal_time\tmax_time\t"
1032 "last_change\tprevent_suspend_time\n"); 1042 "last_change\tprevent_suspend_time\n");
1033 1043
1034 rcu_read_lock(); 1044 srcuidx = srcu_read_lock(&wakeup_srcu);
1035 list_for_each_entry_rcu(ws, &wakeup_sources, entry) 1045 list_for_each_entry_rcu(ws, &wakeup_sources, entry)
1036 print_wakeup_source_stats(m, ws); 1046 print_wakeup_source_stats(m, ws);
1037 rcu_read_unlock(); 1047 srcu_read_unlock(&wakeup_srcu, srcuidx);
1038 1048
1039 print_wakeup_source_stats(m, &deleted_ws); 1049 print_wakeup_source_stats(m, &deleted_ws);
1040 1050
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index e82bb3c30b92..10be285c9055 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -144,10 +144,23 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
144 144
145 cppc_dmi_max_khz = cppc_get_dmi_max_khz(); 145 cppc_dmi_max_khz = cppc_get_dmi_max_khz();
146 146
147 policy->min = cpu->perf_caps.lowest_perf * cppc_dmi_max_khz / cpu->perf_caps.highest_perf; 147 /*
148 * Set min to lowest nonlinear perf to avoid any efficiency penalty (see
149 * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
150 */
151 policy->min = cpu->perf_caps.lowest_nonlinear_perf * cppc_dmi_max_khz /
152 cpu->perf_caps.highest_perf;
148 policy->max = cppc_dmi_max_khz; 153 policy->max = cppc_dmi_max_khz;
149 policy->cpuinfo.min_freq = policy->min; 154
150 policy->cpuinfo.max_freq = policy->max; 155 /*
156 * Set cpuinfo.min_freq to Lowest to make the full range of performance
157 * available if userspace wants to use any perf between lowest & lowest
158 * nonlinear perf
159 */
160 policy->cpuinfo.min_freq = cpu->perf_caps.lowest_perf * cppc_dmi_max_khz /
161 cpu->perf_caps.highest_perf;
162 policy->cpuinfo.max_freq = cppc_dmi_max_khz;
163
151 policy->cpuinfo.transition_latency = cppc_get_transition_latency(cpu_num); 164 policy->cpuinfo.transition_latency = cppc_get_transition_latency(cpu_num);
152 policy->shared_type = cpu->shared_type; 165 policy->shared_type = cpu->shared_type;
153 166
diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c b/drivers/cpufreq/cpufreq-dt-platdev.c
index 921b4a6c3d16..1c262923fe58 100644
--- a/drivers/cpufreq/cpufreq-dt-platdev.c
+++ b/drivers/cpufreq/cpufreq-dt-platdev.c
@@ -31,6 +31,7 @@ static const struct of_device_id machines[] __initconst = {
31 { .compatible = "arm,integrator-ap", }, 31 { .compatible = "arm,integrator-ap", },
32 { .compatible = "arm,integrator-cp", }, 32 { .compatible = "arm,integrator-cp", },
33 33
34 { .compatible = "hisilicon,hi3660", },
34 { .compatible = "hisilicon,hi6220", }, 35 { .compatible = "hisilicon,hi6220", },
35 36
36 { .compatible = "fsl,imx27", }, 37 { .compatible = "fsl,imx27", },
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 29c5b0cbad96..9bf97a366029 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -632,11 +632,21 @@ show_one(cpuinfo_transition_latency, cpuinfo.transition_latency);
632show_one(scaling_min_freq, min); 632show_one(scaling_min_freq, min);
633show_one(scaling_max_freq, max); 633show_one(scaling_max_freq, max);
634 634
635__weak unsigned int arch_freq_get_on_cpu(int cpu)
636{
637 return 0;
638}
639
635static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf) 640static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf)
636{ 641{
637 ssize_t ret; 642 ssize_t ret;
643 unsigned int freq;
638 644
639 if (cpufreq_driver && cpufreq_driver->setpolicy && cpufreq_driver->get) 645 freq = arch_freq_get_on_cpu(policy->cpu);
646 if (freq)
647 ret = sprintf(buf, "%u\n", freq);
648 else if (cpufreq_driver && cpufreq_driver->setpolicy &&
649 cpufreq_driver->get)
640 ret = sprintf(buf, "%u\n", cpufreq_driver->get(policy->cpu)); 650 ret = sprintf(buf, "%u\n", cpufreq_driver->get(policy->cpu));
641 else 651 else
642 ret = sprintf(buf, "%u\n", policy->cur); 652 ret = sprintf(buf, "%u\n", policy->cur);
diff --git a/drivers/cpufreq/exynos5440-cpufreq.c b/drivers/cpufreq/exynos5440-cpufreq.c
index 9180d34cc9fc..b6b369c22272 100644
--- a/drivers/cpufreq/exynos5440-cpufreq.c
+++ b/drivers/cpufreq/exynos5440-cpufreq.c
@@ -173,12 +173,12 @@ static void exynos_enable_dvfs(unsigned int cur_frequency)
173 /* Enable PSTATE Change Event */ 173 /* Enable PSTATE Change Event */
174 tmp = __raw_readl(dvfs_info->base + XMU_PMUEVTEN); 174 tmp = __raw_readl(dvfs_info->base + XMU_PMUEVTEN);
175 tmp |= (1 << PSTATE_CHANGED_EVTEN_SHIFT); 175 tmp |= (1 << PSTATE_CHANGED_EVTEN_SHIFT);
176 __raw_writel(tmp, dvfs_info->base + XMU_PMUEVTEN); 176 __raw_writel(tmp, dvfs_info->base + XMU_PMUEVTEN);
177 177
178 /* Enable PSTATE Change IRQ */ 178 /* Enable PSTATE Change IRQ */
179 tmp = __raw_readl(dvfs_info->base + XMU_PMUIRQEN); 179 tmp = __raw_readl(dvfs_info->base + XMU_PMUIRQEN);
180 tmp |= (1 << PSTATE_CHANGED_IRQEN_SHIFT); 180 tmp |= (1 << PSTATE_CHANGED_IRQEN_SHIFT);
181 __raw_writel(tmp, dvfs_info->base + XMU_PMUIRQEN); 181 __raw_writel(tmp, dvfs_info->base + XMU_PMUIRQEN);
182 182
183 /* Set initial performance index */ 183 /* Set initial performance index */
184 cpufreq_for_each_entry(pos, freq_table) 184 cpufreq_for_each_entry(pos, freq_table)
@@ -330,7 +330,7 @@ static int exynos_cpufreq_probe(struct platform_device *pdev)
330 struct resource res; 330 struct resource res;
331 unsigned int cur_frequency; 331 unsigned int cur_frequency;
332 332
333 np = pdev->dev.of_node; 333 np = pdev->dev.of_node;
334 if (!np) 334 if (!np)
335 return -ENODEV; 335 return -ENODEV;
336 336
diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
index 9c13f097fd8c..b6edd3ccaa55 100644
--- a/drivers/cpufreq/imx6q-cpufreq.c
+++ b/drivers/cpufreq/imx6q-cpufreq.c
@@ -101,7 +101,8 @@ static int imx6q_set_target(struct cpufreq_policy *policy, unsigned int index)
101 * - Reprogram pll1_sys_clk and reparent pll1_sw_clk back to it 101 * - Reprogram pll1_sys_clk and reparent pll1_sw_clk back to it
102 * - Disable pll2_pfd2_396m_clk 102 * - Disable pll2_pfd2_396m_clk
103 */ 103 */
104 if (of_machine_is_compatible("fsl,imx6ul")) { 104 if (of_machine_is_compatible("fsl,imx6ul") ||
105 of_machine_is_compatible("fsl,imx6ull")) {
105 /* 106 /*
106 * When changing pll1_sw_clk's parent to pll1_sys_clk, 107 * When changing pll1_sw_clk's parent to pll1_sys_clk,
107 * CPU may run at higher than 528MHz, this will lead to 108 * CPU may run at higher than 528MHz, this will lead to
@@ -215,7 +216,8 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
215 goto put_clk; 216 goto put_clk;
216 } 217 }
217 218
218 if (of_machine_is_compatible("fsl,imx6ul")) { 219 if (of_machine_is_compatible("fsl,imx6ul") ||
220 of_machine_is_compatible("fsl,imx6ull")) {
219 pll2_bus_clk = clk_get(cpu_dev, "pll2_bus"); 221 pll2_bus_clk = clk_get(cpu_dev, "pll2_bus");
220 secondary_sel_clk = clk_get(cpu_dev, "secondary_sel"); 222 secondary_sel_clk = clk_get(cpu_dev, "secondary_sel");
221 if (IS_ERR(pll2_bus_clk) || IS_ERR(secondary_sel_clk)) { 223 if (IS_ERR(pll2_bus_clk) || IS_ERR(secondary_sel_clk)) {
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index eb1158532de3..48a98f11a84e 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -231,10 +231,8 @@ struct global_params {
231 * @prev_cummulative_iowait: IO Wait time difference from last and 231 * @prev_cummulative_iowait: IO Wait time difference from last and
232 * current sample 232 * current sample
233 * @sample: Storage for storing last Sample data 233 * @sample: Storage for storing last Sample data
234 * @min_perf: Minimum capacity limit as a fraction of the maximum 234 * @min_perf_ratio: Minimum capacity in terms of PERF or HWP ratios
235 * turbo P-state capacity. 235 * @max_perf_ratio: Maximum capacity in terms of PERF or HWP ratios
236 * @max_perf: Maximum capacity limit as a fraction of the maximum
237 * turbo P-state capacity.
238 * @acpi_perf_data: Stores ACPI perf information read from _PSS 236 * @acpi_perf_data: Stores ACPI perf information read from _PSS
239 * @valid_pss_table: Set to true for valid ACPI _PSS entries found 237 * @valid_pss_table: Set to true for valid ACPI _PSS entries found
240 * @epp_powersave: Last saved HWP energy performance preference 238 * @epp_powersave: Last saved HWP energy performance preference
@@ -266,8 +264,8 @@ struct cpudata {
266 u64 prev_tsc; 264 u64 prev_tsc;
267 u64 prev_cummulative_iowait; 265 u64 prev_cummulative_iowait;
268 struct sample sample; 266 struct sample sample;
269 int32_t min_perf; 267 int32_t min_perf_ratio;
270 int32_t max_perf; 268 int32_t max_perf_ratio;
271#ifdef CONFIG_ACPI 269#ifdef CONFIG_ACPI
272 struct acpi_processor_performance acpi_perf_data; 270 struct acpi_processor_performance acpi_perf_data;
273 bool valid_pss_table; 271 bool valid_pss_table;
@@ -653,6 +651,12 @@ static const char * const energy_perf_strings[] = {
653 "power", 651 "power",
654 NULL 652 NULL
655}; 653};
654static const unsigned int epp_values[] = {
655 HWP_EPP_PERFORMANCE,
656 HWP_EPP_BALANCE_PERFORMANCE,
657 HWP_EPP_BALANCE_POWERSAVE,
658 HWP_EPP_POWERSAVE
659};
656 660
657static int intel_pstate_get_energy_pref_index(struct cpudata *cpu_data) 661static int intel_pstate_get_energy_pref_index(struct cpudata *cpu_data)
658{ 662{
@@ -664,17 +668,14 @@ static int intel_pstate_get_energy_pref_index(struct cpudata *cpu_data)
664 return epp; 668 return epp;
665 669
666 if (static_cpu_has(X86_FEATURE_HWP_EPP)) { 670 if (static_cpu_has(X86_FEATURE_HWP_EPP)) {
667 /* 671 if (epp == HWP_EPP_PERFORMANCE)
668 * Range: 672 return 1;
669 * 0x00-0x3F : Performance 673 if (epp <= HWP_EPP_BALANCE_PERFORMANCE)
670 * 0x40-0x7F : Balance performance 674 return 2;
671 * 0x80-0xBF : Balance power 675 if (epp <= HWP_EPP_BALANCE_POWERSAVE)
672 * 0xC0-0xFF : Power 676 return 3;
673 * The EPP is a 8 bit value, but our ranges restrict the 677 else
674 * value which can be set. Here only using top two bits 678 return 4;
675 * effectively.
676 */
677 index = (epp >> 6) + 1;
678 } else if (static_cpu_has(X86_FEATURE_EPB)) { 679 } else if (static_cpu_has(X86_FEATURE_EPB)) {
679 /* 680 /*
680 * Range: 681 * Range:
@@ -712,15 +713,8 @@ static int intel_pstate_set_energy_pref_index(struct cpudata *cpu_data,
712 713
713 value &= ~GENMASK_ULL(31, 24); 714 value &= ~GENMASK_ULL(31, 24);
714 715
715 /*
716 * If epp is not default, convert from index into
717 * energy_perf_strings to epp value, by shifting 6
718 * bits left to use only top two bits in epp.
719 * The resultant epp need to shifted by 24 bits to
720 * epp position in MSR_HWP_REQUEST.
721 */
722 if (epp == -EINVAL) 716 if (epp == -EINVAL)
723 epp = (pref_index - 1) << 6; 717 epp = epp_values[pref_index - 1];
724 718
725 value |= (u64)epp << 24; 719 value |= (u64)epp << 24;
726 ret = wrmsrl_on_cpu(cpu_data->cpu, MSR_HWP_REQUEST, value); 720 ret = wrmsrl_on_cpu(cpu_data->cpu, MSR_HWP_REQUEST, value);
@@ -794,25 +788,32 @@ static struct freq_attr *hwp_cpufreq_attrs[] = {
794 NULL, 788 NULL,
795}; 789};
796 790
797static void intel_pstate_hwp_set(unsigned int cpu) 791static void intel_pstate_get_hwp_max(unsigned int cpu, int *phy_max,
792 int *current_max)
798{ 793{
799 struct cpudata *cpu_data = all_cpu_data[cpu]; 794 u64 cap;
800 int min, hw_min, max, hw_max;
801 u64 value, cap;
802 s16 epp;
803 795
804 rdmsrl_on_cpu(cpu, MSR_HWP_CAPABILITIES, &cap); 796 rdmsrl_on_cpu(cpu, MSR_HWP_CAPABILITIES, &cap);
805 hw_min = HWP_LOWEST_PERF(cap);
806 if (global.no_turbo) 797 if (global.no_turbo)
807 hw_max = HWP_GUARANTEED_PERF(cap); 798 *current_max = HWP_GUARANTEED_PERF(cap);
808 else 799 else
809 hw_max = HWP_HIGHEST_PERF(cap); 800 *current_max = HWP_HIGHEST_PERF(cap);
801
802 *phy_max = HWP_HIGHEST_PERF(cap);
803}
804
805static void intel_pstate_hwp_set(unsigned int cpu)
806{
807 struct cpudata *cpu_data = all_cpu_data[cpu];
808 int max, min;
809 u64 value;
810 s16 epp;
811
812 max = cpu_data->max_perf_ratio;
813 min = cpu_data->min_perf_ratio;
810 814
811 max = fp_ext_toint(hw_max * cpu_data->max_perf);
812 if (cpu_data->policy == CPUFREQ_POLICY_PERFORMANCE) 815 if (cpu_data->policy == CPUFREQ_POLICY_PERFORMANCE)
813 min = max; 816 min = max;
814 else
815 min = fp_ext_toint(hw_max * cpu_data->min_perf);
816 817
817 rdmsrl_on_cpu(cpu, MSR_HWP_REQUEST, &value); 818 rdmsrl_on_cpu(cpu, MSR_HWP_REQUEST, &value);
818 819
@@ -1528,8 +1529,7 @@ static void intel_pstate_max_within_limits(struct cpudata *cpu)
1528 1529
1529 update_turbo_state(); 1530 update_turbo_state();
1530 pstate = intel_pstate_get_base_pstate(cpu); 1531 pstate = intel_pstate_get_base_pstate(cpu);
1531 pstate = max(cpu->pstate.min_pstate, 1532 pstate = max(cpu->pstate.min_pstate, cpu->max_perf_ratio);
1532 fp_ext_toint(pstate * cpu->max_perf));
1533 intel_pstate_set_pstate(cpu, pstate); 1533 intel_pstate_set_pstate(cpu, pstate);
1534} 1534}
1535 1535
@@ -1616,9 +1616,6 @@ static inline int32_t get_target_pstate_use_cpu_load(struct cpudata *cpu)
1616 int32_t busy_frac, boost; 1616 int32_t busy_frac, boost;
1617 int target, avg_pstate; 1617 int target, avg_pstate;
1618 1618
1619 if (cpu->policy == CPUFREQ_POLICY_PERFORMANCE)
1620 return cpu->pstate.turbo_pstate;
1621
1622 busy_frac = div_fp(sample->mperf, sample->tsc); 1619 busy_frac = div_fp(sample->mperf, sample->tsc);
1623 1620
1624 boost = cpu->iowait_boost; 1621 boost = cpu->iowait_boost;
@@ -1655,9 +1652,6 @@ static inline int32_t get_target_pstate_use_performance(struct cpudata *cpu)
1655 int32_t perf_scaled, max_pstate, current_pstate, sample_ratio; 1652 int32_t perf_scaled, max_pstate, current_pstate, sample_ratio;
1656 u64 duration_ns; 1653 u64 duration_ns;
1657 1654
1658 if (cpu->policy == CPUFREQ_POLICY_PERFORMANCE)
1659 return cpu->pstate.turbo_pstate;
1660
1661 /* 1655 /*
1662 * perf_scaled is the ratio of the average P-state during the last 1656 * perf_scaled is the ratio of the average P-state during the last
1663 * sampling period to the P-state requested last time (in percent). 1657 * sampling period to the P-state requested last time (in percent).
@@ -1695,9 +1689,8 @@ static int intel_pstate_prepare_request(struct cpudata *cpu, int pstate)
1695 int max_pstate = intel_pstate_get_base_pstate(cpu); 1689 int max_pstate = intel_pstate_get_base_pstate(cpu);
1696 int min_pstate; 1690 int min_pstate;
1697 1691
1698 min_pstate = max(cpu->pstate.min_pstate, 1692 min_pstate = max(cpu->pstate.min_pstate, cpu->min_perf_ratio);
1699 fp_ext_toint(max_pstate * cpu->min_perf)); 1693 max_pstate = max(min_pstate, cpu->max_perf_ratio);
1700 max_pstate = max(min_pstate, fp_ext_toint(max_pstate * cpu->max_perf));
1701 return clamp_t(int, pstate, min_pstate, max_pstate); 1694 return clamp_t(int, pstate, min_pstate, max_pstate);
1702} 1695}
1703 1696
@@ -1733,16 +1726,6 @@ static void intel_pstate_adjust_pstate(struct cpudata *cpu, int target_pstate)
1733 fp_toint(cpu->iowait_boost * 100)); 1726 fp_toint(cpu->iowait_boost * 100));
1734} 1727}
1735 1728
1736static void intel_pstate_update_util_hwp(struct update_util_data *data,
1737 u64 time, unsigned int flags)
1738{
1739 struct cpudata *cpu = container_of(data, struct cpudata, update_util);
1740 u64 delta_ns = time - cpu->sample.time;
1741
1742 if ((s64)delta_ns >= INTEL_PSTATE_HWP_SAMPLING_INTERVAL)
1743 intel_pstate_sample(cpu, time);
1744}
1745
1746static void intel_pstate_update_util_pid(struct update_util_data *data, 1729static void intel_pstate_update_util_pid(struct update_util_data *data,
1747 u64 time, unsigned int flags) 1730 u64 time, unsigned int flags)
1748{ 1731{
@@ -1934,6 +1917,9 @@ static void intel_pstate_set_update_util_hook(unsigned int cpu_num)
1934{ 1917{
1935 struct cpudata *cpu = all_cpu_data[cpu_num]; 1918 struct cpudata *cpu = all_cpu_data[cpu_num];
1936 1919
1920 if (hwp_active)
1921 return;
1922
1937 if (cpu->update_util_set) 1923 if (cpu->update_util_set)
1938 return; 1924 return;
1939 1925
@@ -1967,52 +1953,61 @@ static void intel_pstate_update_perf_limits(struct cpufreq_policy *policy,
1967{ 1953{
1968 int max_freq = intel_pstate_get_max_freq(cpu); 1954 int max_freq = intel_pstate_get_max_freq(cpu);
1969 int32_t max_policy_perf, min_policy_perf; 1955 int32_t max_policy_perf, min_policy_perf;
1956 int max_state, turbo_max;
1970 1957
1971 max_policy_perf = div_ext_fp(policy->max, max_freq); 1958 /*
1972 max_policy_perf = clamp_t(int32_t, max_policy_perf, 0, int_ext_tofp(1)); 1959 * HWP needs some special consideration, because on BDX the
1960 * HWP_REQUEST uses abstract value to represent performance
1961 * rather than pure ratios.
1962 */
1963 if (hwp_active) {
1964 intel_pstate_get_hwp_max(cpu->cpu, &turbo_max, &max_state);
1965 } else {
1966 max_state = intel_pstate_get_base_pstate(cpu);
1967 turbo_max = cpu->pstate.turbo_pstate;
1968 }
1969
1970 max_policy_perf = max_state * policy->max / max_freq;
1973 if (policy->max == policy->min) { 1971 if (policy->max == policy->min) {
1974 min_policy_perf = max_policy_perf; 1972 min_policy_perf = max_policy_perf;
1975 } else { 1973 } else {
1976 min_policy_perf = div_ext_fp(policy->min, max_freq); 1974 min_policy_perf = max_state * policy->min / max_freq;
1977 min_policy_perf = clamp_t(int32_t, min_policy_perf, 1975 min_policy_perf = clamp_t(int32_t, min_policy_perf,
1978 0, max_policy_perf); 1976 0, max_policy_perf);
1979 } 1977 }
1980 1978
1979 pr_debug("cpu:%d max_state %d min_policy_perf:%d max_policy_perf:%d\n",
1980 policy->cpu, max_state,
1981 min_policy_perf, max_policy_perf);
1982
1981 /* Normalize user input to [min_perf, max_perf] */ 1983 /* Normalize user input to [min_perf, max_perf] */
1982 if (per_cpu_limits) { 1984 if (per_cpu_limits) {
1983 cpu->min_perf = min_policy_perf; 1985 cpu->min_perf_ratio = min_policy_perf;
1984 cpu->max_perf = max_policy_perf; 1986 cpu->max_perf_ratio = max_policy_perf;
1985 } else { 1987 } else {
1986 int32_t global_min, global_max; 1988 int32_t global_min, global_max;
1987 1989
1988 /* Global limits are in percent of the maximum turbo P-state. */ 1990 /* Global limits are in percent of the maximum turbo P-state. */
1989 global_max = percent_ext_fp(global.max_perf_pct); 1991 global_max = DIV_ROUND_UP(turbo_max * global.max_perf_pct, 100);
1990 global_min = percent_ext_fp(global.min_perf_pct); 1992 global_min = DIV_ROUND_UP(turbo_max * global.min_perf_pct, 100);
1991 if (max_freq != cpu->pstate.turbo_freq) {
1992 int32_t turbo_factor;
1993
1994 turbo_factor = div_ext_fp(cpu->pstate.turbo_pstate,
1995 cpu->pstate.max_pstate);
1996 global_min = mul_ext_fp(global_min, turbo_factor);
1997 global_max = mul_ext_fp(global_max, turbo_factor);
1998 }
1999 global_min = clamp_t(int32_t, global_min, 0, global_max); 1993 global_min = clamp_t(int32_t, global_min, 0, global_max);
2000 1994
2001 cpu->min_perf = max(min_policy_perf, global_min); 1995 pr_debug("cpu:%d global_min:%d global_max:%d\n", policy->cpu,
2002 cpu->min_perf = min(cpu->min_perf, max_policy_perf); 1996 global_min, global_max);
2003 cpu->max_perf = min(max_policy_perf, global_max);
2004 cpu->max_perf = max(min_policy_perf, cpu->max_perf);
2005 1997
2006 /* Make sure min_perf <= max_perf */ 1998 cpu->min_perf_ratio = max(min_policy_perf, global_min);
2007 cpu->min_perf = min(cpu->min_perf, cpu->max_perf); 1999 cpu->min_perf_ratio = min(cpu->min_perf_ratio, max_policy_perf);
2008 } 2000 cpu->max_perf_ratio = min(max_policy_perf, global_max);
2001 cpu->max_perf_ratio = max(min_policy_perf, cpu->max_perf_ratio);
2009 2002
2010 cpu->max_perf = round_up(cpu->max_perf, EXT_FRAC_BITS); 2003 /* Make sure min_perf <= max_perf */
2011 cpu->min_perf = round_up(cpu->min_perf, EXT_FRAC_BITS); 2004 cpu->min_perf_ratio = min(cpu->min_perf_ratio,
2005 cpu->max_perf_ratio);
2012 2006
2013 pr_debug("cpu:%d max_perf_pct:%d min_perf_pct:%d\n", policy->cpu, 2007 }
2014 fp_ext_toint(cpu->max_perf * 100), 2008 pr_debug("cpu:%d max_perf_ratio:%d min_perf_ratio:%d\n", policy->cpu,
2015 fp_ext_toint(cpu->min_perf * 100)); 2009 cpu->max_perf_ratio,
2010 cpu->min_perf_ratio);
2016} 2011}
2017 2012
2018static int intel_pstate_set_policy(struct cpufreq_policy *policy) 2013static int intel_pstate_set_policy(struct cpufreq_policy *policy)
@@ -2039,10 +2034,10 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy)
2039 */ 2034 */
2040 intel_pstate_clear_update_util_hook(policy->cpu); 2035 intel_pstate_clear_update_util_hook(policy->cpu);
2041 intel_pstate_max_within_limits(cpu); 2036 intel_pstate_max_within_limits(cpu);
2037 } else {
2038 intel_pstate_set_update_util_hook(policy->cpu);
2042 } 2039 }
2043 2040
2044 intel_pstate_set_update_util_hook(policy->cpu);
2045
2046 if (hwp_active) 2041 if (hwp_active)
2047 intel_pstate_hwp_set(policy->cpu); 2042 intel_pstate_hwp_set(policy->cpu);
2048 2043
@@ -2115,8 +2110,8 @@ static int __intel_pstate_cpu_init(struct cpufreq_policy *policy)
2115 2110
2116 cpu = all_cpu_data[policy->cpu]; 2111 cpu = all_cpu_data[policy->cpu];
2117 2112
2118 cpu->max_perf = int_ext_tofp(1); 2113 cpu->max_perf_ratio = 0xFF;
2119 cpu->min_perf = 0; 2114 cpu->min_perf_ratio = 0;
2120 2115
2121 policy->min = cpu->pstate.min_pstate * cpu->pstate.scaling; 2116 policy->min = cpu->pstate.min_pstate * cpu->pstate.scaling;
2122 policy->max = cpu->pstate.turbo_pstate * cpu->pstate.scaling; 2117 policy->max = cpu->pstate.turbo_pstate * cpu->pstate.scaling;
@@ -2558,7 +2553,6 @@ static int __init intel_pstate_init(void)
2558 } else { 2553 } else {
2559 hwp_active++; 2554 hwp_active++;
2560 intel_pstate.attr = hwp_cpufreq_attrs; 2555 intel_pstate.attr = hwp_cpufreq_attrs;
2561 pstate_funcs.update_util = intel_pstate_update_util_hwp;
2562 goto hwp_cpu_matched; 2556 goto hwp_cpu_matched;
2563 } 2557 }
2564 } else { 2558 } else {
diff --git a/drivers/cpufreq/sfi-cpufreq.c b/drivers/cpufreq/sfi-cpufreq.c
index 992ce6f9abec..3779742f86e3 100644
--- a/drivers/cpufreq/sfi-cpufreq.c
+++ b/drivers/cpufreq/sfi-cpufreq.c
@@ -24,7 +24,7 @@
24 24
25#include <asm/msr.h> 25#include <asm/msr.h>
26 26
27struct cpufreq_frequency_table *freq_table; 27static struct cpufreq_frequency_table *freq_table;
28static struct sfi_freq_table_entry *sfi_cpufreq_array; 28static struct sfi_freq_table_entry *sfi_cpufreq_array;
29static int num_freq_table_entries; 29static int num_freq_table_entries;
30 30
diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm
index 21340e0be73e..f52144808455 100644
--- a/drivers/cpuidle/Kconfig.arm
+++ b/drivers/cpuidle/Kconfig.arm
@@ -4,6 +4,7 @@
4config ARM_CPUIDLE 4config ARM_CPUIDLE
5 bool "Generic ARM/ARM64 CPU idle Driver" 5 bool "Generic ARM/ARM64 CPU idle Driver"
6 select DT_IDLE_STATES 6 select DT_IDLE_STATES
7 select CPU_IDLE_MULTIPLE_DRIVERS
7 help 8 help
8 Select this to enable generic cpuidle driver for ARM. 9 Select this to enable generic cpuidle driver for ARM.
9 It provides a generic idle driver whose idle states are configured 10 It provides a generic idle driver whose idle states are configured
diff --git a/drivers/cpuidle/cpuidle-arm.c b/drivers/cpuidle/cpuidle-arm.c
index f440d385ed34..7080c384ad5d 100644
--- a/drivers/cpuidle/cpuidle-arm.c
+++ b/drivers/cpuidle/cpuidle-arm.c
@@ -18,6 +18,7 @@
18#include <linux/module.h> 18#include <linux/module.h>
19#include <linux/of.h> 19#include <linux/of.h>
20#include <linux/slab.h> 20#include <linux/slab.h>
21#include <linux/topology.h>
21 22
22#include <asm/cpuidle.h> 23#include <asm/cpuidle.h>
23 24
@@ -44,7 +45,7 @@ static int arm_enter_idle_state(struct cpuidle_device *dev,
44 return CPU_PM_CPU_IDLE_ENTER(arm_cpuidle_suspend, idx); 45 return CPU_PM_CPU_IDLE_ENTER(arm_cpuidle_suspend, idx);
45} 46}
46 47
47static struct cpuidle_driver arm_idle_driver = { 48static struct cpuidle_driver arm_idle_driver __initdata = {
48 .name = "arm_idle", 49 .name = "arm_idle",
49 .owner = THIS_MODULE, 50 .owner = THIS_MODULE,
50 /* 51 /*
@@ -80,30 +81,42 @@ static const struct of_device_id arm_idle_state_match[] __initconst = {
80static int __init arm_idle_init(void) 81static int __init arm_idle_init(void)
81{ 82{
82 int cpu, ret; 83 int cpu, ret;
83 struct cpuidle_driver *drv = &arm_idle_driver; 84 struct cpuidle_driver *drv;
84 struct cpuidle_device *dev; 85 struct cpuidle_device *dev;
85 86
86 /*
87 * Initialize idle states data, starting at index 1.
88 * This driver is DT only, if no DT idle states are detected (ret == 0)
89 * let the driver initialization fail accordingly since there is no
90 * reason to initialize the idle driver if only wfi is supported.
91 */
92 ret = dt_init_idle_driver(drv, arm_idle_state_match, 1);
93 if (ret <= 0)
94 return ret ? : -ENODEV;
95
96 ret = cpuidle_register_driver(drv);
97 if (ret) {
98 pr_err("Failed to register cpuidle driver\n");
99 return ret;
100 }
101
102 /*
103 * Call arch CPU operations in order to initialize
104 * idle states suspend back-end specific data
105 */
106 for_each_possible_cpu(cpu) { 87 for_each_possible_cpu(cpu) {
88
89 drv = kmemdup(&arm_idle_driver, sizeof(*drv), GFP_KERNEL);
90 if (!drv) {
91 ret = -ENOMEM;
92 goto out_fail;
93 }
94
95 drv->cpumask = (struct cpumask *)cpumask_of(cpu);
96
97 /*
98 * Initialize idle states data, starting at index 1. This
99 * driver is DT only, if no DT idle states are detected (ret
100 * == 0) let the driver initialization fail accordingly since
101 * there is no reason to initialize the idle driver if only
102 * wfi is supported.
103 */
104 ret = dt_init_idle_driver(drv, arm_idle_state_match, 1);
105 if (ret <= 0) {
106 ret = ret ? : -ENODEV;
107 goto out_fail;
108 }
109
110 ret = cpuidle_register_driver(drv);
111 if (ret) {
112 pr_err("Failed to register cpuidle driver\n");
113 goto out_fail;
114 }
115
116 /*
117 * Call arch CPU operations in order to initialize
118 * idle states suspend back-end specific data
119 */
107 ret = arm_cpuidle_init(cpu); 120 ret = arm_cpuidle_init(cpu);
108 121
109 /* 122 /*
@@ -141,10 +154,11 @@ out_fail:
141 dev = per_cpu(cpuidle_devices, cpu); 154 dev = per_cpu(cpuidle_devices, cpu);
142 cpuidle_unregister_device(dev); 155 cpuidle_unregister_device(dev);
143 kfree(dev); 156 kfree(dev);
157 drv = cpuidle_get_driver();
158 cpuidle_unregister_driver(drv);
159 kfree(drv);
144 } 160 }
145 161
146 cpuidle_unregister_driver(drv);
147
148 return ret; 162 return ret;
149} 163}
150device_initcall(arm_idle_init); 164device_initcall(arm_idle_init);
diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index b2330fd69e34..61b64c2b2cb8 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -286,6 +286,8 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
286 struct device *device = get_cpu_device(dev->cpu); 286 struct device *device = get_cpu_device(dev->cpu);
287 int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY); 287 int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
288 int i; 288 int i;
289 int first_idx;
290 int idx;
289 unsigned int interactivity_req; 291 unsigned int interactivity_req;
290 unsigned int expected_interval; 292 unsigned int expected_interval;
291 unsigned long nr_iowaiters, cpu_load; 293 unsigned long nr_iowaiters, cpu_load;
@@ -335,11 +337,11 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
335 if (data->next_timer_us > polling_threshold && 337 if (data->next_timer_us > polling_threshold &&
336 latency_req > s->exit_latency && !s->disabled && 338 latency_req > s->exit_latency && !s->disabled &&
337 !dev->states_usage[CPUIDLE_DRIVER_STATE_START].disable) 339 !dev->states_usage[CPUIDLE_DRIVER_STATE_START].disable)
338 data->last_state_idx = CPUIDLE_DRIVER_STATE_START; 340 first_idx = CPUIDLE_DRIVER_STATE_START;
339 else 341 else
340 data->last_state_idx = CPUIDLE_DRIVER_STATE_START - 1; 342 first_idx = CPUIDLE_DRIVER_STATE_START - 1;
341 } else { 343 } else {
342 data->last_state_idx = CPUIDLE_DRIVER_STATE_START; 344 first_idx = 0;
343 } 345 }
344 346
345 /* 347 /*
@@ -359,20 +361,28 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
359 * Find the idle state with the lowest power while satisfying 361 * Find the idle state with the lowest power while satisfying
360 * our constraints. 362 * our constraints.
361 */ 363 */
362 for (i = data->last_state_idx + 1; i < drv->state_count; i++) { 364 idx = -1;
365 for (i = first_idx; i < drv->state_count; i++) {
363 struct cpuidle_state *s = &drv->states[i]; 366 struct cpuidle_state *s = &drv->states[i];
364 struct cpuidle_state_usage *su = &dev->states_usage[i]; 367 struct cpuidle_state_usage *su = &dev->states_usage[i];
365 368
366 if (s->disabled || su->disable) 369 if (s->disabled || su->disable)
367 continue; 370 continue;
371 if (idx == -1)
372 idx = i; /* first enabled state */
368 if (s->target_residency > data->predicted_us) 373 if (s->target_residency > data->predicted_us)
369 break; 374 break;
370 if (s->exit_latency > latency_req) 375 if (s->exit_latency > latency_req)
371 break; 376 break;
372 377
373 data->last_state_idx = i; 378 idx = i;
374 } 379 }
375 380
381 if (idx == -1)
382 idx = 0; /* No states enabled. Must use 0. */
383
384 data->last_state_idx = idx;
385
376 return data->last_state_idx; 386 return data->last_state_idx;
377} 387}
378 388
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 216d7ec88c0c..c2ae819a871c 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -51,6 +51,8 @@
51/* un-comment DEBUG to enable pr_debug() statements */ 51/* un-comment DEBUG to enable pr_debug() statements */
52#define DEBUG 52#define DEBUG
53 53
54#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
55
54#include <linux/kernel.h> 56#include <linux/kernel.h>
55#include <linux/cpuidle.h> 57#include <linux/cpuidle.h>
56#include <linux/tick.h> 58#include <linux/tick.h>
@@ -65,7 +67,6 @@
65#include <asm/msr.h> 67#include <asm/msr.h>
66 68
67#define INTEL_IDLE_VERSION "0.4.1" 69#define INTEL_IDLE_VERSION "0.4.1"
68#define PREFIX "intel_idle: "
69 70
70static struct cpuidle_driver intel_idle_driver = { 71static struct cpuidle_driver intel_idle_driver = {
71 .name = "intel_idle", 72 .name = "intel_idle",
@@ -1111,7 +1112,7 @@ static int __init intel_idle_probe(void)
1111 const struct x86_cpu_id *id; 1112 const struct x86_cpu_id *id;
1112 1113
1113 if (max_cstate == 0) { 1114 if (max_cstate == 0) {
1114 pr_debug(PREFIX "disabled\n"); 1115 pr_debug("disabled\n");
1115 return -EPERM; 1116 return -EPERM;
1116 } 1117 }
1117 1118
@@ -1119,8 +1120,8 @@ static int __init intel_idle_probe(void)
1119 if (!id) { 1120 if (!id) {
1120 if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && 1121 if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
1121 boot_cpu_data.x86 == 6) 1122 boot_cpu_data.x86 == 6)
1122 pr_debug(PREFIX "does not run on family %d model %d\n", 1123 pr_debug("does not run on family %d model %d\n",
1123 boot_cpu_data.x86, boot_cpu_data.x86_model); 1124 boot_cpu_data.x86, boot_cpu_data.x86_model);
1124 return -ENODEV; 1125 return -ENODEV;
1125 } 1126 }
1126 1127
@@ -1134,13 +1135,13 @@ static int __init intel_idle_probe(void)
1134 !mwait_substates) 1135 !mwait_substates)
1135 return -ENODEV; 1136 return -ENODEV;
1136 1137
1137 pr_debug(PREFIX "MWAIT substates: 0x%x\n", mwait_substates); 1138 pr_debug("MWAIT substates: 0x%x\n", mwait_substates);
1138 1139
1139 icpu = (const struct idle_cpu *)id->driver_data; 1140 icpu = (const struct idle_cpu *)id->driver_data;
1140 cpuidle_state_table = icpu->state_table; 1141 cpuidle_state_table = icpu->state_table;
1141 1142
1142 pr_debug(PREFIX "v" INTEL_IDLE_VERSION 1143 pr_debug("v" INTEL_IDLE_VERSION " model 0x%X\n",
1143 " model 0x%X\n", boot_cpu_data.x86_model); 1144 boot_cpu_data.x86_model);
1144 1145
1145 return 0; 1146 return 0;
1146} 1147}
@@ -1340,8 +1341,7 @@ static void __init intel_idle_cpuidle_driver_init(void)
1340 break; 1341 break;
1341 1342
1342 if (cstate + 1 > max_cstate) { 1343 if (cstate + 1 > max_cstate) {
1343 printk(PREFIX "max_cstate %d reached\n", 1344 pr_info("max_cstate %d reached\n", max_cstate);
1344 max_cstate);
1345 break; 1345 break;
1346 } 1346 }
1347 1347
@@ -1358,8 +1358,8 @@ static void __init intel_idle_cpuidle_driver_init(void)
1358 1358
1359 /* if state marked as disabled, skip it */ 1359 /* if state marked as disabled, skip it */
1360 if (cpuidle_state_table[cstate].disabled != 0) { 1360 if (cpuidle_state_table[cstate].disabled != 0) {
1361 pr_debug(PREFIX "state %s is disabled", 1361 pr_debug("state %s is disabled\n",
1362 cpuidle_state_table[cstate].name); 1362 cpuidle_state_table[cstate].name);
1363 continue; 1363 continue;
1364 } 1364 }
1365 1365
@@ -1395,7 +1395,7 @@ static int intel_idle_cpu_init(unsigned int cpu)
1395 dev->cpu = cpu; 1395 dev->cpu = cpu;
1396 1396
1397 if (cpuidle_register_device(dev)) { 1397 if (cpuidle_register_device(dev)) {
1398 pr_debug(PREFIX "cpuidle_register_device %d failed!\n", cpu); 1398 pr_debug("cpuidle_register_device %d failed!\n", cpu);
1399 return -EIO; 1399 return -EIO;
1400 } 1400 }
1401 1401
@@ -1447,8 +1447,8 @@ static int __init intel_idle_init(void)
1447 retval = cpuidle_register_driver(&intel_idle_driver); 1447 retval = cpuidle_register_driver(&intel_idle_driver);
1448 if (retval) { 1448 if (retval) {
1449 struct cpuidle_driver *drv = cpuidle_get_driver(); 1449 struct cpuidle_driver *drv = cpuidle_get_driver();
1450 printk(KERN_DEBUG PREFIX "intel_idle yielding to %s", 1450 printk(KERN_DEBUG pr_fmt("intel_idle yielding to %s\n"),
1451 drv ? drv->name : "none"); 1451 drv ? drv->name : "none");
1452 goto init_driver_fail; 1452 goto init_driver_fail;
1453 } 1453 }
1454 1454
@@ -1460,8 +1460,8 @@ static int __init intel_idle_init(void)
1460 if (retval < 0) 1460 if (retval < 0)
1461 goto hp_setup_fail; 1461 goto hp_setup_fail;
1462 1462
1463 pr_debug(PREFIX "lapic_timer_reliable_states 0x%x\n", 1463 pr_debug("lapic_timer_reliable_states 0x%x\n",
1464 lapic_timer_reliable_states); 1464 lapic_timer_reliable_states);
1465 1465
1466 return 0; 1466 return 0;
1467 1467
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 47070cff508c..e70c1c7ba1bf 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -394,29 +394,26 @@ bool pciehp_is_native(struct pci_dev *pdev)
394 394
395/** 395/**
396 * pci_acpi_wake_bus - Root bus wakeup notification fork function. 396 * pci_acpi_wake_bus - Root bus wakeup notification fork function.
397 * @work: Work item to handle. 397 * @context: Device wakeup context.
398 */ 398 */
399static void pci_acpi_wake_bus(struct work_struct *work) 399static void pci_acpi_wake_bus(struct acpi_device_wakeup_context *context)
400{ 400{
401 struct acpi_device *adev; 401 struct acpi_device *adev;
402 struct acpi_pci_root *root; 402 struct acpi_pci_root *root;
403 403
404 adev = container_of(work, struct acpi_device, wakeup.context.work); 404 adev = container_of(context, struct acpi_device, wakeup.context);
405 root = acpi_driver_data(adev); 405 root = acpi_driver_data(adev);
406 pci_pme_wakeup_bus(root->bus); 406 pci_pme_wakeup_bus(root->bus);
407} 407}
408 408
409/** 409/**
410 * pci_acpi_wake_dev - PCI device wakeup notification work function. 410 * pci_acpi_wake_dev - PCI device wakeup notification work function.
411 * @handle: ACPI handle of a device the notification is for. 411 * @context: Device wakeup context.
412 * @work: Work item to handle.
413 */ 412 */
414static void pci_acpi_wake_dev(struct work_struct *work) 413static void pci_acpi_wake_dev(struct acpi_device_wakeup_context *context)
415{ 414{
416 struct acpi_device_wakeup_context *context;
417 struct pci_dev *pci_dev; 415 struct pci_dev *pci_dev;
418 416
419 context = container_of(work, struct acpi_device_wakeup_context, work);
420 pci_dev = to_pci_dev(context->dev); 417 pci_dev = to_pci_dev(context->dev);
421 418
422 if (pci_dev->pme_poll) 419 if (pci_dev->pme_poll)
@@ -424,7 +421,7 @@ static void pci_acpi_wake_dev(struct work_struct *work)
424 421
425 if (pci_dev->current_state == PCI_D3cold) { 422 if (pci_dev->current_state == PCI_D3cold) {
426 pci_wakeup_event(pci_dev); 423 pci_wakeup_event(pci_dev);
427 pm_runtime_resume(&pci_dev->dev); 424 pm_request_resume(&pci_dev->dev);
428 return; 425 return;
429 } 426 }
430 427
@@ -433,7 +430,7 @@ static void pci_acpi_wake_dev(struct work_struct *work)
433 pci_check_pme_status(pci_dev); 430 pci_check_pme_status(pci_dev);
434 431
435 pci_wakeup_event(pci_dev); 432 pci_wakeup_event(pci_dev);
436 pm_runtime_resume(&pci_dev->dev); 433 pm_request_resume(&pci_dev->dev);
437 434
438 pci_pme_wakeup_bus(pci_dev->subordinate); 435 pci_pme_wakeup_bus(pci_dev->subordinate);
439} 436}
@@ -572,67 +569,29 @@ static pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
572 return state_conv[state]; 569 return state_conv[state];
573} 570}
574 571
575static bool acpi_pci_can_wakeup(struct pci_dev *dev) 572static int acpi_pci_propagate_wakeup(struct pci_bus *bus, bool enable)
576{
577 struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
578 return adev ? acpi_device_can_wakeup(adev) : false;
579}
580
581static void acpi_pci_propagate_wakeup_enable(struct pci_bus *bus, bool enable)
582{
583 while (bus->parent) {
584 if (!acpi_pm_device_sleep_wake(&bus->self->dev, enable))
585 return;
586 bus = bus->parent;
587 }
588
589 /* We have reached the root bus. */
590 if (bus->bridge)
591 acpi_pm_device_sleep_wake(bus->bridge, enable);
592}
593
594static int acpi_pci_sleep_wake(struct pci_dev *dev, bool enable)
595{
596 if (acpi_pci_can_wakeup(dev))
597 return acpi_pm_device_sleep_wake(&dev->dev, enable);
598
599 acpi_pci_propagate_wakeup_enable(dev->bus, enable);
600 return 0;
601}
602
603static void acpi_pci_propagate_run_wake(struct pci_bus *bus, bool enable)
604{ 573{
605 while (bus->parent) { 574 while (bus->parent) {
606 struct pci_dev *bridge = bus->self; 575 if (acpi_pm_device_can_wakeup(&bus->self->dev))
576 return acpi_pm_set_device_wakeup(&bus->self->dev, enable);
607 577
608 if (bridge->pme_interrupt)
609 return;
610 if (!acpi_pm_device_run_wake(&bridge->dev, enable))
611 return;
612 bus = bus->parent; 578 bus = bus->parent;
613 } 579 }
614 580
615 /* We have reached the root bus. */ 581 /* We have reached the root bus. */
616 if (bus->bridge) 582 if (bus->bridge) {
617 acpi_pm_device_run_wake(bus->bridge, enable); 583 if (acpi_pm_device_can_wakeup(bus->bridge))
584 return acpi_pm_set_device_wakeup(bus->bridge, enable);
585 }
586 return 0;
618} 587}
619 588
620static int acpi_pci_run_wake(struct pci_dev *dev, bool enable) 589static int acpi_pci_wakeup(struct pci_dev *dev, bool enable)
621{ 590{
622 /* 591 if (acpi_pm_device_can_wakeup(&dev->dev))
623 * Per PCI Express Base Specification Revision 2.0 section 592 return acpi_pm_set_device_wakeup(&dev->dev, enable);
624 * 5.3.3.2 Link Wakeup, platform support is needed for D3cold
625 * waking up to power on the main link even if there is PME
626 * support for D3cold
627 */
628 if (dev->pme_interrupt && !dev->runtime_d3cold)
629 return 0;
630
631 if (!acpi_pm_device_run_wake(&dev->dev, enable))
632 return 0;
633 593
634 acpi_pci_propagate_run_wake(dev->bus, enable); 594 return acpi_pci_propagate_wakeup(dev->bus, enable);
635 return 0;
636} 595}
637 596
638static bool acpi_pci_need_resume(struct pci_dev *dev) 597static bool acpi_pci_need_resume(struct pci_dev *dev)
@@ -656,8 +615,7 @@ static const struct pci_platform_pm_ops acpi_pci_platform_pm = {
656 .set_state = acpi_pci_set_power_state, 615 .set_state = acpi_pci_set_power_state,
657 .get_state = acpi_pci_get_power_state, 616 .get_state = acpi_pci_get_power_state,
658 .choose_state = acpi_pci_choose_state, 617 .choose_state = acpi_pci_choose_state,
659 .sleep_wake = acpi_pci_sleep_wake, 618 .set_wakeup = acpi_pci_wakeup,
660 .run_wake = acpi_pci_run_wake,
661 .need_resume = acpi_pci_need_resume, 619 .need_resume = acpi_pci_need_resume,
662}; 620};
663 621
@@ -780,9 +738,7 @@ static void pci_acpi_setup(struct device *dev)
780 return; 738 return;
781 739
782 device_set_wakeup_capable(dev, true); 740 device_set_wakeup_capable(dev, true);
783 acpi_pci_sleep_wake(pci_dev, false); 741 acpi_pci_wakeup(pci_dev, false);
784 if (adev->wakeup.flags.run_wake)
785 device_set_run_wake(dev, true);
786} 742}
787 743
788static void pci_acpi_cleanup(struct device *dev) 744static void pci_acpi_cleanup(struct device *dev)
@@ -793,10 +749,8 @@ static void pci_acpi_cleanup(struct device *dev)
793 return; 749 return;
794 750
795 pci_acpi_remove_pm_notifier(adev); 751 pci_acpi_remove_pm_notifier(adev);
796 if (adev->wakeup.flags.valid) { 752 if (adev->wakeup.flags.valid)
797 device_set_wakeup_capable(dev, false); 753 device_set_wakeup_capable(dev, false);
798 device_set_run_wake(dev, false);
799 }
800} 754}
801 755
802static bool pci_acpi_bus_match(struct device *dev) 756static bool pci_acpi_bus_match(struct device *dev)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 00e10bf7f6a2..df4aead394f2 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1219,7 +1219,7 @@ static int pci_pm_runtime_resume(struct device *dev)
1219 1219
1220 pci_restore_standard_config(pci_dev); 1220 pci_restore_standard_config(pci_dev);
1221 pci_fixup_device(pci_fixup_resume_early, pci_dev); 1221 pci_fixup_device(pci_fixup_resume_early, pci_dev);
1222 __pci_enable_wake(pci_dev, PCI_D0, true, false); 1222 pci_enable_wake(pci_dev, PCI_D0, false);
1223 pci_fixup_device(pci_fixup_resume, pci_dev); 1223 pci_fixup_device(pci_fixup_resume, pci_dev);
1224 1224
1225 rc = pm->runtime_resume(dev); 1225 rc = pm->runtime_resume(dev);
diff --git a/drivers/pci/pci-mid.c b/drivers/pci/pci-mid.c
index 1c4af7227bca..a4ac940c7696 100644
--- a/drivers/pci/pci-mid.c
+++ b/drivers/pci/pci-mid.c
@@ -39,12 +39,7 @@ static pci_power_t mid_pci_choose_state(struct pci_dev *pdev)
39 return PCI_D3hot; 39 return PCI_D3hot;
40} 40}
41 41
42static int mid_pci_sleep_wake(struct pci_dev *dev, bool enable) 42static int mid_pci_wakeup(struct pci_dev *dev, bool enable)
43{
44 return 0;
45}
46
47static int mid_pci_run_wake(struct pci_dev *dev, bool enable)
48{ 43{
49 return 0; 44 return 0;
50} 45}
@@ -59,8 +54,7 @@ static const struct pci_platform_pm_ops mid_pci_platform_pm = {
59 .set_state = mid_pci_set_power_state, 54 .set_state = mid_pci_set_power_state,
60 .get_state = mid_pci_get_power_state, 55 .get_state = mid_pci_get_power_state,
61 .choose_state = mid_pci_choose_state, 56 .choose_state = mid_pci_choose_state,
62 .sleep_wake = mid_pci_sleep_wake, 57 .set_wakeup = mid_pci_wakeup,
63 .run_wake = mid_pci_run_wake,
64 .need_resume = mid_pci_need_resume, 58 .need_resume = mid_pci_need_resume,
65}; 59};
66 60
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 563901cd9c06..0b5302a9fdae 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -574,8 +574,7 @@ static const struct pci_platform_pm_ops *pci_platform_pm;
574int pci_set_platform_pm(const struct pci_platform_pm_ops *ops) 574int pci_set_platform_pm(const struct pci_platform_pm_ops *ops)
575{ 575{
576 if (!ops->is_manageable || !ops->set_state || !ops->get_state || 576 if (!ops->is_manageable || !ops->set_state || !ops->get_state ||
577 !ops->choose_state || !ops->sleep_wake || !ops->run_wake || 577 !ops->choose_state || !ops->set_wakeup || !ops->need_resume)
578 !ops->need_resume)
579 return -EINVAL; 578 return -EINVAL;
580 pci_platform_pm = ops; 579 pci_platform_pm = ops;
581 return 0; 580 return 0;
@@ -603,16 +602,10 @@ static inline pci_power_t platform_pci_choose_state(struct pci_dev *dev)
603 pci_platform_pm->choose_state(dev) : PCI_POWER_ERROR; 602 pci_platform_pm->choose_state(dev) : PCI_POWER_ERROR;
604} 603}
605 604
606static inline int platform_pci_sleep_wake(struct pci_dev *dev, bool enable) 605static inline int platform_pci_set_wakeup(struct pci_dev *dev, bool enable)
607{ 606{
608 return pci_platform_pm ? 607 return pci_platform_pm ?
609 pci_platform_pm->sleep_wake(dev, enable) : -ENODEV; 608 pci_platform_pm->set_wakeup(dev, enable) : -ENODEV;
610}
611
612static inline int platform_pci_run_wake(struct pci_dev *dev, bool enable)
613{
614 return pci_platform_pm ?
615 pci_platform_pm->run_wake(dev, enable) : -ENODEV;
616} 609}
617 610
618static inline bool platform_pci_need_resume(struct pci_dev *dev) 611static inline bool platform_pci_need_resume(struct pci_dev *dev)
@@ -1805,6 +1798,23 @@ static void __pci_pme_active(struct pci_dev *dev, bool enable)
1805 pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, pmcsr); 1798 pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, pmcsr);
1806} 1799}
1807 1800
1801static void pci_pme_restore(struct pci_dev *dev)
1802{
1803 u16 pmcsr;
1804
1805 if (!dev->pme_support)
1806 return;
1807
1808 pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
1809 if (dev->wakeup_prepared) {
1810 pmcsr |= PCI_PM_CTRL_PME_ENABLE;
1811 } else {
1812 pmcsr &= ~PCI_PM_CTRL_PME_ENABLE;
1813 pmcsr |= PCI_PM_CTRL_PME_STATUS;
1814 }
1815 pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, pmcsr);
1816}
1817
1808/** 1818/**
1809 * pci_pme_active - enable or disable PCI device's PME# function 1819 * pci_pme_active - enable or disable PCI device's PME# function
1810 * @dev: PCI device to handle. 1820 * @dev: PCI device to handle.
@@ -1872,10 +1882,9 @@ void pci_pme_active(struct pci_dev *dev, bool enable)
1872EXPORT_SYMBOL(pci_pme_active); 1882EXPORT_SYMBOL(pci_pme_active);
1873 1883
1874/** 1884/**
1875 * __pci_enable_wake - enable PCI device as wakeup event source 1885 * pci_enable_wake - enable PCI device as wakeup event source
1876 * @dev: PCI device affected 1886 * @dev: PCI device affected
1877 * @state: PCI state from which device will issue wakeup events 1887 * @state: PCI state from which device will issue wakeup events
1878 * @runtime: True if the events are to be generated at run time
1879 * @enable: True to enable event generation; false to disable 1888 * @enable: True to enable event generation; false to disable
1880 * 1889 *
1881 * This enables the device as a wakeup event source, or disables it. 1890 * This enables the device as a wakeup event source, or disables it.
@@ -1891,17 +1900,18 @@ EXPORT_SYMBOL(pci_pme_active);
1891 * Error code depending on the platform is returned if both the platform and 1900 * Error code depending on the platform is returned if both the platform and
1892 * the native mechanism fail to enable the generation of wake-up events 1901 * the native mechanism fail to enable the generation of wake-up events
1893 */ 1902 */
1894int __pci_enable_wake(struct pci_dev *dev, pci_power_t state, 1903int pci_enable_wake(struct pci_dev *dev, pci_power_t state, bool enable)
1895 bool runtime, bool enable)
1896{ 1904{
1897 int ret = 0; 1905 int ret = 0;
1898 1906
1899 if (enable && !runtime && !device_may_wakeup(&dev->dev)) 1907 /*
1900 return -EINVAL; 1908 * Don't do the same thing twice in a row for one device, but restore
1901 1909 * PME Enable in case it has been updated by config space restoration.
1902 /* Don't do the same thing twice in a row for one device. */ 1910 */
1903 if (!!enable == !!dev->wakeup_prepared) 1911 if (!!enable == !!dev->wakeup_prepared) {
1912 pci_pme_restore(dev);
1904 return 0; 1913 return 0;
1914 }
1905 1915
1906 /* 1916 /*
1907 * According to "PCI System Architecture" 4th ed. by Tom Shanley & Don 1917 * According to "PCI System Architecture" 4th ed. by Tom Shanley & Don
@@ -1916,24 +1926,20 @@ int __pci_enable_wake(struct pci_dev *dev, pci_power_t state,
1916 pci_pme_active(dev, true); 1926 pci_pme_active(dev, true);
1917 else 1927 else
1918 ret = 1; 1928 ret = 1;
1919 error = runtime ? platform_pci_run_wake(dev, true) : 1929 error = platform_pci_set_wakeup(dev, true);
1920 platform_pci_sleep_wake(dev, true);
1921 if (ret) 1930 if (ret)
1922 ret = error; 1931 ret = error;
1923 if (!ret) 1932 if (!ret)
1924 dev->wakeup_prepared = true; 1933 dev->wakeup_prepared = true;
1925 } else { 1934 } else {
1926 if (runtime) 1935 platform_pci_set_wakeup(dev, false);
1927 platform_pci_run_wake(dev, false);
1928 else
1929 platform_pci_sleep_wake(dev, false);
1930 pci_pme_active(dev, false); 1936 pci_pme_active(dev, false);
1931 dev->wakeup_prepared = false; 1937 dev->wakeup_prepared = false;
1932 } 1938 }
1933 1939
1934 return ret; 1940 return ret;
1935} 1941}
1936EXPORT_SYMBOL(__pci_enable_wake); 1942EXPORT_SYMBOL(pci_enable_wake);
1937 1943
1938/** 1944/**
1939 * pci_wake_from_d3 - enable/disable device to wake up from D3_hot or D3_cold 1945 * pci_wake_from_d3 - enable/disable device to wake up from D3_hot or D3_cold
@@ -2075,12 +2081,12 @@ int pci_finish_runtime_suspend(struct pci_dev *dev)
2075 2081
2076 dev->runtime_d3cold = target_state == PCI_D3cold; 2082 dev->runtime_d3cold = target_state == PCI_D3cold;
2077 2083
2078 __pci_enable_wake(dev, target_state, true, pci_dev_run_wake(dev)); 2084 pci_enable_wake(dev, target_state, pci_dev_run_wake(dev));
2079 2085
2080 error = pci_set_power_state(dev, target_state); 2086 error = pci_set_power_state(dev, target_state);
2081 2087
2082 if (error) { 2088 if (error) {
2083 __pci_enable_wake(dev, target_state, true, false); 2089 pci_enable_wake(dev, target_state, false);
2084 dev->runtime_d3cold = false; 2090 dev->runtime_d3cold = false;
2085 } 2091 }
2086 2092
@@ -2099,7 +2105,7 @@ bool pci_dev_run_wake(struct pci_dev *dev)
2099{ 2105{
2100 struct pci_bus *bus = dev->bus; 2106 struct pci_bus *bus = dev->bus;
2101 2107
2102 if (device_run_wake(&dev->dev)) 2108 if (device_can_wakeup(&dev->dev))
2103 return true; 2109 return true;
2104 2110
2105 if (!dev->pme_support) 2111 if (!dev->pme_support)
@@ -2112,7 +2118,7 @@ bool pci_dev_run_wake(struct pci_dev *dev)
2112 while (bus->parent) { 2118 while (bus->parent) {
2113 struct pci_dev *bridge = bus->self; 2119 struct pci_dev *bridge = bus->self;
2114 2120
2115 if (device_run_wake(&bridge->dev)) 2121 if (device_can_wakeup(&bridge->dev))
2116 return true; 2122 return true;
2117 2123
2118 bus = bus->parent; 2124 bus = bus->parent;
@@ -2120,7 +2126,7 @@ bool pci_dev_run_wake(struct pci_dev *dev)
2120 2126
2121 /* We have reached the root bus. */ 2127 /* We have reached the root bus. */
2122 if (bus->bridge) 2128 if (bus->bridge)
2123 return device_run_wake(bus->bridge); 2129 return device_can_wakeup(bus->bridge);
2124 2130
2125 return false; 2131 return false;
2126} 2132}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index f8113e5b9812..240b2c0fed4b 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -47,11 +47,7 @@ int pci_probe_reset_function(struct pci_dev *dev);
47 * platform; to be used during system-wide transitions from a 47 * platform; to be used during system-wide transitions from a
48 * sleeping state to the working state and vice versa 48 * sleeping state to the working state and vice versa
49 * 49 *
50 * @sleep_wake: enables/disables the system wake up capability of given device 50 * @set_wakeup: enables/disables wakeup capability for the device
51 *
52 * @run_wake: enables/disables the platform to generate run-time wake-up events
53 * for given device (the device's wake-up capability has to be
54 * enabled by @sleep_wake for this feature to work)
55 * 51 *
56 * @need_resume: returns 'true' if the given device (which is currently 52 * @need_resume: returns 'true' if the given device (which is currently
57 * suspended) needs to be resumed to be configured for system 53 * suspended) needs to be resumed to be configured for system
@@ -65,8 +61,7 @@ struct pci_platform_pm_ops {
65 int (*set_state)(struct pci_dev *dev, pci_power_t state); 61 int (*set_state)(struct pci_dev *dev, pci_power_t state);
66 pci_power_t (*get_state)(struct pci_dev *dev); 62 pci_power_t (*get_state)(struct pci_dev *dev);
67 pci_power_t (*choose_state)(struct pci_dev *dev); 63 pci_power_t (*choose_state)(struct pci_dev *dev);
68 int (*sleep_wake)(struct pci_dev *dev, bool enable); 64 int (*set_wakeup)(struct pci_dev *dev, bool enable);
69 int (*run_wake)(struct pci_dev *dev, bool enable);
70 bool (*need_resume)(struct pci_dev *dev); 65 bool (*need_resume)(struct pci_dev *dev);
71}; 66};
72 67
diff --git a/drivers/pci/pcie/pme.c b/drivers/pci/pcie/pme.c
index 2dd1c68e6de8..80e58d25006d 100644
--- a/drivers/pci/pcie/pme.c
+++ b/drivers/pci/pcie/pme.c
@@ -294,31 +294,29 @@ static irqreturn_t pcie_pme_irq(int irq, void *context)
294} 294}
295 295
296/** 296/**
297 * pcie_pme_set_native - Set the PME interrupt flag for given device. 297 * pcie_pme_can_wakeup - Set the wakeup capability flag.
298 * @dev: PCI device to handle. 298 * @dev: PCI device to handle.
299 * @ign: Ignored. 299 * @ign: Ignored.
300 */ 300 */
301static int pcie_pme_set_native(struct pci_dev *dev, void *ign) 301static int pcie_pme_can_wakeup(struct pci_dev *dev, void *ign)
302{ 302{
303 device_set_run_wake(&dev->dev, true); 303 device_set_wakeup_capable(&dev->dev, true);
304 dev->pme_interrupt = true;
305 return 0; 304 return 0;
306} 305}
307 306
308/** 307/**
309 * pcie_pme_mark_devices - Set the PME interrupt flag for devices below a port. 308 * pcie_pme_mark_devices - Set the wakeup flag for devices below a port.
310 * @port: PCIe root port or event collector to handle. 309 * @port: PCIe root port or event collector to handle.
311 * 310 *
312 * For each device below given root port, including the port itself (or for each 311 * For each device below given root port, including the port itself (or for each
313 * root complex integrated endpoint if @port is a root complex event collector) 312 * root complex integrated endpoint if @port is a root complex event collector)
314 * set the flag indicating that it can signal run-time wake-up events via PCIe 313 * set the flag indicating that it can signal run-time wake-up events.
315 * PME interrupts.
316 */ 314 */
317static void pcie_pme_mark_devices(struct pci_dev *port) 315static void pcie_pme_mark_devices(struct pci_dev *port)
318{ 316{
319 pcie_pme_set_native(port, NULL); 317 pcie_pme_can_wakeup(port, NULL);
320 if (port->subordinate) 318 if (port->subordinate)
321 pci_walk_bus(port->subordinate, pcie_pme_set_native, NULL); 319 pci_walk_bus(port->subordinate, pcie_pme_can_wakeup, NULL);
322} 320}
323 321
324/** 322/**
diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index 8489020ecf44..a3ccc3c795a5 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -794,6 +794,25 @@ config INTEL_CHT_INT33FE
794 This driver instantiates i2c-clients for these, so that standard 794 This driver instantiates i2c-clients for these, so that standard
795 i2c drivers for these chips can bind to the them. 795 i2c drivers for these chips can bind to the them.
796 796
797config INTEL_INT0002_VGPIO
798 tristate "Intel ACPI INT0002 Virtual GPIO driver"
799 depends on GPIOLIB && ACPI
800 select GPIOLIB_IRQCHIP
801 ---help---
802 Some peripherals on Bay Trail and Cherry Trail platforms signal a
803 Power Management Event (PME) to the Power Management Controller (PMC)
804 to wakeup the system. When this happens software needs to explicitly
805 clear the PME bus 0 status bit in the GPE0a_STS register to avoid an
806 IRQ storm on IRQ 9.
807
808 This is modelled in ACPI through the INT0002 ACPI device, which is
809 called a "Virtual GPIO controller" in ACPI because it defines the
810 event handler to call when the PME triggers through _AEI and _L02
811 methods as would be done for a real GPIO interrupt in ACPI.
812
813 To compile this driver as a module, choose M here: the module will
814 be called intel_int0002_vgpio.
815
797config INTEL_HID_EVENT 816config INTEL_HID_EVENT
798 tristate "INTEL HID Event" 817 tristate "INTEL HID Event"
799 depends on ACPI 818 depends on ACPI
diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
index 182a3ed6605a..ab22ce77fb66 100644
--- a/drivers/platform/x86/Makefile
+++ b/drivers/platform/x86/Makefile
@@ -46,6 +46,7 @@ obj-$(CONFIG_TOSHIBA_BT_RFKILL) += toshiba_bluetooth.o
46obj-$(CONFIG_TOSHIBA_HAPS) += toshiba_haps.o 46obj-$(CONFIG_TOSHIBA_HAPS) += toshiba_haps.o
47obj-$(CONFIG_TOSHIBA_WMI) += toshiba-wmi.o 47obj-$(CONFIG_TOSHIBA_WMI) += toshiba-wmi.o
48obj-$(CONFIG_INTEL_CHT_INT33FE) += intel_cht_int33fe.o 48obj-$(CONFIG_INTEL_CHT_INT33FE) += intel_cht_int33fe.o
49obj-$(CONFIG_INTEL_INT0002_VGPIO) += intel_int0002_vgpio.o
49obj-$(CONFIG_INTEL_HID_EVENT) += intel-hid.o 50obj-$(CONFIG_INTEL_HID_EVENT) += intel-hid.o
50obj-$(CONFIG_INTEL_VBTN) += intel-vbtn.o 51obj-$(CONFIG_INTEL_VBTN) += intel-vbtn.o
51obj-$(CONFIG_INTEL_SCU_IPC) += intel_scu_ipc.o 52obj-$(CONFIG_INTEL_SCU_IPC) += intel_scu_ipc.o
diff --git a/drivers/platform/x86/intel-hid.c b/drivers/platform/x86/intel-hid.c
index 63ba2cbd04c2..8519e0f97bdd 100644
--- a/drivers/platform/x86/intel-hid.c
+++ b/drivers/platform/x86/intel-hid.c
@@ -23,6 +23,7 @@
23#include <linux/platform_device.h> 23#include <linux/platform_device.h>
24#include <linux/input/sparse-keymap.h> 24#include <linux/input/sparse-keymap.h>
25#include <linux/acpi.h> 25#include <linux/acpi.h>
26#include <linux/suspend.h>
26#include <acpi/acpi_bus.h> 27#include <acpi/acpi_bus.h>
27 28
28MODULE_LICENSE("GPL"); 29MODULE_LICENSE("GPL");
@@ -75,6 +76,7 @@ static const struct key_entry intel_array_keymap[] = {
75struct intel_hid_priv { 76struct intel_hid_priv {
76 struct input_dev *input_dev; 77 struct input_dev *input_dev;
77 struct input_dev *array; 78 struct input_dev *array;
79 bool wakeup_mode;
78}; 80};
79 81
80static int intel_hid_set_enable(struct device *device, bool enable) 82static int intel_hid_set_enable(struct device *device, bool enable)
@@ -116,23 +118,37 @@ static void intel_button_array_enable(struct device *device, bool enable)
116 dev_warn(device, "failed to set button capability\n"); 118 dev_warn(device, "failed to set button capability\n");
117} 119}
118 120
119static int intel_hid_pl_suspend_handler(struct device *device) 121static int intel_hid_pm_prepare(struct device *device)
120{ 122{
121 intel_hid_set_enable(device, false); 123 struct intel_hid_priv *priv = dev_get_drvdata(device);
122 intel_button_array_enable(device, false); 124
125 priv->wakeup_mode = true;
126 return 0;
127}
123 128
129static int intel_hid_pl_suspend_handler(struct device *device)
130{
131 if (pm_suspend_via_firmware()) {
132 intel_hid_set_enable(device, false);
133 intel_button_array_enable(device, false);
134 }
124 return 0; 135 return 0;
125} 136}
126 137
127static int intel_hid_pl_resume_handler(struct device *device) 138static int intel_hid_pl_resume_handler(struct device *device)
128{ 139{
129 intel_hid_set_enable(device, true); 140 struct intel_hid_priv *priv = dev_get_drvdata(device);
130 intel_button_array_enable(device, true);
131 141
142 priv->wakeup_mode = false;
143 if (pm_resume_via_firmware()) {
144 intel_hid_set_enable(device, true);
145 intel_button_array_enable(device, true);
146 }
132 return 0; 147 return 0;
133} 148}
134 149
135static const struct dev_pm_ops intel_hid_pl_pm_ops = { 150static const struct dev_pm_ops intel_hid_pl_pm_ops = {
151 .prepare = intel_hid_pm_prepare,
136 .freeze = intel_hid_pl_suspend_handler, 152 .freeze = intel_hid_pl_suspend_handler,
137 .thaw = intel_hid_pl_resume_handler, 153 .thaw = intel_hid_pl_resume_handler,
138 .restore = intel_hid_pl_resume_handler, 154 .restore = intel_hid_pl_resume_handler,
@@ -186,6 +202,19 @@ static void notify_handler(acpi_handle handle, u32 event, void *context)
186 unsigned long long ev_index; 202 unsigned long long ev_index;
187 acpi_status status; 203 acpi_status status;
188 204
205 if (priv->wakeup_mode) {
206 /* Wake up on 5-button array events only. */
207 if (event == 0xc0 || !priv->array)
208 return;
209
210 if (sparse_keymap_entry_from_scancode(priv->array, event))
211 pm_wakeup_hard_event(&device->dev);
212 else
213 dev_info(&device->dev, "unknown event 0x%x\n", event);
214
215 return;
216 }
217
189 /* 0xC0 is for HID events, other values are for 5 button array */ 218 /* 0xC0 is for HID events, other values are for 5 button array */
190 if (event != 0xc0) { 219 if (event != 0xc0) {
191 if (!priv->array || 220 if (!priv->array ||
@@ -270,6 +299,7 @@ static int intel_hid_probe(struct platform_device *device)
270 "failed to enable HID power button\n"); 299 "failed to enable HID power button\n");
271 } 300 }
272 301
302 device_init_wakeup(&device->dev, true);
273 return 0; 303 return 0;
274 304
275err_remove_notify: 305err_remove_notify:
diff --git a/drivers/platform/x86/intel-vbtn.c b/drivers/platform/x86/intel-vbtn.c
index c2035e121ac2..61f106377661 100644
--- a/drivers/platform/x86/intel-vbtn.c
+++ b/drivers/platform/x86/intel-vbtn.c
@@ -23,6 +23,7 @@
23#include <linux/platform_device.h> 23#include <linux/platform_device.h>
24#include <linux/input/sparse-keymap.h> 24#include <linux/input/sparse-keymap.h>
25#include <linux/acpi.h> 25#include <linux/acpi.h>
26#include <linux/suspend.h>
26#include <acpi/acpi_bus.h> 27#include <acpi/acpi_bus.h>
27 28
28MODULE_LICENSE("GPL"); 29MODULE_LICENSE("GPL");
@@ -46,6 +47,7 @@ static const struct key_entry intel_vbtn_keymap[] = {
46 47
47struct intel_vbtn_priv { 48struct intel_vbtn_priv {
48 struct input_dev *input_dev; 49 struct input_dev *input_dev;
50 bool wakeup_mode;
49}; 51};
50 52
51static int intel_vbtn_input_setup(struct platform_device *device) 53static int intel_vbtn_input_setup(struct platform_device *device)
@@ -73,9 +75,15 @@ static void notify_handler(acpi_handle handle, u32 event, void *context)
73 struct platform_device *device = context; 75 struct platform_device *device = context;
74 struct intel_vbtn_priv *priv = dev_get_drvdata(&device->dev); 76 struct intel_vbtn_priv *priv = dev_get_drvdata(&device->dev);
75 77
76 if (!sparse_keymap_report_event(priv->input_dev, event, 1, true)) 78 if (priv->wakeup_mode) {
77 dev_info(&device->dev, "unknown event index 0x%x\n", 79 if (sparse_keymap_entry_from_scancode(priv->input_dev, event)) {
78 event); 80 pm_wakeup_hard_event(&device->dev);
81 return;
82 }
83 } else if (sparse_keymap_report_event(priv->input_dev, event, 1, true)) {
84 return;
85 }
86 dev_info(&device->dev, "unknown event index 0x%x\n", event);
79} 87}
80 88
81static int intel_vbtn_probe(struct platform_device *device) 89static int intel_vbtn_probe(struct platform_device *device)
@@ -109,6 +117,7 @@ static int intel_vbtn_probe(struct platform_device *device)
109 if (ACPI_FAILURE(status)) 117 if (ACPI_FAILURE(status))
110 return -EBUSY; 118 return -EBUSY;
111 119
120 device_init_wakeup(&device->dev, true);
112 return 0; 121 return 0;
113} 122}
114 123
@@ -125,10 +134,34 @@ static int intel_vbtn_remove(struct platform_device *device)
125 return 0; 134 return 0;
126} 135}
127 136
137static int intel_vbtn_pm_prepare(struct device *dev)
138{
139 struct intel_vbtn_priv *priv = dev_get_drvdata(dev);
140
141 priv->wakeup_mode = true;
142 return 0;
143}
144
145static int intel_vbtn_pm_resume(struct device *dev)
146{
147 struct intel_vbtn_priv *priv = dev_get_drvdata(dev);
148
149 priv->wakeup_mode = false;
150 return 0;
151}
152
153static const struct dev_pm_ops intel_vbtn_pm_ops = {
154 .prepare = intel_vbtn_pm_prepare,
155 .resume = intel_vbtn_pm_resume,
156 .restore = intel_vbtn_pm_resume,
157 .thaw = intel_vbtn_pm_resume,
158};
159
128static struct platform_driver intel_vbtn_pl_driver = { 160static struct platform_driver intel_vbtn_pl_driver = {
129 .driver = { 161 .driver = {
130 .name = "intel-vbtn", 162 .name = "intel-vbtn",
131 .acpi_match_table = intel_vbtn_ids, 163 .acpi_match_table = intel_vbtn_ids,
164 .pm = &intel_vbtn_pm_ops,
132 }, 165 },
133 .probe = intel_vbtn_probe, 166 .probe = intel_vbtn_probe,
134 .remove = intel_vbtn_remove, 167 .remove = intel_vbtn_remove,
diff --git a/drivers/platform/x86/intel_int0002_vgpio.c b/drivers/platform/x86/intel_int0002_vgpio.c
new file mode 100644
index 000000000000..92dc230ef5b2
--- /dev/null
+++ b/drivers/platform/x86/intel_int0002_vgpio.c
@@ -0,0 +1,219 @@
1/*
2 * Intel INT0002 "Virtual GPIO" driver
3 *
4 * Copyright (C) 2017 Hans de Goede <hdegoede@redhat.com>
5 *
6 * Loosely based on android x86 kernel code which is:
7 *
8 * Copyright (c) 2014, Intel Corporation.
9 *
10 * Author: Dyut Kumar Sil <dyut.k.sil@intel.com>
11 *
12 * This program is free software; you can redistribute it and/or modify
13 * it under the terms of the GNU General Public License version 2 as
14 * published by the Free Software Foundation.
15 *
16 * Some peripherals on Bay Trail and Cherry Trail platforms signal a Power
17 * Management Event (PME) to the Power Management Controller (PMC) to wakeup
18 * the system. When this happens software needs to clear the PME bus 0 status
19 * bit in the GPE0a_STS register to avoid an IRQ storm on IRQ 9.
20 *
21 * This is modelled in ACPI through the INT0002 ACPI device, which is
22 * called a "Virtual GPIO controller" in ACPI because it defines the event
23 * handler to call when the PME triggers through _AEI and _L02 / _E02
24 * methods as would be done for a real GPIO interrupt in ACPI. Note this
25 * is a hack to define an AML event handler for the PME while using existing
26 * ACPI mechanisms, this is not a real GPIO at all.
27 *
28 * This driver will bind to the INT0002 device, and register as a GPIO
29 * controller, letting gpiolib-acpi.c call the _L02 handler as it would
30 * for a real GPIO controller.
31 */
32
33#include <linux/acpi.h>
34#include <linux/bitmap.h>
35#include <linux/gpio/driver.h>
36#include <linux/interrupt.h>
37#include <linux/io.h>
38#include <linux/kernel.h>
39#include <linux/module.h>
40#include <linux/platform_device.h>
41#include <linux/slab.h>
42#include <linux/suspend.h>
43
44#include <asm/cpu_device_id.h>
45#include <asm/intel-family.h>
46
47#define DRV_NAME "INT0002 Virtual GPIO"
48
49/* For some reason the virtual GPIO pin tied to the GPE is numbered pin 2 */
50#define GPE0A_PME_B0_VIRT_GPIO_PIN 2
51
52#define GPE0A_PME_B0_STS_BIT BIT(13)
53#define GPE0A_PME_B0_EN_BIT BIT(13)
54#define GPE0A_STS_PORT 0x420
55#define GPE0A_EN_PORT 0x428
56
57#define ICPU(model) { X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, }
58
59static const struct x86_cpu_id int0002_cpu_ids[] = {
60/*
61 * Limit ourselves to Cherry Trail for now, until testing shows we
62 * need to handle the INT0002 device on Baytrail too.
63 * ICPU(INTEL_FAM6_ATOM_SILVERMONT1), * Valleyview, Bay Trail *
64 */
65 ICPU(INTEL_FAM6_ATOM_AIRMONT), /* Braswell, Cherry Trail */
66 {}
67};
68
69/*
70 * As this is not a real GPIO at all, but just a hack to model an event in
71 * ACPI the get / set functions are dummy functions.
72 */
73
74static int int0002_gpio_get(struct gpio_chip *chip, unsigned int offset)
75{
76 return 0;
77}
78
79static void int0002_gpio_set(struct gpio_chip *chip, unsigned int offset,
80 int value)
81{
82}
83
84static int int0002_gpio_direction_output(struct gpio_chip *chip,
85 unsigned int offset, int value)
86{
87 return 0;
88}
89
90static void int0002_irq_ack(struct irq_data *data)
91{
92 outl(GPE0A_PME_B0_STS_BIT, GPE0A_STS_PORT);
93}
94
95static void int0002_irq_unmask(struct irq_data *data)
96{
97 u32 gpe_en_reg;
98
99 gpe_en_reg = inl(GPE0A_EN_PORT);
100 gpe_en_reg |= GPE0A_PME_B0_EN_BIT;
101 outl(gpe_en_reg, GPE0A_EN_PORT);
102}
103
104static void int0002_irq_mask(struct irq_data *data)
105{
106 u32 gpe_en_reg;
107
108 gpe_en_reg = inl(GPE0A_EN_PORT);
109 gpe_en_reg &= ~GPE0A_PME_B0_EN_BIT;
110 outl(gpe_en_reg, GPE0A_EN_PORT);
111}
112
113static irqreturn_t int0002_irq(int irq, void *data)
114{
115 struct gpio_chip *chip = data;
116 u32 gpe_sts_reg;
117
118 gpe_sts_reg = inl(GPE0A_STS_PORT);
119 if (!(gpe_sts_reg & GPE0A_PME_B0_STS_BIT))
120 return IRQ_NONE;
121
122 generic_handle_irq(irq_find_mapping(chip->irqdomain,
123 GPE0A_PME_B0_VIRT_GPIO_PIN));
124
125 pm_system_wakeup();
126
127 return IRQ_HANDLED;
128}
129
130static struct irq_chip int0002_irqchip = {
131 .name = DRV_NAME,
132 .irq_ack = int0002_irq_ack,
133 .irq_mask = int0002_irq_mask,
134 .irq_unmask = int0002_irq_unmask,
135};
136
137static int int0002_probe(struct platform_device *pdev)
138{
139 struct device *dev = &pdev->dev;
140 const struct x86_cpu_id *cpu_id;
141 struct gpio_chip *chip;
142 int irq, ret;
143
144 /* Menlow has a different INT0002 device? <sigh> */
145 cpu_id = x86_match_cpu(int0002_cpu_ids);
146 if (!cpu_id)
147 return -ENODEV;
148
149 irq = platform_get_irq(pdev, 0);
150 if (irq < 0) {
151 dev_err(dev, "Error getting IRQ: %d\n", irq);
152 return irq;
153 }
154
155 chip = devm_kzalloc(dev, sizeof(*chip), GFP_KERNEL);
156 if (!chip)
157 return -ENOMEM;
158
159 chip->label = DRV_NAME;
160 chip->parent = dev;
161 chip->owner = THIS_MODULE;
162 chip->get = int0002_gpio_get;
163 chip->set = int0002_gpio_set;
164 chip->direction_input = int0002_gpio_get;
165 chip->direction_output = int0002_gpio_direction_output;
166 chip->base = -1;
167 chip->ngpio = GPE0A_PME_B0_VIRT_GPIO_PIN + 1;
168 chip->irq_need_valid_mask = true;
169
170 ret = devm_gpiochip_add_data(&pdev->dev, chip, NULL);
171 if (ret) {
172 dev_err(dev, "Error adding gpio chip: %d\n", ret);
173 return ret;
174 }
175
176 bitmap_clear(chip->irq_valid_mask, 0, GPE0A_PME_B0_VIRT_GPIO_PIN);
177
178 /*
179 * We manually request the irq here instead of passing a flow-handler
180 * to gpiochip_set_chained_irqchip, because the irq is shared.
181 */
182 ret = devm_request_irq(dev, irq, int0002_irq,
183 IRQF_SHARED | IRQF_NO_THREAD, "INT0002", chip);
184 if (ret) {
185 dev_err(dev, "Error requesting IRQ %d: %d\n", irq, ret);
186 return ret;
187 }
188
189 ret = gpiochip_irqchip_add(chip, &int0002_irqchip, 0, handle_edge_irq,
190 IRQ_TYPE_NONE);
191 if (ret) {
192 dev_err(dev, "Error adding irqchip: %d\n", ret);
193 return ret;
194 }
195
196 gpiochip_set_chained_irqchip(chip, &int0002_irqchip, irq, NULL);
197
198 return 0;
199}
200
201static const struct acpi_device_id int0002_acpi_ids[] = {
202 { "INT0002", 0 },
203 { },
204};
205MODULE_DEVICE_TABLE(acpi, int0002_acpi_ids);
206
207static struct platform_driver int0002_driver = {
208 .driver = {
209 .name = DRV_NAME,
210 .acpi_match_table = int0002_acpi_ids,
211 },
212 .probe = int0002_probe,
213};
214
215module_platform_driver(int0002_driver);
216
217MODULE_AUTHOR("Hans de Goede <hdegoede@redhat.com>");
218MODULE_DESCRIPTION("Intel INT0002 Virtual GPIO driver");
219MODULE_LICENSE("GPL");
diff --git a/drivers/pnp/pnpacpi/core.c b/drivers/pnp/pnpacpi/core.c
index 9113876487ed..3a4c1aa0201e 100644
--- a/drivers/pnp/pnpacpi/core.c
+++ b/drivers/pnp/pnpacpi/core.c
@@ -149,8 +149,8 @@ static int pnpacpi_suspend(struct pnp_dev *dev, pm_message_t state)
149 } 149 }
150 150
151 if (device_can_wakeup(&dev->dev)) { 151 if (device_can_wakeup(&dev->dev)) {
152 error = acpi_pm_device_sleep_wake(&dev->dev, 152 error = acpi_pm_set_device_wakeup(&dev->dev,
153 device_may_wakeup(&dev->dev)); 153 device_may_wakeup(&dev->dev));
154 if (error) 154 if (error)
155 return error; 155 return error;
156 } 156 }
@@ -185,7 +185,7 @@ static int pnpacpi_resume(struct pnp_dev *dev)
185 } 185 }
186 186
187 if (device_may_wakeup(&dev->dev)) 187 if (device_may_wakeup(&dev->dev))
188 acpi_pm_device_sleep_wake(&dev->dev, false); 188 acpi_pm_set_device_wakeup(&dev->dev, false);
189 189
190 if (acpi_device_power_manageable(acpi_dev)) 190 if (acpi_device_power_manageable(acpi_dev))
191 error = acpi_device_set_power(acpi_dev, ACPI_STATE_D0); 191 error = acpi_device_set_power(acpi_dev, ACPI_STATE_D0);
diff --git a/drivers/power/avs/rockchip-io-domain.c b/drivers/power/avs/rockchip-io-domain.c
index 85812521b6ba..031a34372191 100644
--- a/drivers/power/avs/rockchip-io-domain.c
+++ b/drivers/power/avs/rockchip-io-domain.c
@@ -253,6 +253,16 @@ static const struct rockchip_iodomain_soc_data soc_data_rk3188 = {
253 }, 253 },
254}; 254};
255 255
256static const struct rockchip_iodomain_soc_data soc_data_rk3228 = {
257 .grf_offset = 0x418,
258 .supply_names = {
259 "vccio1",
260 "vccio2",
261 "vccio3",
262 "vccio4",
263 },
264};
265
256static const struct rockchip_iodomain_soc_data soc_data_rk3288 = { 266static const struct rockchip_iodomain_soc_data soc_data_rk3288 = {
257 .grf_offset = 0x380, 267 .grf_offset = 0x380,
258 .supply_names = { 268 .supply_names = {
@@ -345,6 +355,10 @@ static const struct of_device_id rockchip_iodomain_match[] = {
345 .data = (void *)&soc_data_rk3188 355 .data = (void *)&soc_data_rk3188
346 }, 356 },
347 { 357 {
358 .compatible = "rockchip,rk3228-io-voltage-domain",
359 .data = (void *)&soc_data_rk3228
360 },
361 {
348 .compatible = "rockchip,rk3288-io-voltage-domain", 362 .compatible = "rockchip,rk3288-io-voltage-domain",
349 .data = (void *)&soc_data_rk3288 363 .data = (void *)&soc_data_rk3288
350 }, 364 },
diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c
index 9ddad0815ba9..d1694f1def72 100644
--- a/drivers/powercap/intel_rapl.c
+++ b/drivers/powercap/intel_rapl.c
@@ -874,7 +874,9 @@ static int rapl_write_data_raw(struct rapl_domain *rd,
874 874
875 cpu = rd->rp->lead_cpu; 875 cpu = rd->rp->lead_cpu;
876 bits = rapl_unit_xlate(rd, rp->unit, value, 1); 876 bits = rapl_unit_xlate(rd, rp->unit, value, 1);
877 bits |= bits << rp->shift; 877 bits <<= rp->shift;
878 bits &= rp->mask;
879
878 memset(&ma, 0, sizeof(ma)); 880 memset(&ma, 0, sizeof(ma));
879 881
880 ma.msr_no = rd->msrs[rp->id]; 882 ma.msr_no = rd->msrs[rp->id];
diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
index 7859d738df41..ea829ad798c0 100644
--- a/drivers/usb/core/hcd-pci.c
+++ b/drivers/usb/core/hcd-pci.c
@@ -584,12 +584,7 @@ static int hcd_pci_suspend_noirq(struct device *dev)
584 584
585static int hcd_pci_resume_noirq(struct device *dev) 585static int hcd_pci_resume_noirq(struct device *dev)
586{ 586{
587 struct pci_dev *pci_dev = to_pci_dev(dev); 587 powermac_set_asic(to_pci_dev(dev), 1);
588
589 powermac_set_asic(pci_dev, 1);
590
591 /* Go back to D0 and disable remote wakeup */
592 pci_back_from_sleep(pci_dev);
593 return 0; 588 return 0;
594} 589}
595 590
diff --git a/drivers/usb/dwc3/dwc3-pci.c b/drivers/usb/dwc3/dwc3-pci.c
index fe851544d7fb..7e995df7a797 100644
--- a/drivers/usb/dwc3/dwc3-pci.c
+++ b/drivers/usb/dwc3/dwc3-pci.c
@@ -230,7 +230,6 @@ static int dwc3_pci_probe(struct pci_dev *pci,
230 } 230 }
231 231
232 device_init_wakeup(dev, true); 232 device_init_wakeup(dev, true);
233 device_set_run_wake(dev, true);
234 pci_set_drvdata(pci, dwc); 233 pci_set_drvdata(pci, dwc);
235 pm_runtime_put(dev); 234 pm_runtime_put(dev);
236 235
@@ -310,7 +309,7 @@ static int dwc3_pci_runtime_suspend(struct device *dev)
310{ 309{
311 struct dwc3_pci *dwc = dev_get_drvdata(dev); 310 struct dwc3_pci *dwc = dev_get_drvdata(dev);
312 311
313 if (device_run_wake(dev)) 312 if (device_can_wakeup(dev))
314 return dwc3_pci_dsm(dwc, PCI_INTEL_BXT_STATE_D3); 313 return dwc3_pci_dsm(dwc, PCI_INTEL_BXT_STATE_D3);
315 314
316 return -EBUSY; 315 return -EBUSY;
diff --git a/drivers/usb/host/uhci-pci.c b/drivers/usb/host/uhci-pci.c
index 02260cfdedb1..49effdc0d857 100644
--- a/drivers/usb/host/uhci-pci.c
+++ b/drivers/usb/host/uhci-pci.c
@@ -131,7 +131,7 @@ static int uhci_pci_init(struct usb_hcd *hcd)
131 131
132 /* Intel controllers use non-PME wakeup signalling */ 132 /* Intel controllers use non-PME wakeup signalling */
133 if (to_pci_dev(uhci_dev(uhci))->vendor == PCI_VENDOR_ID_INTEL) 133 if (to_pci_dev(uhci_dev(uhci))->vendor == PCI_VENDOR_ID_INTEL)
134 device_set_run_wake(uhci_dev(uhci), 1); 134 device_set_wakeup_capable(uhci_dev(uhci), true);
135 135
136 /* Set up pointers to PCI-specific functions */ 136 /* Set up pointers to PCI-specific functions */
137 uhci->reset_hc = uhci_pci_reset_hc; 137 uhci->reset_hc = uhci_pci_reset_hc;
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index c1b163cb68b1..68bc6be447fd 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -315,13 +315,12 @@ struct acpi_device_perf {
315/* Wakeup Management */ 315/* Wakeup Management */
316struct acpi_device_wakeup_flags { 316struct acpi_device_wakeup_flags {
317 u8 valid:1; /* Can successfully enable wakeup? */ 317 u8 valid:1; /* Can successfully enable wakeup? */
318 u8 run_wake:1; /* Run-Wake GPE devices */
319 u8 notifier_present:1; /* Wake-up notify handler has been installed */ 318 u8 notifier_present:1; /* Wake-up notify handler has been installed */
320 u8 enabled:1; /* Enabled for wakeup */ 319 u8 enabled:1; /* Enabled for wakeup */
321}; 320};
322 321
323struct acpi_device_wakeup_context { 322struct acpi_device_wakeup_context {
324 struct work_struct work; 323 void (*func)(struct acpi_device_wakeup_context *context);
325 struct device *dev; 324 struct device *dev;
326}; 325};
327 326
@@ -600,15 +599,20 @@ static inline bool acpi_device_always_present(struct acpi_device *adev)
600#endif 599#endif
601 600
602#ifdef CONFIG_PM 601#ifdef CONFIG_PM
602void acpi_pm_wakeup_event(struct device *dev);
603acpi_status acpi_add_pm_notifier(struct acpi_device *adev, struct device *dev, 603acpi_status acpi_add_pm_notifier(struct acpi_device *adev, struct device *dev,
604 void (*work_func)(struct work_struct *work)); 604 void (*func)(struct acpi_device_wakeup_context *context));
605acpi_status acpi_remove_pm_notifier(struct acpi_device *adev); 605acpi_status acpi_remove_pm_notifier(struct acpi_device *adev);
606bool acpi_pm_device_can_wakeup(struct device *dev);
606int acpi_pm_device_sleep_state(struct device *, int *, int); 607int acpi_pm_device_sleep_state(struct device *, int *, int);
607int acpi_pm_device_run_wake(struct device *, bool); 608int acpi_pm_set_device_wakeup(struct device *dev, bool enable);
608#else 609#else
610static inline void acpi_pm_wakeup_event(struct device *dev)
611{
612}
609static inline acpi_status acpi_add_pm_notifier(struct acpi_device *adev, 613static inline acpi_status acpi_add_pm_notifier(struct acpi_device *adev,
610 struct device *dev, 614 struct device *dev,
611 void (*work_func)(struct work_struct *work)) 615 void (*func)(struct acpi_device_wakeup_context *context))
612{ 616{
613 return AE_SUPPORT; 617 return AE_SUPPORT;
614} 618}
@@ -616,6 +620,10 @@ static inline acpi_status acpi_remove_pm_notifier(struct acpi_device *adev)
616{ 620{
617 return AE_SUPPORT; 621 return AE_SUPPORT;
618} 622}
623static inline bool acpi_pm_device_can_wakeup(struct device *dev)
624{
625 return false;
626}
619static inline int acpi_pm_device_sleep_state(struct device *d, int *p, int m) 627static inline int acpi_pm_device_sleep_state(struct device *d, int *p, int m)
620{ 628{
621 if (p) 629 if (p)
@@ -624,16 +632,7 @@ static inline int acpi_pm_device_sleep_state(struct device *d, int *p, int m)
624 return (m >= ACPI_STATE_D0 && m <= ACPI_STATE_D3_COLD) ? 632 return (m >= ACPI_STATE_D0 && m <= ACPI_STATE_D3_COLD) ?
625 m : ACPI_STATE_D0; 633 m : ACPI_STATE_D0;
626} 634}
627static inline int acpi_pm_device_run_wake(struct device *dev, bool enable) 635static inline int acpi_pm_set_device_wakeup(struct device *dev, bool enable)
628{
629 return -ENODEV;
630}
631#endif
632
633#ifdef CONFIG_PM_SLEEP
634int acpi_pm_device_sleep_wake(struct device *, bool);
635#else
636static inline int acpi_pm_device_sleep_wake(struct device *dev, bool enable)
637{ 636{
638 return -ENODEV; 637 return -ENODEV;
639} 638}
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index a5ce0bbeadb5..905117bd5012 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -883,6 +883,8 @@ static inline bool policy_has_boost_freq(struct cpufreq_policy *policy)
883} 883}
884#endif 884#endif
885 885
886extern unsigned int arch_freq_get_on_cpu(int cpu);
887
886/* the following are really really optional */ 888/* the following are really really optional */
887extern struct freq_attr cpufreq_freq_attr_scaling_available_freqs; 889extern struct freq_attr cpufreq_freq_attr_scaling_available_freqs;
888extern struct freq_attr cpufreq_freq_attr_scaling_boost_freqs; 890extern struct freq_attr cpufreq_freq_attr_scaling_boost_freqs;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 58f1ab06c4e8..1ef093866581 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -307,7 +307,6 @@ struct pci_dev {
307 u8 pm_cap; /* PM capability offset */ 307 u8 pm_cap; /* PM capability offset */
308 unsigned int pme_support:5; /* Bitmask of states from which PME# 308 unsigned int pme_support:5; /* Bitmask of states from which PME#
309 can be generated */ 309 can be generated */
310 unsigned int pme_interrupt:1;
311 unsigned int pme_poll:1; /* Poll device's PME status bit */ 310 unsigned int pme_poll:1; /* Poll device's PME status bit */
312 unsigned int d1_support:1; /* Low power state D1 is supported */ 311 unsigned int d1_support:1; /* Low power state D1 is supported */
313 unsigned int d2_support:1; /* Low power state D2 is supported */ 312 unsigned int d2_support:1; /* Low power state D2 is supported */
@@ -1099,8 +1098,7 @@ int pci_set_power_state(struct pci_dev *dev, pci_power_t state);
1099pci_power_t pci_choose_state(struct pci_dev *dev, pm_message_t state); 1098pci_power_t pci_choose_state(struct pci_dev *dev, pm_message_t state);
1100bool pci_pme_capable(struct pci_dev *dev, pci_power_t state); 1099bool pci_pme_capable(struct pci_dev *dev, pci_power_t state);
1101void pci_pme_active(struct pci_dev *dev, bool enable); 1100void pci_pme_active(struct pci_dev *dev, bool enable);
1102int __pci_enable_wake(struct pci_dev *dev, pci_power_t state, 1101int pci_enable_wake(struct pci_dev *dev, pci_power_t state, bool enable);
1103 bool runtime, bool enable);
1104int pci_wake_from_d3(struct pci_dev *dev, bool enable); 1102int pci_wake_from_d3(struct pci_dev *dev, bool enable);
1105int pci_prepare_to_sleep(struct pci_dev *dev); 1103int pci_prepare_to_sleep(struct pci_dev *dev);
1106int pci_back_from_sleep(struct pci_dev *dev); 1104int pci_back_from_sleep(struct pci_dev *dev);
@@ -1110,12 +1108,6 @@ void pci_pme_wakeup_bus(struct pci_bus *bus);
1110void pci_d3cold_enable(struct pci_dev *dev); 1108void pci_d3cold_enable(struct pci_dev *dev);
1111void pci_d3cold_disable(struct pci_dev *dev); 1109void pci_d3cold_disable(struct pci_dev *dev);
1112 1110
1113static inline int pci_enable_wake(struct pci_dev *dev, pci_power_t state,
1114 bool enable)
1115{
1116 return __pci_enable_wake(dev, state, false, enable);
1117}
1118
1119/* PCI Virtual Channel */ 1111/* PCI Virtual Channel */
1120int pci_save_vc_state(struct pci_dev *dev); 1112int pci_save_vc_state(struct pci_dev *dev);
1121void pci_restore_vc_state(struct pci_dev *dev); 1113void pci_restore_vc_state(struct pci_dev *dev);
diff --git a/include/linux/pm.h b/include/linux/pm.h
index a0894bc52bb4..b8b4df09fd8f 100644
--- a/include/linux/pm.h
+++ b/include/linux/pm.h
@@ -584,7 +584,6 @@ struct dev_pm_info {
584 unsigned int idle_notification:1; 584 unsigned int idle_notification:1;
585 unsigned int request_pending:1; 585 unsigned int request_pending:1;
586 unsigned int deferred_resume:1; 586 unsigned int deferred_resume:1;
587 unsigned int run_wake:1;
588 unsigned int runtime_auto:1; 587 unsigned int runtime_auto:1;
589 bool ignore_children:1; 588 bool ignore_children:1;
590 unsigned int no_callbacks:1; 589 unsigned int no_callbacks:1;
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index a6685b3dde26..51ec727b4824 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -121,6 +121,8 @@ struct opp_table *dev_pm_opp_set_prop_name(struct device *dev, const char *name)
121void dev_pm_opp_put_prop_name(struct opp_table *opp_table); 121void dev_pm_opp_put_prop_name(struct opp_table *opp_table);
122struct opp_table *dev_pm_opp_set_regulators(struct device *dev, const char * const names[], unsigned int count); 122struct opp_table *dev_pm_opp_set_regulators(struct device *dev, const char * const names[], unsigned int count);
123void dev_pm_opp_put_regulators(struct opp_table *opp_table); 123void dev_pm_opp_put_regulators(struct opp_table *opp_table);
124struct opp_table *dev_pm_opp_set_clkname(struct device *dev, const char * name);
125void dev_pm_opp_put_clkname(struct opp_table *opp_table);
124struct opp_table *dev_pm_opp_register_set_opp_helper(struct device *dev, int (*set_opp)(struct dev_pm_set_opp_data *data)); 126struct opp_table *dev_pm_opp_register_set_opp_helper(struct device *dev, int (*set_opp)(struct dev_pm_set_opp_data *data));
125void dev_pm_opp_register_put_opp_helper(struct opp_table *opp_table); 127void dev_pm_opp_register_put_opp_helper(struct opp_table *opp_table);
126int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq); 128int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq);
@@ -257,6 +259,13 @@ static inline struct opp_table *dev_pm_opp_set_regulators(struct device *dev, co
257 259
258static inline void dev_pm_opp_put_regulators(struct opp_table *opp_table) {} 260static inline void dev_pm_opp_put_regulators(struct opp_table *opp_table) {}
259 261
262static inline struct opp_table *dev_pm_opp_set_clkname(struct device *dev, const char * name)
263{
264 return ERR_PTR(-ENOTSUPP);
265}
266
267static inline void dev_pm_opp_put_clkname(struct opp_table *opp_table) {}
268
260static inline int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq) 269static inline int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
261{ 270{
262 return -ENOTSUPP; 271 return -ENOTSUPP;
diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
index ca4823e675e2..2efb08a60e63 100644
--- a/include/linux/pm_runtime.h
+++ b/include/linux/pm_runtime.h
@@ -76,16 +76,6 @@ static inline void pm_runtime_put_noidle(struct device *dev)
76 atomic_add_unless(&dev->power.usage_count, -1, 0); 76 atomic_add_unless(&dev->power.usage_count, -1, 0);
77} 77}
78 78
79static inline bool device_run_wake(struct device *dev)
80{
81 return dev->power.run_wake;
82}
83
84static inline void device_set_run_wake(struct device *dev, bool enable)
85{
86 dev->power.run_wake = enable;
87}
88
89static inline bool pm_runtime_suspended(struct device *dev) 79static inline bool pm_runtime_suspended(struct device *dev)
90{ 80{
91 return dev->power.runtime_status == RPM_SUSPENDED 81 return dev->power.runtime_status == RPM_SUSPENDED
@@ -163,8 +153,6 @@ static inline void pm_runtime_forbid(struct device *dev) {}
163static inline void pm_suspend_ignore_children(struct device *dev, bool enable) {} 153static inline void pm_suspend_ignore_children(struct device *dev, bool enable) {}
164static inline void pm_runtime_get_noresume(struct device *dev) {} 154static inline void pm_runtime_get_noresume(struct device *dev) {}
165static inline void pm_runtime_put_noidle(struct device *dev) {} 155static inline void pm_runtime_put_noidle(struct device *dev) {}
166static inline bool device_run_wake(struct device *dev) { return false; }
167static inline void device_set_run_wake(struct device *dev, bool enable) {}
168static inline bool pm_runtime_suspended(struct device *dev) { return false; } 156static inline bool pm_runtime_suspended(struct device *dev) { return false; }
169static inline bool pm_runtime_active(struct device *dev) { return true; } 157static inline bool pm_runtime_active(struct device *dev) { return true; }
170static inline bool pm_runtime_status_suspended(struct device *dev) { return false; } 158static inline bool pm_runtime_status_suspended(struct device *dev) { return false; }
diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index d9718378a8be..0b1cf32edfd7 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -189,6 +189,8 @@ struct platform_suspend_ops {
189struct platform_freeze_ops { 189struct platform_freeze_ops {
190 int (*begin)(void); 190 int (*begin)(void);
191 int (*prepare)(void); 191 int (*prepare)(void);
192 void (*wake)(void);
193 void (*sync)(void);
192 void (*restore)(void); 194 void (*restore)(void);
193 void (*end)(void); 195 void (*end)(void);
194}; 196};
@@ -428,7 +430,8 @@ extern unsigned int pm_wakeup_irq;
428 430
429extern bool pm_wakeup_pending(void); 431extern bool pm_wakeup_pending(void);
430extern void pm_system_wakeup(void); 432extern void pm_system_wakeup(void);
431extern void pm_wakeup_clear(void); 433extern void pm_system_cancel_wakeup(void);
434extern void pm_wakeup_clear(bool reset);
432extern void pm_system_irq_wakeup(unsigned int irq_number); 435extern void pm_system_irq_wakeup(unsigned int irq_number);
433extern bool pm_get_wakeup_count(unsigned int *count, bool block); 436extern bool pm_get_wakeup_count(unsigned int *count, bool block);
434extern bool pm_save_wakeup_count(unsigned int count); 437extern bool pm_save_wakeup_count(unsigned int count);
@@ -478,7 +481,7 @@ static inline int unregister_pm_notifier(struct notifier_block *nb)
478 481
479static inline bool pm_wakeup_pending(void) { return false; } 482static inline bool pm_wakeup_pending(void) { return false; }
480static inline void pm_system_wakeup(void) {} 483static inline void pm_system_wakeup(void) {}
481static inline void pm_wakeup_clear(void) {} 484static inline void pm_wakeup_clear(bool reset) {}
482static inline void pm_system_irq_wakeup(unsigned int irq_number) {} 485static inline void pm_system_irq_wakeup(unsigned int irq_number) {}
483 486
484static inline void lock_system_sleep(void) {} 487static inline void lock_system_sleep(void) {}
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index a8b978c35a6a..e1914c7b85b1 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -1108,7 +1108,7 @@ static struct attribute * g[] = {
1108}; 1108};
1109 1109
1110 1110
1111static struct attribute_group attr_group = { 1111static const struct attribute_group attr_group = {
1112 .attrs = g, 1112 .attrs = g,
1113}; 1113};
1114 1114
diff --git a/kernel/power/process.c b/kernel/power/process.c
index c7209f060eeb..78672d324a6e 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -132,7 +132,7 @@ int freeze_processes(void)
132 if (!pm_freezing) 132 if (!pm_freezing)
133 atomic_inc(&system_freezing_cnt); 133 atomic_inc(&system_freezing_cnt);
134 134
135 pm_wakeup_clear(); 135 pm_wakeup_clear(true);
136 pr_info("Freezing user space processes ... "); 136 pr_info("Freezing user space processes ... ");
137 pm_freezing = true; 137 pm_freezing = true;
138 error = try_to_freeze_tasks(true); 138 error = try_to_freeze_tasks(true);
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index fa46606f3356..b7708e319941 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -36,13 +36,13 @@
36#include <asm/pgtable.h> 36#include <asm/pgtable.h>
37#include <asm/tlbflush.h> 37#include <asm/tlbflush.h>
38#include <asm/io.h> 38#include <asm/io.h>
39#ifdef CONFIG_STRICT_KERNEL_RWX 39#ifdef CONFIG_ARCH_HAS_SET_MEMORY
40#include <asm/set_memory.h> 40#include <asm/set_memory.h>
41#endif 41#endif
42 42
43#include "power.h" 43#include "power.h"
44 44
45#ifdef CONFIG_STRICT_KERNEL_RWX 45#if defined(CONFIG_STRICT_KERNEL_RWX) && defined(CONFIG_ARCH_HAS_SET_MEMORY)
46static bool hibernate_restore_protection; 46static bool hibernate_restore_protection;
47static bool hibernate_restore_protection_active; 47static bool hibernate_restore_protection_active;
48 48
@@ -77,7 +77,7 @@ static inline void hibernate_restore_protection_begin(void) {}
77static inline void hibernate_restore_protection_end(void) {} 77static inline void hibernate_restore_protection_end(void) {}
78static inline void hibernate_restore_protect_page(void *page_address) {} 78static inline void hibernate_restore_protect_page(void *page_address) {}
79static inline void hibernate_restore_unprotect_page(void *page_address) {} 79static inline void hibernate_restore_unprotect_page(void *page_address) {}
80#endif /* CONFIG_STRICT_KERNEL_RWX */ 80#endif /* CONFIG_STRICT_KERNEL_RWX && CONFIG_ARCH_HAS_SET_MEMORY */
81 81
82static int swsusp_page_is_free(struct page *); 82static int swsusp_page_is_free(struct page *);
83static void swsusp_set_page_forbidden(struct page *); 83static void swsusp_set_page_forbidden(struct page *);
@@ -1929,8 +1929,7 @@ static inline unsigned int alloc_highmem_pages(struct memory_bitmap *bm,
1929 * also be located in the high memory, because of the way in which 1929 * also be located in the high memory, because of the way in which
1930 * copy_data_pages() works. 1930 * copy_data_pages() works.
1931 */ 1931 */
1932static int swsusp_alloc(struct memory_bitmap *orig_bm, 1932static int swsusp_alloc(struct memory_bitmap *copy_bm,
1933 struct memory_bitmap *copy_bm,
1934 unsigned int nr_pages, unsigned int nr_highmem) 1933 unsigned int nr_pages, unsigned int nr_highmem)
1935{ 1934{
1936 if (nr_highmem > 0) { 1935 if (nr_highmem > 0) {
@@ -1976,7 +1975,7 @@ asmlinkage __visible int swsusp_save(void)
1976 return -ENOMEM; 1975 return -ENOMEM;
1977 } 1976 }
1978 1977
1979 if (swsusp_alloc(&orig_bm, &copy_bm, nr_pages, nr_highmem)) { 1978 if (swsusp_alloc(&copy_bm, nr_pages, nr_highmem)) {
1980 printk(KERN_ERR "PM: Memory allocation failed\n"); 1979 printk(KERN_ERR "PM: Memory allocation failed\n");
1981 return -ENOMEM; 1980 return -ENOMEM;
1982 } 1981 }
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 15e6baef5c73..3ecf275d7e44 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -72,6 +72,8 @@ static void freeze_begin(void)
72 72
73static void freeze_enter(void) 73static void freeze_enter(void)
74{ 74{
75 trace_suspend_resume(TPS("machine_suspend"), PM_SUSPEND_FREEZE, true);
76
75 spin_lock_irq(&suspend_freeze_lock); 77 spin_lock_irq(&suspend_freeze_lock);
76 if (pm_wakeup_pending()) 78 if (pm_wakeup_pending())
77 goto out; 79 goto out;
@@ -84,11 +86,9 @@ static void freeze_enter(void)
84 86
85 /* Push all the CPUs into the idle loop. */ 87 /* Push all the CPUs into the idle loop. */
86 wake_up_all_idle_cpus(); 88 wake_up_all_idle_cpus();
87 pr_debug("PM: suspend-to-idle\n");
88 /* Make the current CPU wait so it can enter the idle loop too. */ 89 /* Make the current CPU wait so it can enter the idle loop too. */
89 wait_event(suspend_freeze_wait_head, 90 wait_event(suspend_freeze_wait_head,
90 suspend_freeze_state == FREEZE_STATE_WAKE); 91 suspend_freeze_state == FREEZE_STATE_WAKE);
91 pr_debug("PM: resume from suspend-to-idle\n");
92 92
93 cpuidle_pause(); 93 cpuidle_pause();
94 put_online_cpus(); 94 put_online_cpus();
@@ -98,6 +98,31 @@ static void freeze_enter(void)
98 out: 98 out:
99 suspend_freeze_state = FREEZE_STATE_NONE; 99 suspend_freeze_state = FREEZE_STATE_NONE;
100 spin_unlock_irq(&suspend_freeze_lock); 100 spin_unlock_irq(&suspend_freeze_lock);
101
102 trace_suspend_resume(TPS("machine_suspend"), PM_SUSPEND_FREEZE, false);
103}
104
105static void s2idle_loop(void)
106{
107 pr_debug("PM: suspend-to-idle\n");
108
109 do {
110 freeze_enter();
111
112 if (freeze_ops && freeze_ops->wake)
113 freeze_ops->wake();
114
115 dpm_resume_noirq(PMSG_RESUME);
116 if (freeze_ops && freeze_ops->sync)
117 freeze_ops->sync();
118
119 if (pm_wakeup_pending())
120 break;
121
122 pm_wakeup_clear(false);
123 } while (!dpm_suspend_noirq(PMSG_SUSPEND));
124
125 pr_debug("PM: resume from suspend-to-idle\n");
101} 126}
102 127
103void freeze_wake(void) 128void freeze_wake(void)
@@ -371,10 +396,8 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
371 * all the devices are suspended. 396 * all the devices are suspended.
372 */ 397 */
373 if (state == PM_SUSPEND_FREEZE) { 398 if (state == PM_SUSPEND_FREEZE) {
374 trace_suspend_resume(TPS("machine_suspend"), state, true); 399 s2idle_loop();
375 freeze_enter(); 400 goto Platform_early_resume;
376 trace_suspend_resume(TPS("machine_suspend"), state, false);
377 goto Platform_wake;
378 } 401 }
379 402
380 error = disable_nonboot_cpus(); 403 error = disable_nonboot_cpus();
diff --git a/tools/power/cpupower/utils/helpers/amd.c b/tools/power/cpupower/utils/helpers/amd.c
index 6437ef39aeea..5fd5c5b8c7b8 100644
--- a/tools/power/cpupower/utils/helpers/amd.c
+++ b/tools/power/cpupower/utils/helpers/amd.c
@@ -26,6 +26,15 @@ union msr_pstate {
26 unsigned res3:21; 26 unsigned res3:21;
27 unsigned en:1; 27 unsigned en:1;
28 } bits; 28 } bits;
29 struct {
30 unsigned fid:8;
31 unsigned did:6;
32 unsigned vid:8;
33 unsigned iddval:8;
34 unsigned idddiv:2;
35 unsigned res1:30;
36 unsigned en:1;
37 } fam17h_bits;
29 unsigned long long val; 38 unsigned long long val;
30}; 39};
31 40
@@ -35,6 +44,8 @@ static int get_did(int family, union msr_pstate pstate)
35 44
36 if (family == 0x12) 45 if (family == 0x12)
37 t = pstate.val & 0xf; 46 t = pstate.val & 0xf;
47 else if (family == 0x17)
48 t = pstate.fam17h_bits.did;
38 else 49 else
39 t = pstate.bits.did; 50 t = pstate.bits.did;
40 51
@@ -44,16 +55,20 @@ static int get_did(int family, union msr_pstate pstate)
44static int get_cof(int family, union msr_pstate pstate) 55static int get_cof(int family, union msr_pstate pstate)
45{ 56{
46 int t; 57 int t;
47 int fid, did; 58 int fid, did, cof;
48 59
49 did = get_did(family, pstate); 60 did = get_did(family, pstate);
50 61 if (family == 0x17) {
51 t = 0x10; 62 fid = pstate.fam17h_bits.fid;
52 fid = pstate.bits.fid; 63 cof = 200 * fid / did;
53 if (family == 0x11) 64 } else {
54 t = 0x8; 65 t = 0x10;
55 66 fid = pstate.bits.fid;
56 return (100 * (fid + t)) >> did; 67 if (family == 0x11)
68 t = 0x8;
69 cof = (100 * (fid + t)) >> did;
70 }
71 return cof;
57} 72}
58 73
59/* Needs: 74/* Needs:
diff --git a/tools/power/cpupower/utils/helpers/helpers.h b/tools/power/cpupower/utils/helpers/helpers.h
index afb66f80554e..799a18be60aa 100644
--- a/tools/power/cpupower/utils/helpers/helpers.h
+++ b/tools/power/cpupower/utils/helpers/helpers.h
@@ -70,6 +70,8 @@ enum cpupower_cpu_vendor {X86_VENDOR_UNKNOWN = 0, X86_VENDOR_INTEL,
70#define CPUPOWER_CAP_IS_SNB 0x00000020 70#define CPUPOWER_CAP_IS_SNB 0x00000020
71#define CPUPOWER_CAP_INTEL_IDA 0x00000040 71#define CPUPOWER_CAP_INTEL_IDA 0x00000040
72 72
73#define CPUPOWER_AMD_CPBDIS 0x02000000
74
73#define MAX_HW_PSTATES 10 75#define MAX_HW_PSTATES 10
74 76
75struct cpupower_cpu_info { 77struct cpupower_cpu_info {
diff --git a/tools/power/cpupower/utils/helpers/misc.c b/tools/power/cpupower/utils/helpers/misc.c
index 1609243f5c64..601d719d4e60 100644
--- a/tools/power/cpupower/utils/helpers/misc.c
+++ b/tools/power/cpupower/utils/helpers/misc.c
@@ -2,11 +2,14 @@
2 2
3#include "helpers/helpers.h" 3#include "helpers/helpers.h"
4 4
5#define MSR_AMD_HWCR 0xc0010015
6
5int cpufreq_has_boost_support(unsigned int cpu, int *support, int *active, 7int cpufreq_has_boost_support(unsigned int cpu, int *support, int *active,
6 int *states) 8 int *states)
7{ 9{
8 struct cpupower_cpu_info cpu_info; 10 struct cpupower_cpu_info cpu_info;
9 int ret; 11 int ret;
12 unsigned long long val;
10 13
11 *support = *active = *states = 0; 14 *support = *active = *states = 0;
12 15
@@ -16,10 +19,22 @@ int cpufreq_has_boost_support(unsigned int cpu, int *support, int *active,
16 19
17 if (cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_CBP) { 20 if (cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_CBP) {
18 *support = 1; 21 *support = 1;
19 amd_pci_get_num_boost_states(active, states); 22
20 if (ret <= 0) 23 /* AMD Family 0x17 does not utilize PCI D18F4 like prior
21 return ret; 24 * families and has no fixed discrete boost states but
22 *support = 1; 25 * has Hardware determined variable increments instead.
26 */
27
28 if (cpu_info.family == 0x17) {
29 if (!read_msr(cpu, MSR_AMD_HWCR, &val)) {
30 if (!(val & CPUPOWER_AMD_CPBDIS))
31 *active = 1;
32 }
33 } else {
34 ret = amd_pci_get_num_boost_states(active, states);
35 if (ret)
36 return ret;
37 }
23 } else if (cpupower_cpu_info.caps & CPUPOWER_CAP_INTEL_IDA) 38 } else if (cpupower_cpu_info.caps & CPUPOWER_CAP_INTEL_IDA)
24 *support = *active = 1; 39 *support = *active = 1;
25 return 0; 40 return 0;
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index b11294730771..0dafba2c1e7d 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -57,7 +57,6 @@ unsigned int list_header_only;
57unsigned int dump_only; 57unsigned int dump_only;
58unsigned int do_snb_cstates; 58unsigned int do_snb_cstates;
59unsigned int do_knl_cstates; 59unsigned int do_knl_cstates;
60unsigned int do_skl_residency;
61unsigned int do_slm_cstates; 60unsigned int do_slm_cstates;
62unsigned int use_c1_residency_msr; 61unsigned int use_c1_residency_msr;
63unsigned int has_aperf; 62unsigned int has_aperf;
@@ -93,6 +92,7 @@ unsigned int do_ring_perf_limit_reasons;
93unsigned int crystal_hz; 92unsigned int crystal_hz;
94unsigned long long tsc_hz; 93unsigned long long tsc_hz;
95int base_cpu; 94int base_cpu;
95int do_migrate;
96double discover_bclk(unsigned int family, unsigned int model); 96double discover_bclk(unsigned int family, unsigned int model);
97unsigned int has_hwp; /* IA32_PM_ENABLE, IA32_HWP_CAPABILITIES */ 97unsigned int has_hwp; /* IA32_PM_ENABLE, IA32_HWP_CAPABILITIES */
98 /* IA32_HWP_REQUEST, IA32_HWP_STATUS */ 98 /* IA32_HWP_REQUEST, IA32_HWP_STATUS */
@@ -151,6 +151,8 @@ size_t cpu_present_setsize, cpu_affinity_setsize, cpu_subset_size;
151#define MAX_ADDED_COUNTERS 16 151#define MAX_ADDED_COUNTERS 16
152 152
153struct thread_data { 153struct thread_data {
154 struct timeval tv_begin;
155 struct timeval tv_end;
154 unsigned long long tsc; 156 unsigned long long tsc;
155 unsigned long long aperf; 157 unsigned long long aperf;
156 unsigned long long mperf; 158 unsigned long long mperf;
@@ -301,6 +303,9 @@ int for_all_cpus(int (func)(struct thread_data *, struct core_data *, struct pkg
301 303
302int cpu_migrate(int cpu) 304int cpu_migrate(int cpu)
303{ 305{
306 if (!do_migrate)
307 return 0;
308
304 CPU_ZERO_S(cpu_affinity_setsize, cpu_affinity_set); 309 CPU_ZERO_S(cpu_affinity_setsize, cpu_affinity_set);
305 CPU_SET_S(cpu, cpu_affinity_setsize, cpu_affinity_set); 310 CPU_SET_S(cpu, cpu_affinity_setsize, cpu_affinity_set);
306 if (sched_setaffinity(0, cpu_affinity_setsize, cpu_affinity_set) == -1) 311 if (sched_setaffinity(0, cpu_affinity_setsize, cpu_affinity_set) == -1)
@@ -384,8 +389,14 @@ struct msr_counter bic[] = {
384 { 0x0, "CPU" }, 389 { 0x0, "CPU" },
385 { 0x0, "Mod%c6" }, 390 { 0x0, "Mod%c6" },
386 { 0x0, "sysfs" }, 391 { 0x0, "sysfs" },
392 { 0x0, "Totl%C0" },
393 { 0x0, "Any%C0" },
394 { 0x0, "GFX%C0" },
395 { 0x0, "CPUGFX%" },
387}; 396};
388 397
398
399
389#define MAX_BIC (sizeof(bic) / sizeof(struct msr_counter)) 400#define MAX_BIC (sizeof(bic) / sizeof(struct msr_counter))
390#define BIC_Package (1ULL << 0) 401#define BIC_Package (1ULL << 0)
391#define BIC_Avg_MHz (1ULL << 1) 402#define BIC_Avg_MHz (1ULL << 1)
@@ -426,6 +437,10 @@ struct msr_counter bic[] = {
426#define BIC_CPU (1ULL << 36) 437#define BIC_CPU (1ULL << 36)
427#define BIC_Mod_c6 (1ULL << 37) 438#define BIC_Mod_c6 (1ULL << 37)
428#define BIC_sysfs (1ULL << 38) 439#define BIC_sysfs (1ULL << 38)
440#define BIC_Totl_c0 (1ULL << 39)
441#define BIC_Any_c0 (1ULL << 40)
442#define BIC_GFX_c0 (1ULL << 41)
443#define BIC_CPUGFX (1ULL << 42)
429 444
430unsigned long long bic_enabled = 0xFFFFFFFFFFFFFFFFULL; 445unsigned long long bic_enabled = 0xFFFFFFFFFFFFFFFFULL;
431unsigned long long bic_present = BIC_sysfs; 446unsigned long long bic_present = BIC_sysfs;
@@ -521,6 +536,8 @@ void print_header(char *delim)
521 struct msr_counter *mp; 536 struct msr_counter *mp;
522 int printed = 0; 537 int printed = 0;
523 538
539 if (debug)
540 outp += sprintf(outp, "usec %s", delim);
524 if (DO_BIC(BIC_Package)) 541 if (DO_BIC(BIC_Package))
525 outp += sprintf(outp, "%sPackage", (printed++ ? delim : "")); 542 outp += sprintf(outp, "%sPackage", (printed++ ? delim : ""));
526 if (DO_BIC(BIC_Core)) 543 if (DO_BIC(BIC_Core))
@@ -599,12 +616,14 @@ void print_header(char *delim)
599 if (DO_BIC(BIC_GFXMHz)) 616 if (DO_BIC(BIC_GFXMHz))
600 outp += sprintf(outp, "%sGFXMHz", (printed++ ? delim : "")); 617 outp += sprintf(outp, "%sGFXMHz", (printed++ ? delim : ""));
601 618
602 if (do_skl_residency) { 619 if (DO_BIC(BIC_Totl_c0))
603 outp += sprintf(outp, "%sTotl%%C0", (printed++ ? delim : "")); 620 outp += sprintf(outp, "%sTotl%%C0", (printed++ ? delim : ""));
621 if (DO_BIC(BIC_Any_c0))
604 outp += sprintf(outp, "%sAny%%C0", (printed++ ? delim : "")); 622 outp += sprintf(outp, "%sAny%%C0", (printed++ ? delim : ""));
623 if (DO_BIC(BIC_GFX_c0))
605 outp += sprintf(outp, "%sGFX%%C0", (printed++ ? delim : "")); 624 outp += sprintf(outp, "%sGFX%%C0", (printed++ ? delim : ""));
625 if (DO_BIC(BIC_CPUGFX))
606 outp += sprintf(outp, "%sCPUGFX%%", (printed++ ? delim : "")); 626 outp += sprintf(outp, "%sCPUGFX%%", (printed++ ? delim : ""));
607 }
608 627
609 if (DO_BIC(BIC_Pkgpc2)) 628 if (DO_BIC(BIC_Pkgpc2))
610 outp += sprintf(outp, "%sPkg%%pc2", (printed++ ? delim : "")); 629 outp += sprintf(outp, "%sPkg%%pc2", (printed++ ? delim : ""));
@@ -771,6 +790,14 @@ int format_counters(struct thread_data *t, struct core_data *c,
771 (cpu_subset && !CPU_ISSET_S(t->cpu_id, cpu_subset_size, cpu_subset))) 790 (cpu_subset && !CPU_ISSET_S(t->cpu_id, cpu_subset_size, cpu_subset)))
772 return 0; 791 return 0;
773 792
793 if (debug) {
794 /* on each row, print how many usec each timestamp took to gather */
795 struct timeval tv;
796
797 timersub(&t->tv_end, &t->tv_begin, &tv);
798 outp += sprintf(outp, "%5ld\t", tv.tv_sec * 1000000 + tv.tv_usec);
799 }
800
774 interval_float = tv_delta.tv_sec + tv_delta.tv_usec/1000000.0; 801 interval_float = tv_delta.tv_sec + tv_delta.tv_usec/1000000.0;
775 802
776 tsc = t->tsc * tsc_tweak; 803 tsc = t->tsc * tsc_tweak;
@@ -912,12 +939,14 @@ int format_counters(struct thread_data *t, struct core_data *c,
912 outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), p->gfx_mhz); 939 outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), p->gfx_mhz);
913 940
914 /* Totl%C0, Any%C0 GFX%C0 CPUGFX% */ 941 /* Totl%C0, Any%C0 GFX%C0 CPUGFX% */
915 if (do_skl_residency) { 942 if (DO_BIC(BIC_Totl_c0))
916 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pkg_wtd_core_c0/tsc); 943 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pkg_wtd_core_c0/tsc);
944 if (DO_BIC(BIC_Any_c0))
917 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pkg_any_core_c0/tsc); 945 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pkg_any_core_c0/tsc);
946 if (DO_BIC(BIC_GFX_c0))
918 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pkg_any_gfxe_c0/tsc); 947 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pkg_any_gfxe_c0/tsc);
948 if (DO_BIC(BIC_CPUGFX))
919 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pkg_both_core_gfxe_c0/tsc); 949 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pkg_both_core_gfxe_c0/tsc);
920 }
921 950
922 if (DO_BIC(BIC_Pkgpc2)) 951 if (DO_BIC(BIC_Pkgpc2))
923 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pc2/tsc); 952 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pc2/tsc);
@@ -1038,12 +1067,16 @@ delta_package(struct pkg_data *new, struct pkg_data *old)
1038 int i; 1067 int i;
1039 struct msr_counter *mp; 1068 struct msr_counter *mp;
1040 1069
1041 if (do_skl_residency) { 1070
1071 if (DO_BIC(BIC_Totl_c0))
1042 old->pkg_wtd_core_c0 = new->pkg_wtd_core_c0 - old->pkg_wtd_core_c0; 1072 old->pkg_wtd_core_c0 = new->pkg_wtd_core_c0 - old->pkg_wtd_core_c0;
1073 if (DO_BIC(BIC_Any_c0))
1043 old->pkg_any_core_c0 = new->pkg_any_core_c0 - old->pkg_any_core_c0; 1074 old->pkg_any_core_c0 = new->pkg_any_core_c0 - old->pkg_any_core_c0;
1075 if (DO_BIC(BIC_GFX_c0))
1044 old->pkg_any_gfxe_c0 = new->pkg_any_gfxe_c0 - old->pkg_any_gfxe_c0; 1076 old->pkg_any_gfxe_c0 = new->pkg_any_gfxe_c0 - old->pkg_any_gfxe_c0;
1077 if (DO_BIC(BIC_CPUGFX))
1045 old->pkg_both_core_gfxe_c0 = new->pkg_both_core_gfxe_c0 - old->pkg_both_core_gfxe_c0; 1078 old->pkg_both_core_gfxe_c0 = new->pkg_both_core_gfxe_c0 - old->pkg_both_core_gfxe_c0;
1046 } 1079
1047 old->pc2 = new->pc2 - old->pc2; 1080 old->pc2 = new->pc2 - old->pc2;
1048 if (DO_BIC(BIC_Pkgpc3)) 1081 if (DO_BIC(BIC_Pkgpc3))
1049 old->pc3 = new->pc3 - old->pc3; 1082 old->pc3 = new->pc3 - old->pc3;
@@ -1292,12 +1325,14 @@ int sum_counters(struct thread_data *t, struct core_data *c,
1292 if (!(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE)) 1325 if (!(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE))
1293 return 0; 1326 return 0;
1294 1327
1295 if (do_skl_residency) { 1328 if (DO_BIC(BIC_Totl_c0))
1296 average.packages.pkg_wtd_core_c0 += p->pkg_wtd_core_c0; 1329 average.packages.pkg_wtd_core_c0 += p->pkg_wtd_core_c0;
1330 if (DO_BIC(BIC_Any_c0))
1297 average.packages.pkg_any_core_c0 += p->pkg_any_core_c0; 1331 average.packages.pkg_any_core_c0 += p->pkg_any_core_c0;
1332 if (DO_BIC(BIC_GFX_c0))
1298 average.packages.pkg_any_gfxe_c0 += p->pkg_any_gfxe_c0; 1333 average.packages.pkg_any_gfxe_c0 += p->pkg_any_gfxe_c0;
1334 if (DO_BIC(BIC_CPUGFX))
1299 average.packages.pkg_both_core_gfxe_c0 += p->pkg_both_core_gfxe_c0; 1335 average.packages.pkg_both_core_gfxe_c0 += p->pkg_both_core_gfxe_c0;
1300 }
1301 1336
1302 average.packages.pc2 += p->pc2; 1337 average.packages.pc2 += p->pc2;
1303 if (DO_BIC(BIC_Pkgpc3)) 1338 if (DO_BIC(BIC_Pkgpc3))
@@ -1357,12 +1392,14 @@ void compute_average(struct thread_data *t, struct core_data *c,
1357 average.cores.c7 /= topo.num_cores; 1392 average.cores.c7 /= topo.num_cores;
1358 average.cores.mc6_us /= topo.num_cores; 1393 average.cores.mc6_us /= topo.num_cores;
1359 1394
1360 if (do_skl_residency) { 1395 if (DO_BIC(BIC_Totl_c0))
1361 average.packages.pkg_wtd_core_c0 /= topo.num_packages; 1396 average.packages.pkg_wtd_core_c0 /= topo.num_packages;
1397 if (DO_BIC(BIC_Any_c0))
1362 average.packages.pkg_any_core_c0 /= topo.num_packages; 1398 average.packages.pkg_any_core_c0 /= topo.num_packages;
1399 if (DO_BIC(BIC_GFX_c0))
1363 average.packages.pkg_any_gfxe_c0 /= topo.num_packages; 1400 average.packages.pkg_any_gfxe_c0 /= topo.num_packages;
1401 if (DO_BIC(BIC_CPUGFX))
1364 average.packages.pkg_both_core_gfxe_c0 /= topo.num_packages; 1402 average.packages.pkg_both_core_gfxe_c0 /= topo.num_packages;
1365 }
1366 1403
1367 average.packages.pc2 /= topo.num_packages; 1404 average.packages.pc2 /= topo.num_packages;
1368 if (DO_BIC(BIC_Pkgpc3)) 1405 if (DO_BIC(BIC_Pkgpc3))
@@ -1482,6 +1519,9 @@ int get_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p)
1482 struct msr_counter *mp; 1519 struct msr_counter *mp;
1483 int i; 1520 int i;
1484 1521
1522
1523 gettimeofday(&t->tv_begin, (struct timezone *)NULL);
1524
1485 if (cpu_migrate(cpu)) { 1525 if (cpu_migrate(cpu)) {
1486 fprintf(outf, "Could not migrate to CPU %d\n", cpu); 1526 fprintf(outf, "Could not migrate to CPU %d\n", cpu);
1487 return -1; 1527 return -1;
@@ -1565,7 +1605,7 @@ retry:
1565 1605
1566 /* collect core counters only for 1st thread in core */ 1606 /* collect core counters only for 1st thread in core */
1567 if (!(t->flags & CPU_IS_FIRST_THREAD_IN_CORE)) 1607 if (!(t->flags & CPU_IS_FIRST_THREAD_IN_CORE))
1568 return 0; 1608 goto done;
1569 1609
1570 if (DO_BIC(BIC_CPU_c3) && !do_slm_cstates && !do_knl_cstates) { 1610 if (DO_BIC(BIC_CPU_c3) && !do_slm_cstates && !do_knl_cstates) {
1571 if (get_msr(cpu, MSR_CORE_C3_RESIDENCY, &c->c3)) 1611 if (get_msr(cpu, MSR_CORE_C3_RESIDENCY, &c->c3))
@@ -1601,15 +1641,21 @@ retry:
1601 1641
1602 /* collect package counters only for 1st core in package */ 1642 /* collect package counters only for 1st core in package */
1603 if (!(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE)) 1643 if (!(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE))
1604 return 0; 1644 goto done;
1605 1645
1606 if (do_skl_residency) { 1646 if (DO_BIC(BIC_Totl_c0)) {
1607 if (get_msr(cpu, MSR_PKG_WEIGHTED_CORE_C0_RES, &p->pkg_wtd_core_c0)) 1647 if (get_msr(cpu, MSR_PKG_WEIGHTED_CORE_C0_RES, &p->pkg_wtd_core_c0))
1608 return -10; 1648 return -10;
1649 }
1650 if (DO_BIC(BIC_Any_c0)) {
1609 if (get_msr(cpu, MSR_PKG_ANY_CORE_C0_RES, &p->pkg_any_core_c0)) 1651 if (get_msr(cpu, MSR_PKG_ANY_CORE_C0_RES, &p->pkg_any_core_c0))
1610 return -11; 1652 return -11;
1653 }
1654 if (DO_BIC(BIC_GFX_c0)) {
1611 if (get_msr(cpu, MSR_PKG_ANY_GFXE_C0_RES, &p->pkg_any_gfxe_c0)) 1655 if (get_msr(cpu, MSR_PKG_ANY_GFXE_C0_RES, &p->pkg_any_gfxe_c0))
1612 return -12; 1656 return -12;
1657 }
1658 if (DO_BIC(BIC_CPUGFX)) {
1613 if (get_msr(cpu, MSR_PKG_BOTH_CORE_GFXE_C0_RES, &p->pkg_both_core_gfxe_c0)) 1659 if (get_msr(cpu, MSR_PKG_BOTH_CORE_GFXE_C0_RES, &p->pkg_both_core_gfxe_c0))
1614 return -13; 1660 return -13;
1615 } 1661 }
@@ -1688,6 +1734,8 @@ retry:
1688 if (get_mp(cpu, mp, &p->counter[i])) 1734 if (get_mp(cpu, mp, &p->counter[i]))
1689 return -10; 1735 return -10;
1690 } 1736 }
1737done:
1738 gettimeofday(&t->tv_end, (struct timezone *)NULL);
1691 1739
1692 return 0; 1740 return 0;
1693} 1741}
@@ -3895,6 +3943,9 @@ void decode_misc_enable_msr(void)
3895{ 3943{
3896 unsigned long long msr; 3944 unsigned long long msr;
3897 3945
3946 if (!genuine_intel)
3947 return;
3948
3898 if (!get_msr(base_cpu, MSR_IA32_MISC_ENABLE, &msr)) 3949 if (!get_msr(base_cpu, MSR_IA32_MISC_ENABLE, &msr))
3899 fprintf(outf, "cpu%d: MSR_IA32_MISC_ENABLE: 0x%08llx (%sTCC %sEIST %sMWAIT %sPREFETCH %sTURBO)\n", 3950 fprintf(outf, "cpu%d: MSR_IA32_MISC_ENABLE: 0x%08llx (%sTCC %sEIST %sMWAIT %sPREFETCH %sTURBO)\n",
3900 base_cpu, msr, 3951 base_cpu, msr,
@@ -4198,7 +4249,12 @@ void process_cpuid()
4198 BIC_PRESENT(BIC_Pkgpc10); 4249 BIC_PRESENT(BIC_Pkgpc10);
4199 } 4250 }
4200 do_irtl_hsw = has_hsw_msrs(family, model); 4251 do_irtl_hsw = has_hsw_msrs(family, model);
4201 do_skl_residency = has_skl_msrs(family, model); 4252 if (has_skl_msrs(family, model)) {
4253 BIC_PRESENT(BIC_Totl_c0);
4254 BIC_PRESENT(BIC_Any_c0);
4255 BIC_PRESENT(BIC_GFX_c0);
4256 BIC_PRESENT(BIC_CPUGFX);
4257 }
4202 do_slm_cstates = is_slm(family, model); 4258 do_slm_cstates = is_slm(family, model);
4203 do_knl_cstates = is_knl(family, model); 4259 do_knl_cstates = is_knl(family, model);
4204 4260
@@ -4578,7 +4634,7 @@ int get_and_dump_counters(void)
4578} 4634}
4579 4635
4580void print_version() { 4636void print_version() {
4581 fprintf(outf, "turbostat version 17.04.12" 4637 fprintf(outf, "turbostat version 17.06.23"
4582 " - Len Brown <lenb@kernel.org>\n"); 4638 " - Len Brown <lenb@kernel.org>\n");
4583} 4639}
4584 4640
@@ -4951,6 +5007,7 @@ void cmdline(int argc, char **argv)
4951 {"hide", required_argument, 0, 'H'}, // meh, -h taken by --help 5007 {"hide", required_argument, 0, 'H'}, // meh, -h taken by --help
4952 {"Joules", no_argument, 0, 'J'}, 5008 {"Joules", no_argument, 0, 'J'},
4953 {"list", no_argument, 0, 'l'}, 5009 {"list", no_argument, 0, 'l'},
5010 {"migrate", no_argument, 0, 'm'},
4954 {"out", required_argument, 0, 'o'}, 5011 {"out", required_argument, 0, 'o'},
4955 {"quiet", no_argument, 0, 'q'}, 5012 {"quiet", no_argument, 0, 'q'},
4956 {"show", required_argument, 0, 's'}, 5013 {"show", required_argument, 0, 's'},
@@ -4962,7 +5019,7 @@ void cmdline(int argc, char **argv)
4962 5019
4963 progname = argv[0]; 5020 progname = argv[0];
4964 5021
4965 while ((opt = getopt_long_only(argc, argv, "+C:c:Ddhi:JM:m:o:qST:v", 5022 while ((opt = getopt_long_only(argc, argv, "+C:c:Ddhi:Jmo:qST:v",
4966 long_options, &option_index)) != -1) { 5023 long_options, &option_index)) != -1) {
4967 switch (opt) { 5024 switch (opt) {
4968 case 'a': 5025 case 'a':
@@ -5005,6 +5062,9 @@ void cmdline(int argc, char **argv)
5005 list_header_only++; 5062 list_header_only++;
5006 quiet++; 5063 quiet++;
5007 break; 5064 break;
5065 case 'm':
5066 do_migrate = 1;
5067 break;
5008 case 'o': 5068 case 'o':
5009 outf = fopen_or_die(optarg, "w"); 5069 outf = fopen_or_die(optarg, "w");
5010 break; 5070 break;
diff --git a/tools/power/x86/x86_energy_perf_policy/Makefile b/tools/power/x86/x86_energy_perf_policy/Makefile
index 971c9ffdcb50..a711eec0c895 100644
--- a/tools/power/x86/x86_energy_perf_policy/Makefile
+++ b/tools/power/x86/x86_energy_perf_policy/Makefile
@@ -1,10 +1,27 @@
1DESTDIR ?= 1CC = $(CROSS_COMPILE)gcc
2BUILD_OUTPUT := $(CURDIR)
3PREFIX := /usr
4DESTDIR :=
5
6ifeq ("$(origin O)", "command line")
7 BUILD_OUTPUT := $(O)
8endif
2 9
3x86_energy_perf_policy : x86_energy_perf_policy.c 10x86_energy_perf_policy : x86_energy_perf_policy.c
11CFLAGS += -Wall
12CFLAGS += -DMSRHEADER='"../../../../arch/x86/include/asm/msr-index.h"'
13
14%: %.c
15 @mkdir -p $(BUILD_OUTPUT)
16 $(CC) $(CFLAGS) $< -o $(BUILD_OUTPUT)/$@
4 17
18.PHONY : clean
5clean : 19clean :
6 rm -f x86_energy_perf_policy 20 @rm -f $(BUILD_OUTPUT)/x86_energy_perf_policy
21
22install : x86_energy_perf_policy
23 install -d $(DESTDIR)$(PREFIX)/bin
24 install $(BUILD_OUTPUT)/x86_energy_perf_policy $(DESTDIR)$(PREFIX)/bin/x86_energy_perf_policy
25 install -d $(DESTDIR)$(PREFIX)/share/man/man8
26 install x86_energy_perf_policy.8 $(DESTDIR)$(PREFIX)/share/man/man8
7 27
8install :
9 install x86_energy_perf_policy ${DESTDIR}/usr/bin/
10 install x86_energy_perf_policy.8 ${DESTDIR}/usr/share/man/man8/
diff --git a/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.8 b/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.8
index 8eaaad648cdb..17db1c3af4d0 100644
--- a/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.8
+++ b/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.8
@@ -1,104 +1,213 @@
1.\" This page Copyright (C) 2010 Len Brown <len.brown@intel.com> 1.\" This page Copyright (C) 2010 - 2015 Len Brown <len.brown@intel.com>
2.\" Distributed under the GPL, Copyleft 1994. 2.\" Distributed under the GPL, Copyleft 1994.
3.TH X86_ENERGY_PERF_POLICY 8 3.TH X86_ENERGY_PERF_POLICY 8
4.SH NAME 4.SH NAME
5x86_energy_perf_policy \- read or write MSR_IA32_ENERGY_PERF_BIAS 5x86_energy_perf_policy \- Manage Energy vs. Performance Policy via x86 Model Specific Registers
6.SH SYNOPSIS 6.SH SYNOPSIS
7.ft B
8.B x86_energy_perf_policy 7.B x86_energy_perf_policy
9.RB [ "\-c cpu" ] 8.RB "[ options ] [ scope ] [field \ value]"
10.RB [ "\-v" ]
11.RB "\-r"
12.br 9.br
13.B x86_energy_perf_policy 10.RB "scope: \-\-cpu\ cpu-list | \-\-pkg\ pkg-list"
14.RB [ "\-c cpu" ]
15.RB [ "\-v" ]
16.RB 'performance'
17.br 11.br
18.B x86_energy_perf_policy 12.RB "cpu-list, pkg-list: # | #,# | #-# | all"
19.RB [ "\-c cpu" ]
20.RB [ "\-v" ]
21.RB 'normal'
22.br 13.br
23.B x86_energy_perf_policy 14.RB "field: \-\-all | \-\-epb | \-\-hwp-epp | \-\-hwp-min | \-\-hwp-max | \-\-hwp-desired"
24.RB [ "\-c cpu" ]
25.RB [ "\-v" ]
26.RB 'powersave'
27.br 15.br
28.B x86_energy_perf_policy 16.RB "other: (\-\-force | \-\-hwp-enable | \-\-turbo-enable) value)"
29.RB [ "\-c cpu" ]
30.RB [ "\-v" ]
31.RB n
32.br 17.br
18.RB "value: # | default | performance | balance-performance | balance-power | power"
33.SH DESCRIPTION 19.SH DESCRIPTION
34\fBx86_energy_perf_policy\fP 20\fBx86_energy_perf_policy\fP
35allows software to convey 21displays and updates energy-performance policy settings specific to
36its policy for the relative importance of performance 22Intel Architecture Processors. Settings are accessed via Model Specific Register (MSR)
37versus energy savings to the processor. 23updates, no matter if the Linux cpufreq sub-system is enabled or not.
38 24
39The processor uses this information in model-specific ways 25Policy in MSR_IA32_ENERGY_PERF_BIAS (EPB)
40when it must select trade-offs between performance and 26may affect a wide range of hardware decisions,
41energy efficiency. 27such as how aggressively the hardware enters and exits CPU idle states (C-states)
28and Processor Performance States (P-states).
29This policy hint does not replace explicit OS C-state and P-state selection.
30Rather, it tells the hardware how aggressively to implement those selections.
31Further, it allows the OS to influence energy/performance trade-offs where there
32is no software interface, such as in the opportunistic "turbo-mode" P-state range.
33Note that MSR_IA32_ENERGY_PERF_BIAS is defined per CPU,
34but some implementations
35share a single MSR among all CPUs in each processor package.
36On those systems, a write to EPB on one processor will
37be visible, and will have an effect, on all CPUs
38in the same processor package.
42 39
43This policy hint does not supersede Processor Performance states 40Hardware P-States (HWP) are effectively an expansion of hardware
44(P-states) or CPU Idle power states (C-states), but allows 41P-state control from the opportunistic turbo-mode P-state range
45software to have influence where it would otherwise be unable 42to include the entire range of available P-states.
46to express a preference. 43On Broadwell Xeon, the initial HWP implementation, EBP influenced HWP.
44That influence was removed in subsequent generations,
45where it was moved to the
46Energy_Performance_Preference (EPP) field in
47a pair of dedicated MSRs -- MSR_IA32_HWP_REQUEST and MSR_IA32_HWP_REQUEST_PKG.
47 48
48For example, this setting may tell the hardware how 49EPP is the most commonly managed knob in HWP mode,
49aggressively or conservatively to control frequency 50but MSR_IA32_HWP_REQUEST also allows the user to specify
50in the "turbo range" above the explicitly OS-controlled 51minimum-frequency for Quality-of-Service,
51P-state frequency range. It may also tell the hardware 52and maximum-frequency for power-capping.
52how aggressively is should enter the OS requested C-states. 53MSR_IA32_HWP_REQUEST is defined per-CPU.
53 54
54Support for this feature is indicated by CPUID.06H.ECX.bit3 55MSR_IA32_HWP_REQUEST_PKG has the same capability as MSR_IA32_HWP_REQUEST,
55per the Intel Architectures Software Developer's Manual. 56but it can simultaneously set the default policy for all CPUs within a package.
57A bit in per-CPU MSR_IA32_HWP_REQUEST indicates whether it is
58over-ruled-by or exempt-from MSR_IA32_HWP_REQUEST_PKG.
56 59
57.SS Options 60MSR_HWP_CAPABILITIES shows the default values for the fields
58\fB-c\fP limits operation to a single CPU. 61in MSR_IA32_HWP_REQUEST. It is displayed when no values
59The default is to operate on all CPUs. 62are being written.
60Note that MSR_IA32_ENERGY_PERF_BIAS is defined per 63
61logical processor, but that the initial implementations 64.SS SCOPE OPTIONS
62of the MSR were shared among all processors in each package.
63.PP
64\fB-v\fP increases verbosity. By default
65x86_energy_perf_policy is silent.
66.PP
67\fB-r\fP is for "read-only" mode - the unchanged state
68is read and displayed.
69.PP 65.PP
70.I performance 66\fB-c, --cpu\fP Operate on the MSR_IA32_HWP_REQUEST for each CPU in a CPU-list.
71Set a policy where performance is paramount. 67The CPU-list may be comma-separated CPU numbers, with dash for range
72The processor will be unwilling to sacrifice any performance 68or the string "all". Eg. '--cpu 1,4,6-8' or '--cpu all'.
73for the sake of energy saving. This is the hardware default. 69When --cpu is used, \fB--hwp-use-pkg\fP is available, which specifies whether the per-cpu
70MSR_IA32_HWP_REQUEST should be over-ruled by MSR_IA32_HWP_REQUEST_PKG (1),
71or exempt from MSR_IA32_HWP_REQUEST_PKG (0).
72
73\fB-p, --pkg\fP Operate on the MSR_IA32_HWP_REQUEST_PKG for each package in the package-list.
74The list is a string of individual package numbers separated
75by commas, and or ranges of package numbers separated by a dash,
76or the string "all".
77For example '--pkg 1,3' or '--pkg all'
78
79.SS VALUE OPTIONS
74.PP 80.PP
75.I normal 81.I normal | default
76Set a policy with a normal balance between performance and energy efficiency. 82Set a policy with a normal balance between performance and energy efficiency.
77The processor will tolerate minor performance compromise 83The processor will tolerate minor performance compromise
78for potentially significant energy savings. 84for potentially significant energy savings.
79This reasonable default for most desktops and servers. 85This is a reasonable default for most desktops and servers.
86"default" is a synonym for "normal".
80.PP 87.PP
81.I powersave 88.I performance
89Set a policy for maximum performance,
90accepting no performance sacrifice for the benefit of energy efficiency.
91.PP
92.I balance-performance
93Set a policy with a high priority on performance,
94but allowing some performance loss to benefit energy efficiency.
95.PP
96.I balance-power
97Set a policy where the performance and power are balanced.
98This is the default.
99.PP
100.I power
82Set a policy where the processor can accept 101Set a policy where the processor can accept
83a measurable performance hit to maximize energy efficiency. 102a measurable performance impact to maximize energy efficiency.
103
84.PP 104.PP
85.I n 105The following table shows the mapping from the value strings above to actual MSR values.
86Set MSR_IA32_ENERGY_PERF_BIAS to the specified number. 106This mapping is defined in the Linux-kernel header, msr-index.h.
87The range of valid numbers is 0-15, where 0 is maximum
88performance and 15 is maximum energy efficiency.
89 107
108.nf
109VALUE STRING EPB EPP
110performance 0 0
111balance-performance 4 128
112normal, default 6 128
113balance-power 8 192
114power 15 255
115.fi
116.PP
117For MSR_IA32_HWP_REQUEST performance fields
118(--hwp-min, --hwp-max, --hwp-desired), the value option
119is in units of 100 MHz, Eg. 12 signifies 1200 MHz.
120
121.SS FIELD OPTIONS
122\fB-a, --all value-string\fP Sets all EPB and EPP and HWP limit fields to the value associated with
123the value-string. In addition, enables turbo-mode and HWP-mode, if they were previous disabled.
124Thus "--all normal" will set a system without cpufreq into a well known configuration.
125.PP
126\fB-B, --epb\fP set EPB per-core or per-package.
127See value strings in the table above.
128.PP
129\fB-d, --debug\fP debug increases verbosity. By default
130x86_energy_perf_policy is silent for updates,
131and verbose for read-only mode.
132.PP
133\fB-P, --hwp-epp\fP set HWP.EPP per-core or per-package.
134See value strings in the table above.
135.PP
136\fB-m, --hwp-min\fP request HWP to not go below the specified core/bus ratio.
137The "default" is the value found in IA32_HWP_CAPABILITIES.min.
138.PP
139\fB-M, --hwp-max\fP request HWP not exceed a the specified core/bus ratio.
140The "default" is the value found in IA32_HWP_CAPABILITIES.max.
141.PP
142\fB-D, --hwp-desired\fP request HWP 'desired' frequency.
143The "normal" setting is 0, which
144corresponds to 'full autonomous' HWP control.
145Non-zero performance values request a specific performance
146level on this processor, specified in multiples of 100 MHz.
147.PP
148\fB-w, --hwp-window\fP specify integer number of microsec
149in the sliding window that HWP uses to maintain average frequency.
150This parameter is meaningful only when the "desired" field above is non-zero.
151Default is 0, allowing the HW to choose.
152.SH OTHER OPTIONS
153.PP
154\fB-f, --force\fP writes the specified values without bounds checking.
155.PP
156\fB-U, --hwp-use-pkg\fP (0 | 1), when used in conjunction with --cpu,
157indicates whether the per-CPU MSR_IA32_HWP_REQUEST should be overruled (1)
158or exempt (0) from per-Package MSR_IA32_HWP_REQUEST_PKG settings.
159The default is exempt.
160.PP
161\fB-H, --hwp-enable\fP enable HardWare-P-state (HWP) mode. Once enabled, system RESET is required to disable HWP mode.
162.PP
163\fB-t, --turbo-enable\fP enable (1) or disable (0) turbo mode.
164.PP
165\fB-v, --version\fP print version and exit.
166.PP
167If no request to change policy is made,
168the default behavior is to read
169and display the current system state,
170including the default capabilities.
171.SH WARNING
172.PP
173This utility writes directly to Model Specific Registers.
174There is no locking or coordination should this utility
175be used to modify HWP limit fields at the same time that
176intel_pstate's sysfs attributes access the same MSRs.
177.PP
178Note that --hwp-desired and --hwp-window are considered experimental.
179Future versions of Linux reserve the right to access these
180fields internally -- potentially conflicting with user-space access.
181.SH EXAMPLE
182.nf
183# sudo x86_energy_perf_policy
184cpu0: EPB 6
185cpu0: HWP_REQ: min 6 max 35 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0
186cpu0: HWP_CAP: low 1 eff 8 guar 27 high 35
187cpu1: EPB 6
188cpu1: HWP_REQ: min 6 max 35 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0
189cpu1: HWP_CAP: low 1 eff 8 guar 27 high 35
190cpu2: EPB 6
191cpu2: HWP_REQ: min 6 max 35 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0
192cpu2: HWP_CAP: low 1 eff 8 guar 27 high 35
193cpu3: EPB 6
194cpu3: HWP_REQ: min 6 max 35 des 0 epp 128 window 0x0 (0*10^0us) use_pkg 0
195cpu3: HWP_CAP: low 1 eff 8 guar 27 high 35
196.fi
90.SH NOTES 197.SH NOTES
91.B "x86_energy_perf_policy " 198.B "x86_energy_perf_policy"
92runs only as root. 199runs only as root.
93.SH FILES 200.SH FILES
94.ta 201.ta
95.nf 202.nf
96/dev/cpu/*/msr 203/dev/cpu/*/msr
97.fi 204.fi
98
99.SH "SEE ALSO" 205.SH "SEE ALSO"
206.nf
100msr(4) 207msr(4)
208Intel(R) 64 and IA-32 Architectures Software Developer's Manual
209.fi
101.PP 210.PP
102.SH AUTHORS 211.SH AUTHORS
103.nf 212.nf
104Written by Len Brown <len.brown@intel.com> 213Len Brown
diff --git a/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c b/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c
index 40b3e5482f8a..65bbe627a425 100644
--- a/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c
+++ b/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c
@@ -3,322 +3,1424 @@
3 * policy preference bias on recent X86 processors. 3 * policy preference bias on recent X86 processors.
4 */ 4 */
5/* 5/*
6 * Copyright (c) 2010, Intel Corporation. 6 * Copyright (c) 2010 - 2017 Intel Corporation.
7 * Len Brown <len.brown@intel.com> 7 * Len Brown <len.brown@intel.com>
8 * 8 *
9 * This program is free software; you can redistribute it and/or modify it 9 * This program is released under GPL v2
10 * under the terms and conditions of the GNU General Public License,
11 * version 2, as published by the Free Software Foundation.
12 *
13 * This program is distributed in the hope it will be useful, but WITHOUT
14 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
15 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
16 * more details.
17 *
18 * You should have received a copy of the GNU General Public License along with
19 * this program; if not, write to the Free Software Foundation, Inc.,
20 * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
21 */ 10 */
22 11
12#define _GNU_SOURCE
13#include MSRHEADER
23#include <stdio.h> 14#include <stdio.h>
24#include <unistd.h> 15#include <unistd.h>
25#include <sys/types.h> 16#include <sys/types.h>
17#include <sched.h>
26#include <sys/stat.h> 18#include <sys/stat.h>
27#include <sys/resource.h> 19#include <sys/resource.h>
20#include <getopt.h>
21#include <err.h>
28#include <fcntl.h> 22#include <fcntl.h>
29#include <signal.h> 23#include <signal.h>
30#include <sys/time.h> 24#include <sys/time.h>
25#include <limits.h>
31#include <stdlib.h> 26#include <stdlib.h>
32#include <string.h> 27#include <string.h>
28#include <cpuid.h>
29#include <errno.h>
30
31#define OPTARG_NORMAL (INT_MAX - 1)
32#define OPTARG_POWER (INT_MAX - 2)
33#define OPTARG_BALANCE_POWER (INT_MAX - 3)
34#define OPTARG_BALANCE_PERFORMANCE (INT_MAX - 4)
35#define OPTARG_PERFORMANCE (INT_MAX - 5)
36
37struct msr_hwp_cap {
38 unsigned char highest;
39 unsigned char guaranteed;
40 unsigned char efficient;
41 unsigned char lowest;
42};
33 43
34unsigned int verbose; /* set with -v */ 44struct msr_hwp_request {
35unsigned int read_only; /* set with -r */ 45 unsigned char hwp_min;
46 unsigned char hwp_max;
47 unsigned char hwp_desired;
48 unsigned char hwp_epp;
49 unsigned int hwp_window;
50 unsigned char hwp_use_pkg;
51} req_update;
52
53unsigned int debug;
54unsigned int verbose;
55unsigned int force;
36char *progname; 56char *progname;
37unsigned long long new_bias; 57int base_cpu;
38int cpu = -1; 58unsigned char update_epb;
59unsigned long long new_epb;
60unsigned char turbo_is_enabled;
61unsigned char update_turbo;
62unsigned char turbo_update_value;
63unsigned char update_hwp_epp;
64unsigned char update_hwp_min;
65unsigned char update_hwp_max;
66unsigned char update_hwp_desired;
67unsigned char update_hwp_window;
68unsigned char update_hwp_use_pkg;
69unsigned char update_hwp_enable;
70#define hwp_update_enabled() (update_hwp_enable | update_hwp_epp | update_hwp_max | update_hwp_min | update_hwp_desired | update_hwp_window | update_hwp_use_pkg)
71int max_cpu_num;
72int max_pkg_num;
73#define MAX_PACKAGES 64
74unsigned int first_cpu_in_pkg[MAX_PACKAGES];
75unsigned long long pkg_present_set;
76unsigned long long pkg_selected_set;
77cpu_set_t *cpu_present_set;
78cpu_set_t *cpu_selected_set;
79int genuine_intel;
80
81size_t cpu_setsize;
82
83char *proc_stat = "/proc/stat";
84
85unsigned int has_epb; /* MSR_IA32_ENERGY_PERF_BIAS */
86unsigned int has_hwp; /* IA32_PM_ENABLE, IA32_HWP_CAPABILITIES */
87 /* IA32_HWP_REQUEST, IA32_HWP_STATUS */
88unsigned int has_hwp_notify; /* IA32_HWP_INTERRUPT */
89unsigned int has_hwp_activity_window; /* IA32_HWP_REQUEST[bits 41:32] */
90unsigned int has_hwp_epp; /* IA32_HWP_REQUEST[bits 31:24] */
91unsigned int has_hwp_request_pkg; /* IA32_HWP_REQUEST_PKG */
92
93unsigned int bdx_highest_ratio;
39 94
40/* 95/*
41 * Usage: 96 * maintain compatibility with original implementation, but don't document it:
42 *
43 * -c cpu: limit action to a single CPU (default is all CPUs)
44 * -v: verbose output (can invoke more than once)
45 * -r: read-only, don't change any settings
46 *
47 * performance
48 * Performance is paramount.
49 * Unwilling to sacrifice any performance
50 * for the sake of energy saving. (hardware default)
51 *
52 * normal
53 * Can tolerate minor performance compromise
54 * for potentially significant energy savings.
55 * (reasonable default for most desktops and servers)
56 *
57 * powersave
58 * Can tolerate significant performance hit
59 * to maximize energy savings.
60 *
61 * n
62 * a numerical value to write to the underlying MSR.
63 */ 97 */
64void usage(void) 98void usage(void)
65{ 99{
66 printf("%s: [-c cpu] [-v] " 100 fprintf(stderr, "%s [options] [scope][field value]\n", progname);
67 "(-r | 'performance' | 'normal' | 'powersave' | n)\n", 101 fprintf(stderr, "scope: --cpu cpu-list [--hwp-use-pkg #] | --pkg pkg-list\n");
68 progname); 102 fprintf(stderr, "field: --all | --epb | --hwp-epp | --hwp-min | --hwp-max | --hwp-desired\n");
103 fprintf(stderr, "other: --hwp-enable | --turbo-enable (0 | 1) | --help | --force\n");
104 fprintf(stderr,
105 "value: ( # | \"normal\" | \"performance\" | \"balance-performance\" | \"balance-power\"| \"power\")\n");
106 fprintf(stderr, "--hwp-window usec\n");
107
108 fprintf(stderr, "Specify only Energy Performance BIAS (legacy usage):\n");
109 fprintf(stderr, "%s: [-c cpu] [-v] (-r | policy-value )\n", progname);
110
69 exit(1); 111 exit(1);
70} 112}
71 113
72#define MSR_IA32_ENERGY_PERF_BIAS 0x000001b0 114/*
115 * If bdx_highest_ratio is set,
116 * then we must translate between MSR format and simple ratio
117 * used on the cmdline.
118 */
119int ratio_2_msr_perf(int ratio)
120{
121 int msr_perf;
122
123 if (!bdx_highest_ratio)
124 return ratio;
125
126 msr_perf = ratio * 255 / bdx_highest_ratio;
127
128 if (debug)
129 fprintf(stderr, "%d = ratio_to_msr_perf(%d)\n", msr_perf, ratio);
130
131 return msr_perf;
132}
133int msr_perf_2_ratio(int msr_perf)
134{
135 int ratio;
136 double d;
137
138 if (!bdx_highest_ratio)
139 return msr_perf;
140
141 d = (double)msr_perf * (double) bdx_highest_ratio / 255.0;
142 d = d + 0.5; /* round */
143 ratio = (int)d;
144
145 if (debug)
146 fprintf(stderr, "%d = msr_perf_ratio(%d) {%f}\n", ratio, msr_perf, d);
147
148 return ratio;
149}
150int parse_cmdline_epb(int i)
151{
152 if (!has_epb)
153 errx(1, "EPB not enabled on this platform");
154
155 update_epb = 1;
156
157 switch (i) {
158 case OPTARG_POWER:
159 return ENERGY_PERF_BIAS_POWERSAVE;
160 case OPTARG_BALANCE_POWER:
161 return ENERGY_PERF_BIAS_BALANCE_POWERSAVE;
162 case OPTARG_NORMAL:
163 return ENERGY_PERF_BIAS_NORMAL;
164 case OPTARG_BALANCE_PERFORMANCE:
165 return ENERGY_PERF_BIAS_BALANCE_PERFORMANCE;
166 case OPTARG_PERFORMANCE:
167 return ENERGY_PERF_BIAS_PERFORMANCE;
168 }
169 if (i < 0 || i > ENERGY_PERF_BIAS_POWERSAVE)
170 errx(1, "--epb must be from 0 to 15");
171 return i;
172}
173
174#define HWP_CAP_LOWEST 0
175#define HWP_CAP_HIGHEST 255
176
177/*
178 * "performance" changes hwp_min to cap.highest
179 * All others leave it at cap.lowest
180 */
181int parse_cmdline_hwp_min(int i)
182{
183 update_hwp_min = 1;
184
185 switch (i) {
186 case OPTARG_POWER:
187 case OPTARG_BALANCE_POWER:
188 case OPTARG_NORMAL:
189 case OPTARG_BALANCE_PERFORMANCE:
190 return HWP_CAP_LOWEST;
191 case OPTARG_PERFORMANCE:
192 return HWP_CAP_HIGHEST;
193 }
194 return i;
195}
196/*
197 * "power" changes hwp_max to cap.lowest
198 * All others leave it at cap.highest
199 */
200int parse_cmdline_hwp_max(int i)
201{
202 update_hwp_max = 1;
203
204 switch (i) {
205 case OPTARG_POWER:
206 return HWP_CAP_LOWEST;
207 case OPTARG_NORMAL:
208 case OPTARG_BALANCE_POWER:
209 case OPTARG_BALANCE_PERFORMANCE:
210 case OPTARG_PERFORMANCE:
211 return HWP_CAP_HIGHEST;
212 }
213 return i;
214}
215/*
216 * for --hwp-des, all strings leave it in autonomous mode
217 * If you want to change it, you need to explicitly pick a value
218 */
219int parse_cmdline_hwp_desired(int i)
220{
221 update_hwp_desired = 1;
222
223 switch (i) {
224 case OPTARG_POWER:
225 case OPTARG_BALANCE_POWER:
226 case OPTARG_BALANCE_PERFORMANCE:
227 case OPTARG_NORMAL:
228 case OPTARG_PERFORMANCE:
229 return 0; /* autonomous */
230 }
231 return i;
232}
233
234int parse_cmdline_hwp_window(int i)
235{
236 unsigned int exponent;
237
238 update_hwp_window = 1;
239
240 switch (i) {
241 case OPTARG_POWER:
242 case OPTARG_BALANCE_POWER:
243 case OPTARG_NORMAL:
244 case OPTARG_BALANCE_PERFORMANCE:
245 case OPTARG_PERFORMANCE:
246 return 0;
247 }
248 if (i < 0 || i > 1270000000) {
249 fprintf(stderr, "--hwp-window: 0 for auto; 1 - 1270000000 usec for window duration\n");
250 usage();
251 }
252 for (exponent = 0; ; ++exponent) {
253 if (debug)
254 printf("%d 10^%d\n", i, exponent);
255
256 if (i <= 127)
257 break;
258
259 i = i / 10;
260 }
261 if (debug)
262 fprintf(stderr, "%d*10^%d: 0x%x\n", i, exponent, (exponent << 7) | i);
263
264 return (exponent << 7) | i;
265}
266int parse_cmdline_hwp_epp(int i)
267{
268 update_hwp_epp = 1;
269
270 switch (i) {
271 case OPTARG_POWER:
272 return HWP_EPP_POWERSAVE;
273 case OPTARG_BALANCE_POWER:
274 return HWP_EPP_BALANCE_POWERSAVE;
275 case OPTARG_NORMAL:
276 case OPTARG_BALANCE_PERFORMANCE:
277 return HWP_EPP_BALANCE_PERFORMANCE;
278 case OPTARG_PERFORMANCE:
279 return HWP_EPP_PERFORMANCE;
280 }
281 if (i < 0 || i > 0xff) {
282 fprintf(stderr, "--hwp-epp must be from 0 to 0xff\n");
283 usage();
284 }
285 return i;
286}
287int parse_cmdline_turbo(int i)
288{
289 update_turbo = 1;
290
291 switch (i) {
292 case OPTARG_POWER:
293 return 0;
294 case OPTARG_NORMAL:
295 case OPTARG_BALANCE_POWER:
296 case OPTARG_BALANCE_PERFORMANCE:
297 case OPTARG_PERFORMANCE:
298 return 1;
299 }
300 if (i < 0 || i > 1) {
301 fprintf(stderr, "--turbo-enable: 1 to enable, 0 to disable\n");
302 usage();
303 }
304 return i;
305}
306
307int parse_optarg_string(char *s)
308{
309 int i;
310 char *endptr;
311
312 if (!strncmp(s, "default", 7))
313 return OPTARG_NORMAL;
314
315 if (!strncmp(s, "normal", 6))
316 return OPTARG_NORMAL;
317
318 if (!strncmp(s, "power", 9))
319 return OPTARG_POWER;
320
321 if (!strncmp(s, "balance-power", 17))
322 return OPTARG_BALANCE_POWER;
323
324 if (!strncmp(s, "balance-performance", 19))
325 return OPTARG_BALANCE_PERFORMANCE;
326
327 if (!strncmp(s, "performance", 11))
328 return OPTARG_PERFORMANCE;
329
330 i = strtol(s, &endptr, 0);
331 if (s == endptr) {
332 fprintf(stderr, "no digits in \"%s\"\n", s);
333 usage();
334 }
335 if (i == LONG_MIN || i == LONG_MAX)
336 errx(-1, "%s", s);
337
338 if (i > 0xFF)
339 errx(-1, "%d (0x%x) must be < 256", i, i);
340
341 if (i < 0)
342 errx(-1, "%d (0x%x) must be >= 0", i, i);
343 return i;
344}
345
346void parse_cmdline_all(char *s)
347{
348 force++;
349 update_hwp_enable = 1;
350 req_update.hwp_min = parse_cmdline_hwp_min(parse_optarg_string(s));
351 req_update.hwp_max = parse_cmdline_hwp_max(parse_optarg_string(s));
352 req_update.hwp_epp = parse_cmdline_hwp_epp(parse_optarg_string(s));
353 if (has_epb)
354 new_epb = parse_cmdline_epb(parse_optarg_string(s));
355 turbo_update_value = parse_cmdline_turbo(parse_optarg_string(s));
356 req_update.hwp_desired = parse_cmdline_hwp_desired(parse_optarg_string(s));
357 req_update.hwp_window = parse_cmdline_hwp_window(parse_optarg_string(s));
358}
359
360void validate_cpu_selected_set(void)
361{
362 int cpu;
363
364 if (CPU_COUNT_S(cpu_setsize, cpu_selected_set) == 0)
365 errx(0, "no CPUs requested");
366
367 for (cpu = 0; cpu <= max_cpu_num; ++cpu) {
368 if (CPU_ISSET_S(cpu, cpu_setsize, cpu_selected_set))
369 if (!CPU_ISSET_S(cpu, cpu_setsize, cpu_present_set))
370 errx(1, "Requested cpu% is not present", cpu);
371 }
372}
373
374void parse_cmdline_cpu(char *s)
375{
376 char *startp, *endp;
377 int cpu = 0;
378
379 if (pkg_selected_set) {
380 usage();
381 errx(1, "--cpu | --pkg");
382 }
383 cpu_selected_set = CPU_ALLOC((max_cpu_num + 1));
384 if (cpu_selected_set == NULL)
385 err(1, "cpu_selected_set");
386 CPU_ZERO_S(cpu_setsize, cpu_selected_set);
387
388 for (startp = s; startp && *startp;) {
389
390 if (*startp == ',') {
391 startp++;
392 continue;
393 }
394
395 if (*startp == '-') {
396 int end_cpu;
73 397
74#define BIAS_PERFORMANCE 0 398 startp++;
75#define BIAS_BALANCE 6 399 end_cpu = strtol(startp, &endp, 10);
76#define BIAS_POWERSAVE 15 400 if (startp == endp)
401 continue;
402
403 while (cpu <= end_cpu) {
404 if (cpu > max_cpu_num)
405 errx(1, "Requested cpu%d exceeds max cpu%d", cpu, max_cpu_num);
406 CPU_SET_S(cpu, cpu_setsize, cpu_selected_set);
407 cpu++;
408 }
409 startp = endp;
410 continue;
411 }
412
413 if (strncmp(startp, "all", 3) == 0) {
414 for (cpu = 0; cpu <= max_cpu_num; cpu += 1) {
415 if (CPU_ISSET_S(cpu, cpu_setsize, cpu_present_set))
416 CPU_SET_S(cpu, cpu_setsize, cpu_selected_set);
417 }
418 startp += 3;
419 if (*startp == 0)
420 break;
421 }
422 /* "--cpu even" is not documented */
423 if (strncmp(startp, "even", 4) == 0) {
424 for (cpu = 0; cpu <= max_cpu_num; cpu += 2) {
425 if (CPU_ISSET_S(cpu, cpu_setsize, cpu_present_set))
426 CPU_SET_S(cpu, cpu_setsize, cpu_selected_set);
427 }
428 startp += 4;
429 if (*startp == 0)
430 break;
431 }
432
433 /* "--cpu odd" is not documented */
434 if (strncmp(startp, "odd", 3) == 0) {
435 for (cpu = 1; cpu <= max_cpu_num; cpu += 2) {
436 if (CPU_ISSET_S(cpu, cpu_setsize, cpu_present_set))
437 CPU_SET_S(cpu, cpu_setsize, cpu_selected_set);
438 }
439 startp += 3;
440 if (*startp == 0)
441 break;
442 }
443
444 cpu = strtol(startp, &endp, 10);
445 if (startp == endp)
446 errx(1, "--cpu cpu-set: confused by '%s'", startp);
447 if (cpu > max_cpu_num)
448 errx(1, "Requested cpu%d exceeds max cpu%d", cpu, max_cpu_num);
449 CPU_SET_S(cpu, cpu_setsize, cpu_selected_set);
450 startp = endp;
451 }
452
453 validate_cpu_selected_set();
454
455}
456
457void parse_cmdline_pkg(char *s)
458{
459 char *startp, *endp;
460 int pkg = 0;
461
462 if (cpu_selected_set) {
463 usage();
464 errx(1, "--pkg | --cpu");
465 }
466 pkg_selected_set = 0;
467
468 for (startp = s; startp && *startp;) {
469
470 if (*startp == ',') {
471 startp++;
472 continue;
473 }
474
475 if (*startp == '-') {
476 int end_pkg;
477
478 startp++;
479 end_pkg = strtol(startp, &endp, 10);
480 if (startp == endp)
481 continue;
482
483 while (pkg <= end_pkg) {
484 if (pkg > max_pkg_num)
485 errx(1, "Requested pkg%d exceeds max pkg%d", pkg, max_pkg_num);
486 pkg_selected_set |= 1 << pkg;
487 pkg++;
488 }
489 startp = endp;
490 continue;
491 }
492
493 if (strncmp(startp, "all", 3) == 0) {
494 pkg_selected_set = pkg_present_set;
495 return;
496 }
497
498 pkg = strtol(startp, &endp, 10);
499 if (pkg > max_pkg_num)
500 errx(1, "Requested pkg%d Exceeds max pkg%d", pkg, max_pkg_num);
501 pkg_selected_set |= 1 << pkg;
502 startp = endp;
503 }
504}
505
506void for_packages(unsigned long long pkg_set, int (func)(int))
507{
508 int pkg_num;
509
510 for (pkg_num = 0; pkg_num <= max_pkg_num; ++pkg_num) {
511 if (pkg_set & (1UL << pkg_num))
512 func(pkg_num);
513 }
514}
515
516void print_version(void)
517{
518 printf("x86_energy_perf_policy 17.05.11 (C) Len Brown <len.brown@intel.com>\n");
519}
77 520
78void cmdline(int argc, char **argv) 521void cmdline(int argc, char **argv)
79{ 522{
80 int opt; 523 int opt;
524 int option_index = 0;
525
526 static struct option long_options[] = {
527 {"all", required_argument, 0, 'a'},
528 {"cpu", required_argument, 0, 'c'},
529 {"pkg", required_argument, 0, 'p'},
530 {"debug", no_argument, 0, 'd'},
531 {"hwp-desired", required_argument, 0, 'D'},
532 {"epb", required_argument, 0, 'B'},
533 {"force", no_argument, 0, 'f'},
534 {"hwp-enable", no_argument, 0, 'e'},
535 {"help", no_argument, 0, 'h'},
536 {"hwp-epp", required_argument, 0, 'P'},
537 {"hwp-min", required_argument, 0, 'm'},
538 {"hwp-max", required_argument, 0, 'M'},
539 {"read", no_argument, 0, 'r'},
540 {"turbo-enable", required_argument, 0, 't'},
541 {"hwp-use-pkg", required_argument, 0, 'u'},
542 {"version", no_argument, 0, 'v'},
543 {"hwp-window", required_argument, 0, 'w'},
544 {0, 0, 0, 0 }
545 };
81 546
82 progname = argv[0]; 547 progname = argv[0];
83 548
84 while ((opt = getopt(argc, argv, "+rvc:")) != -1) { 549 while ((opt = getopt_long_only(argc, argv, "+a:c:dD:E:e:f:m:M:rt:u:vw",
550 long_options, &option_index)) != -1) {
85 switch (opt) { 551 switch (opt) {
552 case 'a':
553 parse_cmdline_all(optarg);
554 break;
555 case 'B':
556 new_epb = parse_cmdline_epb(parse_optarg_string(optarg));
557 break;
86 case 'c': 558 case 'c':
87 cpu = atoi(optarg); 559 parse_cmdline_cpu(optarg);
560 break;
561 case 'e':
562 update_hwp_enable = 1;
563 break;
564 case 'h':
565 usage();
566 break;
567 case 'd':
568 debug++;
569 verbose++;
570 break;
571 case 'f':
572 force++;
573 break;
574 case 'D':
575 req_update.hwp_desired = parse_cmdline_hwp_desired(parse_optarg_string(optarg));
576 break;
577 case 'm':
578 req_update.hwp_min = parse_cmdline_hwp_min(parse_optarg_string(optarg));
579 break;
580 case 'M':
581 req_update.hwp_max = parse_cmdline_hwp_max(parse_optarg_string(optarg));
582 break;
583 case 'p':
584 parse_cmdline_pkg(optarg);
585 break;
586 case 'P':
587 req_update.hwp_epp = parse_cmdline_hwp_epp(parse_optarg_string(optarg));
88 break; 588 break;
89 case 'r': 589 case 'r':
90 read_only = 1; 590 /* v1 used -r to specify read-only mode, now the default */
591 break;
592 case 't':
593 turbo_update_value = parse_cmdline_turbo(parse_optarg_string(optarg));
594 break;
595 case 'u':
596 update_hwp_use_pkg++;
597 if (atoi(optarg) == 0)
598 req_update.hwp_use_pkg = 0;
599 else
600 req_update.hwp_use_pkg = 1;
91 break; 601 break;
92 case 'v': 602 case 'v':
93 verbose++; 603 print_version();
604 exit(0);
605 break;
606 case 'w':
607 req_update.hwp_window = parse_cmdline_hwp_window(parse_optarg_string(optarg));
94 break; 608 break;
95 default: 609 default:
96 usage(); 610 usage();
97 } 611 }
98 } 612 }
99 /* if -r, then should be no additional optind */
100 if (read_only && (argc > optind))
101 usage();
102
103 /* 613 /*
104 * if no -r , then must be one additional optind 614 * v1 allowed "performance"|"normal"|"power" with no policy specifier
615 * to update BIAS. Continue to support that, even though no longer documented.
105 */ 616 */
106 if (!read_only) { 617 if (argc == optind + 1)
618 new_epb = parse_cmdline_epb(parse_optarg_string(argv[optind]));
107 619
108 if (argc != optind + 1) { 620 if (argc > optind + 1) {
109 printf("must supply -r or policy param\n"); 621 fprintf(stderr, "stray parameter '%s'\n", argv[optind + 1]);
110 usage(); 622 usage();
111 } 623 }
624}
112 625
113 if (!strcmp("performance", argv[optind])) { 626
114 new_bias = BIAS_PERFORMANCE; 627int get_msr(int cpu, int offset, unsigned long long *msr)
115 } else if (!strcmp("normal", argv[optind])) { 628{
116 new_bias = BIAS_BALANCE; 629 int retval;
117 } else if (!strcmp("powersave", argv[optind])) { 630 char pathname[32];
118 new_bias = BIAS_POWERSAVE; 631 int fd;
119 } else { 632
120 char *endptr; 633 sprintf(pathname, "/dev/cpu/%d/msr", cpu);
121 634 fd = open(pathname, O_RDONLY);
122 new_bias = strtoull(argv[optind], &endptr, 0); 635 if (fd < 0)
123 if (endptr == argv[optind] || 636 err(-1, "%s open failed, try chown or chmod +r /dev/cpu/*/msr, or run as root", pathname);
124 new_bias > BIAS_POWERSAVE) { 637
125 fprintf(stderr, "invalid value: %s\n", 638 retval = pread(fd, msr, sizeof(*msr), offset);
126 argv[optind]); 639 if (retval != sizeof(*msr))
127 usage(); 640 err(-1, "%s offset 0x%llx read failed", pathname, (unsigned long long)offset);
128 } 641
129 } 642 if (debug > 1)
643 fprintf(stderr, "get_msr(cpu%d, 0x%X, 0x%llX)\n", cpu, offset, *msr);
644
645 close(fd);
646 return 0;
647}
648
649int put_msr(int cpu, int offset, unsigned long long new_msr)
650{
651 char pathname[32];
652 int retval;
653 int fd;
654
655 sprintf(pathname, "/dev/cpu/%d/msr", cpu);
656 fd = open(pathname, O_RDWR);
657 if (fd < 0)
658 err(-1, "%s open failed, try chown or chmod +r /dev/cpu/*/msr, or run as root", pathname);
659
660 retval = pwrite(fd, &new_msr, sizeof(new_msr), offset);
661 if (retval != sizeof(new_msr))
662 err(-2, "pwrite(cpu%d, offset 0x%x, 0x%llx) = %d", cpu, offset, new_msr, retval);
663
664 close(fd);
665
666 if (debug > 1)
667 fprintf(stderr, "put_msr(cpu%d, 0x%X, 0x%llX)\n", cpu, offset, new_msr);
668
669 return 0;
670}
671
672void print_hwp_cap(int cpu, struct msr_hwp_cap *cap, char *str)
673{
674 if (cpu != -1)
675 printf("cpu%d: ", cpu);
676
677 printf("HWP_CAP: low %d eff %d guar %d high %d\n",
678 cap->lowest, cap->efficient, cap->guaranteed, cap->highest);
679}
680void read_hwp_cap(int cpu, struct msr_hwp_cap *cap, unsigned int msr_offset)
681{
682 unsigned long long msr;
683
684 get_msr(cpu, msr_offset, &msr);
685
686 cap->highest = msr_perf_2_ratio(HWP_HIGHEST_PERF(msr));
687 cap->guaranteed = msr_perf_2_ratio(HWP_GUARANTEED_PERF(msr));
688 cap->efficient = msr_perf_2_ratio(HWP_MOSTEFFICIENT_PERF(msr));
689 cap->lowest = msr_perf_2_ratio(HWP_LOWEST_PERF(msr));
690}
691
692void print_hwp_request(int cpu, struct msr_hwp_request *h, char *str)
693{
694 if (cpu != -1)
695 printf("cpu%d: ", cpu);
696
697 if (str)
698 printf("%s", str);
699
700 printf("HWP_REQ: min %d max %d des %d epp %d window 0x%x (%d*10^%dus) use_pkg %d\n",
701 h->hwp_min, h->hwp_max, h->hwp_desired, h->hwp_epp,
702 h->hwp_window, h->hwp_window & 0x7F, (h->hwp_window >> 7) & 0x7, h->hwp_use_pkg);
703}
704void print_hwp_request_pkg(int pkg, struct msr_hwp_request *h, char *str)
705{
706 printf("pkg%d: ", pkg);
707
708 if (str)
709 printf("%s", str);
710
711 printf("HWP_REQ_PKG: min %d max %d des %d epp %d window 0x%x (%d*10^%dus)\n",
712 h->hwp_min, h->hwp_max, h->hwp_desired, h->hwp_epp,
713 h->hwp_window, h->hwp_window & 0x7F, (h->hwp_window >> 7) & 0x7);
714}
715void read_hwp_request(int cpu, struct msr_hwp_request *hwp_req, unsigned int msr_offset)
716{
717 unsigned long long msr;
718
719 get_msr(cpu, msr_offset, &msr);
720
721 hwp_req->hwp_min = msr_perf_2_ratio((((msr) >> 0) & 0xff));
722 hwp_req->hwp_max = msr_perf_2_ratio((((msr) >> 8) & 0xff));
723 hwp_req->hwp_desired = msr_perf_2_ratio((((msr) >> 16) & 0xff));
724 hwp_req->hwp_epp = (((msr) >> 24) & 0xff);
725 hwp_req->hwp_window = (((msr) >> 32) & 0x3ff);
726 hwp_req->hwp_use_pkg = (((msr) >> 42) & 0x1);
727}
728
729void write_hwp_request(int cpu, struct msr_hwp_request *hwp_req, unsigned int msr_offset)
730{
731 unsigned long long msr = 0;
732
733 if (debug > 1)
734 printf("cpu%d: requesting min %d max %d des %d epp %d window 0x%0x use_pkg %d\n",
735 cpu, hwp_req->hwp_min, hwp_req->hwp_max,
736 hwp_req->hwp_desired, hwp_req->hwp_epp,
737 hwp_req->hwp_window, hwp_req->hwp_use_pkg);
738
739 msr |= HWP_MIN_PERF(ratio_2_msr_perf(hwp_req->hwp_min));
740 msr |= HWP_MAX_PERF(ratio_2_msr_perf(hwp_req->hwp_max));
741 msr |= HWP_DESIRED_PERF(ratio_2_msr_perf(hwp_req->hwp_desired));
742 msr |= HWP_ENERGY_PERF_PREFERENCE(hwp_req->hwp_epp);
743 msr |= HWP_ACTIVITY_WINDOW(hwp_req->hwp_window);
744 msr |= HWP_PACKAGE_CONTROL(hwp_req->hwp_use_pkg);
745
746 put_msr(cpu, msr_offset, msr);
747}
748
749int print_cpu_msrs(int cpu)
750{
751 unsigned long long msr;
752 struct msr_hwp_request req;
753 struct msr_hwp_cap cap;
754
755 if (has_epb) {
756 get_msr(cpu, MSR_IA32_ENERGY_PERF_BIAS, &msr);
757
758 printf("cpu%d: EPB %u\n", cpu, (unsigned int) msr);
130 } 759 }
760
761 if (!has_hwp)
762 return 0;
763
764 read_hwp_request(cpu, &req, MSR_HWP_REQUEST);
765 print_hwp_request(cpu, &req, "");
766
767 read_hwp_cap(cpu, &cap, MSR_HWP_CAPABILITIES);
768 print_hwp_cap(cpu, &cap, "");
769
770 return 0;
771}
772
773int print_pkg_msrs(int pkg)
774{
775 struct msr_hwp_request req;
776 unsigned long long msr;
777
778 if (!has_hwp)
779 return 0;
780
781 read_hwp_request(first_cpu_in_pkg[pkg], &req, MSR_HWP_REQUEST_PKG);
782 print_hwp_request_pkg(pkg, &req, "");
783
784 if (has_hwp_notify) {
785 get_msr(first_cpu_in_pkg[pkg], MSR_HWP_INTERRUPT, &msr);
786 fprintf(stderr,
787 "pkg%d: MSR_HWP_INTERRUPT: 0x%08llx (Excursion_Min-%sabled, Guaranteed_Perf_Change-%sabled)\n",
788 pkg, msr,
789 ((msr) & 0x2) ? "EN" : "Dis",
790 ((msr) & 0x1) ? "EN" : "Dis");
791 }
792 get_msr(first_cpu_in_pkg[pkg], MSR_HWP_STATUS, &msr);
793 fprintf(stderr,
794 "pkg%d: MSR_HWP_STATUS: 0x%08llx (%sExcursion_Min, %sGuaranteed_Perf_Change)\n",
795 pkg, msr,
796 ((msr) & 0x4) ? "" : "No-",
797 ((msr) & 0x1) ? "" : "No-");
798
799 return 0;
131} 800}
132 801
133/* 802/*
134 * validate_cpuid() 803 * Assumption: All HWP systems have 100 MHz bus clock
135 * returns on success, quietly exits on failure (make verbose with -v)
136 */ 804 */
137void validate_cpuid(void) 805int ratio_2_sysfs_khz(int ratio)
138{ 806{
139 unsigned int eax, ebx, ecx, edx, max_level; 807 int bclk_khz = 100 * 1000; /* 100,000 KHz = 100 MHz */
140 unsigned int fms, family, model, stepping;
141 808
142 eax = ebx = ecx = edx = 0; 809 return ratio * bclk_khz;
810}
811/*
812 * If HWP is enabled and cpufreq sysfs attribtes are present,
813 * then update sysfs, so that it will not become
814 * stale when we write to MSRs.
815 * (intel_pstate's max_perf_pct and min_perf_pct will follow cpufreq,
816 * so we don't have to touch that.)
817 */
818void update_cpufreq_scaling_freq(int is_max, int cpu, unsigned int ratio)
819{
820 char pathname[64];
821 FILE *fp;
822 int retval;
823 int khz;
143 824
144 asm("cpuid" : "=a" (max_level), "=b" (ebx), "=c" (ecx), 825 sprintf(pathname, "/sys/devices/system/cpu/cpu%d/cpufreq/scaling_%s_freq",
145 "=d" (edx) : "a" (0)); 826 cpu, is_max ? "max" : "min");
146 827
147 if (ebx != 0x756e6547 || edx != 0x49656e69 || ecx != 0x6c65746e) { 828 fp = fopen(pathname, "w");
148 if (verbose) 829 if (!fp) {
149 fprintf(stderr, "%.4s%.4s%.4s != GenuineIntel", 830 if (debug)
150 (char *)&ebx, (char *)&edx, (char *)&ecx); 831 perror(pathname);
151 exit(1); 832 return;
152 } 833 }
153 834
154 asm("cpuid" : "=a" (fms), "=c" (ecx), "=d" (edx) : "a" (1) : "ebx"); 835 khz = ratio_2_sysfs_khz(ratio);
155 family = (fms >> 8) & 0xf; 836 retval = fprintf(fp, "%d", khz);
156 model = (fms >> 4) & 0xf; 837 if (retval < 0)
157 stepping = fms & 0xf; 838 if (debug)
158 if (family == 6 || family == 0xf) 839 perror("fprintf");
159 model += ((fms >> 16) & 0xf) << 4; 840 if (debug)
841 printf("echo %d > %s\n", khz, pathname);
160 842
161 if (verbose > 1) 843 fclose(fp);
162 printf("CPUID %d levels family:model:stepping " 844}
163 "0x%x:%x:%x (%d:%d:%d)\n", max_level,
164 family, model, stepping, family, model, stepping);
165 845
166 if (!(edx & (1 << 5))) { 846/*
167 if (verbose) 847 * We update all sysfs before updating any MSRs because of
168 printf("CPUID: no MSR\n"); 848 * bugs in cpufreq/intel_pstate where the sysfs writes
169 exit(1); 849 * for a CPU may change the min/max values on other CPUS.
850 */
851
852int update_sysfs(int cpu)
853{
854 if (!has_hwp)
855 return 0;
856
857 if (!hwp_update_enabled())
858 return 0;
859
860 if (access("/sys/devices/system/cpu/cpu0/cpufreq", F_OK))
861 return 0;
862
863 if (update_hwp_min)
864 update_cpufreq_scaling_freq(0, cpu, req_update.hwp_min);
865
866 if (update_hwp_max)
867 update_cpufreq_scaling_freq(1, cpu, req_update.hwp_max);
868
869 return 0;
870}
871
872int verify_hwp_req_self_consistency(int cpu, struct msr_hwp_request *req)
873{
874 /* fail if min > max requested */
875 if (req->hwp_min > req->hwp_max) {
876 errx(1, "cpu%d: requested hwp-min %d > hwp_max %d",
877 cpu, req->hwp_min, req->hwp_max);
170 } 878 }
171 879
172 /* 880 /* fail if desired > max requestd */
173 * Support for MSR_IA32_ENERGY_PERF_BIAS 881 if (req->hwp_desired && (req->hwp_desired > req->hwp_max)) {
174 * is indicated by CPUID.06H.ECX.bit3 882 errx(1, "cpu%d: requested hwp-desired %d > hwp_max %d",
175 */ 883 cpu, req->hwp_desired, req->hwp_max);
176 asm("cpuid" : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) : "a" (6));
177 if (verbose)
178 printf("CPUID.06H.ECX: 0x%x\n", ecx);
179 if (!(ecx & (1 << 3))) {
180 if (verbose)
181 printf("CPUID: No MSR_IA32_ENERGY_PERF_BIAS\n");
182 exit(1);
183 } 884 }
184 return; /* success */ 885 /* fail if desired < min requestd */
886 if (req->hwp_desired && (req->hwp_desired < req->hwp_min)) {
887 errx(1, "cpu%d: requested hwp-desired %d < requested hwp_min %d",
888 cpu, req->hwp_desired, req->hwp_min);
889 }
890
891 return 0;
185} 892}
186 893
187unsigned long long get_msr(int cpu, int offset) 894int check_hwp_request_v_hwp_capabilities(int cpu, struct msr_hwp_request *req, struct msr_hwp_cap *cap)
188{ 895{
189 unsigned long long msr; 896 if (update_hwp_max) {
190 char msr_path[32]; 897 if (req->hwp_max > cap->highest)
191 int retval; 898 errx(1, "cpu%d: requested max %d > capabilities highest %d, use --force?",
192 int fd; 899 cpu, req->hwp_max, cap->highest);
900 if (req->hwp_max < cap->lowest)
901 errx(1, "cpu%d: requested max %d < capabilities lowest %d, use --force?",
902 cpu, req->hwp_max, cap->lowest);
903 }
193 904
194 sprintf(msr_path, "/dev/cpu/%d/msr", cpu); 905 if (update_hwp_min) {
195 fd = open(msr_path, O_RDONLY); 906 if (req->hwp_min > cap->highest)
196 if (fd < 0) { 907 errx(1, "cpu%d: requested min %d > capabilities highest %d, use --force?",
197 printf("Try \"# modprobe msr\"\n"); 908 cpu, req->hwp_min, cap->highest);
198 perror(msr_path); 909 if (req->hwp_min < cap->lowest)
199 exit(1); 910 errx(1, "cpu%d: requested min %d < capabilities lowest %d, use --force?",
911 cpu, req->hwp_min, cap->lowest);
200 } 912 }
201 913
202 retval = pread(fd, &msr, sizeof msr, offset); 914 if (update_hwp_min && update_hwp_max && (req->hwp_min > req->hwp_max))
915 errx(1, "cpu%d: requested min %d > requested max %d",
916 cpu, req->hwp_min, req->hwp_max);
203 917
204 if (retval != sizeof msr) { 918 if (update_hwp_desired && req->hwp_desired) {
205 printf("pread cpu%d 0x%x = %d\n", cpu, offset, retval); 919 if (req->hwp_desired > req->hwp_max)
206 exit(-2); 920 errx(1, "cpu%d: requested desired %d > requested max %d, use --force?",
921 cpu, req->hwp_desired, req->hwp_max);
922 if (req->hwp_desired < req->hwp_min)
923 errx(1, "cpu%d: requested desired %d < requested min %d, use --force?",
924 cpu, req->hwp_desired, req->hwp_min);
925 if (req->hwp_desired < cap->lowest)
926 errx(1, "cpu%d: requested desired %d < capabilities lowest %d, use --force?",
927 cpu, req->hwp_desired, cap->lowest);
928 if (req->hwp_desired > cap->highest)
929 errx(1, "cpu%d: requested desired %d > capabilities highest %d, use --force?",
930 cpu, req->hwp_desired, cap->highest);
207 } 931 }
208 close(fd); 932
209 return msr; 933 return 0;
210} 934}
211 935
212unsigned long long put_msr(int cpu, unsigned long long new_msr, int offset) 936int update_hwp_request(int cpu)
213{ 937{
214 unsigned long long old_msr; 938 struct msr_hwp_request req;
215 char msr_path[32]; 939 struct msr_hwp_cap cap;
216 int retval; 940
217 int fd; 941 int msr_offset = MSR_HWP_REQUEST;
942
943 read_hwp_request(cpu, &req, msr_offset);
944 if (debug)
945 print_hwp_request(cpu, &req, "old: ");
946
947 if (update_hwp_min)
948 req.hwp_min = req_update.hwp_min;
949
950 if (update_hwp_max)
951 req.hwp_max = req_update.hwp_max;
952
953 if (update_hwp_desired)
954 req.hwp_desired = req_update.hwp_desired;
955
956 if (update_hwp_window)
957 req.hwp_window = req_update.hwp_window;
958
959 if (update_hwp_epp)
960 req.hwp_epp = req_update.hwp_epp;
961
962 req.hwp_use_pkg = req_update.hwp_use_pkg;
963
964 read_hwp_cap(cpu, &cap, MSR_HWP_CAPABILITIES);
965 if (debug)
966 print_hwp_cap(cpu, &cap, "");
967
968 if (!force)
969 check_hwp_request_v_hwp_capabilities(cpu, &req, &cap);
970
971 verify_hwp_req_self_consistency(cpu, &req);
218 972
219 sprintf(msr_path, "/dev/cpu/%d/msr", cpu); 973 write_hwp_request(cpu, &req, msr_offset);
220 fd = open(msr_path, O_RDWR); 974
221 if (fd < 0) { 975 if (debug) {
222 perror(msr_path); 976 read_hwp_request(cpu, &req, msr_offset);
223 exit(1); 977 print_hwp_request(cpu, &req, "new: ");
224 } 978 }
979 return 0;
980}
981int update_hwp_request_pkg(int pkg)
982{
983 struct msr_hwp_request req;
984 struct msr_hwp_cap cap;
985 int cpu = first_cpu_in_pkg[pkg];
986
987 int msr_offset = MSR_HWP_REQUEST_PKG;
988
989 read_hwp_request(cpu, &req, msr_offset);
990 if (debug)
991 print_hwp_request_pkg(pkg, &req, "old: ");
992
993 if (update_hwp_min)
994 req.hwp_min = req_update.hwp_min;
995
996 if (update_hwp_max)
997 req.hwp_max = req_update.hwp_max;
998
999 if (update_hwp_desired)
1000 req.hwp_desired = req_update.hwp_desired;
1001
1002 if (update_hwp_window)
1003 req.hwp_window = req_update.hwp_window;
1004
1005 if (update_hwp_epp)
1006 req.hwp_epp = req_update.hwp_epp;
1007
1008 read_hwp_cap(cpu, &cap, MSR_HWP_CAPABILITIES);
1009 if (debug)
1010 print_hwp_cap(cpu, &cap, "");
1011
1012 if (!force)
1013 check_hwp_request_v_hwp_capabilities(cpu, &req, &cap);
1014
1015 verify_hwp_req_self_consistency(cpu, &req);
1016
1017 write_hwp_request(cpu, &req, msr_offset);
225 1018
226 retval = pread(fd, &old_msr, sizeof old_msr, offset); 1019 if (debug) {
227 if (retval != sizeof old_msr) { 1020 read_hwp_request(cpu, &req, msr_offset);
228 perror("pwrite"); 1021 print_hwp_request_pkg(pkg, &req, "new: ");
229 printf("pread cpu%d 0x%x = %d\n", cpu, offset, retval);
230 exit(-2);
231 } 1022 }
1023 return 0;
1024}
1025
1026int enable_hwp_on_cpu(int cpu)
1027{
1028 unsigned long long msr;
1029
1030 get_msr(cpu, MSR_PM_ENABLE, &msr);
1031 put_msr(cpu, MSR_PM_ENABLE, 1);
1032
1033 if (verbose)
1034 printf("cpu%d: MSR_PM_ENABLE old: %d new: %d\n", cpu, (unsigned int) msr, 1);
1035
1036 return 0;
1037}
1038
1039int update_cpu_msrs(int cpu)
1040{
1041 unsigned long long msr;
1042
232 1043
233 retval = pwrite(fd, &new_msr, sizeof new_msr, offset); 1044 if (update_epb) {
234 if (retval != sizeof new_msr) { 1045 get_msr(cpu, MSR_IA32_ENERGY_PERF_BIAS, &msr);
235 perror("pwrite"); 1046 put_msr(cpu, MSR_IA32_ENERGY_PERF_BIAS, new_epb);
236 printf("pwrite cpu%d 0x%x = %d\n", cpu, offset, retval); 1047
237 exit(-2); 1048 if (verbose)
1049 printf("cpu%d: ENERGY_PERF_BIAS old: %d new: %d\n",
1050 cpu, (unsigned int) msr, (unsigned int) new_epb);
238 } 1051 }
239 1052
240 close(fd); 1053 if (update_turbo) {
1054 int turbo_is_present_and_disabled;
1055
1056 get_msr(cpu, MSR_IA32_MISC_ENABLE, &msr);
1057
1058 turbo_is_present_and_disabled = ((msr & MSR_IA32_MISC_ENABLE_TURBO_DISABLE) != 0);
1059
1060 if (turbo_update_value == 1) {
1061 if (turbo_is_present_and_disabled) {
1062 msr &= ~MSR_IA32_MISC_ENABLE_TURBO_DISABLE;
1063 put_msr(cpu, MSR_IA32_MISC_ENABLE, msr);
1064 if (verbose)
1065 printf("cpu%d: turbo ENABLE\n", cpu);
1066 }
1067 } else {
1068 /*
1069 * if "turbo_is_enabled" were known to be describe this cpu
1070 * then we could use it here to skip redundant disable requests.
1071 * but cpu may be in a different package, so we always write.
1072 */
1073 msr |= MSR_IA32_MISC_ENABLE_TURBO_DISABLE;
1074 put_msr(cpu, MSR_IA32_MISC_ENABLE, msr);
1075 if (verbose)
1076 printf("cpu%d: turbo DISABLE\n", cpu);
1077 }
1078 }
1079
1080 if (!has_hwp)
1081 return 0;
1082
1083 if (!hwp_update_enabled())
1084 return 0;
1085
1086 update_hwp_request(cpu);
1087 return 0;
1088}
1089
1090/*
1091 * Open a file, and exit on failure
1092 */
1093FILE *fopen_or_die(const char *path, const char *mode)
1094{
1095 FILE *filep = fopen(path, "r");
241 1096
242 return old_msr; 1097 if (!filep)
1098 err(1, "%s: open failed", path);
1099 return filep;
243} 1100}
244 1101
245void print_msr(int cpu) 1102unsigned int get_pkg_num(int cpu)
246{ 1103{
247 printf("cpu%d: 0x%016llx\n", 1104 FILE *fp;
248 cpu, get_msr(cpu, MSR_IA32_ENERGY_PERF_BIAS)); 1105 char pathname[128];
1106 unsigned int pkg;
1107 int retval;
1108
1109 sprintf(pathname, "/sys/devices/system/cpu/cpu%d/topology/physical_package_id", cpu);
1110
1111 fp = fopen_or_die(pathname, "r");
1112 retval = fscanf(fp, "%d\n", &pkg);
1113 if (retval != 1)
1114 errx(1, "%s: failed to parse", pathname);
1115 return pkg;
249} 1116}
250 1117
251void update_msr(int cpu) 1118int set_max_cpu_pkg_num(int cpu)
252{ 1119{
253 unsigned long long previous_msr; 1120 unsigned int pkg;
254 1121
255 previous_msr = put_msr(cpu, new_bias, MSR_IA32_ENERGY_PERF_BIAS); 1122 if (max_cpu_num < cpu)
1123 max_cpu_num = cpu;
256 1124
257 if (verbose) 1125 pkg = get_pkg_num(cpu);
258 printf("cpu%d msr0x%x 0x%016llx -> 0x%016llx\n", 1126
259 cpu, MSR_IA32_ENERGY_PERF_BIAS, previous_msr, new_bias); 1127 if (pkg >= MAX_PACKAGES)
1128 errx(1, "cpu%d: %d >= MAX_PACKAGES (%d)", cpu, pkg, MAX_PACKAGES);
1129
1130 if (pkg > max_pkg_num)
1131 max_pkg_num = pkg;
260 1132
261 return; 1133 if ((pkg_present_set & (1ULL << pkg)) == 0) {
1134 pkg_present_set |= (1ULL << pkg);
1135 first_cpu_in_pkg[pkg] = cpu;
1136 }
1137
1138 return 0;
1139}
1140int mark_cpu_present(int cpu)
1141{
1142 CPU_SET_S(cpu, cpu_setsize, cpu_present_set);
1143 return 0;
262} 1144}
263 1145
264char *proc_stat = "/proc/stat";
265/* 1146/*
266 * run func() on every cpu in /dev/cpu 1147 * run func(cpu) on every cpu in /proc/stat
1148 * return max_cpu number
267 */ 1149 */
268void for_every_cpu(void (func)(int)) 1150int for_all_proc_cpus(int (func)(int))
269{ 1151{
270 FILE *fp; 1152 FILE *fp;
1153 int cpu_num;
271 int retval; 1154 int retval;
272 1155
273 fp = fopen(proc_stat, "r"); 1156 fp = fopen_or_die(proc_stat, "r");
274 if (fp == NULL) {
275 perror(proc_stat);
276 exit(1);
277 }
278 1157
279 retval = fscanf(fp, "cpu %*d %*d %*d %*d %*d %*d %*d %*d %*d %*d\n"); 1158 retval = fscanf(fp, "cpu %*d %*d %*d %*d %*d %*d %*d %*d %*d %*d\n");
280 if (retval != 0) { 1159 if (retval != 0)
281 perror("/proc/stat format"); 1160 err(1, "%s: failed to parse format", proc_stat);
282 exit(1);
283 }
284 1161
285 while (1) { 1162 while (1) {
286 int cpu; 1163 retval = fscanf(fp, "cpu%u %*d %*d %*d %*d %*d %*d %*d %*d %*d %*d\n", &cpu_num);
287
288 retval = fscanf(fp,
289 "cpu%u %*d %*d %*d %*d %*d %*d %*d %*d %*d %*d\n",
290 &cpu);
291 if (retval != 1) 1164 if (retval != 1)
292 break; 1165 break;
293 1166
294 func(cpu); 1167 retval = func(cpu_num);
1168 if (retval) {
1169 fclose(fp);
1170 return retval;
1171 }
295 } 1172 }
296 fclose(fp); 1173 fclose(fp);
1174 return 0;
1175}
1176
1177void for_all_cpus_in_set(size_t set_size, cpu_set_t *cpu_set, int (func)(int))
1178{
1179 int cpu_num;
1180
1181 for (cpu_num = 0; cpu_num <= max_cpu_num; ++cpu_num)
1182 if (CPU_ISSET_S(cpu_num, set_size, cpu_set))
1183 func(cpu_num);
1184}
1185
1186void init_data_structures(void)
1187{
1188 for_all_proc_cpus(set_max_cpu_pkg_num);
1189
1190 cpu_setsize = CPU_ALLOC_SIZE((max_cpu_num + 1));
1191
1192 cpu_present_set = CPU_ALLOC((max_cpu_num + 1));
1193 if (cpu_present_set == NULL)
1194 err(3, "CPU_ALLOC");
1195 CPU_ZERO_S(cpu_setsize, cpu_present_set);
1196 for_all_proc_cpus(mark_cpu_present);
1197}
1198
1199/* clear has_hwp if it is not enable (or being enabled) */
1200
1201void verify_hwp_is_enabled(void)
1202{
1203 unsigned long long msr;
1204
1205 if (!has_hwp) /* set in early_cpuid() */
1206 return;
1207
1208 /* MSR_PM_ENABLE[1] == 1 if HWP is enabled and MSRs visible */
1209 get_msr(base_cpu, MSR_PM_ENABLE, &msr);
1210 if ((msr & 1) == 0) {
1211 fprintf(stderr, "HWP can be enabled using '--hwp-enable'\n");
1212 has_hwp = 0;
1213 return;
1214 }
1215}
1216
1217int req_update_bounds_check(void)
1218{
1219 if (!hwp_update_enabled())
1220 return 0;
1221
1222 /* fail if min > max requested */
1223 if ((update_hwp_max && update_hwp_min) &&
1224 (req_update.hwp_min > req_update.hwp_max)) {
1225 printf("hwp-min %d > hwp_max %d\n", req_update.hwp_min, req_update.hwp_max);
1226 return -EINVAL;
1227 }
1228
1229 /* fail if desired > max requestd */
1230 if (req_update.hwp_desired && update_hwp_max &&
1231 (req_update.hwp_desired > req_update.hwp_max)) {
1232 printf("hwp-desired cannot be greater than hwp_max\n");
1233 return -EINVAL;
1234 }
1235 /* fail if desired < min requestd */
1236 if (req_update.hwp_desired && update_hwp_min &&
1237 (req_update.hwp_desired < req_update.hwp_min)) {
1238 printf("hwp-desired cannot be less than hwp_min\n");
1239 return -EINVAL;
1240 }
1241
1242 return 0;
1243}
1244
1245void set_base_cpu(void)
1246{
1247 base_cpu = sched_getcpu();
1248 if (base_cpu < 0)
1249 err(-ENODEV, "No valid cpus found");
1250}
1251
1252
1253void probe_dev_msr(void)
1254{
1255 struct stat sb;
1256 char pathname[32];
1257
1258 sprintf(pathname, "/dev/cpu/%d/msr", base_cpu);
1259 if (stat(pathname, &sb))
1260 if (system("/sbin/modprobe msr > /dev/null 2>&1"))
1261 err(-5, "no /dev/cpu/0/msr, Try \"# modprobe msr\" ");
1262}
1263/*
1264 * early_cpuid()
1265 * initialize turbo_is_enabled, has_hwp, has_epb
1266 * before cmdline is parsed
1267 */
1268void early_cpuid(void)
1269{
1270 unsigned int eax, ebx, ecx, edx, max_level;
1271 unsigned int fms, family, model;
1272
1273 __get_cpuid(0, &max_level, &ebx, &ecx, &edx);
1274
1275 if (max_level < 6)
1276 errx(1, "Processor not supported\n");
1277
1278 __get_cpuid(1, &fms, &ebx, &ecx, &edx);
1279 family = (fms >> 8) & 0xf;
1280 model = (fms >> 4) & 0xf;
1281 if (family == 6 || family == 0xf)
1282 model += ((fms >> 16) & 0xf) << 4;
1283
1284 if (model == 0x4F) {
1285 unsigned long long msr;
1286
1287 get_msr(base_cpu, MSR_TURBO_RATIO_LIMIT, &msr);
1288
1289 bdx_highest_ratio = msr & 0xFF;
1290 }
1291
1292 __get_cpuid(0x6, &eax, &ebx, &ecx, &edx);
1293 turbo_is_enabled = (eax >> 1) & 1;
1294 has_hwp = (eax >> 7) & 1;
1295 has_epb = (ecx >> 3) & 1;
1296}
1297
1298/*
1299 * parse_cpuid()
1300 * set
1301 * has_hwp, has_hwp_notify, has_hwp_activity_window, has_hwp_epp, has_hwp_request_pkg, has_epb
1302 */
1303void parse_cpuid(void)
1304{
1305 unsigned int eax, ebx, ecx, edx, max_level;
1306 unsigned int fms, family, model, stepping;
1307
1308 eax = ebx = ecx = edx = 0;
1309
1310 __get_cpuid(0, &max_level, &ebx, &ecx, &edx);
1311
1312 if (ebx == 0x756e6547 && edx == 0x49656e69 && ecx == 0x6c65746e)
1313 genuine_intel = 1;
1314
1315 if (debug)
1316 fprintf(stderr, "CPUID(0): %.4s%.4s%.4s ",
1317 (char *)&ebx, (char *)&edx, (char *)&ecx);
1318
1319 __get_cpuid(1, &fms, &ebx, &ecx, &edx);
1320 family = (fms >> 8) & 0xf;
1321 model = (fms >> 4) & 0xf;
1322 stepping = fms & 0xf;
1323 if (family == 6 || family == 0xf)
1324 model += ((fms >> 16) & 0xf) << 4;
1325
1326 if (debug) {
1327 fprintf(stderr, "%d CPUID levels; family:model:stepping 0x%x:%x:%x (%d:%d:%d)\n",
1328 max_level, family, model, stepping, family, model, stepping);
1329 fprintf(stderr, "CPUID(1): %s %s %s %s %s %s %s %s\n",
1330 ecx & (1 << 0) ? "SSE3" : "-",
1331 ecx & (1 << 3) ? "MONITOR" : "-",
1332 ecx & (1 << 7) ? "EIST" : "-",
1333 ecx & (1 << 8) ? "TM2" : "-",
1334 edx & (1 << 4) ? "TSC" : "-",
1335 edx & (1 << 5) ? "MSR" : "-",
1336 edx & (1 << 22) ? "ACPI-TM" : "-",
1337 edx & (1 << 29) ? "TM" : "-");
1338 }
1339
1340 if (!(edx & (1 << 5)))
1341 errx(1, "CPUID: no MSR");
1342
1343
1344 __get_cpuid(0x6, &eax, &ebx, &ecx, &edx);
1345 /* turbo_is_enabled already set */
1346 /* has_hwp already set */
1347 has_hwp_notify = eax & (1 << 8);
1348 has_hwp_activity_window = eax & (1 << 9);
1349 has_hwp_epp = eax & (1 << 10);
1350 has_hwp_request_pkg = eax & (1 << 11);
1351
1352 if (!has_hwp_request_pkg && update_hwp_use_pkg)
1353 errx(1, "--hwp-use-pkg is not available on this hardware");
1354
1355 /* has_epb already set */
1356
1357 if (debug)
1358 fprintf(stderr,
1359 "CPUID(6): %sTURBO, %sHWP, %sHWPnotify, %sHWPwindow, %sHWPepp, %sHWPpkg, %sEPB\n",
1360 turbo_is_enabled ? "" : "No-",
1361 has_hwp ? "" : "No-",
1362 has_hwp_notify ? "" : "No-",
1363 has_hwp_activity_window ? "" : "No-",
1364 has_hwp_epp ? "" : "No-",
1365 has_hwp_request_pkg ? "" : "No-",
1366 has_epb ? "" : "No-");
1367
1368 return; /* success */
297} 1369}
298 1370
299int main(int argc, char **argv) 1371int main(int argc, char **argv)
300{ 1372{
1373 set_base_cpu();
1374 probe_dev_msr();
1375 init_data_structures();
1376
1377 early_cpuid(); /* initial cpuid parse before cmdline */
1378
301 cmdline(argc, argv); 1379 cmdline(argc, argv);
302 1380
303 if (verbose > 1) 1381 if (debug)
304 printf("x86_energy_perf_policy Nov 24, 2010" 1382 print_version();
305 " - Len Brown <lenb@kernel.org>\n"); 1383
306 if (verbose > 1 && !read_only) 1384 parse_cpuid();
307 printf("new_bias %lld\n", new_bias); 1385
308 1386 /* If CPU-set and PKG-set are not initialized, default to all CPUs */
309 validate_cpuid(); 1387 if ((cpu_selected_set == 0) && (pkg_selected_set == 0))
310 1388 cpu_selected_set = cpu_present_set;
311 if (cpu != -1) { 1389
312 if (read_only) 1390 /*
313 print_msr(cpu); 1391 * If HWP is being enabled, do it now, so that subsequent operations
314 else 1392 * that access HWP registers can work.
315 update_msr(cpu); 1393 */
316 } else { 1394 if (update_hwp_enable)
317 if (read_only) 1395 for_all_cpus_in_set(cpu_setsize, cpu_selected_set, enable_hwp_on_cpu);
318 for_every_cpu(print_msr); 1396
319 else 1397 /* If HWP present, but disabled, warn and ignore from here forward */
320 for_every_cpu(update_msr); 1398 verify_hwp_is_enabled();
1399
1400 if (req_update_bounds_check())
1401 return -EINVAL;
1402
1403 /* display information only, no updates to settings */
1404 if (!update_epb && !update_turbo && !hwp_update_enabled()) {
1405 if (cpu_selected_set)
1406 for_all_cpus_in_set(cpu_setsize, cpu_selected_set, print_cpu_msrs);
1407
1408 if (has_hwp_request_pkg) {
1409 if (pkg_selected_set == 0)
1410 pkg_selected_set = pkg_present_set;
1411
1412 for_packages(pkg_selected_set, print_pkg_msrs);
1413 }
1414
1415 return 0;
321 } 1416 }
322 1417
1418 /* update CPU set */
1419 if (cpu_selected_set) {
1420 for_all_cpus_in_set(cpu_setsize, cpu_selected_set, update_sysfs);
1421 for_all_cpus_in_set(cpu_setsize, cpu_selected_set, update_cpu_msrs);
1422 } else if (pkg_selected_set)
1423 for_packages(pkg_selected_set, update_hwp_request_pkg);
1424
323 return 0; 1425 return 0;
324} 1426}