aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/cpuidle
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2014-04-01 15:48:54 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2014-04-01 15:48:54 -0400
commit4dedde7c7a18f55180574f934dbc1be84ca0400b (patch)
treed7cc511e8ba8ffceadf3f45b9a63395c4e4183c5 /drivers/cpuidle
parent683b6c6f82a60fabf47012581c2cfbf1b037ab95 (diff)
parent0ecfe310f4517d7505599be738158087c165be7c (diff)
Merge tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki: "The majority of this material spent some time in linux-next, some of it even several weeks. There are a few relatively fresh commits in it, but they are mostly fixes and simple cleanups. ACPI took the lead this time, both in terms of the number of commits and the number of modified lines of code, cpufreq follows and there are a few changes in the PM core and in cpuidle too. A new feature that already got some LWN.net's attention is the device PM QoS extension allowing latency tolerance requirements to be propagated from leaf devices to their ancestors with hardware interfaces for specifying latency tolerance. That should help systems with hardware-driven power management to avoid going too far with it in cases when there are latency tolerance constraints. There also are some significant changes in the ACPI core related to the way in which hotplug notifications are handled. They affect PCI hotplug (ACPIPHP) and the ACPI dock station code too. The bottom line is that all those notification now go through the root notify handler and are propagated to the interested subsystems by means of callbacks instead of having to install a notify handler for each device object that we can potentially get hotplug notifications for. In addition to that ACPICA will now advertise "Windows 2013" compatibility for _OSI, because some systems out there don't work correctly if that is not done (some of them don't even boot). On the system suspend side of things, all of the device suspend and resume callbacks, except for ->prepare() and ->complete(), are now going to be executed asynchronously as that turns out to speed up system suspend and resume on some platforms quite significantly and we have a few more optimizations in that area. Apart from that, there are some new device IDs and fixes and cleanups all over. In particular, the system suspend and resume handling by cpufreq should be improved and the cpuidle menu governor should be a bit more robust now. Specifics: - Device PM QoS support for latency tolerance constraints on systems with hardware interfaces allowing such constraints to be specified. That is necessary to prevent hardware-driven power management from becoming overly aggressive on some systems and to prevent power management features leading to excessive latencies from being used in some cases. - Consolidation of the handling of ACPI hotplug notifications for device objects. This causes all device hotplug notifications to go through the root notify handler (that was executed for all of them anyway before) that propagates them to individual subsystems, if necessary, by executing callbacks provided by those subsystems (those callbacks are associated with struct acpi_device objects during device enumeration). As a result, the code in question becomes both smaller in size and more straightforward and all of those changes should not affect users. - ACPICA update, including fixes related to the handling of _PRT in cases when it is broken and the addition of "Windows 2013" to the list of supported "features" for _OSI (which is necessary to support systems that work incorrectly or don't even boot without it). Changes from Bob Moore and Lv Zheng. - Consolidation of ACPI _OST handling from Jiang Liu. - ACPI battery and AC fixes allowing unusual system configurations to be handled by that code from Alexander Mezin. - New device IDs for the ACPI LPSS driver from Chiau Ee Chew. - ACPI fan and thermal optimizations related to system suspend and resume from Aaron Lu. - Cleanups related to ACPI video from Jean Delvare. - Assorted ACPI fixes and cleanups from Al Stone, Hanjun Guo, Lan Tianyu, Paul Bolle, Tomasz Nowicki. - Intel RAPL (Running Average Power Limits) driver cleanups from Jacob Pan. - intel_pstate fixes and cleanups from Dirk Brandewie. - cpufreq fixes related to system suspend/resume handling from Viresh Kumar. - cpufreq core fixes and cleanups from Viresh Kumar, Stratos Karafotis, Saravana Kannan, Rashika Kheria, Joe Perches. - cpufreq drivers updates from Viresh Kumar, Zhuoyu Zhang, Rob Herring. - cpuidle fixes related to the menu governor from Tuukka Tikkanen. - cpuidle fix related to coupled CPUs handling from Paul Burton. - Asynchronous execution of all device suspend and resume callbacks, except for ->prepare and ->complete, during system suspend and resume from Chuansheng Liu. - Delayed resuming of runtime-suspended devices during system suspend for the PCI bus type and ACPI PM domain. - New set of PM helper routines to allow device runtime PM callbacks to be used during system suspend and resume more easily from Ulf Hansson. - Assorted fixes and cleanups in the PM core from Geert Uytterhoeven, Prabhakar Lad, Philipp Zabel, Rashika Kheria, Sebastian Capella. - devfreq fix from Saravana Kannan" * tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits) PM / devfreq: Rewrite devfreq_update_status() to fix multiple bugs PM / sleep: Correct whitespace errors in <linux/pm.h> intel_pstate: Set core to min P state during core offline cpufreq: Add stop CPU callback to cpufreq_driver interface cpufreq: Remove unnecessary braces cpufreq: Fix checkpatch errors and warnings cpufreq: powerpc: add cpufreq transition latency for FSL e500mc SoCs MAINTAINERS: Reorder maintainer addresses for PM and ACPI PM / Runtime: Update runtime_idle() documentation for return value meaning video / output: Drop display output class support fujitsu-laptop: Drop unneeded include acer-wmi: Stop selecting VIDEO_OUTPUT_CONTROL ACPI / gpu / drm: Stop selecting VIDEO_OUTPUT_CONTROL ACPI / video: fix ACPI_VIDEO dependencies cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE|RESUMECHANGE} cpufreq: Do not allow ->setpolicy drivers to provide ->target cpufreq: arm_big_little: set 'physical_cluster' for each CPU cpufreq: arm_big_little: make vexpress driver depend on bL core driver ACPI / button: Add ACPI Button event via netlink routine ACPI: Remove duplicate definitions of PREFIX ...
Diffstat (limited to 'drivers/cpuidle')
-rw-r--r--drivers/cpuidle/cpuidle.c3
-rw-r--r--drivers/cpuidle/driver.c2
-rw-r--r--drivers/cpuidle/governors/menu.c75
3 files changed, 47 insertions, 33 deletions
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 09d05ab262be..cb20fd915be8 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -85,7 +85,8 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
85 85
86 time_end = ktime_get(); 86 time_end = ktime_get();
87 87
88 local_irq_enable(); 88 if (!cpuidle_state_is_coupled(dev, drv, entered_state))
89 local_irq_enable();
89 90
90 diff = ktime_to_us(ktime_sub(time_end, time_start)); 91 diff = ktime_to_us(ktime_sub(time_end, time_start));
91 if (diff > INT_MAX) 92 if (diff > INT_MAX)
diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c
index 06dbe7c86199..136d6a283e0a 100644
--- a/drivers/cpuidle/driver.c
+++ b/drivers/cpuidle/driver.c
@@ -209,7 +209,7 @@ static void poll_idle_init(struct cpuidle_driver *drv)
209 state->exit_latency = 0; 209 state->exit_latency = 0;
210 state->target_residency = 0; 210 state->target_residency = 0;
211 state->power_usage = -1; 211 state->power_usage = -1;
212 state->flags = 0; 212 state->flags = CPUIDLE_FLAG_TIME_VALID;
213 state->enter = poll_idle; 213 state->enter = poll_idle;
214 state->disabled = false; 214 state->disabled = false;
215} 215}
diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index cf7f2f0e4ef5..71b523293354 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -122,9 +122,8 @@ struct menu_device {
122 int last_state_idx; 122 int last_state_idx;
123 int needs_update; 123 int needs_update;
124 124
125 unsigned int expected_us; 125 unsigned int next_timer_us;
126 unsigned int predicted_us; 126 unsigned int predicted_us;
127 unsigned int exit_us;
128 unsigned int bucket; 127 unsigned int bucket;
129 unsigned int correction_factor[BUCKETS]; 128 unsigned int correction_factor[BUCKETS];
130 unsigned int intervals[INTERVALS]; 129 unsigned int intervals[INTERVALS];
@@ -257,7 +256,7 @@ again:
257 stddev = int_sqrt(stddev); 256 stddev = int_sqrt(stddev);
258 if (((avg > stddev * 6) && (divisor * 4 >= INTERVALS * 3)) 257 if (((avg > stddev * 6) && (divisor * 4 >= INTERVALS * 3))
259 || stddev <= 20) { 258 || stddev <= 20) {
260 if (data->expected_us > avg) 259 if (data->next_timer_us > avg)
261 data->predicted_us = avg; 260 data->predicted_us = avg;
262 return; 261 return;
263 } 262 }
@@ -289,7 +288,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
289 struct menu_device *data = &__get_cpu_var(menu_devices); 288 struct menu_device *data = &__get_cpu_var(menu_devices);
290 int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY); 289 int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
291 int i; 290 int i;
292 int multiplier; 291 unsigned int interactivity_req;
293 struct timespec t; 292 struct timespec t;
294 293
295 if (data->needs_update) { 294 if (data->needs_update) {
@@ -298,7 +297,6 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
298 } 297 }
299 298
300 data->last_state_idx = 0; 299 data->last_state_idx = 0;
301 data->exit_us = 0;
302 300
303 /* Special case when user has set very strict latency requirement */ 301 /* Special case when user has set very strict latency requirement */
304 if (unlikely(latency_req == 0)) 302 if (unlikely(latency_req == 0))
@@ -306,13 +304,11 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
306 304
307 /* determine the expected residency time, round up */ 305 /* determine the expected residency time, round up */
308 t = ktime_to_timespec(tick_nohz_get_sleep_length()); 306 t = ktime_to_timespec(tick_nohz_get_sleep_length());
309 data->expected_us = 307 data->next_timer_us =
310 t.tv_sec * USEC_PER_SEC + t.tv_nsec / NSEC_PER_USEC; 308 t.tv_sec * USEC_PER_SEC + t.tv_nsec / NSEC_PER_USEC;
311 309
312 310
313 data->bucket = which_bucket(data->expected_us); 311 data->bucket = which_bucket(data->next_timer_us);
314
315 multiplier = performance_multiplier();
316 312
317 /* 313 /*
318 * if the correction factor is 0 (eg first time init or cpu hotplug 314 * if the correction factor is 0 (eg first time init or cpu hotplug
@@ -326,17 +322,26 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
326 * operands are 32 bits. 322 * operands are 32 bits.
327 * Make sure to round up for half microseconds. 323 * Make sure to round up for half microseconds.
328 */ 324 */
329 data->predicted_us = div_round64((uint64_t)data->expected_us * 325 data->predicted_us = div_round64((uint64_t)data->next_timer_us *
330 data->correction_factor[data->bucket], 326 data->correction_factor[data->bucket],
331 RESOLUTION * DECAY); 327 RESOLUTION * DECAY);
332 328
333 get_typical_interval(data); 329 get_typical_interval(data);
334 330
335 /* 331 /*
332 * Performance multiplier defines a minimum predicted idle
333 * duration / latency ratio. Adjust the latency limit if
334 * necessary.
335 */
336 interactivity_req = data->predicted_us / performance_multiplier();
337 if (latency_req > interactivity_req)
338 latency_req = interactivity_req;
339
340 /*
336 * We want to default to C1 (hlt), not to busy polling 341 * We want to default to C1 (hlt), not to busy polling
337 * unless the timer is happening really really soon. 342 * unless the timer is happening really really soon.
338 */ 343 */
339 if (data->expected_us > 5 && 344 if (data->next_timer_us > 5 &&
340 !drv->states[CPUIDLE_DRIVER_STATE_START].disabled && 345 !drv->states[CPUIDLE_DRIVER_STATE_START].disabled &&
341 dev->states_usage[CPUIDLE_DRIVER_STATE_START].disable == 0) 346 dev->states_usage[CPUIDLE_DRIVER_STATE_START].disable == 0)
342 data->last_state_idx = CPUIDLE_DRIVER_STATE_START; 347 data->last_state_idx = CPUIDLE_DRIVER_STATE_START;
@@ -355,11 +360,8 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
355 continue; 360 continue;
356 if (s->exit_latency > latency_req) 361 if (s->exit_latency > latency_req)
357 continue; 362 continue;
358 if (s->exit_latency * multiplier > data->predicted_us)
359 continue;
360 363
361 data->last_state_idx = i; 364 data->last_state_idx = i;
362 data->exit_us = s->exit_latency;
363 } 365 }
364 366
365 return data->last_state_idx; 367 return data->last_state_idx;
@@ -390,36 +392,47 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev)
390{ 392{
391 struct menu_device *data = &__get_cpu_var(menu_devices); 393 struct menu_device *data = &__get_cpu_var(menu_devices);
392 int last_idx = data->last_state_idx; 394 int last_idx = data->last_state_idx;
393 unsigned int last_idle_us = cpuidle_get_last_residency(dev);
394 struct cpuidle_state *target = &drv->states[last_idx]; 395 struct cpuidle_state *target = &drv->states[last_idx];
395 unsigned int measured_us; 396 unsigned int measured_us;
396 unsigned int new_factor; 397 unsigned int new_factor;
397 398
398 /* 399 /*
399 * Ugh, this idle state doesn't support residency measurements, so we 400 * Try to figure out how much time passed between entry to low
400 * are basically lost in the dark. As a compromise, assume we slept 401 * power state and occurrence of the wakeup event.
401 * for the whole expected time. 402 *
403 * If the entered idle state didn't support residency measurements,
404 * we are basically lost in the dark how much time passed.
405 * As a compromise, assume we slept for the whole expected time.
406 *
407 * Any measured amount of time will include the exit latency.
408 * Since we are interested in when the wakeup begun, not when it
409 * was completed, we must substract the exit latency. However, if
410 * the measured amount of time is less than the exit latency,
411 * assume the state was never reached and the exit latency is 0.
402 */ 412 */
403 if (unlikely(!(target->flags & CPUIDLE_FLAG_TIME_VALID))) 413 if (unlikely(!(target->flags & CPUIDLE_FLAG_TIME_VALID))) {
404 last_idle_us = data->expected_us; 414 /* Use timer value as is */
415 measured_us = data->next_timer_us;
405 416
417 } else {
418 /* Use measured value */
419 measured_us = cpuidle_get_last_residency(dev);
406 420
407 measured_us = last_idle_us; 421 /* Deduct exit latency */
408 422 if (measured_us > target->exit_latency)
409 /* 423 measured_us -= target->exit_latency;
410 * We correct for the exit latency; we are assuming here that the
411 * exit latency happens after the event that we're interested in.
412 */
413 if (measured_us > data->exit_us)
414 measured_us -= data->exit_us;
415 424
425 /* Make sure our coefficients do not exceed unity */
426 if (measured_us > data->next_timer_us)
427 measured_us = data->next_timer_us;
428 }
416 429
417 /* Update our correction ratio */ 430 /* Update our correction ratio */
418 new_factor = data->correction_factor[data->bucket]; 431 new_factor = data->correction_factor[data->bucket];
419 new_factor -= new_factor / DECAY; 432 new_factor -= new_factor / DECAY;
420 433
421 if (data->expected_us > 0 && measured_us < MAX_INTERESTING) 434 if (data->next_timer_us > 0 && measured_us < MAX_INTERESTING)
422 new_factor += RESOLUTION * measured_us / data->expected_us; 435 new_factor += RESOLUTION * measured_us / data->next_timer_us;
423 else 436 else
424 /* 437 /*
425 * we were idle so long that we count it as a perfect 438 * we were idle so long that we count it as a perfect
@@ -439,7 +452,7 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev)
439 data->correction_factor[data->bucket] = new_factor; 452 data->correction_factor[data->bucket] = new_factor;
440 453
441 /* update the repeating-pattern data */ 454 /* update the repeating-pattern data */
442 data->intervals[data->interval_ptr++] = last_idle_us; 455 data->intervals[data->interval_ptr++] = measured_us;
443 if (data->interval_ptr >= INTERVALS) 456 if (data->interval_ptr >= INTERVALS)
444 data->interval_ptr = 0; 457 data->interval_ptr = 0;
445} 458}