aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorRafael J. Wysocki <rjw@sisk.pl>2011-07-06 04:51:58 -0400
committerRafael J. Wysocki <rjw@sisk.pl>2011-07-06 04:51:58 -0400
commit1e2ef05bb8cf851a694d38e9170c89e7ff052741 (patch)
treeff1dfccc622374f9039d342cc48d16d6636740ae
parenteea3fc0357eb89d0b2d1af37bdfb83eb4076a542 (diff)
PM: Limit race conditions between runtime PM and system sleep (v2)
One of the roles of the PM core is to prevent different PM callbacks executed for the same device object from racing with each other. Unfortunately, after commit e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to succeed during system suspend) runtime PM callbacks may be executed concurrently with system suspend/resume callbacks for the same device. The main reason for commit e8665002477f0278f84f898145b1f141ba26ee26 was that some subsystems and device drivers wanted to use runtime PM helpers, pm_runtime_suspend() and pm_runtime_put_sync() in particular, for carrying out the suspend of devices in their .suspend() callbacks. However, as it's been determined recently, there are multiple reasons not to do so, inlcuding: * The caller really doesn't control the runtime PM usage counters, because user space can access them through sysfs and effectively block runtime PM. That means using pm_runtime_suspend() or pm_runtime_get_sync() to suspend devices during system suspend may or may not work. * If a driver calls pm_runtime_suspend() from its .suspend() callback, it causes the subsystem's .runtime_suspend() callback to be executed, which leads to the call sequence: subsys->suspend(dev) driver->suspend(dev) pm_runtime_suspend(dev) subsys->runtime_suspend(dev) recursive from the subsystem's point of view. For some subsystems that may actually work (e.g. the platform bus type), but for some it will fail in a rather spectacular fashion (e.g. PCI). In each case it means a layering violation. * Both the subsystem and the driver can provide .suspend_noirq() callbacks for system suspend that can do whatever the .runtime_suspend() callbacks do just fine, so it really isn't necessary to call pm_runtime_suspend() during system suspend. * The runtime PM's handling of wakeup devices is usually different from the system suspend's one, so .runtime_suspend() may simply be inappropriate for system suspend. * System suspend is supposed to work even if CONFIG_PM_RUNTIME is unset. * The runtime PM workqueue is frozen before system suspend, so if whatever the driver is going to do during system suspend depends on it, that simply won't work. Still, there is a good reason to allow pm_runtime_resume() to succeed during system suspend and resume (for instance, some subsystems and device drivers may legitimately use it to ensure that their devices are in full-power states before suspending them). Moreover, there is no reason to prevent runtime PM callbacks from being executed in parallel with the system suspend/resume .prepare() and .complete() callbacks and the code removed by commit e8665002477f0278f84f898145b1f141ba26ee26 went too far in this respect. On the other hand, runtime PM callbacks, including .runtime_resume(), must not be executed during system suspend's "late" stage of suspending devices and during system resume's "early" device resume stage. Taking all of the above into consideration, make the PM core acquire a runtime PM reference to every device and resume it if there's a runtime PM resume request pending right before executing the subsystem-level .suspend() callback for it. Make the PM core drop references to all devices right after executing the subsystem-level .resume() callbacks for them. Additionally, make the PM core disable the runtime PM framework for all devices during system suspend, after executing the subsystem-level .suspend() callbacks for them, and enable the runtime PM framework for all devices during system resume, right before executing the subsystem-level .resume() callbacks for them. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Kevin Hilman <khilman@ti.com>
-rw-r--r--Documentation/power/runtime_pm.txt21
-rw-r--r--drivers/base/power/main.c35
2 files changed, 44 insertions, 12 deletions
diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
index 0ec3d610fc9a..d50dd1ab590d 100644
--- a/Documentation/power/runtime_pm.txt
+++ b/Documentation/power/runtime_pm.txt
@@ -583,6 +583,13 @@ this is:
583 pm_runtime_set_active(dev); 583 pm_runtime_set_active(dev);
584 pm_runtime_enable(dev); 584 pm_runtime_enable(dev);
585 585
586The PM core always increments the run-time usage counter before calling the
587->suspend() callback and decrements it after calling the ->resume() callback.
588Hence disabling run-time PM temporarily like this will not cause any runtime
589suspend attempts to be permanently lost. If the usage count goes to zero
590following the return of the ->resume() callback, the ->runtime_idle() callback
591will be invoked as usual.
592
586On some systems, however, system sleep is not entered through a global firmware 593On some systems, however, system sleep is not entered through a global firmware
587or hardware operation. Instead, all hardware components are put into low-power 594or hardware operation. Instead, all hardware components are put into low-power
588states directly by the kernel in a coordinated way. Then, the system sleep 595states directly by the kernel in a coordinated way. Then, the system sleep
@@ -595,6 +602,20 @@ place (in particular, if the system is not waking up from hibernation), it may
595be more efficient to leave the devices that had been suspended before the system 602be more efficient to leave the devices that had been suspended before the system
596suspend began in the suspended state. 603suspend began in the suspended state.
597 604
605The PM core does its best to reduce the probability of race conditions between
606the runtime PM and system suspend/resume (and hibernation) callbacks by carrying
607out the following operations:
608
609 * During system suspend it calls pm_runtime_get_noresume() and
610 pm_runtime_barrier() for every device right before executing the
611 subsystem-level .suspend() callback for it. In addition to that it calls
612 pm_runtime_disable() for every device right after executing the
613 subsystem-level .suspend() callback for it.
614
615 * During system resume it calls pm_runtime_enable() and pm_runtime_put_sync()
616 for every device right before and right after executing the subsystem-level
617 .resume() callback for it, respectively.
618
5987. Generic subsystem callbacks 6197. Generic subsystem callbacks
599 620
600Subsystems may wish to conserve code space by using the set of generic power 621Subsystems may wish to conserve code space by using the set of generic power
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 85b591a5429a..a85459126bc6 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -505,6 +505,7 @@ static int legacy_resume(struct device *dev, int (*cb)(struct device *dev))
505static int device_resume(struct device *dev, pm_message_t state, bool async) 505static int device_resume(struct device *dev, pm_message_t state, bool async)
506{ 506{
507 int error = 0; 507 int error = 0;
508 bool put = false;
508 509
509 TRACE_DEVICE(dev); 510 TRACE_DEVICE(dev);
510 TRACE_RESUME(0); 511 TRACE_RESUME(0);
@@ -521,6 +522,9 @@ static int device_resume(struct device *dev, pm_message_t state, bool async)
521 if (!dev->power.is_suspended) 522 if (!dev->power.is_suspended)
522 goto Unlock; 523 goto Unlock;
523 524
525 pm_runtime_enable(dev);
526 put = true;
527
524 if (dev->pm_domain) { 528 if (dev->pm_domain) {
525 pm_dev_dbg(dev, state, "power domain "); 529 pm_dev_dbg(dev, state, "power domain ");
526 error = pm_op(dev, &dev->pm_domain->ops, state); 530 error = pm_op(dev, &dev->pm_domain->ops, state);
@@ -563,6 +567,10 @@ static int device_resume(struct device *dev, pm_message_t state, bool async)
563 complete_all(&dev->power.completion); 567 complete_all(&dev->power.completion);
564 568
565 TRACE_RESUME(error); 569 TRACE_RESUME(error);
570
571 if (put)
572 pm_runtime_put_sync(dev);
573
566 return error; 574 return error;
567} 575}
568 576
@@ -843,16 +851,22 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async)
843 int error = 0; 851 int error = 0;
844 852
845 dpm_wait_for_children(dev, async); 853 dpm_wait_for_children(dev, async);
846 device_lock(dev);
847 854
848 if (async_error) 855 if (async_error)
849 goto Unlock; 856 return 0;
857
858 pm_runtime_get_noresume(dev);
859 if (pm_runtime_barrier(dev) && device_may_wakeup(dev))
860 pm_wakeup_event(dev, 0);
850 861
851 if (pm_wakeup_pending()) { 862 if (pm_wakeup_pending()) {
863 pm_runtime_put_sync(dev);
852 async_error = -EBUSY; 864 async_error = -EBUSY;
853 goto Unlock; 865 return 0;
854 } 866 }
855 867
868 device_lock(dev);
869
856 if (dev->pm_domain) { 870 if (dev->pm_domain) {
857 pm_dev_dbg(dev, state, "power domain "); 871 pm_dev_dbg(dev, state, "power domain ");
858 error = pm_op(dev, &dev->pm_domain->ops, state); 872 error = pm_op(dev, &dev->pm_domain->ops, state);
@@ -890,12 +904,15 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async)
890 End: 904 End:
891 dev->power.is_suspended = !error; 905 dev->power.is_suspended = !error;
892 906
893 Unlock:
894 device_unlock(dev); 907 device_unlock(dev);
895 complete_all(&dev->power.completion); 908 complete_all(&dev->power.completion);
896 909
897 if (error) 910 if (error) {
911 pm_runtime_put_sync(dev);
898 async_error = error; 912 async_error = error;
913 } else if (dev->power.is_suspended) {
914 __pm_runtime_disable(dev, false);
915 }
899 916
900 return error; 917 return error;
901} 918}
@@ -1035,13 +1052,7 @@ int dpm_prepare(pm_message_t state)
1035 get_device(dev); 1052 get_device(dev);
1036 mutex_unlock(&dpm_list_mtx); 1053 mutex_unlock(&dpm_list_mtx);
1037 1054
1038 pm_runtime_get_noresume(dev); 1055 error = device_prepare(dev, state);
1039 if (pm_runtime_barrier(dev) && device_may_wakeup(dev))
1040 pm_wakeup_event(dev, 0);
1041
1042 pm_runtime_put_sync(dev);
1043 error = pm_wakeup_pending() ?
1044 -EBUSY : device_prepare(dev, state);
1045 1056
1046 mutex_lock(&dpm_list_mtx); 1057 mutex_lock(&dpm_list_mtx);
1047 if (error) { 1058 if (error) {