From c125e96f044427f38d106fab7bc5e4a5e6a18262 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Mon, 5 Jul 2010 22:43:53 +0200 Subject: PM: Make it possible to avoid races between wakeup and system sleep One of the arguments during the suspend blockers discussion was that the mainline kernel didn't contain any mechanisms making it possible to avoid races between wakeup and system suspend. Generally, there are two problems in that area. First, if a wakeup event occurs exactly when /sys/power/state is being written to, it may be delivered to user space right before the freezer kicks in, so the user space consumer of the event may not be able to process it before the system is suspended. Second, if a wakeup event occurs after user space has been frozen, it is not generally guaranteed that the ongoing transition of the system into a sleep state will be aborted. To address these issues introduce a new global sysfs attribute, /sys/power/wakeup_count, associated with a running counter of wakeup events and three helper functions, pm_stay_awake(), pm_relax(), and pm_wakeup_event(), that may be used by kernel subsystems to control the behavior of this attribute and to request the PM core to abort system transitions into a sleep state already in progress. The /sys/power/wakeup_count file may be read from or written to by user space. Reads will always succeed (unless interrupted by a signal) and return the current value of the wakeup events counter. Writes, however, will only succeed if the written number is equal to the current value of the wakeup events counter. If a write is successful, it will cause the kernel to save the current value of the wakeup events counter and to abort the subsequent system transition into a sleep state if any wakeup events are reported after the write has returned. [The assumption is that before writing to /sys/power/state user space will first read from /sys/power/wakeup_count. Next, user space consumers of wakeup events will have a chance to acknowledge or veto the upcoming system transition to a sleep state. Finally, if the transition is allowed to proceed, /sys/power/wakeup_count will be written to and if that succeeds, /sys/power/state will be written to as well. Still, if any wakeup events are reported to the PM core by kernel subsystems after that point, the transition will be aborted.] Additionally, put a wakeup events counter into struct dev_pm_info and make these per-device wakeup event counters available via sysfs, so that it's possible to check the activity of various wakeup event sources within the kernel. To illustrate how subsystems can use pm_wakeup_event(), make the low-level PCI runtime PM wakeup-handling code use it. Signed-off-by: Rafael J. Wysocki Acked-by: Jesse Barnes Acked-by: Greg Kroah-Hartman Acked-by: markgross Reviewed-by: Alan Stern --- include/linux/pm.h | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'include/linux/pm.h') diff --git a/include/linux/pm.h b/include/linux/pm.h index 8e258c727971..b417fc46f3fc 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -457,6 +457,7 @@ struct dev_pm_info { #ifdef CONFIG_PM_SLEEP struct list_head entry; struct completion completion; + unsigned long wakeup_count; #endif #ifdef CONFIG_PM_RUNTIME struct timer_list suspend_timer; @@ -552,6 +553,11 @@ extern void __suspend_report_result(const char *function, void *fn, int ret); } while (0) extern void device_pm_wait_for_dev(struct device *sub, struct device *dev); + +/* drivers/base/power/wakeup.c */ +extern void pm_wakeup_event(struct device *dev, unsigned int msec); +extern void pm_stay_awake(struct device *dev); +extern void pm_relax(void); #else /* !CONFIG_PM_SLEEP */ #define device_pm_lock() do {} while (0) @@ -565,6 +571,10 @@ static inline int dpm_suspend_start(pm_message_t state) #define suspend_report_result(fn, ret) do {} while (0) static inline void device_pm_wait_for_dev(struct device *a, struct device *b) {} + +static inline void pm_wakeup_event(struct device *dev, unsigned int msec) {} +static inline void pm_stay_awake(struct device *dev) {} +static inline void pm_relax(void) {} #endif /* !CONFIG_PM_SLEEP */ /* How to reorder dpm_list after device_move() */ -- cgit v1.2.2 From 8d4b9d1bfef117862a2889dec4dac227068544c9 Mon Sep 17 00:00:00 2001 From: Arjan van de Ven Date: Mon, 19 Jul 2010 02:01:06 +0200 Subject: PM / Runtime: Add runtime PM statistics (v3) In order for PowerTOP to be able to report how well the new runtime PM is working for the various drivers, the kernel needs to export some basic statistics in sysfs. This patch adds two sysfs files in the runtime PM domain that expose the total time a device has been active, and the time a device has been suspended. With this PowerTOP can compute the activity percentage Active %age = 100 * (delta active) / (delta active + delta suspended) and present the information to the user. I've written the PowerTOP code (slated for version 1.12) already, and the output looks like this: Runtime Device Power Management statistics Active Device name 10.0% 06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8101E/RTL8102E PCI Express Fast Ethernet controller [version 2: fix stat update bugs noticed by Alan Stern] [version 3: rebase to -next and move the sysfs declaration] Signed-off-by: Arjan van de Ven Signed-off-by: Rafael J. Wysocki --- include/linux/pm.h | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'include/linux/pm.h') diff --git a/include/linux/pm.h b/include/linux/pm.h index b417fc46f3fc..52e8c55ff314 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -477,9 +477,15 @@ struct dev_pm_info { enum rpm_request request; enum rpm_status runtime_status; int runtime_error; + unsigned long active_jiffies; + unsigned long suspended_jiffies; + unsigned long accounting_timestamp; #endif }; +extern void update_pm_runtime_accounting(struct device *dev); + + /* * The PM_EVENT_ messages are also used by drivers implementing the legacy * suspend framework, based on the ->suspend() and ->resume() callbacks common -- cgit v1.2.2