aboutsummaryrefslogtreecommitdiffstats
path: root/kernel/time
diff options
context:
space:
mode:
authorPaul E. McKenney <paulmck@linux.vnet.ibm.com>2013-06-21 19:37:22 -0400
committerPaul E. McKenney <paulmck@linux.vnet.ibm.com>2013-08-31 17:43:50 -0400
commit0edd1b1784cbdad55aca2c1293be018f53c0ab1d (patch)
tree61e17002ce447f0042a65429cfa33c6462f872a1 /kernel/time
parent217af2a2ffbfc1498d1cf3a89fa478b5632df8f7 (diff)
nohz_full: Add full-system-idle state machine
This commit adds the state machine that takes the per-CPU idle data as input and produces a full-system-idle indication as output. This state machine is driven out of RCU's quiescent-state-forcing mechanism, which invokes rcu_sysidle_check_cpu() to collect per-CPU idle state and then rcu_sysidle_report() to drive the state machine. The full-system-idle state is sampled using rcu_sys_is_idle(), which also drives the state machine if RCU is idle (and does so by forcing RCU to become non-idle). This function returns true if all but the timekeeping CPU (tick_do_timer_cpu) are idle and have been idle long enough to avoid memory contention on the full_sysidle_state state variable. The rcu_sysidle_force_exit() may be called externally to reset the state machine back into non-idle state. For large systems the state machine is driven out of RCU's force-quiescent-state logic, which provides good scalability at the price of millisecond-scale latencies on the transition to full-system-idle state. This is not so good for battery-powered systems, which are usually small enough that they don't need to care about scalability, but which do care deeply about energy efficiency. Small systems therefore drive the state machine directly out of the idle-entry code. The number of CPUs in a "small" system is defined by a new NO_HZ_FULL_SYSIDLE_SMALL Kconfig parameter, which defaults to 8. Note that this is a build-time definition. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> [ paulmck: Use true and false for boolean constants per Lai Jiangshan. ] Reviewed-by: Josh Triplett <josh@joshtriplett.org> [ paulmck: Simplify logic and provide better comments for memory barriers, based on review comments and questions by Lai Jiangshan. ]
Diffstat (limited to 'kernel/time')
-rw-r--r--kernel/time/Kconfig27
1 files changed, 27 insertions, 0 deletions
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index c7d2fd67799e..3381f098070f 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -157,6 +157,33 @@ config NO_HZ_FULL_SYSIDLE
157 157
158 Say N if you are unsure. 158 Say N if you are unsure.
159 159
160config NO_HZ_FULL_SYSIDLE_SMALL
161 int "Number of CPUs above which large-system approach is used"
162 depends on NO_HZ_FULL_SYSIDLE
163 range 1 NR_CPUS
164 default 8
165 help
166 The full-system idle detection mechanism takes a lazy approach
167 on large systems, as is required to attain decent scalability.
168 However, on smaller systems, scalability is not anywhere near as
169 large a concern as is energy efficiency. The sysidle subsystem
170 therefore uses a fast but non-scalable algorithm for small
171 systems and a lazier but scalable algorithm for large systems.
172 This Kconfig parameter defines the number of CPUs in the largest
173 system that will be considered to be "small".
174
175 The default value will be fine in most cases. Battery-powered
176 systems that (1) enable NO_HZ_FULL_SYSIDLE, (2) have larger
177 numbers of CPUs, and (3) are suffering from battery-lifetime
178 problems due to long sysidle latencies might wish to experiment
179 with larger values for this Kconfig parameter. On the other
180 hand, they might be even better served by disabling NO_HZ_FULL
181 entirely, given that NO_HZ_FULL is intended for HPC and
182 real-time workloads that at present do not tend to be run on
183 battery-powered systems.
184
185 Take the default if you are unsure.
186
160config NO_HZ 187config NO_HZ
161 bool "Old Idle dynticks config" 188 bool "Old Idle dynticks config"
162 depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS 189 depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS