diff options
| author | Paul E. McKenney <paulmck@linux.vnet.ibm.com> | 2012-04-19 15:20:14 -0400 |
|---|---|---|
| committer | Paul E. McKenney <paulmck@linux.vnet.ibm.com> | 2012-04-24 23:54:52 -0400 |
| commit | 8932a63d5edb02f714d50c26583152fe0a97a69c (patch) | |
| tree | ec71159908f1a78eb21e736d284ffe7ed7584b6c /init | |
| parent | d8169d4c369e8aa2fda10df705a4957331b5a4db (diff) | |
rcu: Reduce cache-miss initialization latencies for large systems
Commit #0209f649 (rcu: limit rcu_node leaf-level fanout) set an upper
limit of 16 on the leaf-level fanout for the rcu_node tree. This was
needed to reduce lock contention that was induced by the synchronization
of scheduling-clock interrupts, which was in turn needed to improve
energy efficiency for moderate-sized lightly loaded servers.
However, reducing the leaf-level fanout means that there are more
leaf-level rcu_node structures in the tree, which in turn means that
RCU's grace-period initialization incurs more cache misses. This is
not a problem on moderate-sized servers with only a few tens of CPUs,
but becomes a major source of real-time latency spikes on systems with
many hundreds of CPUs. In addition, the workloads running on these large
systems tend to be CPU-bound, which eliminates the energy-efficiency
advantages of synchronizing scheduling-clock interrupts. Therefore,
these systems need maximal values for the rcu_node leaf-level fanout.
This commit addresses this problem by introducing a new kernel parameter
named RCU_FANOUT_LEAF that directly controls the leaf-level fanout.
This parameter defaults to 16 to handle the common case of a moderate
sized lightly loaded servers, but may be set higher on larger systems.
Reported-by: Mike Galbraith <efault@gmx.de>
Reported-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Diffstat (limited to 'init')
| -rw-r--r-- | init/Kconfig | 27 |
1 files changed, 27 insertions, 0 deletions
diff --git a/init/Kconfig b/init/Kconfig index 85c6870ed476..6d18ef8071b5 100644 --- a/init/Kconfig +++ b/init/Kconfig | |||
| @@ -458,6 +458,33 @@ config RCU_FANOUT | |||
| 458 | Select a specific number if testing RCU itself. | 458 | Select a specific number if testing RCU itself. |
| 459 | Take the default if unsure. | 459 | Take the default if unsure. |
| 460 | 460 | ||
| 461 | config RCU_FANOUT_LEAF | ||
| 462 | int "Tree-based hierarchical RCU leaf-level fanout value" | ||
| 463 | range 2 RCU_FANOUT if 64BIT | ||
| 464 | range 2 RCU_FANOUT if !64BIT | ||
| 465 | depends on TREE_RCU || TREE_PREEMPT_RCU | ||
| 466 | default 16 | ||
| 467 | help | ||
| 468 | This option controls the leaf-level fanout of hierarchical | ||
| 469 | implementations of RCU, and allows trading off cache misses | ||
| 470 | against lock contention. Systems that synchronize their | ||
| 471 | scheduling-clock interrupts for energy-efficiency reasons will | ||
| 472 | want the default because the smaller leaf-level fanout keeps | ||
| 473 | lock contention levels acceptably low. Very large systems | ||
| 474 | (hundreds or thousands of CPUs) will instead want to set this | ||
| 475 | value to the maximum value possible in order to reduce the | ||
| 476 | number of cache misses incurred during RCU's grace-period | ||
| 477 | initialization. These systems tend to run CPU-bound, and thus | ||
| 478 | are not helped by synchronized interrupts, and thus tend to | ||
| 479 | skew them, which reduces lock contention enough that large | ||
| 480 | leaf-level fanouts work well. | ||
| 481 | |||
| 482 | Select a specific number if testing RCU itself. | ||
| 483 | |||
| 484 | Select the maximum permissible value for large systems. | ||
| 485 | |||
| 486 | Take the default if unsure. | ||
| 487 | |||
| 461 | config RCU_FANOUT_EXACT | 488 | config RCU_FANOUT_EXACT |
| 462 | bool "Disable tree-based hierarchical RCU auto-balancing" | 489 | bool "Disable tree-based hierarchical RCU auto-balancing" |
| 463 | depends on TREE_RCU || TREE_PREEMPT_RCU | 490 | depends on TREE_RCU || TREE_PREEMPT_RCU |
