diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2012-10-01 13:16:42 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2012-10-01 13:16:42 -0400 |
commit | 620e77533f29796df7aff861e79bd72e08554ebb (patch) | |
tree | 844afce2333549bc5b8d7dc87a4875b9216a0023 /Documentation | |
parent | 6977b4c7736e8809b7959c66875a16c0bbcf2152 (diff) | |
parent | fa34da708cbe1e2d9a2ee7fc68ea8fccbf095d12 (diff) |
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU changes from Ingo Molnar:
0. 'idle RCU':
Adds RCU APIs that allow non-idle tasks to enter RCU idle mode and
provides x86 code to make use of them, allowing RCU to treat
user-mode execution as an extended quiescent state when the new
RCU_USER_QS kernel configuration parameter is specified. (Work is
in progress to port this to a few other architectures, but is not
part of this series.)
1. A fix for a latent bug that has been in RCU ever since the addition
of CPU stall warnings. This bug results in false-positive stall
warnings, but thus far only on embedded systems with severely
cut-down userspace configurations.
2. Further reductions in latency spikes for huge systems, along with
additional boot-time adaptation to the actual hardware.
This is a large change, as it moves RCU grace-period initialization
and cleanup, along with quiescent-state forcing, from softirq to a
kthread. However, it appears to be in quite good shape (famous
last words).
3. Updates to documentation and rcutorture, the latter category
including keeping statistics on CPU-hotplug latencies and fixing
some initialization-time races.
4. CPU-hotplug fixes and improvements.
5. Idle-loop fixes that were omitted on an earlier submission.
6. Miscellaneous fixes and improvements
In certain RCU configurations new kernel threads will show up (rcu_bh,
rcu_sched), showing RCU processing overhead.
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (90 commits)
rcu: Apply micro-optimization and int/bool fixes to RCU's idle handling
rcu: Userspace RCU extended QS selftest
x86: Exit RCU extended QS on notify resume
x86: Use the new schedule_user API on userspace preemption
rcu: Exit RCU extended QS on user preemption
rcu: Exit RCU extended QS on kernel preemption after irq/exception
x86: Exception hooks for userspace RCU extended QS
x86: Unspaghettize do_general_protection()
x86: Syscall hooks for userspace RCU extended QS
rcu: Switch task's syscall hooks on context switch
rcu: Ignore userspace extended quiescent state by default
rcu: Allow rcu_user_enter()/exit() to nest
rcu: Settle config for userspace extended quiescent state
rcu: Make RCU_FAST_NO_HZ handle adaptive ticks
rcu: New rcu_user_enter_after_irq() and rcu_user_exit_after_irq() APIs
rcu: New rcu_user_enter() and rcu_user_exit() APIs
ia64: Add missing RCU idle APIs on idle loop
xtensa: Add missing RCU idle APIs on idle loop
score: Add missing RCU idle APIs on idle loop
parisc: Add missing RCU idle APIs on idle loop
...
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/RCU/checklist.txt | 6 | ||||
-rw-r--r-- | Documentation/RCU/stallwarn.txt | 16 | ||||
-rw-r--r-- | Documentation/RCU/trace.txt | 43 | ||||
-rw-r--r-- | Documentation/RCU/whatisRCU.txt | 9 | ||||
-rw-r--r-- | Documentation/kernel-parameters.txt | 11 |
5 files changed, 48 insertions, 37 deletions
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt index fc103d7a0474..cdb20d41a44a 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.txt | |||
@@ -310,6 +310,12 @@ over a rather long period of time, but improvements are always welcome! | |||
310 | code under the influence of preempt_disable(), you instead | 310 | code under the influence of preempt_disable(), you instead |
311 | need to use synchronize_irq() or synchronize_sched(). | 311 | need to use synchronize_irq() or synchronize_sched(). |
312 | 312 | ||
313 | This same limitation also applies to synchronize_rcu_bh() | ||
314 | and synchronize_srcu(), as well as to the asynchronous and | ||
315 | expedited forms of the three primitives, namely call_rcu(), | ||
316 | call_rcu_bh(), call_srcu(), synchronize_rcu_expedited(), | ||
317 | synchronize_rcu_bh_expedited(), and synchronize_srcu_expedited(). | ||
318 | |||
313 | 12. Any lock acquired by an RCU callback must be acquired elsewhere | 319 | 12. Any lock acquired by an RCU callback must be acquired elsewhere |
314 | with softirq disabled, e.g., via spin_lock_irqsave(), | 320 | with softirq disabled, e.g., via spin_lock_irqsave(), |
315 | spin_lock_bh(), etc. Failing to disable irq on a given | 321 | spin_lock_bh(), etc. Failing to disable irq on a given |
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index 523364e4e1f1..1927151b386b 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt | |||
@@ -99,7 +99,7 @@ In kernels with CONFIG_RCU_FAST_NO_HZ, even more information is | |||
99 | printed: | 99 | printed: |
100 | 100 | ||
101 | INFO: rcu_preempt detected stall on CPU | 101 | INFO: rcu_preempt detected stall on CPU |
102 | 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 drain=0 . timer=-1 | 102 | 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 drain=0 . timer not pending |
103 | (t=65000 jiffies) | 103 | (t=65000 jiffies) |
104 | 104 | ||
105 | The "(64628 ticks this GP)" indicates that this CPU has taken more | 105 | The "(64628 ticks this GP)" indicates that this CPU has taken more |
@@ -116,13 +116,13 @@ number between the two "/"s is the value of the nesting, which will | |||
116 | be a small positive number if in the idle loop and a very large positive | 116 | be a small positive number if in the idle loop and a very large positive |
117 | number (as shown above) otherwise. | 117 | number (as shown above) otherwise. |
118 | 118 | ||
119 | For CONFIG_RCU_FAST_NO_HZ kernels, the "drain=0" indicates that the | 119 | For CONFIG_RCU_FAST_NO_HZ kernels, the "drain=0" indicates that the CPU is |
120 | CPU is not in the process of trying to force itself into dyntick-idle | 120 | not in the process of trying to force itself into dyntick-idle state, the |
121 | state, the "." indicates that the CPU has not given up forcing RCU | 121 | "." indicates that the CPU has not given up forcing RCU into dyntick-idle |
122 | into dyntick-idle mode (it would be "H" otherwise), and the "timer=-1" | 122 | mode (it would be "H" otherwise), and the "timer not pending" indicates |
123 | indicates that the CPU has not recented forced RCU into dyntick-idle | 123 | that the CPU has not recently forced RCU into dyntick-idle mode (it |
124 | mode (it would otherwise indicate the number of microseconds remaining | 124 | would otherwise indicate the number of microseconds remaining in this |
125 | in this forced state). | 125 | forced state). |
126 | 126 | ||
127 | 127 | ||
128 | Multiple Warnings From One Stall | 128 | Multiple Warnings From One Stall |
diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt index f6f15ce39903..672d19083252 100644 --- a/Documentation/RCU/trace.txt +++ b/Documentation/RCU/trace.txt | |||
@@ -333,23 +333,23 @@ o Each element of the form "1/1 0:127 ^0" represents one struct | |||
333 | The output of "cat rcu/rcu_pending" looks as follows: | 333 | The output of "cat rcu/rcu_pending" looks as follows: |
334 | 334 | ||
335 | rcu_sched: | 335 | rcu_sched: |
336 | 0 np=255892 qsp=53936 rpq=85 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741 | 336 | 0 np=255892 qsp=53936 rpq=85 cbr=0 cng=14417 gpc=10033 gps=24320 nn=146741 |
337 | 1 np=261224 qsp=54638 rpq=33 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792 | 337 | 1 np=261224 qsp=54638 rpq=33 cbr=0 cng=25723 gpc=16310 gps=2849 nn=155792 |
338 | 2 np=237496 qsp=49664 rpq=23 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629 | 338 | 2 np=237496 qsp=49664 rpq=23 cbr=0 cng=2762 gpc=45478 gps=1762 nn=136629 |
339 | 3 np=236249 qsp=48766 rpq=98 cbr=0 cng=286 gpc=48049 gps=1218 nf=207 nn=137723 | 339 | 3 np=236249 qsp=48766 rpq=98 cbr=0 cng=286 gpc=48049 gps=1218 nn=137723 |
340 | 4 np=221310 qsp=46850 rpq=7 cbr=0 cng=26 gpc=43161 gps=4634 nf=3529 nn=123110 | 340 | 4 np=221310 qsp=46850 rpq=7 cbr=0 cng=26 gpc=43161 gps=4634 nn=123110 |
341 | 5 np=237332 qsp=48449 rpq=9 cbr=0 cng=54 gpc=47920 gps=3252 nf=201 nn=137456 | 341 | 5 np=237332 qsp=48449 rpq=9 cbr=0 cng=54 gpc=47920 gps=3252 nn=137456 |
342 | 6 np=219995 qsp=46718 rpq=12 cbr=0 cng=50 gpc=42098 gps=6093 nf=4202 nn=120834 | 342 | 6 np=219995 qsp=46718 rpq=12 cbr=0 cng=50 gpc=42098 gps=6093 nn=120834 |
343 | 7 np=249893 qsp=49390 rpq=42 cbr=0 cng=72 gpc=38400 gps=17102 nf=41 nn=144888 | 343 | 7 np=249893 qsp=49390 rpq=42 cbr=0 cng=72 gpc=38400 gps=17102 nn=144888 |
344 | rcu_bh: | 344 | rcu_bh: |
345 | 0 np=146741 qsp=1419 rpq=6 cbr=0 cng=6 gpc=0 gps=0 nf=2 nn=145314 | 345 | 0 np=146741 qsp=1419 rpq=6 cbr=0 cng=6 gpc=0 gps=0 nn=145314 |
346 | 1 np=155792 qsp=12597 rpq=3 cbr=0 cng=0 gpc=4 gps=8 nf=3 nn=143180 | 346 | 1 np=155792 qsp=12597 rpq=3 cbr=0 cng=0 gpc=4 gps=8 nn=143180 |
347 | 2 np=136629 qsp=18680 rpq=1 cbr=0 cng=0 gpc=7 gps=6 nf=0 nn=117936 | 347 | 2 np=136629 qsp=18680 rpq=1 cbr=0 cng=0 gpc=7 gps=6 nn=117936 |
348 | 3 np=137723 qsp=2843 rpq=0 cbr=0 cng=0 gpc=10 gps=7 nf=0 nn=134863 | 348 | 3 np=137723 qsp=2843 rpq=0 cbr=0 cng=0 gpc=10 gps=7 nn=134863 |
349 | 4 np=123110 qsp=12433 rpq=0 cbr=0 cng=0 gpc=4 gps=2 nf=0 nn=110671 | 349 | 4 np=123110 qsp=12433 rpq=0 cbr=0 cng=0 gpc=4 gps=2 nn=110671 |
350 | 5 np=137456 qsp=4210 rpq=1 cbr=0 cng=0 gpc=6 gps=5 nf=0 nn=133235 | 350 | 5 np=137456 qsp=4210 rpq=1 cbr=0 cng=0 gpc=6 gps=5 nn=133235 |
351 | 6 np=120834 qsp=9902 rpq=2 cbr=0 cng=0 gpc=6 gps=3 nf=2 nn=110921 | 351 | 6 np=120834 qsp=9902 rpq=2 cbr=0 cng=0 gpc=6 gps=3 nn=110921 |
352 | 7 np=144888 qsp=26336 rpq=0 cbr=0 cng=0 gpc=8 gps=2 nf=0 nn=118542 | 352 | 7 np=144888 qsp=26336 rpq=0 cbr=0 cng=0 gpc=8 gps=2 nn=118542 |
353 | 353 | ||
354 | As always, this is once again split into "rcu_sched" and "rcu_bh" | 354 | As always, this is once again split into "rcu_sched" and "rcu_bh" |
355 | portions, with CONFIG_TREE_PREEMPT_RCU kernels having an additional | 355 | portions, with CONFIG_TREE_PREEMPT_RCU kernels having an additional |
@@ -377,17 +377,6 @@ o "gpc" is the number of times that an old grace period had | |||
377 | o "gps" is the number of times that a new grace period had started, | 377 | o "gps" is the number of times that a new grace period had started, |
378 | but this CPU was not yet aware of it. | 378 | but this CPU was not yet aware of it. |
379 | 379 | ||
380 | o "nf" is the number of times that this CPU suspected that the | ||
381 | current grace period had run for too long, and thus needed to | ||
382 | be forced. | ||
383 | |||
384 | Please note that "forcing" consists of sending resched IPIs | ||
385 | to holdout CPUs. If that CPU really still is in an old RCU | ||
386 | read-side critical section, then we really do have to wait for it. | ||
387 | The assumption behing "forcing" is that the CPU is not still in | ||
388 | an old RCU read-side critical section, but has not yet responded | ||
389 | for some other reason. | ||
390 | |||
391 | o "nn" is the number of times that this CPU needed nothing. Alert | 380 | o "nn" is the number of times that this CPU needed nothing. Alert |
392 | readers will note that the rcu "nn" number for a given CPU very | 381 | readers will note that the rcu "nn" number for a given CPU very |
393 | closely matches the rcu_bh "np" number for that same CPU. This | 382 | closely matches the rcu_bh "np" number for that same CPU. This |
diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index 69ee188515e7..bf0f6de2aa00 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt | |||
@@ -873,7 +873,7 @@ d. Do you need to treat NMI handlers, hardirq handlers, | |||
873 | and code segments with preemption disabled (whether | 873 | and code segments with preemption disabled (whether |
874 | via preempt_disable(), local_irq_save(), local_bh_disable(), | 874 | via preempt_disable(), local_irq_save(), local_bh_disable(), |
875 | or some other mechanism) as if they were explicit RCU readers? | 875 | or some other mechanism) as if they were explicit RCU readers? |
876 | If so, you need RCU-sched. | 876 | If so, RCU-sched is the only choice that will work for you. |
877 | 877 | ||
878 | e. Do you need RCU grace periods to complete even in the face | 878 | e. Do you need RCU grace periods to complete even in the face |
879 | of softirq monopolization of one or more of the CPUs? For | 879 | of softirq monopolization of one or more of the CPUs? For |
@@ -884,7 +884,12 @@ f. Is your workload too update-intensive for normal use of | |||
884 | RCU, but inappropriate for other synchronization mechanisms? | 884 | RCU, but inappropriate for other synchronization mechanisms? |
885 | If so, consider SLAB_DESTROY_BY_RCU. But please be careful! | 885 | If so, consider SLAB_DESTROY_BY_RCU. But please be careful! |
886 | 886 | ||
887 | g. Otherwise, use RCU. | 887 | g. Do you need read-side critical sections that are respected |
888 | even though they are in the middle of the idle loop, during | ||
889 | user-mode execution, or on an offlined CPU? If so, SRCU is the | ||
890 | only choice that will work for you. | ||
891 | |||
892 | h. Otherwise, use RCU. | ||
888 | 893 | ||
889 | Of course, this all assumes that you have determined that RCU is in fact | 894 | Of course, this all assumes that you have determined that RCU is in fact |
890 | the right tool for your job. | 895 | the right tool for your job. |
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index ad7e2e5088c1..55ada0471f93 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt | |||
@@ -2385,6 +2385,17 @@ bytes respectively. Such letter suffixes can also be entirely omitted. | |||
2385 | rcutree.rcu_cpu_stall_timeout= [KNL,BOOT] | 2385 | rcutree.rcu_cpu_stall_timeout= [KNL,BOOT] |
2386 | Set timeout for RCU CPU stall warning messages. | 2386 | Set timeout for RCU CPU stall warning messages. |
2387 | 2387 | ||
2388 | rcutree.jiffies_till_first_fqs= [KNL,BOOT] | ||
2389 | Set delay from grace-period initialization to | ||
2390 | first attempt to force quiescent states. | ||
2391 | Units are jiffies, minimum value is zero, | ||
2392 | and maximum value is HZ. | ||
2393 | |||
2394 | rcutree.jiffies_till_next_fqs= [KNL,BOOT] | ||
2395 | Set delay between subsequent attempts to force | ||
2396 | quiescent states. Units are jiffies, minimum | ||
2397 | value is one, and maximum value is HZ. | ||
2398 | |||
2388 | rcutorture.fqs_duration= [KNL,BOOT] | 2399 | rcutorture.fqs_duration= [KNL,BOOT] |
2389 | Set duration of force_quiescent_state bursts. | 2400 | Set duration of force_quiescent_state bursts. |
2390 | 2401 | ||