aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/RCU/stallwarn.txt33
-rw-r--r--Documentation/kernel-parameters.txt68
-rw-r--r--Documentation/locking/locktorture.txt142
-rw-r--r--Documentation/memory-barriers.txt128
4 files changed, 299 insertions, 72 deletions
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index 68fe3ad27015..ef5a2fd4ff70 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -56,8 +56,20 @@ RCU_STALL_RAT_DELAY
56 two jiffies. (This is a cpp macro, not a kernel configuration 56 two jiffies. (This is a cpp macro, not a kernel configuration
57 parameter.) 57 parameter.)
58 58
59When a CPU detects that it is stalling, it will print a message similar 59rcupdate.rcu_task_stall_timeout
60to the following: 60
61 This boot/sysfs parameter controls the RCU-tasks stall warning
62 interval. A value of zero or less suppresses RCU-tasks stall
63 warnings. A positive value sets the stall-warning interval
64 in jiffies. An RCU-tasks stall warning starts wtih the line:
65
66 INFO: rcu_tasks detected stalls on tasks:
67
68 And continues with the output of sched_show_task() for each
69 task stalling the current RCU-tasks grace period.
70
71For non-RCU-tasks flavors of RCU, when a CPU detects that it is stalling,
72it will print a message similar to the following:
61 73
62INFO: rcu_sched_state detected stall on CPU 5 (t=2500 jiffies) 74INFO: rcu_sched_state detected stall on CPU 5 (t=2500 jiffies)
63 75
@@ -174,8 +186,12 @@ o A CPU looping with preemption disabled. This condition can
174o A CPU looping with bottom halves disabled. This condition can 186o A CPU looping with bottom halves disabled. This condition can
175 result in RCU-sched and RCU-bh stalls. 187 result in RCU-sched and RCU-bh stalls.
176 188
177o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel 189o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the
178 without invoking schedule(). 190 kernel without invoking schedule(). Note that cond_resched()
191 does not necessarily prevent RCU CPU stall warnings. Therefore,
192 if the looping in the kernel is really expected and desirable
193 behavior, you might need to replace some of the cond_resched()
194 calls with calls to cond_resched_rcu_qs().
179 195
180o A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might 196o A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
181 happen to preempt a low-priority task in the middle of an RCU 197 happen to preempt a low-priority task in the middle of an RCU
@@ -208,11 +224,10 @@ o A hardware failure. This is quite unlikely, but has occurred
208 This resulted in a series of RCU CPU stall warnings, eventually 224 This resulted in a series of RCU CPU stall warnings, eventually
209 leading the realization that the CPU had failed. 225 leading the realization that the CPU had failed.
210 226
211The RCU, RCU-sched, and RCU-bh implementations have CPU stall warning. 227The RCU, RCU-sched, RCU-bh, and RCU-tasks implementations have CPU stall
212SRCU does not have its own CPU stall warnings, but its calls to 228warning. Note that SRCU does -not- have CPU stall warnings. Please note
213synchronize_sched() will result in RCU-sched detecting RCU-sched-related 229that RCU only detects CPU stalls when there is a grace period in progress.
214CPU stalls. Please note that RCU only detects CPU stalls when there is 230No grace period, no CPU stall warnings.
215a grace period in progress. No grace period, no CPU stall warnings.
216 231
217To diagnose the cause of the stall, inspect the stack traces. 232To diagnose the cause of the stall, inspect the stack traces.
218The offending function will usually be near the top of the stack. 233The offending function will usually be near the top of the stack.
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 10d51c2f10d7..aa0eedc84d00 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1704,6 +1704,49 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
1704 lockd.nlm_udpport=M [NFS] Assign UDP port. 1704 lockd.nlm_udpport=M [NFS] Assign UDP port.
1705 Format: <integer> 1705 Format: <integer>
1706 1706
1707 locktorture.nreaders_stress= [KNL]
1708 Set the number of locking read-acquisition kthreads.
1709 Defaults to being automatically set based on the
1710 number of online CPUs.
1711
1712 locktorture.nwriters_stress= [KNL]
1713 Set the number of locking write-acquisition kthreads.
1714
1715 locktorture.onoff_holdoff= [KNL]
1716 Set time (s) after boot for CPU-hotplug testing.
1717
1718 locktorture.onoff_interval= [KNL]
1719 Set time (s) between CPU-hotplug operations, or
1720 zero to disable CPU-hotplug testing.
1721
1722 locktorture.shuffle_interval= [KNL]
1723 Set task-shuffle interval (jiffies). Shuffling
1724 tasks allows some CPUs to go into dyntick-idle
1725 mode during the locktorture test.
1726
1727 locktorture.shutdown_secs= [KNL]
1728 Set time (s) after boot system shutdown. This
1729 is useful for hands-off automated testing.
1730
1731 locktorture.stat_interval= [KNL]
1732 Time (s) between statistics printk()s.
1733
1734 locktorture.stutter= [KNL]
1735 Time (s) to stutter testing, for example,
1736 specifying five seconds causes the test to run for
1737 five seconds, wait for five seconds, and so on.
1738 This tests the locking primitive's ability to
1739 transition abruptly to and from idle.
1740
1741 locktorture.torture_runnable= [BOOT]
1742 Start locktorture running at boot time.
1743
1744 locktorture.torture_type= [KNL]
1745 Specify the locking implementation to test.
1746
1747 locktorture.verbose= [KNL]
1748 Enable additional printk() statements.
1749
1707 logibm.irq= [HW,MOUSE] Logitech Bus Mouse Driver 1750 logibm.irq= [HW,MOUSE] Logitech Bus Mouse Driver
1708 Format: <irq> 1751 Format: <irq>
1709 1752
@@ -2881,6 +2924,24 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
2881 Lazy RCU callbacks are those which RCU can 2924 Lazy RCU callbacks are those which RCU can
2882 prove do nothing more than free memory. 2925 prove do nothing more than free memory.
2883 2926
2927 rcutorture.cbflood_inter_holdoff= [KNL]
2928 Set holdoff time (jiffies) between successive
2929 callback-flood tests.
2930
2931 rcutorture.cbflood_intra_holdoff= [KNL]
2932 Set holdoff time (jiffies) between successive
2933 bursts of callbacks within a given callback-flood
2934 test.
2935
2936 rcutorture.cbflood_n_burst= [KNL]
2937 Set the number of bursts making up a given
2938 callback-flood test. Set this to zero to
2939 disable callback-flood testing.
2940
2941 rcutorture.cbflood_n_per_burst= [KNL]
2942 Set the number of callbacks to be registered
2943 in a given burst of a callback-flood test.
2944
2884 rcutorture.fqs_duration= [KNL] 2945 rcutorture.fqs_duration= [KNL]
2885 Set duration of force_quiescent_state bursts. 2946 Set duration of force_quiescent_state bursts.
2886 2947
@@ -2920,7 +2981,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
2920 Set time (s) between CPU-hotplug operations, or 2981 Set time (s) between CPU-hotplug operations, or
2921 zero to disable CPU-hotplug testing. 2982 zero to disable CPU-hotplug testing.
2922 2983
2923 rcutorture.rcutorture_runnable= [BOOT] 2984 rcutorture.torture_runnable= [BOOT]
2924 Start rcutorture running at boot time. 2985 Start rcutorture running at boot time.
2925 2986
2926 rcutorture.shuffle_interval= [KNL] 2987 rcutorture.shuffle_interval= [KNL]
@@ -2982,6 +3043,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
2982 rcupdate.rcu_cpu_stall_timeout= [KNL] 3043 rcupdate.rcu_cpu_stall_timeout= [KNL]
2983 Set timeout for RCU CPU stall warning messages. 3044 Set timeout for RCU CPU stall warning messages.
2984 3045
3046 rcupdate.rcu_task_stall_timeout= [KNL]
3047 Set timeout in jiffies for RCU task stall warning
3048 messages. Disable with a value less than or equal
3049 to zero.
3050
2985 rdinit= [KNL] 3051 rdinit= [KNL]
2986 Format: <full_path> 3052 Format: <full_path>
2987 Run specified binary instead of /init from the ramdisk, 3053 Run specified binary instead of /init from the ramdisk,
diff --git a/Documentation/locking/locktorture.txt b/Documentation/locking/locktorture.txt
new file mode 100644
index 000000000000..be715015e0f7
--- /dev/null
+++ b/Documentation/locking/locktorture.txt
@@ -0,0 +1,142 @@
1Kernel Lock Torture Test Operation
2
3CONFIG_LOCK_TORTURE_TEST
4
5The CONFIG LOCK_TORTURE_TEST config option provides a kernel module
6that runs torture tests on core kernel locking primitives. The kernel
7module, 'locktorture', may be built after the fact on the running
8kernel to be tested, if desired. The tests periodically output status
9messages via printk(), which can be examined via the dmesg (perhaps
10grepping for "torture"). The test is started when the module is loaded,
11and stops when the module is unloaded. This program is based on how RCU
12is tortured, via rcutorture.
13
14This torture test consists of creating a number of kernel threads which
15acquire the lock and hold it for specific amount of time, thus simulating
16different critical region behaviors. The amount of contention on the lock
17can be simulated by either enlarging this critical region hold time and/or
18creating more kthreads.
19
20
21MODULE PARAMETERS
22
23This module has the following parameters:
24
25
26 ** Locktorture-specific **
27
28nwriters_stress Number of kernel threads that will stress exclusive lock
29 ownership (writers). The default value is twice the number
30 of online CPUs.
31
32nreaders_stress Number of kernel threads that will stress shared lock
33 ownership (readers). The default is the same amount of writer
34 locks. If the user did not specify nwriters_stress, then
35 both readers and writers be the amount of online CPUs.
36
37torture_type Type of lock to torture. By default, only spinlocks will
38 be tortured. This module can torture the following locks,
39 with string values as follows:
40
41 o "lock_busted": Simulates a buggy lock implementation.
42
43 o "spin_lock": spin_lock() and spin_unlock() pairs.
44
45 o "spin_lock_irq": spin_lock_irq() and spin_unlock_irq()
46 pairs.
47
48 o "mutex_lock": mutex_lock() and mutex_unlock() pairs.
49
50 o "rwsem_lock": read/write down() and up() semaphore pairs.
51
52torture_runnable Start locktorture at boot time in the case where the
53 module is built into the kernel, otherwise wait for
54 torture_runnable to be set via sysfs before starting.
55 By default it will begin once the module is loaded.
56
57
58 ** Torture-framework (RCU + locking) **
59
60shutdown_secs The number of seconds to run the test before terminating
61 the test and powering off the system. The default is
62 zero, which disables test termination and system shutdown.
63 This capability is useful for automated testing.
64
65onoff_interval The number of seconds between each attempt to execute a
66 randomly selected CPU-hotplug operation. Defaults
67 to zero, which disables CPU hotplugging. In
68 CONFIG_HOTPLUG_CPU=n kernels, locktorture will silently
69 refuse to do any CPU-hotplug operations regardless of
70 what value is specified for onoff_interval.
71
72onoff_holdoff The number of seconds to wait until starting CPU-hotplug
73 operations. This would normally only be used when
74 locktorture was built into the kernel and started
75 automatically at boot time, in which case it is useful
76 in order to avoid confusing boot-time code with CPUs
77 coming and going. This parameter is only useful if
78 CONFIG_HOTPLUG_CPU is enabled.
79
80stat_interval Number of seconds between statistics-related printk()s.
81 By default, locktorture will report stats every 60 seconds.
82 Setting the interval to zero causes the statistics to
83 be printed -only- when the module is unloaded, and this
84 is the default.
85
86stutter The length of time to run the test before pausing for this
87 same period of time. Defaults to "stutter=5", so as
88 to run and pause for (roughly) five-second intervals.
89 Specifying "stutter=0" causes the test to run continuously
90 without pausing, which is the old default behavior.
91
92shuffle_interval The number of seconds to keep the test threads affinitied
93 to a particular subset of the CPUs, defaults to 3 seconds.
94 Used in conjunction with test_no_idle_hz.
95
96verbose Enable verbose debugging printing, via printk(). Enabled
97 by default. This extra information is mostly related to
98 high-level errors and reports from the main 'torture'
99 framework.
100
101
102STATISTICS
103
104Statistics are printed in the following format:
105
106spin_lock-torture: Writes: Total: 93746064 Max/Min: 0/0 Fail: 0
107 (A) (B) (C) (D) (E)
108
109(A): Lock type that is being tortured -- torture_type parameter.
110
111(B): Number of writer lock acquisitions. If dealing with a read/write primitive
112 a second "Reads" statistics line is printed.
113
114(C): Number of times the lock was acquired.
115
116(D): Min and max number of times threads failed to acquire the lock.
117
118(E): true/false values if there were errors acquiring the lock. This should
119 -only- be positive if there is a bug in the locking primitive's
120 implementation. Otherwise a lock should never fail (i.e., spin_lock()).
121 Of course, the same applies for (C), above. A dummy example of this is
122 the "lock_busted" type.
123
124USAGE
125
126The following script may be used to torture locks:
127
128 #!/bin/sh
129
130 modprobe locktorture
131 sleep 3600
132 rmmod locktorture
133 dmesg | grep torture:
134
135The output can be manually inspected for the error flag of "!!!".
136One could of course create a more elaborate script that automatically
137checked for such errors. The "rmmod" command forces a "SUCCESS",
138"FAILURE", or "RCU_HOTPLUG" indication to be printk()ed. The first
139two are self-explanatory, while the last indicates that while there
140were no locking failures, CPU-hotplug problems were detected.
141
142Also see: Documentation/RCU/torture.txt
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index a4de88fb55f0..22a969cdd476 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -574,30 +574,14 @@ However, stores are not speculated. This means that ordering -is- provided
574in the following example: 574in the following example:
575 575
576 q = ACCESS_ONCE(a); 576 q = ACCESS_ONCE(a);
577 if (ACCESS_ONCE(q)) {
578 ACCESS_ONCE(b) = p;
579 }
580
581Please note that ACCESS_ONCE() is not optional! Without the ACCESS_ONCE(),
582the compiler is within its rights to transform this example:
583
584 q = a;
585 if (q) { 577 if (q) {
586 b = p; /* BUG: Compiler can reorder!!! */ 578 ACCESS_ONCE(b) = p;
587 do_something();
588 } else {
589 b = p; /* BUG: Compiler can reorder!!! */
590 do_something_else();
591 } 579 }
592 580
593into this, which of course defeats the ordering: 581Please note that ACCESS_ONCE() is not optional! Without the
594 582ACCESS_ONCE(), might combine the load from 'a' with other loads from
595 b = p; 583'a', and the store to 'b' with other stores to 'b', with possible highly
596 q = a; 584counterintuitive effects on ordering.
597 if (q)
598 do_something();
599 else
600 do_something_else();
601 585
602Worse yet, if the compiler is able to prove (say) that the value of 586Worse yet, if the compiler is able to prove (say) that the value of
603variable 'a' is always non-zero, it would be well within its rights 587variable 'a' is always non-zero, it would be well within its rights
@@ -605,11 +589,12 @@ to optimize the original example by eliminating the "if" statement
605as follows: 589as follows:
606 590
607 q = a; 591 q = a;
608 b = p; /* BUG: Compiler can reorder!!! */ 592 b = p; /* BUG: Compiler and CPU can both reorder!!! */
609 do_something(); 593
594So don't leave out the ACCESS_ONCE().
610 595
611The solution is again ACCESS_ONCE() and barrier(), which preserves the 596It is tempting to try to enforce ordering on identical stores on both
612ordering between the load from variable 'a' and the store to variable 'b': 597branches of the "if" statement as follows:
613 598
614 q = ACCESS_ONCE(a); 599 q = ACCESS_ONCE(a);
615 if (q) { 600 if (q) {
@@ -622,18 +607,11 @@ ordering between the load from variable 'a' and the store to variable 'b':
622 do_something_else(); 607 do_something_else();
623 } 608 }
624 609
625The initial ACCESS_ONCE() is required to prevent the compiler from 610Unfortunately, current compilers will transform this as follows at high
626proving the value of 'a', and the pair of barrier() invocations are 611optimization levels:
627required to prevent the compiler from pulling the two identical stores
628to 'b' out from the legs of the "if" statement.
629
630It is important to note that control dependencies absolutely require a
631a conditional. For example, the following "optimized" version of
632the above example breaks ordering, which is why the barrier() invocations
633are absolutely required if you have identical stores in both legs of
634the "if" statement:
635 612
636 q = ACCESS_ONCE(a); 613 q = ACCESS_ONCE(a);
614 barrier();
637 ACCESS_ONCE(b) = p; /* BUG: No ordering vs. load from a!!! */ 615 ACCESS_ONCE(b) = p; /* BUG: No ordering vs. load from a!!! */
638 if (q) { 616 if (q) {
639 /* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */ 617 /* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */
@@ -643,21 +621,36 @@ the "if" statement:
643 do_something_else(); 621 do_something_else();
644 } 622 }
645 623
646It is of course legal for the prior load to be part of the conditional, 624Now there is no conditional between the load from 'a' and the store to
647for example, as follows: 625'b', which means that the CPU is within its rights to reorder them:
626The conditional is absolutely required, and must be present in the
627assembly code even after all compiler optimizations have been applied.
628Therefore, if you need ordering in this example, you need explicit
629memory barriers, for example, smp_store_release():
648 630
649 if (ACCESS_ONCE(a) > 0) { 631 q = ACCESS_ONCE(a);
650 barrier(); 632 if (q) {
651 ACCESS_ONCE(b) = q / 2; 633 smp_store_release(&b, p);
652 do_something(); 634 do_something();
653 } else { 635 } else {
654 barrier(); 636 smp_store_release(&b, p);
655 ACCESS_ONCE(b) = q / 3; 637 do_something_else();
638 }
639
640In contrast, without explicit memory barriers, two-legged-if control
641ordering is guaranteed only when the stores differ, for example:
642
643 q = ACCESS_ONCE(a);
644 if (q) {
645 ACCESS_ONCE(b) = p;
646 do_something();
647 } else {
648 ACCESS_ONCE(b) = r;
656 do_something_else(); 649 do_something_else();
657 } 650 }
658 651
659This will again ensure that the load from variable 'a' is ordered before the 652The initial ACCESS_ONCE() is still required to prevent the compiler from
660stores to variable 'b'. 653proving the value of 'a'.
661 654
662In addition, you need to be careful what you do with the local variable 'q', 655In addition, you need to be careful what you do with the local variable 'q',
663otherwise the compiler might be able to guess the value and again remove 656otherwise the compiler might be able to guess the value and again remove
@@ -665,12 +658,10 @@ the needed conditional. For example:
665 658
666 q = ACCESS_ONCE(a); 659 q = ACCESS_ONCE(a);
667 if (q % MAX) { 660 if (q % MAX) {
668 barrier();
669 ACCESS_ONCE(b) = p; 661 ACCESS_ONCE(b) = p;
670 do_something(); 662 do_something();
671 } else { 663 } else {
672 barrier(); 664 ACCESS_ONCE(b) = r;
673 ACCESS_ONCE(b) = p;
674 do_something_else(); 665 do_something_else();
675 } 666 }
676 667
@@ -682,9 +673,12 @@ transform the above code into the following:
682 ACCESS_ONCE(b) = p; 673 ACCESS_ONCE(b) = p;
683 do_something_else(); 674 do_something_else();
684 675
685This transformation loses the ordering between the load from variable 'a' 676Given this transformation, the CPU is not required to respect the ordering
686and the store to variable 'b'. If you are relying on this ordering, you 677between the load from variable 'a' and the store to variable 'b'. It is
687should do something like the following: 678tempting to add a barrier(), but this does not help. The conditional
679is gone, and the barrier won't bring it back. Therefore, if you are
680relying on this ordering, you should make sure that MAX is greater than
681one, perhaps as follows:
688 682
689 q = ACCESS_ONCE(a); 683 q = ACCESS_ONCE(a);
690 BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */ 684 BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
@@ -692,35 +686,45 @@ should do something like the following:
692 ACCESS_ONCE(b) = p; 686 ACCESS_ONCE(b) = p;
693 do_something(); 687 do_something();
694 } else { 688 } else {
695 ACCESS_ONCE(b) = p; 689 ACCESS_ONCE(b) = r;
696 do_something_else(); 690 do_something_else();
697 } 691 }
698 692
693Please note once again that the stores to 'b' differ. If they were
694identical, as noted earlier, the compiler could pull this store outside
695of the 'if' statement.
696
699Finally, control dependencies do -not- provide transitivity. This is 697Finally, control dependencies do -not- provide transitivity. This is
700demonstrated by two related examples: 698demonstrated by two related examples, with the initial values of
699x and y both being zero:
701 700
702 CPU 0 CPU 1 701 CPU 0 CPU 1
703 ===================== ===================== 702 ===================== =====================
704 r1 = ACCESS_ONCE(x); r2 = ACCESS_ONCE(y); 703 r1 = ACCESS_ONCE(x); r2 = ACCESS_ONCE(y);
705 if (r1 >= 0) if (r2 >= 0) 704 if (r1 > 0) if (r2 > 0)
706 ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1; 705 ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1;
707 706
708 assert(!(r1 == 1 && r2 == 1)); 707 assert(!(r1 == 1 && r2 == 1));
709 708
710The above two-CPU example will never trigger the assert(). However, 709The above two-CPU example will never trigger the assert(). However,
711if control dependencies guaranteed transitivity (which they do not), 710if control dependencies guaranteed transitivity (which they do not),
712then adding the following two CPUs would guarantee a related assertion: 711then adding the following CPU would guarantee a related assertion:
713 712
714 CPU 2 CPU 3 713 CPU 2
715 ===================== ===================== 714 =====================
716 ACCESS_ONCE(x) = 2; ACCESS_ONCE(y) = 2; 715 ACCESS_ONCE(x) = 2;
716
717 assert(!(r1 == 2 && r2 == 1 && x == 2)); /* FAILS!!! */
717 718
718 assert(!(r1 == 2 && r2 == 2 && x == 1 && y == 1)); /* FAILS!!! */ 719But because control dependencies do -not- provide transitivity, the above
720assertion can fail after the combined three-CPU example completes. If you
721need the three-CPU example to provide ordering, you will need smp_mb()
722between the loads and stores in the CPU 0 and CPU 1 code fragments,
723that is, just before or just after the "if" statements.
719 724
720But because control dependencies do -not- provide transitivity, the 725These two examples are the LB and WWC litmus tests from this paper:
721above assertion can fail after the combined four-CPU example completes. 726http://www.cl.cam.ac.uk/users/pes20/ppc-supplemental/test6.pdf and this
722If you need the four-CPU example to provide ordering, you will need 727site: https://www.cl.cam.ac.uk/~pes20/ppcmem/index.html.
723smp_mb() between the loads and stores in the CPU 0 and CPU 1 code fragments.
724 728
725In summary: 729In summary:
726 730