aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--Documentation/locking/crossrelease.txt874
-rw-r--r--include/linux/compiler.h47
-rw-r--r--include/linux/completion.h45
-rw-r--r--include/linux/lockdep.h125
-rw-r--r--include/linux/rwlock_types.h3
-rw-r--r--include/linux/sched.h11
-rw-r--r--include/linux/spinlock.h5
-rw-r--r--include/linux/spinlock_types.h3
-rw-r--r--kernel/locking/lockdep.c652
-rw-r--r--kernel/locking/spinlock.c13
-rw-r--r--lib/Kconfig.debug33
-rwxr-xr-xscripts/checkpatch.pl22
-rw-r--r--tools/include/linux/compiler.h21
-rw-r--r--tools/include/linux/lockdep.h1
-rw-r--r--tools/perf/util/mmap.h2
15 files changed, 60 insertions, 1797 deletions
diff --git a/Documentation/locking/crossrelease.txt b/Documentation/locking/crossrelease.txt
deleted file mode 100644
index bdf1423d5f99..000000000000
--- a/Documentation/locking/crossrelease.txt
+++ /dev/null
@@ -1,874 +0,0 @@
1Crossrelease
2============
3
4Started by Byungchul Park <byungchul.park@lge.com>
5
6Contents:
7
8 (*) Background
9
10 - What causes deadlock
11 - How lockdep works
12
13 (*) Limitation
14
15 - Limit lockdep
16 - Pros from the limitation
17 - Cons from the limitation
18 - Relax the limitation
19
20 (*) Crossrelease
21
22 - Introduce crossrelease
23 - Introduce commit
24
25 (*) Implementation
26
27 - Data structures
28 - How crossrelease works
29
30 (*) Optimizations
31
32 - Avoid duplication
33 - Lockless for hot paths
34
35 (*) APPENDIX A: What lockdep does to work aggresively
36
37 (*) APPENDIX B: How to avoid adding false dependencies
38
39
40==========
41Background
42==========
43
44What causes deadlock
45--------------------
46
47A deadlock occurs when a context is waiting for an event to happen,
48which is impossible because another (or the) context who can trigger the
49event is also waiting for another (or the) event to happen, which is
50also impossible due to the same reason.
51
52For example:
53
54 A context going to trigger event C is waiting for event A to happen.
55 A context going to trigger event A is waiting for event B to happen.
56 A context going to trigger event B is waiting for event C to happen.
57
58A deadlock occurs when these three wait operations run at the same time,
59because event C cannot be triggered if event A does not happen, which in
60turn cannot be triggered if event B does not happen, which in turn
61cannot be triggered if event C does not happen. After all, no event can
62be triggered since any of them never meets its condition to wake up.
63
64A dependency might exist between two waiters and a deadlock might happen
65due to an incorrect releationship between dependencies. Thus, we must
66define what a dependency is first. A dependency exists between them if:
67
68 1. There are two waiters waiting for each event at a given time.
69 2. The only way to wake up each waiter is to trigger its event.
70 3. Whether one can be woken up depends on whether the other can.
71
72Each wait in the example creates its dependency like:
73
74 Event C depends on event A.
75 Event A depends on event B.
76 Event B depends on event C.
77
78 NOTE: Precisely speaking, a dependency is one between whether a
79 waiter for an event can be woken up and whether another waiter for
80 another event can be woken up. However from now on, we will describe
81 a dependency as if it's one between an event and another event for
82 simplicity.
83
84And they form circular dependencies like:
85
86 -> C -> A -> B -
87 / \
88 \ /
89 ----------------
90
91 where 'A -> B' means that event A depends on event B.
92
93Such circular dependencies lead to a deadlock since no waiter can meet
94its condition to wake up as described.
95
96CONCLUSION
97
98Circular dependencies cause a deadlock.
99
100
101How lockdep works
102-----------------
103
104Lockdep tries to detect a deadlock by checking dependencies created by
105lock operations, acquire and release. Waiting for a lock corresponds to
106waiting for an event, and releasing a lock corresponds to triggering an
107event in the previous section.
108
109In short, lockdep does:
110
111 1. Detect a new dependency.
112 2. Add the dependency into a global graph.
113 3. Check if that makes dependencies circular.
114 4. Report a deadlock or its possibility if so.
115
116For example, consider a graph built by lockdep that looks like:
117
118 A -> B -
119 \
120 -> E
121 /
122 C -> D -
123
124 where A, B,..., E are different lock classes.
125
126Lockdep will add a dependency into the graph on detection of a new
127dependency. For example, it will add a dependency 'E -> C' when a new
128dependency between lock E and lock C is detected. Then the graph will be:
129
130 A -> B -
131 \
132 -> E -
133 / \
134 -> C -> D - \
135 / /
136 \ /
137 ------------------
138
139 where A, B,..., E are different lock classes.
140
141This graph contains a subgraph which demonstrates circular dependencies:
142
143 -> E -
144 / \
145 -> C -> D - \
146 / /
147 \ /
148 ------------------
149
150 where C, D and E are different lock classes.
151
152This is the condition under which a deadlock might occur. Lockdep
153reports it on detection after adding a new dependency. This is the way
154how lockdep works.
155
156CONCLUSION
157
158Lockdep detects a deadlock or its possibility by checking if circular
159dependencies were created after adding each new dependency.
160
161
162==========
163Limitation
164==========
165
166Limit lockdep
167-------------
168
169Limiting lockdep to work on only typical locks e.g. spin locks and
170mutexes, which are released within the acquire context, the
171implementation becomes simple but its capacity for detection becomes
172limited. Let's check pros and cons in next section.
173
174
175Pros from the limitation
176------------------------
177
178Given the limitation, when acquiring a lock, locks in a held_locks
179cannot be released if the context cannot acquire it so has to wait to
180acquire it, which means all waiters for the locks in the held_locks are
181stuck. It's an exact case to create dependencies between each lock in
182the held_locks and the lock to acquire.
183
184For example:
185
186 CONTEXT X
187 ---------
188 acquire A
189 acquire B /* Add a dependency 'A -> B' */
190 release B
191 release A
192
193 where A and B are different lock classes.
194
195When acquiring lock A, the held_locks of CONTEXT X is empty thus no
196dependency is added. But when acquiring lock B, lockdep detects and adds
197a new dependency 'A -> B' between lock A in the held_locks and lock B.
198They can be simply added whenever acquiring each lock.
199
200And data required by lockdep exists in a local structure, held_locks
201embedded in task_struct. Forcing to access the data within the context,
202lockdep can avoid racy problems without explicit locks while handling
203the local data.
204
205Lastly, lockdep only needs to keep locks currently being held, to build
206a dependency graph. However, relaxing the limitation, it needs to keep
207even locks already released, because a decision whether they created
208dependencies might be long-deferred.
209
210To sum up, we can expect several advantages from the limitation:
211
212 1. Lockdep can easily identify a dependency when acquiring a lock.
213 2. Races are avoidable while accessing local locks in a held_locks.
214 3. Lockdep only needs to keep locks currently being held.
215
216CONCLUSION
217
218Given the limitation, the implementation becomes simple and efficient.
219
220
221Cons from the limitation
222------------------------
223
224Given the limitation, lockdep is applicable only to typical locks. For
225example, page locks for page access or completions for synchronization
226cannot work with lockdep.
227
228Can we detect deadlocks below, under the limitation?
229
230Example 1:
231
232 CONTEXT X CONTEXT Y CONTEXT Z
233 --------- --------- ----------
234 mutex_lock A
235 lock_page B
236 lock_page B
237 mutex_lock A /* DEADLOCK */
238 unlock_page B held by X
239 unlock_page B
240 mutex_unlock A
241 mutex_unlock A
242
243 where A and B are different lock classes.
244
245No, we cannot.
246
247Example 2:
248
249 CONTEXT X CONTEXT Y
250 --------- ---------
251 mutex_lock A
252 mutex_lock A
253 wait_for_complete B /* DEADLOCK */
254 complete B
255 mutex_unlock A
256 mutex_unlock A
257
258 where A is a lock class and B is a completion variable.
259
260No, we cannot.
261
262CONCLUSION
263
264Given the limitation, lockdep cannot detect a deadlock or its
265possibility caused by page locks or completions.
266
267
268Relax the limitation
269--------------------
270
271Under the limitation, things to create dependencies are limited to
272typical locks. However, synchronization primitives like page locks and
273completions, which are allowed to be released in any context, also
274create dependencies and can cause a deadlock. So lockdep should track
275these locks to do a better job. We have to relax the limitation for
276these locks to work with lockdep.
277
278Detecting dependencies is very important for lockdep to work because
279adding a dependency means adding an opportunity to check whether it
280causes a deadlock. The more lockdep adds dependencies, the more it
281thoroughly works. Thus Lockdep has to do its best to detect and add as
282many true dependencies into a graph as possible.
283
284For example, considering only typical locks, lockdep builds a graph like:
285
286 A -> B -
287 \
288 -> E
289 /
290 C -> D -
291
292 where A, B,..., E are different lock classes.
293
294On the other hand, under the relaxation, additional dependencies might
295be created and added. Assuming additional 'FX -> C' and 'E -> GX' are
296added thanks to the relaxation, the graph will be:
297
298 A -> B -
299 \
300 -> E -> GX
301 /
302 FX -> C -> D -
303
304 where A, B,..., E, FX and GX are different lock classes, and a suffix
305 'X' is added on non-typical locks.
306
307The latter graph gives us more chances to check circular dependencies
308than the former. However, it might suffer performance degradation since
309relaxing the limitation, with which design and implementation of lockdep
310can be efficient, might introduce inefficiency inevitably. So lockdep
311should provide two options, strong detection and efficient detection.
312
313Choosing efficient detection:
314
315 Lockdep works with only locks restricted to be released within the
316 acquire context. However, lockdep works efficiently.
317
318Choosing strong detection:
319
320 Lockdep works with all synchronization primitives. However, lockdep
321 suffers performance degradation.
322
323CONCLUSION
324
325Relaxing the limitation, lockdep can add additional dependencies giving
326additional opportunities to check circular dependencies.
327
328
329============
330Crossrelease
331============
332
333Introduce crossrelease
334----------------------
335
336In order to allow lockdep to handle additional dependencies by what
337might be released in any context, namely 'crosslock', we have to be able
338to identify those created by crosslocks. The proposed 'crossrelease'
339feature provoides a way to do that.
340
341Crossrelease feature has to do:
342
343 1. Identify dependencies created by crosslocks.
344 2. Add the dependencies into a dependency graph.
345
346That's all. Once a meaningful dependency is added into graph, then
347lockdep would work with the graph as it did. The most important thing
348crossrelease feature has to do is to correctly identify and add true
349dependencies into the global graph.
350
351A dependency e.g. 'A -> B' can be identified only in the A's release
352context because a decision required to identify the dependency can be
353made only in the release context. That is to decide whether A can be
354released so that a waiter for A can be woken up. It cannot be made in
355other than the A's release context.
356
357It's no matter for typical locks because each acquire context is same as
358its release context, thus lockdep can decide whether a lock can be
359released in the acquire context. However for crosslocks, lockdep cannot
360make the decision in the acquire context but has to wait until the
361release context is identified.
362
363Therefore, deadlocks by crosslocks cannot be detected just when it
364happens, because those cannot be identified until the crosslocks are
365released. However, deadlock possibilities can be detected and it's very
366worth. See 'APPENDIX A' section to check why.
367
368CONCLUSION
369
370Using crossrelease feature, lockdep can work with what might be released
371in any context, namely crosslock.
372
373
374Introduce commit
375----------------
376
377Since crossrelease defers the work adding true dependencies of
378crosslocks until they are actually released, crossrelease has to queue
379all acquisitions which might create dependencies with the crosslocks.
380Then it identifies dependencies using the queued data in batches at a
381proper time. We call it 'commit'.
382
383There are four types of dependencies:
384
3851. TT type: 'typical lock A -> typical lock B'
386
387 Just when acquiring B, lockdep can see it's in the A's release
388 context. So the dependency between A and B can be identified
389 immediately. Commit is unnecessary.
390
3912. TC type: 'typical lock A -> crosslock BX'
392
393 Just when acquiring BX, lockdep can see it's in the A's release
394 context. So the dependency between A and BX can be identified
395 immediately. Commit is unnecessary, too.
396
3973. CT type: 'crosslock AX -> typical lock B'
398
399 When acquiring B, lockdep cannot identify the dependency because
400 there's no way to know if it's in the AX's release context. It has
401 to wait until the decision can be made. Commit is necessary.
402
4034. CC type: 'crosslock AX -> crosslock BX'
404
405 When acquiring BX, lockdep cannot identify the dependency because
406 there's no way to know if it's in the AX's release context. It has
407 to wait until the decision can be made. Commit is necessary.
408 But, handling CC type is not implemented yet. It's a future work.
409
410Lockdep can work without commit for typical locks, but commit step is
411necessary once crosslocks are involved. Introducing commit, lockdep
412performs three steps. What lockdep does in each step is:
413
4141. Acquisition: For typical locks, lockdep does what it originally did
415 and queues the lock so that CT type dependencies can be checked using
416 it at the commit step. For crosslocks, it saves data which will be
417 used at the commit step and increases a reference count for it.
418
4192. Commit: No action is reauired for typical locks. For crosslocks,
420 lockdep adds CT type dependencies using the data saved at the
421 acquisition step.
422
4233. Release: No changes are required for typical locks. When a crosslock
424 is released, it decreases a reference count for it.
425
426CONCLUSION
427
428Crossrelease introduces commit step to handle dependencies of crosslocks
429in batches at a proper time.
430
431
432==============
433Implementation
434==============
435
436Data structures
437---------------
438
439Crossrelease introduces two main data structures.
440
4411. hist_lock
442
443 This is an array embedded in task_struct, for keeping lock history so
444 that dependencies can be added using them at the commit step. Since
445 it's local data, it can be accessed locklessly in the owner context.
446 The array is filled at the acquisition step and consumed at the
447 commit step. And it's managed in circular manner.
448
4492. cross_lock
450
451 One per lockdep_map exists. This is for keeping data of crosslocks
452 and used at the commit step.
453
454
455How crossrelease works
456----------------------
457
458It's the key of how crossrelease works, to defer necessary works to an
459appropriate point in time and perform in at once at the commit step.
460Let's take a look with examples step by step, starting from how lockdep
461works without crossrelease for typical locks.
462
463 acquire A /* Push A onto held_locks */
464 acquire B /* Push B onto held_locks and add 'A -> B' */
465 acquire C /* Push C onto held_locks and add 'B -> C' */
466 release C /* Pop C from held_locks */
467 release B /* Pop B from held_locks */
468 release A /* Pop A from held_locks */
469
470 where A, B and C are different lock classes.
471
472 NOTE: This document assumes that readers already understand how
473 lockdep works without crossrelease thus omits details. But there's
474 one thing to note. Lockdep pretends to pop a lock from held_locks
475 when releasing it. But it's subtly different from the original pop
476 operation because lockdep allows other than the top to be poped.
477
478In this case, lockdep adds 'the top of held_locks -> the lock to acquire'
479dependency every time acquiring a lock.
480
481After adding 'A -> B', a dependency graph will be:
482
483 A -> B
484
485 where A and B are different lock classes.
486
487And after adding 'B -> C', the graph will be:
488
489 A -> B -> C
490
491 where A, B and C are different lock classes.
492
493Let's performs commit step even for typical locks to add dependencies.
494Of course, commit step is not necessary for them, however, it would work
495well because this is a more general way.
496
497 acquire A
498 /*
499 * Queue A into hist_locks
500 *
501 * In hist_locks: A
502 * In graph: Empty
503 */
504
505 acquire B
506 /*
507 * Queue B into hist_locks
508 *
509 * In hist_locks: A, B
510 * In graph: Empty
511 */
512
513 acquire C
514 /*
515 * Queue C into hist_locks
516 *
517 * In hist_locks: A, B, C
518 * In graph: Empty
519 */
520
521 commit C
522 /*
523 * Add 'C -> ?'
524 * Answer the following to decide '?'
525 * What has been queued since acquire C: Nothing
526 *
527 * In hist_locks: A, B, C
528 * In graph: Empty
529 */
530
531 release C
532
533 commit B
534 /*
535 * Add 'B -> ?'
536 * Answer the following to decide '?'
537 * What has been queued since acquire B: C
538 *
539 * In hist_locks: A, B, C
540 * In graph: 'B -> C'
541 */
542
543 release B
544
545 commit A
546 /*
547 * Add 'A -> ?'
548 * Answer the following to decide '?'
549 * What has been queued since acquire A: B, C
550 *
551 * In hist_locks: A, B, C
552 * In graph: 'B -> C', 'A -> B', 'A -> C'
553 */
554
555 release A
556
557 where A, B and C are different lock classes.
558
559In this case, dependencies are added at the commit step as described.
560
561After commits for A, B and C, the graph will be:
562
563 A -> B -> C
564
565 where A, B and C are different lock classes.
566
567 NOTE: A dependency 'A -> C' is optimized out.
568
569We can see the former graph built without commit step is same as the
570latter graph built using commit steps. Of course the former way leads to
571earlier finish for building the graph, which means we can detect a
572deadlock or its possibility sooner. So the former way would be prefered
573when possible. But we cannot avoid using the latter way for crosslocks.
574
575Let's look at how commit steps work for crosslocks. In this case, the
576commit step is performed only on crosslock AX as real. And it assumes
577that the AX release context is different from the AX acquire context.
578
579 BX RELEASE CONTEXT BX ACQUIRE CONTEXT
580 ------------------ ------------------
581 acquire A
582 /*
583 * Push A onto held_locks
584 * Queue A into hist_locks
585 *
586 * In held_locks: A
587 * In hist_locks: A
588 * In graph: Empty
589 */
590
591 acquire BX
592 /*
593 * Add 'the top of held_locks -> BX'
594 *
595 * In held_locks: A
596 * In hist_locks: A
597 * In graph: 'A -> BX'
598 */
599
600 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
601 It must be guaranteed that the following operations are seen after
602 acquiring BX globally. It can be done by things like barrier.
603 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
604
605 acquire C
606 /*
607 * Push C onto held_locks
608 * Queue C into hist_locks
609 *
610 * In held_locks: C
611 * In hist_locks: C
612 * In graph: 'A -> BX'
613 */
614
615 release C
616 /*
617 * Pop C from held_locks
618 *
619 * In held_locks: Empty
620 * In hist_locks: C
621 * In graph: 'A -> BX'
622 */
623 acquire D
624 /*
625 * Push D onto held_locks
626 * Queue D into hist_locks
627 * Add 'the top of held_locks -> D'
628 *
629 * In held_locks: A, D
630 * In hist_locks: A, D
631 * In graph: 'A -> BX', 'A -> D'
632 */
633 acquire E
634 /*
635 * Push E onto held_locks
636 * Queue E into hist_locks
637 *
638 * In held_locks: E
639 * In hist_locks: C, E
640 * In graph: 'A -> BX', 'A -> D'
641 */
642
643 release E
644 /*
645 * Pop E from held_locks
646 *
647 * In held_locks: Empty
648 * In hist_locks: D, E
649 * In graph: 'A -> BX', 'A -> D'
650 */
651 release D
652 /*
653 * Pop D from held_locks
654 *
655 * In held_locks: A
656 * In hist_locks: A, D
657 * In graph: 'A -> BX', 'A -> D'
658 */
659 commit BX
660 /*
661 * Add 'BX -> ?'
662 * What has been queued since acquire BX: C, E
663 *
664 * In held_locks: Empty
665 * In hist_locks: D, E
666 * In graph: 'A -> BX', 'A -> D',
667 * 'BX -> C', 'BX -> E'
668 */
669
670 release BX
671 /*
672 * In held_locks: Empty
673 * In hist_locks: D, E
674 * In graph: 'A -> BX', 'A -> D',
675 * 'BX -> C', 'BX -> E'
676 */
677 release A
678 /*
679 * Pop A from held_locks
680 *
681 * In held_locks: Empty
682 * In hist_locks: A, D
683 * In graph: 'A -> BX', 'A -> D',
684 * 'BX -> C', 'BX -> E'
685 */
686
687 where A, BX, C,..., E are different lock classes, and a suffix 'X' is
688 added on crosslocks.
689
690Crossrelease considers all acquisitions after acqiuring BX are
691candidates which might create dependencies with BX. True dependencies
692will be determined when identifying the release context of BX. Meanwhile,
693all typical locks are queued so that they can be used at the commit step.
694And then two dependencies 'BX -> C' and 'BX -> E' are added at the
695commit step when identifying the release context.
696
697The final graph will be, with crossrelease:
698
699 -> C
700 /
701 -> BX -
702 / \
703 A - -> E
704 \
705 -> D
706
707 where A, BX, C,..., E are different lock classes, and a suffix 'X' is
708 added on crosslocks.
709
710However, the final graph will be, without crossrelease:
711
712 A -> D
713
714 where A and D are different lock classes.
715
716The former graph has three more dependencies, 'A -> BX', 'BX -> C' and
717'BX -> E' giving additional opportunities to check if they cause
718deadlocks. This way lockdep can detect a deadlock or its possibility
719caused by crosslocks.
720
721CONCLUSION
722
723We checked how crossrelease works with several examples.
724
725
726=============
727Optimizations
728=============
729
730Avoid duplication
731-----------------
732
733Crossrelease feature uses a cache like what lockdep already uses for
734dependency chains, but this time it's for caching CT type dependencies.
735Once that dependency is cached, the same will never be added again.
736
737
738Lockless for hot paths
739----------------------
740
741To keep all locks for later use at the commit step, crossrelease adopts
742a local array embedded in task_struct, which makes access to the data
743lockless by forcing it to happen only within the owner context. It's
744like how lockdep handles held_locks. Lockless implmentation is important
745since typical locks are very frequently acquired and released.
746
747
748=================================================
749APPENDIX A: What lockdep does to work aggresively
750=================================================
751
752A deadlock actually occurs when all wait operations creating circular
753dependencies run at the same time. Even though they don't, a potential
754deadlock exists if the problematic dependencies exist. Thus it's
755meaningful to detect not only an actual deadlock but also its potential
756possibility. The latter is rather valuable. When a deadlock occurs
757actually, we can identify what happens in the system by some means or
758other even without lockdep. However, there's no way to detect possiblity
759without lockdep unless the whole code is parsed in head. It's terrible.
760Lockdep does the both, and crossrelease only focuses on the latter.
761
762Whether or not a deadlock actually occurs depends on several factors.
763For example, what order contexts are switched in is a factor. Assuming
764circular dependencies exist, a deadlock would occur when contexts are
765switched so that all wait operations creating the dependencies run
766simultaneously. Thus to detect a deadlock possibility even in the case
767that it has not occured yet, lockdep should consider all possible
768combinations of dependencies, trying to:
769
7701. Use a global dependency graph.
771
772 Lockdep combines all dependencies into one global graph and uses them,
773 regardless of which context generates them or what order contexts are
774 switched in. Aggregated dependencies are only considered so they are
775 prone to be circular if a problem exists.
776
7772. Check dependencies between classes instead of instances.
778
779 What actually causes a deadlock are instances of lock. However,
780 lockdep checks dependencies between classes instead of instances.
781 This way lockdep can detect a deadlock which has not happened but
782 might happen in future by others but the same class.
783
7843. Assume all acquisitions lead to waiting.
785
786 Although locks might be acquired without waiting which is essential
787 to create dependencies, lockdep assumes all acquisitions lead to
788 waiting since it might be true some time or another.
789
790CONCLUSION
791
792Lockdep detects not only an actual deadlock but also its possibility,
793and the latter is more valuable.
794
795
796==================================================
797APPENDIX B: How to avoid adding false dependencies
798==================================================
799
800Remind what a dependency is. A dependency exists if:
801
802 1. There are two waiters waiting for each event at a given time.
803 2. The only way to wake up each waiter is to trigger its event.
804 3. Whether one can be woken up depends on whether the other can.
805
806For example:
807
808 acquire A
809 acquire B /* A dependency 'A -> B' exists */
810 release B
811 release A
812
813 where A and B are different lock classes.
814
815A depedency 'A -> B' exists since:
816
817 1. A waiter for A and a waiter for B might exist when acquiring B.
818 2. Only way to wake up each is to release what it waits for.
819 3. Whether the waiter for A can be woken up depends on whether the
820 other can. IOW, TASK X cannot release A if it fails to acquire B.
821
822For another example:
823
824 TASK X TASK Y
825 ------ ------
826 acquire AX
827 acquire B /* A dependency 'AX -> B' exists */
828 release B
829 release AX held by Y
830
831 where AX and B are different lock classes, and a suffix 'X' is added
832 on crosslocks.
833
834Even in this case involving crosslocks, the same rule can be applied. A
835depedency 'AX -> B' exists since:
836
837 1. A waiter for AX and a waiter for B might exist when acquiring B.
838 2. Only way to wake up each is to release what it waits for.
839 3. Whether the waiter for AX can be woken up depends on whether the
840 other can. IOW, TASK X cannot release AX if it fails to acquire B.
841
842Let's take a look at more complicated example:
843
844 TASK X TASK Y
845 ------ ------
846 acquire B
847 release B
848 fork Y
849 acquire AX
850 acquire C /* A dependency 'AX -> C' exists */
851 release C
852 release AX held by Y
853
854 where AX, B and C are different lock classes, and a suffix 'X' is
855 added on crosslocks.
856
857Does a dependency 'AX -> B' exist? Nope.
858
859Two waiters are essential to create a dependency. However, waiters for
860AX and B to create 'AX -> B' cannot exist at the same time in this
861example. Thus the dependency 'AX -> B' cannot be created.
862
863It would be ideal if the full set of true ones can be considered. But
864we can ensure nothing but what actually happened. Relying on what
865actually happens at runtime, we can anyway add only true ones, though
866they might be a subset of true ones. It's similar to how lockdep works
867for typical locks. There might be more true dependencies than what
868lockdep has detected in runtime. Lockdep has no choice but to rely on
869what actually happens. Crossrelease also relies on it.
870
871CONCLUSION
872
873Relying on what actually happens, lockdep can avoid adding false
874dependencies.
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 188ed9f65517..52e611ab9a6c 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -220,21 +220,21 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
220/* 220/*
221 * Prevent the compiler from merging or refetching reads or writes. The 221 * Prevent the compiler from merging or refetching reads or writes. The
222 * compiler is also forbidden from reordering successive instances of 222 * compiler is also forbidden from reordering successive instances of
223 * READ_ONCE, WRITE_ONCE and ACCESS_ONCE (see below), but only when the 223 * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
224 * compiler is aware of some particular ordering. One way to make the 224 * particular ordering. One way to make the compiler aware of ordering is to
225 * compiler aware of ordering is to put the two invocations of READ_ONCE, 225 * put the two invocations of READ_ONCE or WRITE_ONCE in different C
226 * WRITE_ONCE or ACCESS_ONCE() in different C statements. 226 * statements.
227 * 227 *
228 * In contrast to ACCESS_ONCE these two macros will also work on aggregate 228 * These two macros will also work on aggregate data types like structs or
229 * data types like structs or unions. If the size of the accessed data 229 * unions. If the size of the accessed data type exceeds the word size of
230 * type exceeds the word size of the machine (e.g., 32 bits or 64 bits) 230 * the machine (e.g., 32 bits or 64 bits) READ_ONCE() and WRITE_ONCE() will
231 * READ_ONCE() and WRITE_ONCE() will fall back to memcpy(). There's at 231 * fall back to memcpy(). There's at least two memcpy()s: one for the
232 * least two memcpy()s: one for the __builtin_memcpy() and then one for 232 * __builtin_memcpy() and then one for the macro doing the copy of variable
233 * the macro doing the copy of variable - '__u' allocated on the stack. 233 * - '__u' allocated on the stack.
234 * 234 *
235 * Their two major use cases are: (1) Mediating communication between 235 * Their two major use cases are: (1) Mediating communication between
236 * process-level code and irq/NMI handlers, all running on the same CPU, 236 * process-level code and irq/NMI handlers, all running on the same CPU,
237 * and (2) Ensuring that the compiler does not fold, spindle, or otherwise 237 * and (2) Ensuring that the compiler does not fold, spindle, or otherwise
238 * mutilate accesses that either do not require ordering or that interact 238 * mutilate accesses that either do not require ordering or that interact
239 * with an explicit memory barrier or atomic instruction that provides the 239 * with an explicit memory barrier or atomic instruction that provides the
240 * required ordering. 240 * required ordering.
@@ -327,29 +327,4 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
327 compiletime_assert(__native_word(t), \ 327 compiletime_assert(__native_word(t), \
328 "Need native word sized stores/loads for atomicity.") 328 "Need native word sized stores/loads for atomicity.")
329 329
330/*
331 * Prevent the compiler from merging or refetching accesses. The compiler
332 * is also forbidden from reordering successive instances of ACCESS_ONCE(),
333 * but only when the compiler is aware of some particular ordering. One way
334 * to make the compiler aware of ordering is to put the two invocations of
335 * ACCESS_ONCE() in different C statements.
336 *
337 * ACCESS_ONCE will only work on scalar types. For union types, ACCESS_ONCE
338 * on a union member will work as long as the size of the member matches the
339 * size of the union and the size is smaller than word size.
340 *
341 * The major use cases of ACCESS_ONCE used to be (1) Mediating communication
342 * between process-level code and irq/NMI handlers, all running on the same CPU,
343 * and (2) Ensuring that the compiler does not fold, spindle, or otherwise
344 * mutilate accesses that either do not require ordering or that interact
345 * with an explicit memory barrier or atomic instruction that provides the
346 * required ordering.
347 *
348 * If possible use READ_ONCE()/WRITE_ONCE() instead.
349 */
350#define __ACCESS_ONCE(x) ({ \
351 __maybe_unused typeof(x) __var = (__force typeof(x)) 0; \
352 (volatile typeof(x) *)&(x); })
353#define ACCESS_ONCE(x) (*__ACCESS_ONCE(x))
354
355#endif /* __LINUX_COMPILER_H */ 330#endif /* __LINUX_COMPILER_H */
diff --git a/include/linux/completion.h b/include/linux/completion.h
index 0662a417febe..94a59ba7d422 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -10,9 +10,6 @@
10 */ 10 */
11 11
12#include <linux/wait.h> 12#include <linux/wait.h>
13#ifdef CONFIG_LOCKDEP_COMPLETIONS
14#include <linux/lockdep.h>
15#endif
16 13
17/* 14/*
18 * struct completion - structure used to maintain state for a "completion" 15 * struct completion - structure used to maintain state for a "completion"
@@ -29,58 +26,16 @@
29struct completion { 26struct completion {
30 unsigned int done; 27 unsigned int done;
31 wait_queue_head_t wait; 28 wait_queue_head_t wait;
32#ifdef CONFIG_LOCKDEP_COMPLETIONS
33 struct lockdep_map_cross map;
34#endif
35}; 29};
36 30
37#ifdef CONFIG_LOCKDEP_COMPLETIONS
38static inline void complete_acquire(struct completion *x)
39{
40 lock_acquire_exclusive((struct lockdep_map *)&x->map, 0, 0, NULL, _RET_IP_);
41}
42
43static inline void complete_release(struct completion *x)
44{
45 lock_release((struct lockdep_map *)&x->map, 0, _RET_IP_);
46}
47
48static inline void complete_release_commit(struct completion *x)
49{
50 lock_commit_crosslock((struct lockdep_map *)&x->map);
51}
52
53#define init_completion_map(x, m) \
54do { \
55 lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map, \
56 (m)->name, (m)->key, 0); \
57 __init_completion(x); \
58} while (0)
59
60#define init_completion(x) \
61do { \
62 static struct lock_class_key __key; \
63 lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map, \
64 "(completion)" #x, \
65 &__key, 0); \
66 __init_completion(x); \
67} while (0)
68#else
69#define init_completion_map(x, m) __init_completion(x) 31#define init_completion_map(x, m) __init_completion(x)
70#define init_completion(x) __init_completion(x) 32#define init_completion(x) __init_completion(x)
71static inline void complete_acquire(struct completion *x) {} 33static inline void complete_acquire(struct completion *x) {}
72static inline void complete_release(struct completion *x) {} 34static inline void complete_release(struct completion *x) {}
73static inline void complete_release_commit(struct completion *x) {} 35static inline void complete_release_commit(struct completion *x) {}
74#endif
75 36
76#ifdef CONFIG_LOCKDEP_COMPLETIONS
77#define COMPLETION_INITIALIZER(work) \
78 { 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait), \
79 STATIC_CROSS_LOCKDEP_MAP_INIT("(completion)" #work, &(work)) }
80#else
81#define COMPLETION_INITIALIZER(work) \ 37#define COMPLETION_INITIALIZER(work) \
82 { 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait) } 38 { 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait) }
83#endif
84 39
85#define COMPLETION_INITIALIZER_ONSTACK_MAP(work, map) \ 40#define COMPLETION_INITIALIZER_ONSTACK_MAP(work, map) \
86 (*({ init_completion_map(&(work), &(map)); &(work); })) 41 (*({ init_completion_map(&(work), &(map)); &(work); }))
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index a842551fe044..2e75dc34bff5 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -158,12 +158,6 @@ struct lockdep_map {
158 int cpu; 158 int cpu;
159 unsigned long ip; 159 unsigned long ip;
160#endif 160#endif
161#ifdef CONFIG_LOCKDEP_CROSSRELEASE
162 /*
163 * Whether it's a crosslock.
164 */
165 int cross;
166#endif
167}; 161};
168 162
169static inline void lockdep_copy_map(struct lockdep_map *to, 163static inline void lockdep_copy_map(struct lockdep_map *to,
@@ -267,96 +261,9 @@ struct held_lock {
267 unsigned int hardirqs_off:1; 261 unsigned int hardirqs_off:1;
268 unsigned int references:12; /* 32 bits */ 262 unsigned int references:12; /* 32 bits */
269 unsigned int pin_count; 263 unsigned int pin_count;
270#ifdef CONFIG_LOCKDEP_CROSSRELEASE
271 /*
272 * Generation id.
273 *
274 * A value of cross_gen_id will be stored when holding this,
275 * which is globally increased whenever each crosslock is held.
276 */
277 unsigned int gen_id;
278#endif
279};
280
281#ifdef CONFIG_LOCKDEP_CROSSRELEASE
282#define MAX_XHLOCK_TRACE_ENTRIES 5
283
284/*
285 * This is for keeping locks waiting for commit so that true dependencies
286 * can be added at commit step.
287 */
288struct hist_lock {
289 /*
290 * Id for each entry in the ring buffer. This is used to
291 * decide whether the ring buffer was overwritten or not.
292 *
293 * For example,
294 *
295 * |<----------- hist_lock ring buffer size ------->|
296 * pppppppppppppppppppppiiiiiiiiiiiiiiiiiiiiiiiiiiiii
297 * wrapped > iiiiiiiiiiiiiiiiiiiiiiiiiii.......................
298 *
299 * where 'p' represents an acquisition in process
300 * context, 'i' represents an acquisition in irq
301 * context.
302 *
303 * In this example, the ring buffer was overwritten by
304 * acquisitions in irq context, that should be detected on
305 * rollback or commit.
306 */
307 unsigned int hist_id;
308
309 /*
310 * Seperate stack_trace data. This will be used at commit step.
311 */
312 struct stack_trace trace;
313 unsigned long trace_entries[MAX_XHLOCK_TRACE_ENTRIES];
314
315 /*
316 * Seperate hlock instance. This will be used at commit step.
317 *
318 * TODO: Use a smaller data structure containing only necessary
319 * data. However, we should make lockdep code able to handle the
320 * smaller one first.
321 */
322 struct held_lock hlock;
323}; 264};
324 265
325/* 266/*
326 * To initialize a lock as crosslock, lockdep_init_map_crosslock() should
327 * be called instead of lockdep_init_map().
328 */
329struct cross_lock {
330 /*
331 * When more than one acquisition of crosslocks are overlapped,
332 * we have to perform commit for them based on cross_gen_id of
333 * the first acquisition, which allows us to add more true
334 * dependencies.
335 *
336 * Moreover, when no acquisition of a crosslock is in progress,
337 * we should not perform commit because the lock might not exist
338 * any more, which might cause incorrect memory access. So we
339 * have to track the number of acquisitions of a crosslock.
340 */
341 int nr_acquire;
342
343 /*
344 * Seperate hlock instance. This will be used at commit step.
345 *
346 * TODO: Use a smaller data structure containing only necessary
347 * data. However, we should make lockdep code able to handle the
348 * smaller one first.
349 */
350 struct held_lock hlock;
351};
352
353struct lockdep_map_cross {
354 struct lockdep_map map;
355 struct cross_lock xlock;
356};
357#endif
358
359/*
360 * Initialization, self-test and debugging-output methods: 267 * Initialization, self-test and debugging-output methods:
361 */ 268 */
362extern void lockdep_info(void); 269extern void lockdep_info(void);
@@ -560,37 +467,6 @@ enum xhlock_context_t {
560 XHLOCK_CTX_NR, 467 XHLOCK_CTX_NR,
561}; 468};
562 469
563#ifdef CONFIG_LOCKDEP_CROSSRELEASE
564extern void lockdep_init_map_crosslock(struct lockdep_map *lock,
565 const char *name,
566 struct lock_class_key *key,
567 int subclass);
568extern void lock_commit_crosslock(struct lockdep_map *lock);
569
570/*
571 * What we essencially have to initialize is 'nr_acquire'. Other members
572 * will be initialized in add_xlock().
573 */
574#define STATIC_CROSS_LOCK_INIT() \
575 { .nr_acquire = 0,}
576
577#define STATIC_CROSS_LOCKDEP_MAP_INIT(_name, _key) \
578 { .map.name = (_name), .map.key = (void *)(_key), \
579 .map.cross = 1, .xlock = STATIC_CROSS_LOCK_INIT(), }
580
581/*
582 * To initialize a lockdep_map statically use this macro.
583 * Note that _name must not be NULL.
584 */
585#define STATIC_LOCKDEP_MAP_INIT(_name, _key) \
586 { .name = (_name), .key = (void *)(_key), .cross = 0, }
587
588extern void crossrelease_hist_start(enum xhlock_context_t c);
589extern void crossrelease_hist_end(enum xhlock_context_t c);
590extern void lockdep_invariant_state(bool force);
591extern void lockdep_init_task(struct task_struct *task);
592extern void lockdep_free_task(struct task_struct *task);
593#else /* !CROSSRELEASE */
594#define lockdep_init_map_crosslock(m, n, k, s) do {} while (0) 470#define lockdep_init_map_crosslock(m, n, k, s) do {} while (0)
595/* 471/*
596 * To initialize a lockdep_map statically use this macro. 472 * To initialize a lockdep_map statically use this macro.
@@ -604,7 +480,6 @@ static inline void crossrelease_hist_end(enum xhlock_context_t c) {}
604static inline void lockdep_invariant_state(bool force) {} 480static inline void lockdep_invariant_state(bool force) {}
605static inline void lockdep_init_task(struct task_struct *task) {} 481static inline void lockdep_init_task(struct task_struct *task) {}
606static inline void lockdep_free_task(struct task_struct *task) {} 482static inline void lockdep_free_task(struct task_struct *task) {}
607#endif /* CROSSRELEASE */
608 483
609#ifdef CONFIG_LOCK_STAT 484#ifdef CONFIG_LOCK_STAT
610 485
diff --git a/include/linux/rwlock_types.h b/include/linux/rwlock_types.h
index cc0072e93e36..857a72ceb794 100644
--- a/include/linux/rwlock_types.h
+++ b/include/linux/rwlock_types.h
@@ -10,9 +10,6 @@
10 */ 10 */
11typedef struct { 11typedef struct {
12 arch_rwlock_t raw_lock; 12 arch_rwlock_t raw_lock;
13#ifdef CONFIG_GENERIC_LOCKBREAK
14 unsigned int break_lock;
15#endif
16#ifdef CONFIG_DEBUG_SPINLOCK 13#ifdef CONFIG_DEBUG_SPINLOCK
17 unsigned int magic, owner_cpu; 14 unsigned int magic, owner_cpu;
18 void *owner; 15 void *owner;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5124ba709830..d2588263a989 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -849,17 +849,6 @@ struct task_struct {
849 struct held_lock held_locks[MAX_LOCK_DEPTH]; 849 struct held_lock held_locks[MAX_LOCK_DEPTH];
850#endif 850#endif
851 851
852#ifdef CONFIG_LOCKDEP_CROSSRELEASE
853#define MAX_XHLOCKS_NR 64UL
854 struct hist_lock *xhlocks; /* Crossrelease history locks */
855 unsigned int xhlock_idx;
856 /* For restoring at history boundaries */
857 unsigned int xhlock_idx_hist[XHLOCK_CTX_NR];
858 unsigned int hist_id;
859 /* For overwrite check at each context exit */
860 unsigned int hist_id_save[XHLOCK_CTX_NR];
861#endif
862
863#ifdef CONFIG_UBSAN 852#ifdef CONFIG_UBSAN
864 unsigned int in_ubsan; 853 unsigned int in_ubsan;
865#endif 854#endif
diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index a39186194cd6..3bf273538840 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -107,16 +107,11 @@ do { \
107 107
108#define raw_spin_is_locked(lock) arch_spin_is_locked(&(lock)->raw_lock) 108#define raw_spin_is_locked(lock) arch_spin_is_locked(&(lock)->raw_lock)
109 109
110#ifdef CONFIG_GENERIC_LOCKBREAK
111#define raw_spin_is_contended(lock) ((lock)->break_lock)
112#else
113
114#ifdef arch_spin_is_contended 110#ifdef arch_spin_is_contended
115#define raw_spin_is_contended(lock) arch_spin_is_contended(&(lock)->raw_lock) 111#define raw_spin_is_contended(lock) arch_spin_is_contended(&(lock)->raw_lock)
116#else 112#else
117#define raw_spin_is_contended(lock) (((void)(lock), 0)) 113#define raw_spin_is_contended(lock) (((void)(lock), 0))
118#endif /*arch_spin_is_contended*/ 114#endif /*arch_spin_is_contended*/
119#endif
120 115
121/* 116/*
122 * This barrier must provide two things: 117 * This barrier must provide two things:
diff --git a/include/linux/spinlock_types.h b/include/linux/spinlock_types.h
index 73548eb13a5d..24b4e6f2c1a2 100644
--- a/include/linux/spinlock_types.h
+++ b/include/linux/spinlock_types.h
@@ -19,9 +19,6 @@
19 19
20typedef struct raw_spinlock { 20typedef struct raw_spinlock {
21 arch_spinlock_t raw_lock; 21 arch_spinlock_t raw_lock;
22#ifdef CONFIG_GENERIC_LOCKBREAK
23 unsigned int break_lock;
24#endif
25#ifdef CONFIG_DEBUG_SPINLOCK 22#ifdef CONFIG_DEBUG_SPINLOCK
26 unsigned int magic, owner_cpu; 23 unsigned int magic, owner_cpu;
27 void *owner; 24 void *owner;
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 670d8d7d8087..5fa1324a4f29 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -57,10 +57,6 @@
57#define CREATE_TRACE_POINTS 57#define CREATE_TRACE_POINTS
58#include <trace/events/lock.h> 58#include <trace/events/lock.h>
59 59
60#ifdef CONFIG_LOCKDEP_CROSSRELEASE
61#include <linux/slab.h>
62#endif
63
64#ifdef CONFIG_PROVE_LOCKING 60#ifdef CONFIG_PROVE_LOCKING
65int prove_locking = 1; 61int prove_locking = 1;
66module_param(prove_locking, int, 0644); 62module_param(prove_locking, int, 0644);
@@ -75,19 +71,6 @@ module_param(lock_stat, int, 0644);
75#define lock_stat 0 71#define lock_stat 0
76#endif 72#endif
77 73
78#ifdef CONFIG_BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK
79static int crossrelease_fullstack = 1;
80#else
81static int crossrelease_fullstack;
82#endif
83static int __init allow_crossrelease_fullstack(char *str)
84{
85 crossrelease_fullstack = 1;
86 return 0;
87}
88
89early_param("crossrelease_fullstack", allow_crossrelease_fullstack);
90
91/* 74/*
92 * lockdep_lock: protects the lockdep graph, the hashes and the 75 * lockdep_lock: protects the lockdep graph, the hashes and the
93 * class/list/hash allocators. 76 * class/list/hash allocators.
@@ -740,18 +723,6 @@ look_up_lock_class(struct lockdep_map *lock, unsigned int subclass)
740 return is_static || static_obj(lock->key) ? NULL : ERR_PTR(-EINVAL); 723 return is_static || static_obj(lock->key) ? NULL : ERR_PTR(-EINVAL);
741} 724}
742 725
743#ifdef CONFIG_LOCKDEP_CROSSRELEASE
744static void cross_init(struct lockdep_map *lock, int cross);
745static int cross_lock(struct lockdep_map *lock);
746static int lock_acquire_crosslock(struct held_lock *hlock);
747static int lock_release_crosslock(struct lockdep_map *lock);
748#else
749static inline void cross_init(struct lockdep_map *lock, int cross) {}
750static inline int cross_lock(struct lockdep_map *lock) { return 0; }
751static inline int lock_acquire_crosslock(struct held_lock *hlock) { return 2; }
752static inline int lock_release_crosslock(struct lockdep_map *lock) { return 2; }
753#endif
754
755/* 726/*
756 * Register a lock's class in the hash-table, if the class is not present 727 * Register a lock's class in the hash-table, if the class is not present
757 * yet. Otherwise we look it up. We cache the result in the lock object 728 * yet. Otherwise we look it up. We cache the result in the lock object
@@ -1151,41 +1122,22 @@ print_circular_lock_scenario(struct held_lock *src,
1151 printk(KERN_CONT "\n\n"); 1122 printk(KERN_CONT "\n\n");
1152 } 1123 }
1153 1124
1154 if (cross_lock(tgt->instance)) { 1125 printk(" Possible unsafe locking scenario:\n\n");
1155 printk(" Possible unsafe locking scenario by crosslock:\n\n"); 1126 printk(" CPU0 CPU1\n");
1156 printk(" CPU0 CPU1\n"); 1127 printk(" ---- ----\n");
1157 printk(" ---- ----\n"); 1128 printk(" lock(");
1158 printk(" lock("); 1129 __print_lock_name(target);
1159 __print_lock_name(parent); 1130 printk(KERN_CONT ");\n");
1160 printk(KERN_CONT ");\n"); 1131 printk(" lock(");
1161 printk(" lock("); 1132 __print_lock_name(parent);
1162 __print_lock_name(target); 1133 printk(KERN_CONT ");\n");
1163 printk(KERN_CONT ");\n"); 1134 printk(" lock(");
1164 printk(" lock("); 1135 __print_lock_name(target);
1165 __print_lock_name(source); 1136 printk(KERN_CONT ");\n");
1166 printk(KERN_CONT ");\n"); 1137 printk(" lock(");
1167 printk(" unlock("); 1138 __print_lock_name(source);
1168 __print_lock_name(target); 1139 printk(KERN_CONT ");\n");
1169 printk(KERN_CONT ");\n"); 1140 printk("\n *** DEADLOCK ***\n\n");
1170 printk("\n *** DEADLOCK ***\n\n");
1171 } else {
1172 printk(" Possible unsafe locking scenario:\n\n");
1173 printk(" CPU0 CPU1\n");
1174 printk(" ---- ----\n");
1175 printk(" lock(");
1176 __print_lock_name(target);
1177 printk(KERN_CONT ");\n");
1178 printk(" lock(");
1179 __print_lock_name(parent);
1180 printk(KERN_CONT ");\n");
1181 printk(" lock(");
1182 __print_lock_name(target);
1183 printk(KERN_CONT ");\n");
1184 printk(" lock(");
1185 __print_lock_name(source);
1186 printk(KERN_CONT ");\n");
1187 printk("\n *** DEADLOCK ***\n\n");
1188 }
1189} 1141}
1190 1142
1191/* 1143/*
@@ -1211,10 +1163,7 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
1211 curr->comm, task_pid_nr(curr)); 1163 curr->comm, task_pid_nr(curr));
1212 print_lock(check_src); 1164 print_lock(check_src);
1213 1165
1214 if (cross_lock(check_tgt->instance)) 1166 pr_warn("\nbut task is already holding lock:\n");
1215 pr_warn("\nbut now in release context of a crosslock acquired at the following:\n");
1216 else
1217 pr_warn("\nbut task is already holding lock:\n");
1218 1167
1219 print_lock(check_tgt); 1168 print_lock(check_tgt);
1220 pr_warn("\nwhich lock already depends on the new lock.\n\n"); 1169 pr_warn("\nwhich lock already depends on the new lock.\n\n");
@@ -1244,9 +1193,7 @@ static noinline int print_circular_bug(struct lock_list *this,
1244 if (!debug_locks_off_graph_unlock() || debug_locks_silent) 1193 if (!debug_locks_off_graph_unlock() || debug_locks_silent)
1245 return 0; 1194 return 0;
1246 1195
1247 if (cross_lock(check_tgt->instance)) 1196 if (!save_trace(&this->trace))
1248 this->trace = *trace;
1249 else if (!save_trace(&this->trace))
1250 return 0; 1197 return 0;
1251 1198
1252 depth = get_lock_depth(target); 1199 depth = get_lock_depth(target);
@@ -1850,9 +1797,6 @@ check_deadlock(struct task_struct *curr, struct held_lock *next,
1850 if (nest) 1797 if (nest)
1851 return 2; 1798 return 2;
1852 1799
1853 if (cross_lock(prev->instance))
1854 continue;
1855
1856 return print_deadlock_bug(curr, prev, next); 1800 return print_deadlock_bug(curr, prev, next);
1857 } 1801 }
1858 return 1; 1802 return 1;
@@ -2018,31 +1962,26 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
2018 for (;;) { 1962 for (;;) {
2019 int distance = curr->lockdep_depth - depth + 1; 1963 int distance = curr->lockdep_depth - depth + 1;
2020 hlock = curr->held_locks + depth - 1; 1964 hlock = curr->held_locks + depth - 1;
1965
2021 /* 1966 /*
2022 * Only non-crosslock entries get new dependencies added. 1967 * Only non-recursive-read entries get new dependencies
2023 * Crosslock entries will be added by commit later: 1968 * added:
2024 */ 1969 */
2025 if (!cross_lock(hlock->instance)) { 1970 if (hlock->read != 2 && hlock->check) {
1971 int ret = check_prev_add(curr, hlock, next, distance, &trace, save_trace);
1972 if (!ret)
1973 return 0;
1974
2026 /* 1975 /*
2027 * Only non-recursive-read entries get new dependencies 1976 * Stop after the first non-trylock entry,
2028 * added: 1977 * as non-trylock entries have added their
1978 * own direct dependencies already, so this
1979 * lock is connected to them indirectly:
2029 */ 1980 */
2030 if (hlock->read != 2 && hlock->check) { 1981 if (!hlock->trylock)
2031 int ret = check_prev_add(curr, hlock, next, 1982 break;
2032 distance, &trace, save_trace);
2033 if (!ret)
2034 return 0;
2035
2036 /*
2037 * Stop after the first non-trylock entry,
2038 * as non-trylock entries have added their
2039 * own direct dependencies already, so this
2040 * lock is connected to them indirectly:
2041 */
2042 if (!hlock->trylock)
2043 break;
2044 }
2045 } 1983 }
1984
2046 depth--; 1985 depth--;
2047 /* 1986 /*
2048 * End of lock-stack? 1987 * End of lock-stack?
@@ -3292,21 +3231,10 @@ static void __lockdep_init_map(struct lockdep_map *lock, const char *name,
3292void lockdep_init_map(struct lockdep_map *lock, const char *name, 3231void lockdep_init_map(struct lockdep_map *lock, const char *name,
3293 struct lock_class_key *key, int subclass) 3232 struct lock_class_key *key, int subclass)
3294{ 3233{
3295 cross_init(lock, 0);
3296 __lockdep_init_map(lock, name, key, subclass); 3234 __lockdep_init_map(lock, name, key, subclass);
3297} 3235}
3298EXPORT_SYMBOL_GPL(lockdep_init_map); 3236EXPORT_SYMBOL_GPL(lockdep_init_map);
3299 3237
3300#ifdef CONFIG_LOCKDEP_CROSSRELEASE
3301void lockdep_init_map_crosslock(struct lockdep_map *lock, const char *name,
3302 struct lock_class_key *key, int subclass)
3303{
3304 cross_init(lock, 1);
3305 __lockdep_init_map(lock, name, key, subclass);
3306}
3307EXPORT_SYMBOL_GPL(lockdep_init_map_crosslock);
3308#endif
3309
3310struct lock_class_key __lockdep_no_validate__; 3238struct lock_class_key __lockdep_no_validate__;
3311EXPORT_SYMBOL_GPL(__lockdep_no_validate__); 3239EXPORT_SYMBOL_GPL(__lockdep_no_validate__);
3312 3240
@@ -3362,7 +3290,6 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
3362 int chain_head = 0; 3290 int chain_head = 0;
3363 int class_idx; 3291 int class_idx;
3364 u64 chain_key; 3292 u64 chain_key;
3365 int ret;
3366 3293
3367 if (unlikely(!debug_locks)) 3294 if (unlikely(!debug_locks))
3368 return 0; 3295 return 0;
@@ -3411,8 +3338,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
3411 3338
3412 class_idx = class - lock_classes + 1; 3339 class_idx = class - lock_classes + 1;
3413 3340
3414 /* TODO: nest_lock is not implemented for crosslock yet. */ 3341 if (depth) {
3415 if (depth && !cross_lock(lock)) {
3416 hlock = curr->held_locks + depth - 1; 3342 hlock = curr->held_locks + depth - 1;
3417 if (hlock->class_idx == class_idx && nest_lock) { 3343 if (hlock->class_idx == class_idx && nest_lock) {
3418 if (hlock->references) { 3344 if (hlock->references) {
@@ -3500,14 +3426,6 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
3500 if (!validate_chain(curr, lock, hlock, chain_head, chain_key)) 3426 if (!validate_chain(curr, lock, hlock, chain_head, chain_key))
3501 return 0; 3427 return 0;
3502 3428
3503 ret = lock_acquire_crosslock(hlock);
3504 /*
3505 * 2 means normal acquire operations are needed. Otherwise, it's
3506 * ok just to return with '0:fail, 1:success'.
3507 */
3508 if (ret != 2)
3509 return ret;
3510
3511 curr->curr_chain_key = chain_key; 3429 curr->curr_chain_key = chain_key;
3512 curr->lockdep_depth++; 3430 curr->lockdep_depth++;
3513 check_chain_key(curr); 3431 check_chain_key(curr);
@@ -3745,19 +3663,11 @@ __lock_release(struct lockdep_map *lock, int nested, unsigned long ip)
3745 struct task_struct *curr = current; 3663 struct task_struct *curr = current;
3746 struct held_lock *hlock; 3664 struct held_lock *hlock;
3747 unsigned int depth; 3665 unsigned int depth;
3748 int ret, i; 3666 int i;
3749 3667
3750 if (unlikely(!debug_locks)) 3668 if (unlikely(!debug_locks))
3751 return 0; 3669 return 0;
3752 3670
3753 ret = lock_release_crosslock(lock);
3754 /*
3755 * 2 means normal release operations are needed. Otherwise, it's
3756 * ok just to return with '0:fail, 1:success'.
3757 */
3758 if (ret != 2)
3759 return ret;
3760
3761 depth = curr->lockdep_depth; 3671 depth = curr->lockdep_depth;
3762 /* 3672 /*
3763 * So we're all set to release this lock.. wait what lock? We don't 3673 * So we're all set to release this lock.. wait what lock? We don't
@@ -4675,495 +4585,3 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
4675 dump_stack(); 4585 dump_stack();
4676} 4586}
4677EXPORT_SYMBOL_GPL(lockdep_rcu_suspicious); 4587EXPORT_SYMBOL_GPL(lockdep_rcu_suspicious);
4678
4679#ifdef CONFIG_LOCKDEP_CROSSRELEASE
4680
4681/*
4682 * Crossrelease works by recording a lock history for each thread and
4683 * connecting those historic locks that were taken after the
4684 * wait_for_completion() in the complete() context.
4685 *
4686 * Task-A Task-B
4687 *
4688 * mutex_lock(&A);
4689 * mutex_unlock(&A);
4690 *
4691 * wait_for_completion(&C);
4692 * lock_acquire_crosslock();
4693 * atomic_inc_return(&cross_gen_id);
4694 * |
4695 * | mutex_lock(&B);
4696 * | mutex_unlock(&B);
4697 * |
4698 * | complete(&C);
4699 * `-- lock_commit_crosslock();
4700 *
4701 * Which will then add a dependency between B and C.
4702 */
4703
4704#define xhlock(i) (current->xhlocks[(i) % MAX_XHLOCKS_NR])
4705
4706/*
4707 * Whenever a crosslock is held, cross_gen_id will be increased.
4708 */
4709static atomic_t cross_gen_id; /* Can be wrapped */
4710
4711/*
4712 * Make an entry of the ring buffer invalid.
4713 */
4714static inline void invalidate_xhlock(struct hist_lock *xhlock)
4715{
4716 /*
4717 * Normally, xhlock->hlock.instance must be !NULL.
4718 */
4719 xhlock->hlock.instance = NULL;
4720}
4721
4722/*
4723 * Lock history stacks; we have 2 nested lock history stacks:
4724 *
4725 * HARD(IRQ)
4726 * SOFT(IRQ)
4727 *
4728 * The thing is that once we complete a HARD/SOFT IRQ the future task locks
4729 * should not depend on any of the locks observed while running the IRQ. So
4730 * what we do is rewind the history buffer and erase all our knowledge of that
4731 * temporal event.
4732 */
4733
4734void crossrelease_hist_start(enum xhlock_context_t c)
4735{
4736 struct task_struct *cur = current;
4737
4738 if (!cur->xhlocks)
4739 return;
4740
4741 cur->xhlock_idx_hist[c] = cur->xhlock_idx;
4742 cur->hist_id_save[c] = cur->hist_id;
4743}
4744
4745void crossrelease_hist_end(enum xhlock_context_t c)
4746{
4747 struct task_struct *cur = current;
4748
4749 if (cur->xhlocks) {
4750 unsigned int idx = cur->xhlock_idx_hist[c];
4751 struct hist_lock *h = &xhlock(idx);
4752
4753 cur->xhlock_idx = idx;
4754
4755 /* Check if the ring was overwritten. */
4756 if (h->hist_id != cur->hist_id_save[c])
4757 invalidate_xhlock(h);
4758 }
4759}
4760
4761/*
4762 * lockdep_invariant_state() is used to annotate independence inside a task, to
4763 * make one task look like multiple independent 'tasks'.
4764 *
4765 * Take for instance workqueues; each work is independent of the last. The
4766 * completion of a future work does not depend on the completion of a past work
4767 * (in general). Therefore we must not carry that (lock) dependency across
4768 * works.
4769 *
4770 * This is true for many things; pretty much all kthreads fall into this
4771 * pattern, where they have an invariant state and future completions do not
4772 * depend on past completions. Its just that since they all have the 'same'
4773 * form -- the kthread does the same over and over -- it doesn't typically
4774 * matter.
4775 *
4776 * The same is true for system-calls, once a system call is completed (we've
4777 * returned to userspace) the next system call does not depend on the lock
4778 * history of the previous system call.
4779 *
4780 * They key property for independence, this invariant state, is that it must be
4781 * a point where we hold no locks and have no history. Because if we were to
4782 * hold locks, the restore at _end() would not necessarily recover it's history
4783 * entry. Similarly, independence per-definition means it does not depend on
4784 * prior state.
4785 */
4786void lockdep_invariant_state(bool force)
4787{
4788 /*
4789 * We call this at an invariant point, no current state, no history.
4790 * Verify the former, enforce the latter.
4791 */
4792 WARN_ON_ONCE(!force && current->lockdep_depth);
4793 if (current->xhlocks)
4794 invalidate_xhlock(&xhlock(current->xhlock_idx));
4795}
4796
4797static int cross_lock(struct lockdep_map *lock)
4798{
4799 return lock ? lock->cross : 0;
4800}
4801
4802/*
4803 * This is needed to decide the relationship between wrapable variables.
4804 */
4805static inline int before(unsigned int a, unsigned int b)
4806{
4807 return (int)(a - b) < 0;
4808}
4809
4810static inline struct lock_class *xhlock_class(struct hist_lock *xhlock)
4811{
4812 return hlock_class(&xhlock->hlock);
4813}
4814
4815static inline struct lock_class *xlock_class(struct cross_lock *xlock)
4816{
4817 return hlock_class(&xlock->hlock);
4818}
4819
4820/*
4821 * Should we check a dependency with previous one?
4822 */
4823static inline int depend_before(struct held_lock *hlock)
4824{
4825 return hlock->read != 2 && hlock->check && !hlock->trylock;
4826}
4827
4828/*
4829 * Should we check a dependency with next one?
4830 */
4831static inline int depend_after(struct held_lock *hlock)
4832{
4833 return hlock->read != 2 && hlock->check;
4834}
4835
4836/*
4837 * Check if the xhlock is valid, which would be false if,
4838 *
4839 * 1. Has not used after initializaion yet.
4840 * 2. Got invalidated.
4841 *
4842 * Remind hist_lock is implemented as a ring buffer.
4843 */
4844static inline int xhlock_valid(struct hist_lock *xhlock)
4845{
4846 /*
4847 * xhlock->hlock.instance must be !NULL.
4848 */
4849 return !!xhlock->hlock.instance;
4850}
4851
4852/*
4853 * Record a hist_lock entry.
4854 *
4855 * Irq disable is only required.
4856 */
4857static void add_xhlock(struct held_lock *hlock)
4858{
4859 unsigned int idx = ++current->xhlock_idx;
4860 struct hist_lock *xhlock = &xhlock(idx);
4861
4862#ifdef CONFIG_DEBUG_LOCKDEP
4863 /*
4864 * This can be done locklessly because they are all task-local
4865 * state, we must however ensure IRQs are disabled.
4866 */
4867 WARN_ON_ONCE(!irqs_disabled());
4868#endif
4869
4870 /* Initialize hist_lock's members */
4871 xhlock->hlock = *hlock;
4872 xhlock->hist_id = ++current->hist_id;
4873
4874 xhlock->trace.nr_entries = 0;
4875 xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
4876 xhlock->trace.entries = xhlock->trace_entries;
4877
4878 if (crossrelease_fullstack) {
4879 xhlock->trace.skip = 3;
4880 save_stack_trace(&xhlock->trace);
4881 } else {
4882 xhlock->trace.nr_entries = 1;
4883 xhlock->trace.entries[0] = hlock->acquire_ip;
4884 }
4885}
4886
4887static inline int same_context_xhlock(struct hist_lock *xhlock)
4888{
4889 return xhlock->hlock.irq_context == task_irq_context(current);
4890}
4891
4892/*
4893 * This should be lockless as far as possible because this would be
4894 * called very frequently.
4895 */
4896static void check_add_xhlock(struct held_lock *hlock)
4897{
4898 /*
4899 * Record a hist_lock, only in case that acquisitions ahead
4900 * could depend on the held_lock. For example, if the held_lock
4901 * is trylock then acquisitions ahead never depends on that.
4902 * In that case, we don't need to record it. Just return.
4903 */
4904 if (!current->xhlocks || !depend_before(hlock))
4905 return;
4906
4907 add_xhlock(hlock);
4908}
4909
4910/*
4911 * For crosslock.
4912 */
4913static int add_xlock(struct held_lock *hlock)
4914{
4915 struct cross_lock *xlock;
4916 unsigned int gen_id;
4917
4918 if (!graph_lock())
4919 return 0;
4920
4921 xlock = &((struct lockdep_map_cross *)hlock->instance)->xlock;
4922
4923 /*
4924 * When acquisitions for a crosslock are overlapped, we use
4925 * nr_acquire to perform commit for them, based on cross_gen_id
4926 * of the first acquisition, which allows to add additional
4927 * dependencies.
4928 *
4929 * Moreover, when no acquisition of a crosslock is in progress,
4930 * we should not perform commit because the lock might not exist
4931 * any more, which might cause incorrect memory access. So we
4932 * have to track the number of acquisitions of a crosslock.
4933 *
4934 * depend_after() is necessary to initialize only the first
4935 * valid xlock so that the xlock can be used on its commit.
4936 */
4937 if (xlock->nr_acquire++ && depend_after(&xlock->hlock))
4938 goto unlock;
4939
4940 gen_id = (unsigned int)atomic_inc_return(&cross_gen_id);
4941 xlock->hlock = *hlock;
4942 xlock->hlock.gen_id = gen_id;
4943unlock:
4944 graph_unlock();
4945 return 1;
4946}
4947
4948/*
4949 * Called for both normal and crosslock acquires. Normal locks will be
4950 * pushed on the hist_lock queue. Cross locks will record state and
4951 * stop regular lock_acquire() to avoid being placed on the held_lock
4952 * stack.
4953 *
4954 * Return: 0 - failure;
4955 * 1 - crosslock, done;
4956 * 2 - normal lock, continue to held_lock[] ops.
4957 */
4958static int lock_acquire_crosslock(struct held_lock *hlock)
4959{
4960 /*
4961 * CONTEXT 1 CONTEXT 2
4962 * --------- ---------
4963 * lock A (cross)
4964 * X = atomic_inc_return(&cross_gen_id)
4965 * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4966 * Y = atomic_read_acquire(&cross_gen_id)
4967 * lock B
4968 *
4969 * atomic_read_acquire() is for ordering between A and B,
4970 * IOW, A happens before B, when CONTEXT 2 see Y >= X.
4971 *
4972 * Pairs with atomic_inc_return() in add_xlock().
4973 */
4974 hlock->gen_id = (unsigned int)atomic_read_acquire(&cross_gen_id);
4975
4976 if (cross_lock(hlock->instance))
4977 return add_xlock(hlock);
4978
4979 check_add_xhlock(hlock);
4980 return 2;
4981}
4982
4983static int copy_trace(struct stack_trace *trace)
4984{
4985 unsigned long *buf = stack_trace + nr_stack_trace_entries;
4986 unsigned int max_nr = MAX_STACK_TRACE_ENTRIES - nr_stack_trace_entries;
4987 unsigned int nr = min(max_nr, trace->nr_entries);
4988
4989 trace->nr_entries = nr;
4990 memcpy(buf, trace->entries, nr * sizeof(trace->entries[0]));
4991 trace->entries = buf;
4992 nr_stack_trace_entries += nr;
4993
4994 if (nr_stack_trace_entries >= MAX_STACK_TRACE_ENTRIES-1) {
4995 if (!debug_locks_off_graph_unlock())
4996 return 0;
4997
4998 print_lockdep_off("BUG: MAX_STACK_TRACE_ENTRIES too low!");
4999 dump_stack();
5000
5001 return 0;
5002 }
5003
5004 return 1;
5005}
5006
5007static int commit_xhlock(struct cross_lock *xlock, struct hist_lock *xhlock)
5008{
5009 unsigned int xid, pid;
5010 u64 chain_key;
5011
5012 xid = xlock_class(xlock) - lock_classes;
5013 chain_key = iterate_chain_key((u64)0, xid);
5014 pid = xhlock_class(xhlock) - lock_classes;
5015 chain_key = iterate_chain_key(chain_key, pid);
5016
5017 if (lookup_chain_cache(chain_key))
5018 return 1;
5019
5020 if (!add_chain_cache_classes(xid, pid, xhlock->hlock.irq_context,
5021 chain_key))
5022 return 0;
5023
5024 if (!check_prev_add(current, &xlock->hlock, &xhlock->hlock, 1,
5025 &xhlock->trace, copy_trace))
5026 return 0;
5027
5028 return 1;
5029}
5030
5031static void commit_xhlocks(struct cross_lock *xlock)
5032{
5033 unsigned int cur = current->xhlock_idx;
5034 unsigned int prev_hist_id = xhlock(cur).hist_id;
5035 unsigned int i;
5036
5037 if (!graph_lock())
5038 return;
5039
5040 if (xlock->nr_acquire) {
5041 for (i = 0; i < MAX_XHLOCKS_NR; i++) {
5042 struct hist_lock *xhlock = &xhlock(cur - i);
5043
5044 if (!xhlock_valid(xhlock))
5045 break;
5046
5047 if (before(xhlock->hlock.gen_id, xlock->hlock.gen_id))
5048 break;
5049
5050 if (!same_context_xhlock(xhlock))
5051 break;
5052
5053 /*
5054 * Filter out the cases where the ring buffer was
5055 * overwritten and the current entry has a bigger
5056 * hist_id than the previous one, which is impossible
5057 * otherwise:
5058 */
5059 if (unlikely(before(prev_hist_id, xhlock->hist_id)))
5060 break;
5061
5062 prev_hist_id = xhlock->hist_id;
5063
5064 /*
5065 * commit_xhlock() returns 0 with graph_lock already
5066 * released if fail.
5067 */
5068 if (!commit_xhlock(xlock, xhlock))
5069 return;
5070 }
5071 }
5072
5073 graph_unlock();
5074}
5075
5076void lock_commit_crosslock(struct lockdep_map *lock)
5077{
5078 struct cross_lock *xlock;
5079 unsigned long flags;
5080
5081 if (unlikely(!debug_locks || current->lockdep_recursion))
5082 return;
5083
5084 if (!current->xhlocks)
5085 return;
5086
5087 /*
5088 * Do commit hist_locks with the cross_lock, only in case that
5089 * the cross_lock could depend on acquisitions after that.
5090 *
5091 * For example, if the cross_lock does not have the 'check' flag
5092 * then we don't need to check dependencies and commit for that.
5093 * Just skip it. In that case, of course, the cross_lock does
5094 * not depend on acquisitions ahead, either.
5095 *
5096 * WARNING: Don't do that in add_xlock() in advance. When an
5097 * acquisition context is different from the commit context,
5098 * invalid(skipped) cross_lock might be accessed.
5099 */
5100 if (!depend_after(&((struct lockdep_map_cross *)lock)->xlock.hlock))
5101 return;
5102
5103 raw_local_irq_save(flags);
5104 check_flags(flags);
5105 current->lockdep_recursion = 1;
5106 xlock = &((struct lockdep_map_cross *)lock)->xlock;
5107 commit_xhlocks(xlock);
5108 current->lockdep_recursion = 0;
5109 raw_local_irq_restore(flags);
5110}
5111EXPORT_SYMBOL_GPL(lock_commit_crosslock);
5112
5113/*
5114 * Return: 0 - failure;
5115 * 1 - crosslock, done;
5116 * 2 - normal lock, continue to held_lock[] ops.
5117 */
5118static int lock_release_crosslock(struct lockdep_map *lock)
5119{
5120 if (cross_lock(lock)) {
5121 if (!graph_lock())
5122 return 0;
5123 ((struct lockdep_map_cross *)lock)->xlock.nr_acquire--;
5124 graph_unlock();
5125 return 1;
5126 }
5127 return 2;
5128}
5129
5130static void cross_init(struct lockdep_map *lock, int cross)
5131{
5132 if (cross)
5133 ((struct lockdep_map_cross *)lock)->xlock.nr_acquire = 0;
5134
5135 lock->cross = cross;
5136
5137 /*
5138 * Crossrelease assumes that the ring buffer size of xhlocks
5139 * is aligned with power of 2. So force it on build.
5140 */
5141 BUILD_BUG_ON(MAX_XHLOCKS_NR & (MAX_XHLOCKS_NR - 1));
5142}
5143
5144void lockdep_init_task(struct task_struct *task)
5145{
5146 int i;
5147
5148 task->xhlock_idx = UINT_MAX;
5149 task->hist_id = 0;
5150
5151 for (i = 0; i < XHLOCK_CTX_NR; i++) {
5152 task->xhlock_idx_hist[i] = UINT_MAX;
5153 task->hist_id_save[i] = 0;
5154 }
5155
5156 task->xhlocks = kzalloc(sizeof(struct hist_lock) * MAX_XHLOCKS_NR,
5157 GFP_KERNEL);
5158}
5159
5160void lockdep_free_task(struct task_struct *task)
5161{
5162 if (task->xhlocks) {
5163 void *tmp = task->xhlocks;
5164 /* Diable crossrelease for current */
5165 task->xhlocks = NULL;
5166 kfree(tmp);
5167 }
5168}
5169#endif
diff --git a/kernel/locking/spinlock.c b/kernel/locking/spinlock.c
index 1fd1a7543cdd..936f3d14dd6b 100644
--- a/kernel/locking/spinlock.c
+++ b/kernel/locking/spinlock.c
@@ -66,12 +66,8 @@ void __lockfunc __raw_##op##_lock(locktype##_t *lock) \
66 break; \ 66 break; \
67 preempt_enable(); \ 67 preempt_enable(); \
68 \ 68 \
69 if (!(lock)->break_lock) \ 69 arch_##op##_relax(&lock->raw_lock); \
70 (lock)->break_lock = 1; \
71 while ((lock)->break_lock) \
72 arch_##op##_relax(&lock->raw_lock); \
73 } \ 70 } \
74 (lock)->break_lock = 0; \
75} \ 71} \
76 \ 72 \
77unsigned long __lockfunc __raw_##op##_lock_irqsave(locktype##_t *lock) \ 73unsigned long __lockfunc __raw_##op##_lock_irqsave(locktype##_t *lock) \
@@ -86,12 +82,9 @@ unsigned long __lockfunc __raw_##op##_lock_irqsave(locktype##_t *lock) \
86 local_irq_restore(flags); \ 82 local_irq_restore(flags); \
87 preempt_enable(); \ 83 preempt_enable(); \
88 \ 84 \
89 if (!(lock)->break_lock) \ 85 arch_##op##_relax(&lock->raw_lock); \
90 (lock)->break_lock = 1; \
91 while ((lock)->break_lock) \
92 arch_##op##_relax(&lock->raw_lock); \
93 } \ 86 } \
94 (lock)->break_lock = 0; \ 87 \
95 return flags; \ 88 return flags; \
96} \ 89} \
97 \ 90 \
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 947d3e2ed5c2..9d5b78aad4c5 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1099,8 +1099,6 @@ config PROVE_LOCKING
1099 select DEBUG_MUTEXES 1099 select DEBUG_MUTEXES
1100 select DEBUG_RT_MUTEXES if RT_MUTEXES 1100 select DEBUG_RT_MUTEXES if RT_MUTEXES
1101 select DEBUG_LOCK_ALLOC 1101 select DEBUG_LOCK_ALLOC
1102 select LOCKDEP_CROSSRELEASE
1103 select LOCKDEP_COMPLETIONS
1104 select TRACE_IRQFLAGS 1102 select TRACE_IRQFLAGS
1105 default n 1103 default n
1106 help 1104 help
@@ -1170,37 +1168,6 @@ config LOCK_STAT
1170 CONFIG_LOCK_STAT defines "contended" and "acquired" lock events. 1168 CONFIG_LOCK_STAT defines "contended" and "acquired" lock events.
1171 (CONFIG_LOCKDEP defines "acquire" and "release" events.) 1169 (CONFIG_LOCKDEP defines "acquire" and "release" events.)
1172 1170
1173config LOCKDEP_CROSSRELEASE
1174 bool
1175 help
1176 This makes lockdep work for crosslock which is a lock allowed to
1177 be released in a different context from the acquisition context.
1178 Normally a lock must be released in the context acquiring the lock.
1179 However, relexing this constraint helps synchronization primitives
1180 such as page locks or completions can use the lock correctness
1181 detector, lockdep.
1182
1183config LOCKDEP_COMPLETIONS
1184 bool
1185 help
1186 A deadlock caused by wait_for_completion() and complete() can be
1187 detected by lockdep using crossrelease feature.
1188
1189config BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK
1190 bool "Enable the boot parameter, crossrelease_fullstack"
1191 depends on LOCKDEP_CROSSRELEASE
1192 default n
1193 help
1194 The lockdep "cross-release" feature needs to record stack traces
1195 (of calling functions) for all acquisitions, for eventual later
1196 use during analysis. By default only a single caller is recorded,
1197 because the unwind operation can be very expensive with deeper
1198 stack chains.
1199
1200 However a boot parameter, crossrelease_fullstack, was
1201 introduced since sometimes deeper traces are required for full
1202 analysis. This option turns on the boot parameter.
1203
1204config DEBUG_LOCKDEP 1171config DEBUG_LOCKDEP
1205 bool "Lock dependency engine debugging" 1172 bool "Lock dependency engine debugging"
1206 depends on DEBUG_KERNEL && LOCKDEP 1173 depends on DEBUG_KERNEL && LOCKDEP
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 040aa79e1d9d..31031f10fe56 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -6233,28 +6233,6 @@ sub process {
6233 } 6233 }
6234 } 6234 }
6235 6235
6236# whine about ACCESS_ONCE
6237 if ($^V && $^V ge 5.10.0 &&
6238 $line =~ /\bACCESS_ONCE\s*$balanced_parens\s*(=(?!=))?\s*($FuncArg)?/) {
6239 my $par = $1;
6240 my $eq = $2;
6241 my $fun = $3;
6242 $par =~ s/^\(\s*(.*)\s*\)$/$1/;
6243 if (defined($eq)) {
6244 if (WARN("PREFER_WRITE_ONCE",
6245 "Prefer WRITE_ONCE(<FOO>, <BAR>) over ACCESS_ONCE(<FOO>) = <BAR>\n" . $herecurr) &&
6246 $fix) {
6247 $fixed[$fixlinenr] =~ s/\bACCESS_ONCE\s*\(\s*\Q$par\E\s*\)\s*$eq\s*\Q$fun\E/WRITE_ONCE($par, $fun)/;
6248 }
6249 } else {
6250 if (WARN("PREFER_READ_ONCE",
6251 "Prefer READ_ONCE(<FOO>) over ACCESS_ONCE(<FOO>)\n" . $herecurr) &&
6252 $fix) {
6253 $fixed[$fixlinenr] =~ s/\bACCESS_ONCE\s*\(\s*\Q$par\E\s*\)/READ_ONCE($par)/;
6254 }
6255 }
6256 }
6257
6258# check for mutex_trylock_recursive usage 6236# check for mutex_trylock_recursive usage
6259 if ($line =~ /mutex_trylock_recursive/) { 6237 if ($line =~ /mutex_trylock_recursive/) {
6260 ERROR("LOCKING", 6238 ERROR("LOCKING",
diff --git a/tools/include/linux/compiler.h b/tools/include/linux/compiler.h
index 07fd03c74a77..04e32f965ad7 100644
--- a/tools/include/linux/compiler.h
+++ b/tools/include/linux/compiler.h
@@ -84,8 +84,6 @@
84 84
85#define uninitialized_var(x) x = *(&(x)) 85#define uninitialized_var(x) x = *(&(x))
86 86
87#define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))
88
89#include <linux/types.h> 87#include <linux/types.h>
90 88
91/* 89/*
@@ -135,20 +133,19 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
135/* 133/*
136 * Prevent the compiler from merging or refetching reads or writes. The 134 * Prevent the compiler from merging or refetching reads or writes. The
137 * compiler is also forbidden from reordering successive instances of 135 * compiler is also forbidden from reordering successive instances of
138 * READ_ONCE, WRITE_ONCE and ACCESS_ONCE (see below), but only when the 136 * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
139 * compiler is aware of some particular ordering. One way to make the 137 * particular ordering. One way to make the compiler aware of ordering is to
140 * compiler aware of ordering is to put the two invocations of READ_ONCE, 138 * put the two invocations of READ_ONCE or WRITE_ONCE in different C
141 * WRITE_ONCE or ACCESS_ONCE() in different C statements. 139 * statements.
142 * 140 *
143 * In contrast to ACCESS_ONCE these two macros will also work on aggregate 141 * These two macros will also work on aggregate data types like structs or
144 * data types like structs or unions. If the size of the accessed data 142 * unions. If the size of the accessed data type exceeds the word size of
145 * type exceeds the word size of the machine (e.g., 32 bits or 64 bits) 143 * the machine (e.g., 32 bits or 64 bits) READ_ONCE() and WRITE_ONCE() will
146 * READ_ONCE() and WRITE_ONCE() will fall back to memcpy and print a 144 * fall back to memcpy and print a compile-time warning.
147 * compile-time warning.
148 * 145 *
149 * Their two major use cases are: (1) Mediating communication between 146 * Their two major use cases are: (1) Mediating communication between
150 * process-level code and irq/NMI handlers, all running on the same CPU, 147 * process-level code and irq/NMI handlers, all running on the same CPU,
151 * and (2) Ensuring that the compiler does not fold, spindle, or otherwise 148 * and (2) Ensuring that the compiler does not fold, spindle, or otherwise
152 * mutilate accesses that either do not require ordering or that interact 149 * mutilate accesses that either do not require ordering or that interact
153 * with an explicit memory barrier or atomic instruction that provides the 150 * with an explicit memory barrier or atomic instruction that provides the
154 * required ordering. 151 * required ordering.
diff --git a/tools/include/linux/lockdep.h b/tools/include/linux/lockdep.h
index 940c1b075659..6b0c36a58fcb 100644
--- a/tools/include/linux/lockdep.h
+++ b/tools/include/linux/lockdep.h
@@ -48,6 +48,7 @@ static inline int debug_locks_off(void)
48#define printk(...) dprintf(STDOUT_FILENO, __VA_ARGS__) 48#define printk(...) dprintf(STDOUT_FILENO, __VA_ARGS__)
49#define pr_err(format, ...) fprintf (stderr, format, ## __VA_ARGS__) 49#define pr_err(format, ...) fprintf (stderr, format, ## __VA_ARGS__)
50#define pr_warn pr_err 50#define pr_warn pr_err
51#define pr_cont pr_err
51 52
52#define list_del_rcu list_del 53#define list_del_rcu list_del
53 54
diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index efd78b827b05..3a5cb5a6e94a 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -70,7 +70,7 @@ void perf_mmap__read_catchup(struct perf_mmap *md);
70static inline u64 perf_mmap__read_head(struct perf_mmap *mm) 70static inline u64 perf_mmap__read_head(struct perf_mmap *mm)
71{ 71{
72 struct perf_event_mmap_page *pc = mm->base; 72 struct perf_event_mmap_page *pc = mm->base;
73 u64 head = ACCESS_ONCE(pc->data_head); 73 u64 head = READ_ONCE(pc->data_head);
74 rmb(); 74 rmb();
75 return head; 75 return head;
76} 76}