locking/qspinlock: Fix spin_unlock_wait() some more

While this prior commit: 54cf809b9512 ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()") ... fixes spin_is_locked() and spin_unlock_wait() for the usage in ipc/sem and netfilter, it does not in fact work right for the usage in task_work and futex. So while the 2 locks crossed problem: spin_lock(A) spin_lock(B) if (!spin_is_locked(B)) spin_unlock_wait(A) foo() foo(); ... works with the smp_mb() injected by both spin_is_locked() and spin_unlock_wait(), this is not sufficient for: flag = 1; smp_mb(); spin_lock() spin_unlock_wait() if (!flag) // add to lockless list // iterate lockless list ... because in this scenario, the store from spin_lock() can be delayed past the load of flag, uncrossing the variables and loosing the guarantee. This patch reworks spin_is_locked() and spin_unlock_wait() to work in both cases by exploiting the observation that while the lock byte store can be delayed, the contender must have registered itself visibly in other state contained in the word. It also allows for architectures to override both functions, as PPC and ARM64 have an additional issue for which we currently have no generic solution. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Giovanni Gherdovich <ggherdovich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <waiman.long@hpe.com> Cc: Will Deacon <will.deacon@arm.com> Cc: stable@vger.kernel.org # v4.2 and later Fixes: 54cf809b9512 ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()") Signed-off-by: Ingo Molnar <mingo@kernel.org>
author: Peter Zijlstra <peterz@infradead.org> 2016-06-08 04:19:51 -0400
committer: Ingo Molnar <mingo@kernel.org> 2016-06-08 08:29:08 -0400
commit: 2c610022711675ee908b903d242f0b90e1db661f (patch)
tree: 3fb20eee63da3b9a72f37abce6b14a7396ef9123
parent: 0422e83d84ae24b933e4b0d4c1e0f0b4ae8a0a3b (diff)
2 files changed, 77 insertions, 36 deletions
diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h
index 6bd05700d8c9..05f05f17a7c2 100644
--- a/include/asm-generic/qspinlock.h
+++ b/include/asm-generic/qspinlock.h
@@ -22,37 +22,33 @@
 #include <asm-generic/qspinlock_types.h>
 /**
+ * queued_spin_unlock_wait - wait until the _current_ lock holder releases the lock
+ * @lock : Pointer to queued spinlock structure
+ *
+ * There is a very slight possibility of live-lock if the lockers keep coming
+ * and the waiter is just unfortunate enough to not see any unlock state.
+ */
+#ifndef queued_spin_unlock_wait
+extern void queued_spin_unlock_wait(struct qspinlock *lock);
+#endif
+/**
 * queued_spin_is_locked - is the spinlock locked?
 * @lock: Pointer to queued spinlock structure
 * Return: 1 if it is locked, 0 otherwise
 */
+#ifndef queued_spin_is_locked
 static __always_inline int queued_spin_is_locked(struct qspinlock *lock)
 {
        /*
-         * queued_spin_lock_slowpath() can ACQUIRE the lock before
+         * See queued_spin_unlock_wait().
-         * issuing the unordered store that sets _Q_LOCKED_VAL.
-         *
-         * See both smp_cond_acquire() sites for more detail.
-         *
-         * This however means that in code like:
-         *
-         *   spin_lock(A)               spin_lock(B)
-         *   spin_unlock_wait(B)        spin_is_locked(A)
-         *   do_something()             do_something()
-         *
-         * Both CPUs can end up running do_something() because the store
-         * setting _Q_LOCKED_VAL will pass through the loads in
-         * spin_unlock_wait() and/or spin_is_locked().
         *
-         * Avoid this by issuing a full memory barrier between the spin_lock()
+         * Any !0 state indicates it is locked, even if _Q_LOCKED_VAL
-         * and the loads in spin_unlock_wait() and spin_is_locked().
+         * isn't immediately observable.
-         *
-         * Note that regular mutual exclusion doesn't care about this
-         * delayed store.
         */
-        smp_mb();
+        return atomic_read(&lock->val);
-        return atomic_read(&lock->val) & _Q_LOCKED_MASK;
 }
+#endif
 /**
 * queued_spin_value_unlocked - is the spinlock structure unlocked?
@@ -122,21 +118,6 @@ static __always_inline void queued_spin_unlock(struct qspinlock *lock)
 }
 #endif
-/**
- * queued_spin_unlock_wait - wait until current lock holder releases the lock
- * @lock : Pointer to queued spinlock structure
- *
- * There is a very slight possibility of live-lock if the lockers keep coming
- * and the waiter is just unfortunate enough to not see any unlock state.
- */
-static inline void queued_spin_unlock_wait(struct qspinlock *lock)
-{
-        /* See queued_spin_is_locked() */
-        smp_mb();
-        while (atomic_read(&lock->val) & _Q_LOCKED_MASK)
-                cpu_relax();
-}
 #ifndef virt_spin_lock
 static __always_inline bool virt_spin_lock(struct qspinlock *lock)
 {
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index ce2f75e32ae1..5fc8c311b8fe 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -267,6 +267,66 @@ static __always_inline u32  __pv_wait_head_or_lock(struct qspinlock *lock,
 #define queued_spin_lock_slowpath       native_queued_spin_lock_slowpath
 #endif
+/*
+ * queued_spin_lock_slowpath() can (load-)ACQUIRE the lock before
+ * issuing an _unordered_ store to set _Q_LOCKED_VAL.
+ *
+ * This means that the store can be delayed, but no later than the
+ * store-release from the unlock. This means that simply observing
+ * _Q_LOCKED_VAL is not sufficient to determine if the lock is acquired.
+ *
+ * There are two paths that can issue the unordered store:
+ *
+ *  (1) clear_pending_set_locked():     *,1,0 -> *,0,1
+ *
+ *  (2) set_locked():                   t,0,0 -> t,0,1 ; t != 0
+ *      atomic_cmpxchg_relaxed():       t,0,0 -> 0,0,1
+ *
+ * However, in both cases we have other !0 state we've set before to queue
+ * ourseves:
+ *
+ * For (1) we have the atomic_cmpxchg_acquire() that set _Q_PENDING_VAL, our
+ * load is constrained by that ACQUIRE to not pass before that, and thus must
+ * observe the store.
+ *
+ * For (2) we have a more intersting scenario. We enqueue ourselves using
+ * xchg_tail(), which ends up being a RELEASE. This in itself is not
+ * sufficient, however that is followed by an smp_cond_acquire() on the same
+ * word, giving a RELEASE->ACQUIRE ordering. This again constrains our load and
+ * guarantees we must observe that store.
+ *
+ * Therefore both cases have other !0 state that is observable before the
+ * unordered locked byte store comes through. This means we can use that to
+ * wait for the lock store, and then wait for an unlock.
+ */
+#ifndef queued_spin_unlock_wait
+void queued_spin_unlock_wait(struct qspinlock *lock)
+{
+        u32 val;
+        for (;;) {
+                val = atomic_read(&lock->val);
+                if (!val) /* not locked, we're done */
+                        goto done;
+                if (val & _Q_LOCKED_MASK) /* locked, go wait for unlock */
+                        break;
+                /* not locked, but pending, wait until we observe the lock */
+                cpu_relax();
+        }
+        /* any unlock is good */
+        while (atomic_read(&lock->val) & _Q_LOCKED_MASK)
+                cpu_relax();
+done:
+        smp_rmb(); /* CTRL + RMB -> ACQUIRE */
+}
+EXPORT_SYMBOL(queued_spin_unlock_wait);
+#endif
 #endif /* _GEN_PV_LOCK_SLOWPATH */
 /**
author	Peter Zijlstra <peterz@infradead.org>	2016-06-08 04:19:51 -0400
committer	Ingo Molnar <mingo@kernel.org>	2016-06-08 08:29:08 -0400
commit	2c610022711675ee908b903d242f0b90e1db661f (patch)
tree	3fb20eee63da3b9a72f37abce6b14a7396ef9123
parent	0422e83d84ae24b933e4b0d4c1e0f0b4ae8a0a3b (diff)

diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h index 6bd05700d8c9..05f05f17a7c2 100644 --- a/include/asm-generic/qspinlock.h +++ b/include/asm-generic/qspinlock.h
@@ -22,37 +22,33 @@
22	#include <asm-generic/qspinlock_types.h>	22	#include <asm-generic/qspinlock_types.h>
23		23
24	/**	24	/**
		25	* queued_spin_unlock_wait - wait until the _current_ lock holder releases the lock
		26	* @lock : Pointer to queued spinlock structure
		27	*
		28	* There is a very slight possibility of live-lock if the lockers keep coming
		29	* and the waiter is just unfortunate enough to not see any unlock state.
		30	*/
		31	#ifndef queued_spin_unlock_wait
		32	extern void queued_spin_unlock_wait(struct qspinlock *lock);
		33	#endif
		34
		35	/**
25	* queued_spin_is_locked - is the spinlock locked?	36	* queued_spin_is_locked - is the spinlock locked?
26	* @lock: Pointer to queued spinlock structure	37	* @lock: Pointer to queued spinlock structure
27	* Return: 1 if it is locked, 0 otherwise	38	* Return: 1 if it is locked, 0 otherwise
28	*/	39	*/
		40	#ifndef queued_spin_is_locked
29	static __always_inline int queued_spin_is_locked(struct qspinlock *lock)	41	static __always_inline int queued_spin_is_locked(struct qspinlock *lock)
30	{	42	{
31	/*	43	/*
32	* queued_spin_lock_slowpath() can ACQUIRE the lock before	44	* See queued_spin_unlock_wait().
33	* issuing the unordered store that sets _Q_LOCKED_VAL.
34	*
35	* See both smp_cond_acquire() sites for more detail.
36	*
37	* This however means that in code like:
38	*
39	* spin_lock(A) spin_lock(B)
40	* spin_unlock_wait(B) spin_is_locked(A)
41	* do_something() do_something()
42	*
43	* Both CPUs can end up running do_something() because the store
44	* setting _Q_LOCKED_VAL will pass through the loads in
45	* spin_unlock_wait() and/or spin_is_locked().
46	*	45	*
47	* Avoid this by issuing a full memory barrier between the spin_lock()	46	* Any !0 state indicates it is locked, even if _Q_LOCKED_VAL
48	* and the loads in spin_unlock_wait() and spin_is_locked().	47	* isn't immediately observable.
49	*
50	* Note that regular mutual exclusion doesn't care about this
51	* delayed store.
52	*/	48	*/
53	smp_mb();	49	return atomic_read(&lock->val);
54	return atomic_read(&lock->val) & _Q_LOCKED_MASK;
55	}	50	}
		51	#endif
56		52
57	/**	53	/**
58	* queued_spin_value_unlocked - is the spinlock structure unlocked?	54	* queued_spin_value_unlocked - is the spinlock structure unlocked?
@@ -122,21 +118,6 @@ static __always_inline void queued_spin_unlock(struct qspinlock *lock)
122	}	118	}
123	#endif	119	#endif
124		120
125	/**
126	* queued_spin_unlock_wait - wait until current lock holder releases the lock
127	* @lock : Pointer to queued spinlock structure
128	*
129	* There is a very slight possibility of live-lock if the lockers keep coming
130	* and the waiter is just unfortunate enough to not see any unlock state.
131	*/
132	static inline void queued_spin_unlock_wait(struct qspinlock *lock)
133	{
134	/* See queued_spin_is_locked() */
135	smp_mb();
136	while (atomic_read(&lock->val) & _Q_LOCKED_MASK)
137	cpu_relax();
138	}
139
140	#ifndef virt_spin_lock	121	#ifndef virt_spin_lock
141	static __always_inline bool virt_spin_lock(struct qspinlock *lock)	122	static __always_inline bool virt_spin_lock(struct qspinlock *lock)
142	{	123	{


diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index ce2f75e32ae1..5fc8c311b8fe 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c
@@ -267,6 +267,66 @@ static __always_inline u32 __pv_wait_head_or_lock(struct qspinlock *lock,
267	#define queued_spin_lock_slowpath native_queued_spin_lock_slowpath	267	#define queued_spin_lock_slowpath native_queued_spin_lock_slowpath
268	#endif	268	#endif
269		269
		270	/*
		271	* queued_spin_lock_slowpath() can (load-)ACQUIRE the lock before
		272	* issuing an _unordered_ store to set _Q_LOCKED_VAL.
		273	*
		274	* This means that the store can be delayed, but no later than the
		275	* store-release from the unlock. This means that simply observing
		276	* _Q_LOCKED_VAL is not sufficient to determine if the lock is acquired.
		277	*
		278	* There are two paths that can issue the unordered store:
		279	*
		280	* (1) clear_pending_set_locked(): ,1,0 -> ,0,1
		281	*
		282	* (2) set_locked(): t,0,0 -> t,0,1 ; t != 0
		283	* atomic_cmpxchg_relaxed(): t,0,0 -> 0,0,1
		284	*
		285	* However, in both cases we have other !0 state we've set before to queue
		286	* ourseves:
		287	*
		288	* For (1) we have the atomic_cmpxchg_acquire() that set _Q_PENDING_VAL, our
		289	* load is constrained by that ACQUIRE to not pass before that, and thus must
		290	* observe the store.
		291	*
		292	* For (2) we have a more intersting scenario. We enqueue ourselves using
		293	* xchg_tail(), which ends up being a RELEASE. This in itself is not
		294	* sufficient, however that is followed by an smp_cond_acquire() on the same
		295	* word, giving a RELEASE->ACQUIRE ordering. This again constrains our load and
		296	* guarantees we must observe that store.
		297	*
		298	* Therefore both cases have other !0 state that is observable before the
		299	* unordered locked byte store comes through. This means we can use that to
		300	* wait for the lock store, and then wait for an unlock.
		301	*/
		302	#ifndef queued_spin_unlock_wait
		303	void queued_spin_unlock_wait(struct qspinlock *lock)
		304	{
		305	u32 val;
		306
		307	for (;;) {
		308	val = atomic_read(&lock->val);
		309
		310	if (!val) /* not locked, we're done */
		311	goto done;
		312
		313	if (val & _Q_LOCKED_MASK) /* locked, go wait for unlock */
		314	break;
		315
		316	/* not locked, but pending, wait until we observe the lock */
		317	cpu_relax();
		318	}
		319
		320	/* any unlock is good */
		321	while (atomic_read(&lock->val) & _Q_LOCKED_MASK)
		322	cpu_relax();
		323
		324	done:
		325	smp_rmb(); /* CTRL + RMB -> ACQUIRE */
		326	}
		327	EXPORT_SYMBOL(queued_spin_unlock_wait);
		328	#endif
		329
270	#endif /* _GEN_PV_LOCK_SLOWPATH */	330	#endif /* _GEN_PV_LOCK_SLOWPATH */
271		331
272	/**	332	/**