locking/qspinlock: Handle > 4 slowpath nesting levels

Four queue nodes per CPU are allocated to enable up to 4 nesting levels using the per-CPU nodes. Nested NMIs are possible in some architectures. Still it is very unlikely that we will ever hit more than 4 nested levels with contention in the slowpath. When that rare condition happens, however, it is likely that the system will hang or crash shortly after that. It is not good and we need to handle this exception case. This is done by spinning directly on the lock using repeated trylock. This alternative code path should only be used when there is nested NMIs. Assuming that the locks used by those NMI handlers will not be heavily contended, a simple TAS locking should work out. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: James Morse <james.morse@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: SRINIVAS <srinivas.eeda@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Zhenzhong Duan <zhenzhong.duan@oracle.com> Link: https://lkml.kernel.org/r/1548798828-16156-2-git-send-email-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
author: Waiman Long <longman@redhat.com> 2019-01-29 16:53:45 -0500
committer: Ingo Molnar <mingo@kernel.org> 2019-02-04 03:03:29 -0500
commit: d682b596d99345ef0000e7017db714ba7f29e017 (patch)
tree: 88a0781605660a26b539452219f37df8786e7577
parent: 07879c6a3740fbbf3c8891a0ab484c20a12794d8 (diff)
1 files changed, 15 insertions, 0 deletions
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 8a8c3c208c5e..0875053c4050 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -412,6 +412,21 @@ pv_queue:
        idx = node->count++;
        tail = encode_tail(smp_processor_id(), idx);
+        /*
+         * 4 nodes are allocated based on the assumption that there will
+         * not be nested NMIs taking spinlocks. That may not be true in
+         * some architectures even though the chance of needing more than
+         * 4 nodes will still be extremely unlikely. When that happens,
+         * we fall back to spinning on the lock directly without using
+         * any MCS node. This is not the most elegant solution, but is
+         * simple enough.
+         */
+        if (unlikely(idx >= MAX_NODES)) {
+                while (!queued_spin_trylock(lock))
+                        cpu_relax();
+                goto release;
+        }
        node = grab_mcs_node(node, idx);
        /*
author	Waiman Long <longman@redhat.com>	2019-01-29 16:53:45 -0500
committer	Ingo Molnar <mingo@kernel.org>	2019-02-04 03:03:29 -0500
commit	d682b596d99345ef0000e7017db714ba7f29e017 (patch)
tree	88a0781605660a26b539452219f37df8786e7577
parent	07879c6a3740fbbf3c8891a0ab484c20a12794d8 (diff)

diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index 8a8c3c208c5e..0875053c4050 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c
@@ -412,6 +412,21 @@ pv_queue:
412	idx = node->count++;	412	idx = node->count++;
413	tail = encode_tail(smp_processor_id(), idx);	413	tail = encode_tail(smp_processor_id(), idx);
414		414
		415	/*
		416	* 4 nodes are allocated based on the assumption that there will
		417	* not be nested NMIs taking spinlocks. That may not be true in
		418	* some architectures even though the chance of needing more than
		419	* 4 nodes will still be extremely unlikely. When that happens,
		420	* we fall back to spinning on the lock directly without using
		421	* any MCS node. This is not the most elegant solution, but is
		422	* simple enough.
		423	*/
		424	if (unlikely(idx >= MAX_NODES)) {
		425	while (!queued_spin_trylock(lock))
		426	cpu_relax();
		427	goto release;
		428	}
		429
415	node = grab_mcs_node(node, idx);	430	node = grab_mcs_node(node, idx);
416		431
417	/*	432	/*