Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar: "So we have a laundry list of locking subsystem changes: - continuing barrier API and code improvements - futex enhancements - atomics API improvements - pvqspinlock enhancements: in particular lock stealing and adaptive spinning - qspinlock micro-enhancements" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op futex: Cleanup the goto confusion in requeue_pi() futex: Remove pointless put_pi_state calls in requeue() futex: Document pi_state refcounting in requeue code futex: Rename free_pi_state() to put_pi_state() futex: Drop refcount if requeue_pi() acquired the rtmutex locking/barriers, arch: Remove ambiguous statement in the smp_store_mb() documentation lcoking/barriers, arch: Use smp barriers in smp_store_release() locking/cmpxchg, arch: Remove tas() definitions locking/pvqspinlock: Queue node adaptive spinning locking/pvqspinlock: Allow limited lock stealing locking/pvqspinlock: Collect slowpath lock statistics sched/core, locking: Document Program-Order guarantees locking, sched: Introduce smp_cond_acquire() and use it locking/pvqspinlock, x86: Optimize the PV unlock code path locking/qspinlock: Avoid redundant read of next pointer locking/qspinlock: Prefetch the next node cacheline locking/qspinlock: Use _acquire/_release() versions of cmpxchg() & xchg() atomics: Add test for atomic operations with _relaxed variants
author: Linus Torvalds <torvalds@linux-foundation.org> 2016-01-11 17:18:38 -0500
committer: Linus Torvalds <torvalds@linux-foundation.org> 2016-01-11 17:18:38 -0500
commit: 24af98c4cf5f5e69266e270c7f3fb34b82ff6656 (patch)
tree: 70d71381c841c92b2d28397bf0c5d6a7d9bbbaac /kernel/sched/core.c
parent: 9061cbe62adeccf8c986883bcd40f4aeee59ea75 (diff)
parent: 337f13046ff03717a9e99675284a817527440a49 (diff)
1 files changed, 92 insertions, 7 deletions
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1ef0d7aeab47..34cb9f7fc2d2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1905,6 +1905,97 @@ static void ttwu_queue(struct task_struct *p, int cpu)
        raw_spin_unlock(&rq->lock);
 }
+/*
+ * Notes on Program-Order guarantees on SMP systems.
+ *
+ *  MIGRATION
+ *
+ * The basic program-order guarantee on SMP systems is that when a task [t]
+ * migrates, all its activity on its old cpu [c0] happens-before any subsequent
+ * execution on its new cpu [c1].
+ *
+ * For migration (of runnable tasks) this is provided by the following means:
+ *
+ *  A) UNLOCK of the rq(c0)->lock scheduling out task t
+ *  B) migration for t is required to synchronize *both* rq(c0)->lock and
+ *     rq(c1)->lock (if not at the same time, then in that order).
+ *  C) LOCK of the rq(c1)->lock scheduling in task
+ *
+ * Transitivity guarantees that B happens after A and C after B.
+ * Note: we only require RCpc transitivity.
+ * Note: the cpu doing B need not be c0 or c1
+ *
+ * Example:
+ *
+ *   CPU0            CPU1            CPU2
+ *
+ *   LOCK rq(0)->lock
+ *   sched-out X
+ *   sched-in Y
+ *   UNLOCK rq(0)->lock
+ *
+ *                                   LOCK rq(0)->lock // orders against CPU0
+ *                                   dequeue X
+ *                                   UNLOCK rq(0)->lock
+ *
+ *                                   LOCK rq(1)->lock
+ *                                   enqueue X
+ *                                   UNLOCK rq(1)->lock
+ *
+ *                   LOCK rq(1)->lock // orders against CPU2
+ *                   sched-out Z
+ *                   sched-in X
+ *                   UNLOCK rq(1)->lock
+ *
+ *
+ *  BLOCKING -- aka. SLEEP + WAKEUP
+ *
+ * For blocking we (obviously) need to provide the same guarantee as for
+ * migration. However the means are completely different as there is no lock
+ * chain to provide order. Instead we do:
+ *
+ *   1) smp_store_release(X->on_cpu, 0)
+ *   2) smp_cond_acquire(!X->on_cpu)
+ *
+ * Example:
+ *
+ *   CPU0 (schedule)  CPU1 (try_to_wake_up) CPU2 (schedule)
+ *
+ *   LOCK rq(0)->lock LOCK X->pi_lock
+ *   dequeue X
+ *   sched-out X
+ *   smp_store_release(X->on_cpu, 0);
+ *
+ *                    smp_cond_acquire(!X->on_cpu);
+ *                    X->state = WAKING
+ *                    set_task_cpu(X,2)
+ *
+ *                    LOCK rq(2)->lock
+ *                    enqueue X
+ *                    X->state = RUNNING
+ *                    UNLOCK rq(2)->lock
+ *
+ *                                          LOCK rq(2)->lock // orders against CPU1
+ *                                          sched-out Z
+ *                                          sched-in X
+ *                                          UNLOCK rq(2)->lock
+ *
+ *                    UNLOCK X->pi_lock
+ *   UNLOCK rq(0)->lock
+ *
+ *
+ * However; for wakeups there is a second guarantee we must provide, namely we
+ * must observe the state that lead to our wakeup. That is, not only must our
+ * task observe its own prior state, it must also observe the stores prior to
+ * its wakeup.
+ *
+ * This means that any means of doing remote wakeups must order the CPU doing
+ * the wakeup against the CPU the task is going to end up running on. This,
+ * however, is already required for the regular Program-Order guarantee above,
+ * since the waking CPU is the one issueing the ACQUIRE (smp_cond_acquire).
+ *
+ */
 /**
 * try_to_wake_up - wake up a thread
 * @p: the thread to be awakened
@@ -1968,19 +2059,13 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
        /*
         * If the owning (remote) cpu is still in the middle of schedule() with
         * this task as prev, wait until its done referencing the task.
-         */
-        while (p->on_cpu)
-                cpu_relax();
-        /*
-         * Combined with the control dependency above, we have an effective
-         * smp_load_acquire() without the need for full barriers.
         *
         * Pairs with the smp_store_release() in finish_lock_switch().
         *
         * This ensures that tasks getting woken will be fully ordered against
         * their previous state and preserve Program Order.
         */
-        smp_rmb();
+        smp_cond_acquire(!p->on_cpu);
        p->sched_contributes_to_load = !!task_contributes_to_load(p);
        p->state = TASK_WAKING;
author	Linus Torvalds <torvalds@linux-foundation.org>	2016-01-11 17:18:38 -0500
committer	Linus Torvalds <torvalds@linux-foundation.org>	2016-01-11 17:18:38 -0500
commit	24af98c4cf5f5e69266e270c7f3fb34b82ff6656 (patch)
tree	70d71381c841c92b2d28397bf0c5d6a7d9bbbaac /kernel/sched/core.c
parent	9061cbe62adeccf8c986883bcd40f4aeee59ea75 (diff)
parent	337f13046ff03717a9e99675284a817527440a49 (diff)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 1ef0d7aeab47..34cb9f7fc2d2 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c
@@ -1905,6 +1905,97 @@ static void ttwu_queue(struct task_struct *p, int cpu)
1905	raw_spin_unlock(&rq->lock);	1905	raw_spin_unlock(&rq->lock);
1906	}	1906	}
1907		1907
		1908	/*
		1909	* Notes on Program-Order guarantees on SMP systems.
		1910	*
		1911	* MIGRATION
		1912	*
		1913	* The basic program-order guarantee on SMP systems is that when a task [t]
		1914	* migrates, all its activity on its old cpu [c0] happens-before any subsequent
		1915	* execution on its new cpu [c1].
		1916	*
		1917	* For migration (of runnable tasks) this is provided by the following means:
		1918	*
		1919	* A) UNLOCK of the rq(c0)->lock scheduling out task t
		1920	* B) migration for t is required to synchronize both rq(c0)->lock and
		1921	* rq(c1)->lock (if not at the same time, then in that order).
		1922	* C) LOCK of the rq(c1)->lock scheduling in task
		1923	*
		1924	* Transitivity guarantees that B happens after A and C after B.
		1925	* Note: we only require RCpc transitivity.
		1926	* Note: the cpu doing B need not be c0 or c1
		1927	*
		1928	* Example:
		1929	*
		1930	* CPU0 CPU1 CPU2
		1931	*
		1932	* LOCK rq(0)->lock
		1933	* sched-out X
		1934	* sched-in Y
		1935	* UNLOCK rq(0)->lock
		1936	*
		1937	* LOCK rq(0)->lock // orders against CPU0
		1938	* dequeue X
		1939	* UNLOCK rq(0)->lock
		1940	*
		1941	* LOCK rq(1)->lock
		1942	* enqueue X
		1943	* UNLOCK rq(1)->lock
		1944	*
		1945	* LOCK rq(1)->lock // orders against CPU2
		1946	* sched-out Z
		1947	* sched-in X
		1948	* UNLOCK rq(1)->lock
		1949	*
		1950	*
		1951	* BLOCKING -- aka. SLEEP + WAKEUP
		1952	*
		1953	* For blocking we (obviously) need to provide the same guarantee as for
		1954	* migration. However the means are completely different as there is no lock
		1955	* chain to provide order. Instead we do:
		1956	*
		1957	* 1) smp_store_release(X->on_cpu, 0)
		1958	* 2) smp_cond_acquire(!X->on_cpu)
		1959	*
		1960	* Example:
		1961	*
		1962	* CPU0 (schedule) CPU1 (try_to_wake_up) CPU2 (schedule)
		1963	*
		1964	* LOCK rq(0)->lock LOCK X->pi_lock
		1965	* dequeue X
		1966	* sched-out X
		1967	* smp_store_release(X->on_cpu, 0);
		1968	*
		1969	* smp_cond_acquire(!X->on_cpu);
		1970	* X->state = WAKING
		1971	* set_task_cpu(X,2)
		1972	*
		1973	* LOCK rq(2)->lock
		1974	* enqueue X
		1975	* X->state = RUNNING
		1976	* UNLOCK rq(2)->lock
		1977	*
		1978	* LOCK rq(2)->lock // orders against CPU1
		1979	* sched-out Z
		1980	* sched-in X
		1981	* UNLOCK rq(2)->lock
		1982	*
		1983	* UNLOCK X->pi_lock
		1984	* UNLOCK rq(0)->lock
		1985	*
		1986	*
		1987	* However; for wakeups there is a second guarantee we must provide, namely we
		1988	* must observe the state that lead to our wakeup. That is, not only must our
		1989	* task observe its own prior state, it must also observe the stores prior to
		1990	* its wakeup.
		1991	*
		1992	* This means that any means of doing remote wakeups must order the CPU doing
		1993	* the wakeup against the CPU the task is going to end up running on. This,
		1994	* however, is already required for the regular Program-Order guarantee above,
		1995	* since the waking CPU is the one issueing the ACQUIRE (smp_cond_acquire).
		1996	*
		1997	*/
		1998
1908	/**	1999	/**
1909	* try_to_wake_up - wake up a thread	2000	* try_to_wake_up - wake up a thread
1910	* @p: the thread to be awakened	2001	* @p: the thread to be awakened
@@ -1968,19 +2059,13 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
1968	/*	2059	/*
1969	* If the owning (remote) cpu is still in the middle of schedule() with	2060	* If the owning (remote) cpu is still in the middle of schedule() with
1970	* this task as prev, wait until its done referencing the task.	2061	* this task as prev, wait until its done referencing the task.
1971	*/
1972	while (p->on_cpu)
1973	cpu_relax();
1974	/*
1975	* Combined with the control dependency above, we have an effective
1976	* smp_load_acquire() without the need for full barriers.
1977	*	2062	*
1978	* Pairs with the smp_store_release() in finish_lock_switch().	2063	* Pairs with the smp_store_release() in finish_lock_switch().
1979	*	2064	*
1980	* This ensures that tasks getting woken will be fully ordered against	2065	* This ensures that tasks getting woken will be fully ordered against
1981	* their previous state and preserve Program Order.	2066	* their previous state and preserve Program Order.
1982	*/	2067	*/
1983	smp_rmb();	2068	smp_cond_acquire(!p->on_cpu);
1984		2069
1985	p->sched_contributes_to_load = !!task_contributes_to_load(p);	2070	p->sched_contributes_to_load = !!task_contributes_to_load(p);
1986	p->state = TASK_WAKING;	2071	p->state = TASK_WAKING;