drm/i915: Slaughter the thundering i915_wait_request herd

One particularly stressful scenario consists of many independent tasks all competing for GPU time and waiting upon the results (e.g. realtime transcoding of many, many streams). One bottleneck in particular is that each client waits on its own results, but every client is woken up after every batchbuffer - hence the thunder of hooves as then every client must do its heavyweight dance to read a coherent seqno to see if it is the lucky one. Ideally, we only want one client to wake up after the interrupt and check its request for completion. Since the requests must retire in order, we can select the first client on the oldest request to be woken. Once that client has completed his wait, we can then wake up the next client and so on. However, all clients then incur latency as every process in the chain may be delayed for scheduling - this may also then cause some priority inversion. To reduce the latency, when a client is added or removed from the list, we scan the tree for completed seqno and wake up all the completed waiters in parallel. Using igt/benchmarks/gem_latency, we can demonstrate this effect. The benchmark measures the number of GPU cycles between completion of a batch and the client waking up from a call to wait-ioctl. With many concurrent waiters, with each on a different request, we observe that the wakeup latency before the patch scales nearly linearly with the number of waiters (before external factors kick in making the scaling much worse). After applying the patch, we can see that only the single waiter for the request is being woken up, providing a constant wakeup latency for every operation. However, the situation is not quite as rosy for many waiters on the same request, though to the best of my knowledge this is much less likely in practice. Here, we can observe that the concurrent waiters incur extra latency from being woken up by the solitary bottom-half, rather than directly by the interrupt. This appears to be scheduler induced (having discounted adverse effects from having a rbtree walk/erase in the wakeup path), each additional wake_up_process() costs approximately 1us on big core. Another effect of performing the secondary wakeups from the first bottom-half is the incurred delay this imposes on high priority threads - rather than immediately returning to userspace and leaving the interrupt handler to wake the others. To offset the delay incurred with additional waiters on a request, we could use a hybrid scheme that did a quick read in the interrupt handler and dequeued all the completed waiters (incurring the overhead in the interrupt handler, not the best plan either as we then incur GPU submission latency) but we would still have to wake up the bottom-half every time to do the heavyweight slow read. Or we could only kick the waiters on the seqno with the same priority as the current task (i.e. in the realtime waiter scenario, only it is woken up immediately by the interrupt and simply queues the next waiter before returning to userspace, minimising its delay at the expense of the chain, and also reducing contention on its scheduler runqueue). This is effective at avoid long pauses in the interrupt handler and at avoiding the extra latency in realtime/high-priority waiters. v2: Convert from a kworker per engine into a dedicated kthread for the bottom-half. v3: Rename request members and tweak comments. v4: Use a per-engine spinlock in the breadcrumbs bottom-half. v5: Fix race in locklessly checking waiter status and kicking the task on adding a new waiter. v6: Fix deciding when to force the timer to hide missing interrupts. v7: Move the bottom-half from the kthread to the first client process. v8: Reword a few comments v9: Break the busy loop when the interrupt is unmasked or has fired. v10: Comments, unnecessary churn, better debugging from Tvrtko v11: Wake all completed waiters on removing the current bottom-half to reduce the latency of waking up a herd of clients all waiting on the same request. v12: Rearrange missed-interrupt fault injection so that it works with igt/drv_missed_irq_hang v13: Rename intel_breadcrumb and friends to intel_wait in preparation for signal handling. v14: RCU commentary, assert_spin_locked v15: Hide BUG_ON behind the compiler; report on gem_latency findings. v16: Sort seqno-groups by priority so that first-waiter has the highest task priority (and so avoid priority inversion). v17: Add waiters to post-mortem GPU hang state. v18: Return early for a completed wait after acquiring the spinlock. Avoids adding ourselves to the tree if the is already complete, and skips the awkward question of why we don't do completion wakeups for waits earlier than or equal to ourselves. v19: Prepare for init_breadcrumbs to fail. Later patches may want to allocate during init, so be prepared to propagate back the error code. Testcase: igt/gem_concurrent_blit Testcase: igt/benchmarks/gem_latency Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com> Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Dave Gordon <david.s.gordon@intel.com> Cc: "Goel, Akash" <akash.goel@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> #v18 Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-6-git-send-email-chris@chris-wilson.co.uk
author: Chris Wilson <chris@chris-wilson.co.uk> 2016-07-01 12:23:15 -0400
committer: Chris Wilson <chris@chris-wilson.co.uk> 2016-07-01 15:58:43 -0400
commit: 688e6c7258164de86d626e8e983ca8d28015c263 (patch)
tree: ea2040fd06199f2335f7431ca2506cd4e7757fe5 /drivers/gpu/drm/i915/i915_gem.c
parent: 1f15b76f1ec973d1eb5d21b6d98b21aebb9025f1 (diff)
1 files changed, 53 insertions, 90 deletions
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b5278d117ea0..c9814572e346 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1343,17 +1343,6 @@ i915_gem_check_wedge(unsigned reset_counter, bool interruptible)
        return 0;
 }
-static void fake_irq(unsigned long data)
-{
-        wake_up_process((struct task_struct *)data);
-}
-static bool missed_irq(struct drm_i915_private *dev_priv,
-                       struct intel_engine_cs *engine)
-{
-        return test_bit(engine->id, &dev_priv->gpu_error.missed_irq_rings);
-}
 static unsigned long local_clock_us(unsigned *cpu)
 {
        unsigned long t;
@@ -1386,7 +1375,7 @@ static bool busywait_stop(unsigned long timeout, unsigned cpu)
        return this_cpu != cpu;
 }
-static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
+static bool __i915_spin_request(struct drm_i915_gem_request *req, int state)
 {
        unsigned long timeout;
        unsigned cpu;
@@ -1401,17 +1390,14 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
         * takes to sleep on a request, on the order of a microsecond.
         */
-        if (req->engine->irq_refcount)
-                return -EBUSY;
        /* Only spin if we know the GPU is processing this request */
        if (!i915_gem_request_started(req, true))
-                return -EAGAIN;
+                return false;
        timeout = local_clock_us(&cpu) + 5;
-        while (!need_resched()) {
+        do {
                if (i915_gem_request_completed(req, true))
-                        return 0;
+                        return true;
                if (signal_pending_state(state, current))
                        break;
@@ -1420,12 +1406,9 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
                        break;
                cpu_relax_lowlatency();
-        }
+        } while (!need_resched());
-        if (i915_gem_request_completed(req, false))
-                return 0;
-        return -EAGAIN;
+        return false;
 }
 /**
@@ -1450,18 +1433,14 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
                        s64 *timeout,
                        struct intel_rps_client *rps)
 {
-        struct intel_engine_cs *engine = i915_gem_request_get_engine(req);
-        struct drm_i915_private *dev_priv = req->i915;
-        const bool irq_test_in_progress =
-                ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_engine_flag(engine);
        int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
        DEFINE_WAIT(reset);
-        DEFINE_WAIT(wait);
+        struct intel_wait wait;
-        unsigned long timeout_expire;
+        unsigned long timeout_remain;
        s64 before = 0; /* Only to silence a compiler warning. */
-        int ret;
+        int ret = 0;
-        WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
+        might_sleep();
        if (list_empty(&req->list))
                return 0;
@@ -1469,7 +1448,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
        if (i915_gem_request_completed(req, true))
                return 0;
-        timeout_expire = 0;
+        timeout_remain = MAX_SCHEDULE_TIMEOUT;
        if (timeout) {
                if (WARN_ON(*timeout < 0))
                        return -EINVAL;
@@ -1477,7 +1456,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
                if (*timeout == 0)
                        return -ETIME;
-                timeout_expire = jiffies + nsecs_to_jiffies_timeout(*timeout);
+                timeout_remain = nsecs_to_jiffies_timeout(*timeout);
                /*
                 * Record current time in case interrupted by signal, or wedged.
@@ -1485,55 +1464,32 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
                before = ktime_get_raw_ns();
        }
-        if (INTEL_INFO(dev_priv)->gen >= 6)
-                gen6_rps_boost(dev_priv, rps, req->emitted_jiffies);
        trace_i915_gem_request_wait_begin(req);
-        /* Optimistic spin for the next jiffie before touching IRQs */
+        if (INTEL_INFO(req->i915)->gen >= 6)
-        ret = __i915_spin_request(req, state);
+                gen6_rps_boost(req->i915, rps, req->emitted_jiffies);
-        if (ret == 0)
-                goto out;
-        if (!irq_test_in_progress && WARN_ON(!engine->irq_get(engine))) {
+        /* Optimistic spin for the next ~jiffie before touching IRQs */
-                ret = -ENODEV;
+        if (__i915_spin_request(req, state))
-                goto out;
+                goto complete;
-        }
-        add_wait_queue(&dev_priv->gpu_error.wait_queue, &reset);
+        set_current_state(state);
-        for (;;) {
+        add_wait_queue(&req->i915->gpu_error.wait_queue, &reset);
-                struct timer_list timer;
-                prepare_to_wait(&engine->irq_queue, &wait, state);
+        intel_wait_init(&wait, req->seqno);
+        if (intel_engine_add_wait(req->engine, &wait))
-                /* We need to check whether any gpu reset happened in between
+                /* In order to check that we haven't missed the interrupt
-                 * the request being submitted and now. If a reset has occurred,
+                 * as we enabled it, we need to kick ourselves to do a
-                 * the seqno will have been advance past ours and our request
+                 * coherent check on the seqno before we sleep.
-                 * is complete. If we are in the process of handling a reset,
-                 * the request is effectively complete as the rendering will
-                 * be discarded, but we need to return in order to drop the
-                 * struct_mutex.
                 */
-                if (i915_reset_in_progress(&dev_priv->gpu_error)) {
+                goto wakeup;
-                        ret = 0;
-                        break;
-                }
-                if (i915_gem_request_completed(req, false)) {
-                        ret = 0;
-                        break;
-                }
+        for (;;) {
                if (signal_pending_state(state, current)) {
                        ret = -ERESTARTSYS;
                        break;
                }
-                if (timeout && time_after_eq(jiffies, timeout_expire)) {
-                        ret = -ETIME;
-                        break;
-                }
                /* Ensure that even if the GPU hangs, we get woken up.
                 *
                 * However, note that if no one is waiting, we never notice
@@ -1541,32 +1497,33 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
                 * held by the GPU and so trigger a hangcheck. In the most
                 * pathological case, this will be upon memory starvation!
                 */
-                i915_queue_hangcheck(dev_priv);
+                i915_queue_hangcheck(req->i915);
-                timer.function = NULL;
-                if (timeout || missed_irq(dev_priv, engine)) {
-                        unsigned long expire;
-                        setup_timer_on_stack(&timer, fake_irq, (unsigned long)current);
+                timeout_remain = io_schedule_timeout(timeout_remain);
-                        expire = missed_irq(dev_priv, engine) ? jiffies + 1 : timeout_expire;
+                if (timeout_remain == 0) {
-                        mod_timer(&timer, expire);
+                        ret = -ETIME;
+                        break;
                }
-                io_schedule();
+                if (intel_wait_complete(&wait))
+                        break;
-                if (timer.function) {
-                        del_singleshot_timer_sync(&timer);
-                        destroy_timer_on_stack(&timer);
-                }
-        }
-        remove_wait_queue(&dev_priv->gpu_error.wait_queue, &reset);
-        if (!irq_test_in_progress)
+                set_current_state(state);
-                engine->irq_put(engine);
-        finish_wait(&engine->irq_queue, &wait);
+wakeup:
+                /* Carefully check if the request is complete, giving time
+                 * for the seqno to be visible following the interrupt.
+                 * We also have to check in case we are kicked by the GPU
+                 * reset in order to drop the struct_mutex.
+                 */
+                if (__i915_request_irq_complete(req))
+                        break;
+        }
+        remove_wait_queue(&req->i915->gpu_error.wait_queue, &reset);
-out:
+        intel_engine_remove_wait(req->engine, &wait);
+        __set_current_state(TASK_RUNNING);
+complete:
        trace_i915_gem_request_wait_end(req);
        if (timeout) {
@@ -2796,6 +2753,12 @@ i915_gem_init_seqno(struct drm_i915_private *dev_priv, u32 seqno)
        }
        i915_gem_retire_requests(dev_priv);
+        /* If the seqno wraps around, we need to clear the breadcrumb rbtree */
+        if (!i915_seqno_passed(seqno, dev_priv->next_seqno)) {
+                while (intel_kick_waiters(dev_priv))
+                        yield();
+        }
        /* Finally reset hw state */
        for_each_engine(engine, dev_priv)
                intel_ring_init_seqno(engine, seqno);
author	Chris Wilson <chris@chris-wilson.co.uk>	2016-07-01 12:23:15 -0400
committer	Chris Wilson <chris@chris-wilson.co.uk>	2016-07-01 15:58:43 -0400
commit	688e6c7258164de86d626e8e983ca8d28015c263 (patch)
tree	ea2040fd06199f2335f7431ca2506cd4e7757fe5 /drivers/gpu/drm/i915/i915_gem.c
parent	1f15b76f1ec973d1eb5d21b6d98b21aebb9025f1 (diff)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index b5278d117ea0..c9814572e346 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1343,17 +1343,6 @@ i915_gem_check_wedge(unsigned reset_counter, bool interruptible)
1343	return 0;	1343	return 0;
1344	}	1344	}
1345		1345
1346	static void fake_irq(unsigned long data)
1347	{
1348	wake_up_process((struct task_struct *)data);
1349	}
1350
1351	static bool missed_irq(struct drm_i915_private *dev_priv,
1352	struct intel_engine_cs *engine)
1353	{
1354	return test_bit(engine->id, &dev_priv->gpu_error.missed_irq_rings);
1355	}
1356
1357	static unsigned long local_clock_us(unsigned *cpu)	1346	static unsigned long local_clock_us(unsigned *cpu)
1358	{	1347	{
1359	unsigned long t;	1348	unsigned long t;
@@ -1386,7 +1375,7 @@ static bool busywait_stop(unsigned long timeout, unsigned cpu)
1386	return this_cpu != cpu;	1375	return this_cpu != cpu;
1387	}	1376	}
1388		1377
1389	static int __i915_spin_request(struct drm_i915_gem_request *req, int state)	1378	static bool __i915_spin_request(struct drm_i915_gem_request *req, int state)
1390	{	1379	{
1391	unsigned long timeout;	1380	unsigned long timeout;
1392	unsigned cpu;	1381	unsigned cpu;
@@ -1401,17 +1390,14 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
1401	* takes to sleep on a request, on the order of a microsecond.	1390	* takes to sleep on a request, on the order of a microsecond.
1402	*/	1391	*/
1403		1392
1404	if (req->engine->irq_refcount)
1405	return -EBUSY;
1406
1407	/* Only spin if we know the GPU is processing this request */	1393	/* Only spin if we know the GPU is processing this request */
1408	if (!i915_gem_request_started(req, true))	1394	if (!i915_gem_request_started(req, true))
1409	return -EAGAIN;	1395	return false;
1410		1396
1411	timeout = local_clock_us(&cpu) + 5;	1397	timeout = local_clock_us(&cpu) + 5;
1412	while (!need_resched()) {	1398	do {
1413	if (i915_gem_request_completed(req, true))	1399	if (i915_gem_request_completed(req, true))
1414	return 0;	1400	return true;
1415		1401
1416	if (signal_pending_state(state, current))	1402	if (signal_pending_state(state, current))
1417	break;	1403	break;
@@ -1420,12 +1406,9 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
1420	break;	1406	break;
1421		1407
1422	cpu_relax_lowlatency();	1408	cpu_relax_lowlatency();
1423	}	1409	} while (!need_resched());
1424
1425	if (i915_gem_request_completed(req, false))
1426	return 0;
1427		1410
1428	return -EAGAIN;	1411	return false;
1429	}	1412	}
1430		1413
1431	/**	1414	/**
@@ -1450,18 +1433,14 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
1450	s64 *timeout,	1433	s64 *timeout,
1451	struct intel_rps_client *rps)	1434	struct intel_rps_client *rps)
1452	{	1435	{
1453	struct intel_engine_cs *engine = i915_gem_request_get_engine(req);
1454	struct drm_i915_private *dev_priv = req->i915;
1455	const bool irq_test_in_progress =
1456	ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_engine_flag(engine);
1457	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;	1436	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
1458	DEFINE_WAIT(reset);	1437	DEFINE_WAIT(reset);
1459	DEFINE_WAIT(wait);	1438	struct intel_wait wait;
1460	unsigned long timeout_expire;	1439	unsigned long timeout_remain;
1461	s64 before = 0; /* Only to silence a compiler warning. */	1440	s64 before = 0; /* Only to silence a compiler warning. */
1462	int ret;	1441	int ret = 0;
1463		1442
1464	WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");	1443	might_sleep();
1465		1444
1466	if (list_empty(&req->list))	1445	if (list_empty(&req->list))
1467	return 0;	1446	return 0;
@@ -1469,7 +1448,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
1469	if (i915_gem_request_completed(req, true))	1448	if (i915_gem_request_completed(req, true))
1470	return 0;	1449	return 0;
1471		1450
1472	timeout_expire = 0;	1451	timeout_remain = MAX_SCHEDULE_TIMEOUT;
1473	if (timeout) {	1452	if (timeout) {
1474	if (WARN_ON(*timeout < 0))	1453	if (WARN_ON(*timeout < 0))
1475	return -EINVAL;	1454	return -EINVAL;
@@ -1477,7 +1456,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
1477	if (*timeout == 0)	1456	if (*timeout == 0)
1478	return -ETIME;	1457	return -ETIME;
1479		1458
1480	timeout_expire = jiffies + nsecs_to_jiffies_timeout(*timeout);	1459	timeout_remain = nsecs_to_jiffies_timeout(*timeout);
1481		1460
1482	/*	1461	/*
1483	* Record current time in case interrupted by signal, or wedged.	1462	* Record current time in case interrupted by signal, or wedged.
@@ -1485,55 +1464,32 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
1485	before = ktime_get_raw_ns();	1464	before = ktime_get_raw_ns();
1486	}	1465	}
1487		1466
1488	if (INTEL_INFO(dev_priv)->gen >= 6)
1489	gen6_rps_boost(dev_priv, rps, req->emitted_jiffies);
1490
1491	trace_i915_gem_request_wait_begin(req);	1467	trace_i915_gem_request_wait_begin(req);
1492		1468
1493	/* Optimistic spin for the next jiffie before touching IRQs */	1469	if (INTEL_INFO(req->i915)->gen >= 6)
1494	ret = __i915_spin_request(req, state);	1470	gen6_rps_boost(req->i915, rps, req->emitted_jiffies);
1495	if (ret == 0)
1496	goto out;
1497		1471
1498	if (!irq_test_in_progress && WARN_ON(!engine->irq_get(engine))) {	1472	/* Optimistic spin for the next ~jiffie before touching IRQs */
1499	ret = -ENODEV;	1473	if (__i915_spin_request(req, state))
1500	goto out;	1474	goto complete;
1501	}
1502		1475
1503	add_wait_queue(&dev_priv->gpu_error.wait_queue, &reset);	1476	set_current_state(state);
1504	for (;;) {	1477	add_wait_queue(&req->i915->gpu_error.wait_queue, &reset);
1505	struct timer_list timer;
1506		1478
1507	prepare_to_wait(&engine->irq_queue, &wait, state);	1479	intel_wait_init(&wait, req->seqno);
1508		1480	if (intel_engine_add_wait(req->engine, &wait))
1509	/* We need to check whether any gpu reset happened in between	1481	/* In order to check that we haven't missed the interrupt
1510	* the request being submitted and now. If a reset has occurred,	1482	* as we enabled it, we need to kick ourselves to do a
1511	* the seqno will have been advance past ours and our request	1483	* coherent check on the seqno before we sleep.
1512	* is complete. If we are in the process of handling a reset,
1513	* the request is effectively complete as the rendering will
1514	* be discarded, but we need to return in order to drop the
1515	* struct_mutex.
1516	*/	1484	*/
1517	if (i915_reset_in_progress(&dev_priv->gpu_error)) {	1485	goto wakeup;
1518	ret = 0;
1519	break;
1520	}
1521
1522	if (i915_gem_request_completed(req, false)) {
1523	ret = 0;
1524	break;
1525	}
1526		1486
		1487	for (;;) {
1527	if (signal_pending_state(state, current)) {	1488	if (signal_pending_state(state, current)) {
1528	ret = -ERESTARTSYS;	1489	ret = -ERESTARTSYS;
1529	break;	1490	break;
1530	}	1491	}
1531		1492
1532	if (timeout && time_after_eq(jiffies, timeout_expire)) {
1533	ret = -ETIME;
1534	break;
1535	}
1536
1537	/* Ensure that even if the GPU hangs, we get woken up.	1493	/* Ensure that even if the GPU hangs, we get woken up.
1538	*	1494	*
1539	* However, note that if no one is waiting, we never notice	1495	* However, note that if no one is waiting, we never notice
@@ -1541,32 +1497,33 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
1541	* held by the GPU and so trigger a hangcheck. In the most	1497	* held by the GPU and so trigger a hangcheck. In the most
1542	* pathological case, this will be upon memory starvation!	1498	* pathological case, this will be upon memory starvation!
1543	*/	1499	*/
1544	i915_queue_hangcheck(dev_priv);	1500	i915_queue_hangcheck(req->i915);
1545
1546	timer.function = NULL;
1547	if (timeout \|\| missed_irq(dev_priv, engine)) {
1548	unsigned long expire;
1549		1501
1550	setup_timer_on_stack(&timer, fake_irq, (unsigned long)current);	1502	timeout_remain = io_schedule_timeout(timeout_remain);
1551	expire = missed_irq(dev_priv, engine) ? jiffies + 1 : timeout_expire;	1503	if (timeout_remain == 0) {
1552	mod_timer(&timer, expire);	1504	ret = -ETIME;
		1505	break;
1553	}	1506	}
1554		1507
1555	io_schedule();	1508	if (intel_wait_complete(&wait))
1556		1509	break;
1557	if (timer.function) {
1558	del_singleshot_timer_sync(&timer);
1559	destroy_timer_on_stack(&timer);
1560	}
1561	}
1562	remove_wait_queue(&dev_priv->gpu_error.wait_queue, &reset);
1563		1510
1564	if (!irq_test_in_progress)	1511	set_current_state(state);
1565	engine->irq_put(engine);
1566		1512
1567	finish_wait(&engine->irq_queue, &wait);	1513	wakeup:
		1514	/* Carefully check if the request is complete, giving time
		1515	* for the seqno to be visible following the interrupt.
		1516	* We also have to check in case we are kicked by the GPU
		1517	* reset in order to drop the struct_mutex.
		1518	*/
		1519	if (__i915_request_irq_complete(req))
		1520	break;
		1521	}
		1522	remove_wait_queue(&req->i915->gpu_error.wait_queue, &reset);
1568		1523
1569	out:	1524	intel_engine_remove_wait(req->engine, &wait);
		1525	__set_current_state(TASK_RUNNING);
		1526	complete:
1570	trace_i915_gem_request_wait_end(req);	1527	trace_i915_gem_request_wait_end(req);
1571		1528
1572	if (timeout) {	1529	if (timeout) {
@@ -2796,6 +2753,12 @@ i915_gem_init_seqno(struct drm_i915_private *dev_priv, u32 seqno)
2796	}	2753	}
2797	i915_gem_retire_requests(dev_priv);	2754	i915_gem_retire_requests(dev_priv);
2798		2755
		2756	/* If the seqno wraps around, we need to clear the breadcrumb rbtree */
		2757	if (!i915_seqno_passed(seqno, dev_priv->next_seqno)) {
		2758	while (intel_kick_waiters(dev_priv))
		2759	yield();
		2760	}
		2761
2799	/* Finally reset hw state */	2762	/* Finally reset hw state */
2800	for_each_engine(engine, dev_priv)	2763	for_each_engine(engine, dev_priv)
2801	intel_ring_init_seqno(engine, seqno);	2764	intel_ring_init_seqno(engine, seqno);