litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux	Linus Torvalds	2012-10-04
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull drm merge (part 1) from Dave Airlie: "So first of all my tree and uapi stuff has a conflict mess, its my fault as the nouveau stuff didn't hit -next as were trying to rebase regressions out of it before we merged. Highlights: - SH mobile modesetting driver and associated helpers - some DRM core documentation - i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write combined pte writing, ilk rc6 support, - nouveau: major driver rework into a hw core driver, makes features like SLI a lot saner to implement, - psb: add eDP/DP support for Cedarview - radeon: 2 layer page tables, async VM pte updates, better PLL selection for > 2 screens, better ACPI interactions The rest is general grab bag of fixes. So why part 1? well I have the exynos pull req which came in a bit late but was waiting for me to do something they shouldn't have and it looks fairly safe, and David Howells has some more header cleanups he'd like me to pull, that seem like a good idea, but I'd like to get this merge out of the way so -next dosen't get blocked." Tons of conflicts mostly due to silly include line changes, but mostly mindless. A few other small semantic conflicts too, noted from Dave's pre-merged branch. * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits) drm/nv98/crypt: fix fuc build with latest envyas drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering drm/nv41/vm: fix and enable use of "real" pciegart drm/nv44/vm: fix and enable use of "real" pciegart drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie drm/nouveau: store supported dma mask in vmmgr drm/nvc0/ibus: initial implementation of subdev drm/nouveau/therm: add support for fan-control modes drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules drm/nouveau/therm: calculate the pwm divisor on nv50+ drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster drm/nouveau/therm: move thermal-related functions to the therm subdev drm/nouveau/bios: parse the pwm divisor from the perf table drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices drm/nouveau/therm: rework thermal table parsing drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table drm/nouveau: fix pm initialization order drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it drm/nouveau: log channel debug/error messages from client object rather than drm client drm/nouveau: have drm debugging macros build on top of core macros ...
\| *	drm/i915: Replace the array of pages with a scatterlist	Chris Wilson	2012-09-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than have multiple data structures for describing our page layout in conjunction with the array of pages, we can migrate all users over to a scatterlist. One major advantage, other than unifying the page tracking structures, this offers is that we replace the vmalloc'ed array (which can be up to a megabyte in size) with a chain of individual pages which helps reduce memory pressure. The disadvantage is that we then do not have a simple array to iterate, or to access randomly. The common case for this is in the relocation processing, which will typically fit within a single scatterlist page and so be almost the same cost as the simple array. For iterating over the array, the extra function call could be optimised away, but in reality is an insignificant cost of either binding the pages, or performing the pwrite/pread. v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the trivial compile error from rebasing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	drm/i915: add workarounds to gen7_render_ring_flush	Paulo Zanoni	2012-09-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From Bspec, Vol 2a, Section 1.9.3.4 "PIPE_CONTROL", intro section detailing the various workarounds: "[DevIVB {W/A}, DevHSW {W/A}]: Pipe_control with CS-stall bit set must be issued before a pipe-control command that has the State Cache Invalidate bit set." Note that public Bspec has different numbering, it's Vol2Part1, Section 1.10.4.1 "PIPE_CONTROL" there. There's also a second workaround for the PIPE_CONTROL command itself: "[DevIVB, DevVLV, DevHSW] {WA}: Every 4th PIPE_CONTROL command, not counting the PIPE_CONTROL with only read-cache-invalidate bit(s) set, must have a CS_STALL bit set" For simplicity we simply set the CS_STALL bit on every pipe_control on gen7+ Note that this massively helps on some hsw machines, together with the following patch to unconditionally set the CS_STALL bit on every pipe_control it prevents a gpu hang every few seconds. This is a regression that has been introduced in the pipe_control cleanup: commit 6c6cf5aa9c583478b19e23149feaa92d01fb8c2d Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 18:02:28 2012 +0100 drm/i915: Only apply the SNB pipe control w/a to gen6 It looks like the massive snb pipe_control workaround also papered over any issues on ivb and hsw. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> [danvet: squashed both workarounds together, pimped commit message with Bsepc citations, regression commit citation and changed the comment in the code a bit to clarify that we unconditionally set CS_STALL to avoid being hurt by trying to be clever.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	drm/i915: add workarounds directly to gen6_render_ring_flush	Paulo Zanoni	2012-09-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since gen 7+ now run the new gen7_render_ring_flush function. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	drm/i915: add gen7_render_ring_flush	Paulo Zanoni	2012-09-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For now, just a copy of gen6_render_ring_flush. Different gens have different workarounds, so we want different functions. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	drm/i915: Only pwrite through the GTT if there is space in the aperture	Chris Wilson	2012-08-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid stalling and waiting for the GPU by checking to see if there is sufficient inactive space in the aperture for us to bind the buffer prior to writing through the GTT. If there is inadequate space we will have to stall waiting for the GPU, and incur overheads moving objects about. Instead, only incur the clflush overhead on the target object by writing through shmem. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	Merge tag 'v3.6-rc2' into drm-intel-next	Daniel Vetter	2012-08-17
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backmerge Linux 3.6-rc2 to resolve a few funny conflicts before we put even more madness on top: - drivers/gpu/drm/i915/i915_irq.c: Just a spurious WARN removed in -fixes, that has been changed in a variable-rename in -next, too. - drivers/gpu/drm/i915/intel_ringbuffer.c: -next remove scratch_addr (since all their users have been extracted in another fucntion), -fixes added another user for a hw workaroudn. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| * \|	drm/i915: Lazily apply the SNB+ seqno w/a	Chris Wilson	2012-08-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid the forcewake overhead when simply retiring requests, as often the last seen seqno is good enough to satisfy the retirment process and will be promptly re-run in any case. Only ensure that we force the coherent seqno read when we are explicitly waiting upon a completion event to be sure that none go missing, and also for when we are reporting seqno values in case of error or debugging. This greatly reduces the load for userspace using the busy-ioctl to track active buffers, for instance halving the CPU used by X in pushing the pixels from a software render (flash). The effect will be even more magnified with userptr and so providing a zero-copy upload path in that instance, or in similar instances where X is simply compositing DRI buffers. v2: Reverse the polarity of the tachyon stream. Daniel suggested that 'force' was too generic for the parameter name and that 'lazy_coherency' better encapsulated the semantics of it being an optimization and its purpose. Also notice that gen6_get_seqno() is only used by gen6/7 chipsets and so the test for IS_GEN6 \|\| IS_GEN7 is redundant in that function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| * \|	drm/i915: Only apply the SNB pipe control w/a to gen6	Chris Wilson	2012-08-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The requirements for the sync flush to be emitted prior to the render cache flush is only true for SandyBridge. On IvyBridge and friends we can just emit the flushes with an inline CS stall. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| * \|	drm/i915: Macro to determine DPF support	Ben Widawsky	2012-07-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally I had a macro specifically for DPF support, and Daniel, with good reason asked me to change it to this. It's not the way I would have gone (and indeed I didn't), but for now there is no distinction as all platforms with L3 also have DPF. Note: The good reasons are that dpf is a l3$ feature (at least on currrent hw), hence I don't expect one to go without the other. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: added note] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| * \|	drm/i915: Split i915_gem_flush_ring() into seperate invalidate/flush funcs	Chris Wilson	2012-07-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By moving the function to intel_ringbuffer and currying the appropriate parameter, hopefully we make the callsites easier to read and understand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| * \|	drm/i915: Remove the per-ring write list	Chris Wilson	2012-07-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is now handled by a global flag to ensure we emit a flush before the next serialisation point (if we failed to queue one previously). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \| \|	UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/	David Howells	2012-10-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Convert #include "..." to #include <path/...> in drivers/gpu/. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>
* \| \|	UAPI: (Scripted) Remove redundant DRM UAPI header #inclusions from drivers/gpu/.	David Howells	2012-10-02
\| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove redundant DRM UAPI header #inclusions from drivers/gpu/. Remove redundant #inclusions of core DRM UAPI headers (drm.h, drm_mode.h and drm_sarea.h). They are now #included via drmP.h and drm_crtc.h via a preceding patch. Without this patch and the patch to make include the UAPI headers from the core headers, after the UAPI split, the DRM C sources cannot find these UAPI headers because the DRM code relies on specific -I flags to make #include "..." work on headers in include/drm/ - but that does not work after the UAPI split without adding more -I flags. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>
* \|	drm/i915: Apply post-sync write for pipe control invalidates	Chris Wilson	2012-08-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When invalidating the TLBs it is documentated as requiring a post-sync write. Failure to do so seems to result in a GPU hang. Exposure to this hang on IVB seems to be a result of removing the extra stalls required for SNB pipecontrol workarounds: commit 6c6cf5aa9c583478b19e23149feaa92d01fb8c2d Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 18:02:28 2012 +0100 drm/i915: Only apply the SNB pipe control w/a to gen6 Note: Manually switch the pipe_control cmd to 4 dwords to avoid a (silent) functional conflict with -next. This way will get a loud (but conflict with next (since the scratch_addr has been deleted there). Reported-and-tested-by: yex.tian@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53322 Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: added note about merge conflict with -next.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: correctly order the ring init sequence	Daniel Vetter	2012-08-08
\|/ \| \| \| \| \| \| \| \| \| \| \| \|	We may only start to set up the new register values after having confirmed that the ring is truely off. Otherwise the hw might lose the newly written register values. This is caught later on in the init sequence, when we check whether the register writes have stuck. Cc: stable@vger.kernel.org Reviewed-by: Jani Nikula <jani.nikula@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50522 Tested-by: Yang Guang <guang.a.yang@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	drm/i915: missing error case in init status page	Ben Widawsky	2012-07-20
\| \| \| \| \|	Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	drm/i915: Add comments to explain the BSD tail write workaround	Chris Wilson	2012-07-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Having had to dive into the bspec to understand what each stage of the workaround meant, and how that the ring broadcasting IDLE corresponded with the GT powering down the ring (i.e. rc6) add comments to aide the next reader. And since the register "is used to control all aspects of PSMI and power saving functions" that makes it quite interesting to inspect with regards to RC6 hangs, so add it to the error-state. v2: Rediscover the piece of magic, set the RNCID to 0 before waiting for the ring to wake up. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	drm/i915: don't return a spurious -EIO from intel_ring_begin	Daniel Vetter	2012-07-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The issue with this check is that it results in userspace receiving an -EIO while the gpu reset hasn't completed, resulting in fallback to sw rendering or worse. Now there's also a stern comment in intel_ring_wait_seqno saying that intel_ring_begin should not return -EAGAIN, ever, because some callers can't handle that. But after an audit of the callsites I don't see any issues. I guess the last problematic spot disappeared with the removal of the pipelined fencing code. So do the right thing and call check_wedge, which should properly decide whether an -EAGAIN or -EIO is appropriate if wedged is set. Note that the early check for a wedged gpu before touching the ring is rather important (and it took me quite some time of acting like the densest doofus to figure that out): If we don't do that and the gpu died for good, not having been resurrect by the reset code, userspace can merrily fill up the entire ring until it notices that something is amiss. Allowing userspace to emit more render, despite that we know that it will fail can't lead to anything good (and by experience can lead to all sorts of havoc, including angering the OOM gods and hard-hanging the hw for good). v2: Fix EAGAIN mispell, noticed by Chris Wilson. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	drm/i915: non-interruptible sleeps can't handle -EAGAIN	Daniel Vetter	2012-07-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	So don't return -EAGAIN, even in the case of a gpu hang. Remap it to -EIO instead. Note that this isn't really an issue with interruptability, but more that we have quite a few codepaths (mostly around kms stuff) that simply can't handle any errors and hence not even -EAGAIN. Instead of adding proper failure paths so that we could restart these ioctls we've opted for the cheap way out of sleeping non-interruptibly. Which works everywhere but when the gpu dies, which this patch fixes. So essentially interruptible == false means 'wait for the gpu or die trying'.' This patch is a bit ugly because intel_ring_begin is all non-interruptible and hence only returns -EIO. But as the comment in there says, auditing all the callsites would be a pain. To avoid duplicating code, reuse i915_gem_check_wedge in __wait_seqno and intel_wait_ring_buffer. Also use the opportunity to clarify the different cases in i915_gem_check_wedge a bit with comments. v2: Don't access dev_priv->mm.interruptible from check_wedge - we might not hold dev->struct_mutex, making this racy. Instead pass interruptible in as a parameter. I've noticed this because I've hit a BUG_ON(!mutex_is_locked) at the top of check_wedge. This has been added in commit b4aca0106c466b5a0329318203f65bac2d91b682 Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Apr 25 20:50:12 2012 -0700 drm/i915: extract some common olr+wedge code although that commit is missing any justification for this. I guess it's just copy&paste, because the same commit add the same BUG_ON check to check_olr, where it indeed makes sense. But in check_wedge everything we access is protected by other means, so this is superflous. And because it now gets in the way (we add a new caller in __wait_seqno, which can be called without dev->struct_mutext) let's just remove it. v3: Group all the i915_gem_check_wedge refactoring into this patch, so that this patch here is all about not returning -EAGAIN to callsites that can't handle syscall restarting. v4: Add clarification what interuptible == fales means in our code, requested by Ben Widawsky. v5: Fix EAGAIN mispell noticed by Chris Wilson. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	drm/i915: "Flush Me Harder" required on gen6+	Daniel Vetter	2012-06-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The prep to remove the flushing list in commit cc889e0f6ce6a63c62db17d702ecfed86d58083f Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Jun 13 20:45:19 2012 +0200 drm/i915: disable flushing_list/gpu_write_list causes quite some decent regressions. We can fix this by setting the CS_STALL bit to ensure that the following seqno write happens only after the cache flush has completed. But only do that when the caller actually wants the flush (and not also when we invalidate caches before starting the next batch). I've looked through all our ancient scrolls about gen6+ pipe control workarounds, and this seems to be indeed a legal combination: We're allowed to set the CS_STALL bit when we flush the render cache (which we do). While yelling at this code, also pass back the return value from intel_emit_post_sync_nonzero_flush properly. v2: Instead of emitting more pipe controls, set the CS_STALL bit on the write flush as suggested by Chris Wilson. It seems to work, too. Cc: Eric Anholt <eric@anholt.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51436 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51429 Tested-by: Lu Hua <huax.lu@intel.com> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	Merge tag 'v3.5-rc4' into drm-intel-next-queued	Daniel Vetter	2012-06-25
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I want to merge the "no more fake agp on gen6+" patches into drm-intel-next (well, the last pieces). But a patch in 3.5-rc4 also adds a new use of dev->agp. Hence the backmarge to sort this out, for otherwise drm-intel-next merged into Linus' tree would conflict in the relevant code, things would compile but nicely OOPS at driver load :( Conflicts in this merge are just simple cases of "both branches changed/added lines at the same place". The only tricky part is to keep the order correct wrt the unwind code in case of errors in intel_ringbuffer.c (and the MI_DISPLAY_FLIP #defines in i915_reg.h together, obviously). Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_ringbuffer.c Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	drm/i915: hold forcewake around ring hw init	Daniel Vetter	2012-06-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Empirical evidence suggests that we need to: On at least one ivb machine when running the hangman i-g-t test, the rings don't properly initialize properly - the RING_START registers seems to be stuck at all zeros. Holding forcewake around this register init sequences makes chip reset reliable again. Note that this is not the first such issue: commit f01db988ef6f6c70a6cc36ee71e4a98a68901229 Author: Sean Paul <seanpaul@chromium.org> Date: Fri Mar 16 12:43:22 2012 -0400 drm/i915: Add wait_for in init_ring_common added delay loops to make RING_START and RING_CTL initialization reliable on the blt ring at boot-up. So I guess it won't hurt if we do this unconditionally for all force_wake needing gpus. To avoid copy&pasting of the HAS_FORCE_WAKE check I've added a new intel_info bit for that. v2: Fixup missing commas in static struct and properly handling the error case in init_ring_common, both noticed by Jani Nikula. Cc: stable@vger.kernel.org Reported-and-tested-by: Yang Guang <guang.a.yang@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50522 Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	drm/i915: Mark the ringbuffers as being in the GTT domain	Chris Wilson	2012-06-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By correctly describing the rinbuffers as being in the GTT domain, it appears that we are more careful with the management of the CPU cache upon resume and so prevent some coherency issue when submitting commands to the GPU later. A secondary effect is that the debug logs are then consistent with the actual usage (i.e. they no longer describe the ringbuffers as being in the CPU write domain when we are accessing them through an wc iomapping.) Reported-and-tested-by: Daniel Gnoutcheff <daniel@gnoutcheff.name> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41092 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	drm/i915: Reset last_retired_head when resetting ring	Chris Wilson	2012-05-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we reset the ring control registers, including the HEAD and TAIL of the ring, we also need to reset associated state. In this instance, we were failing to reset the cached value of ring->last_retired_head and so upon the first request for more space following a resume would potentially (depending on a narrow race window) believe that the HEAD had advanced much further than reality. This is a regression from: commit a71d8d94525e8fd855c0466fb586ae1cb008f3a2 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Feb 15 11:25:36 2012 +0000 drm/i915: Record the tail at each request and use it to estimate the head Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org # 3.4 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: possibly invalidate TLB before context switch	Ben Widawsky	2012-06-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From http://intellinuxgraphics.org/documentation/SNB/IHD_OS_Vol1_Part3.pdf [DevSNB] If Flush TLB invalidation Mode is enabled it's the driver's responsibility to invalidate the TLBs at least once after the previous context switch after any GTT mappings changed (including new GTT entries). This can be done by a pipelined PIPE_CONTROL with TLB inv bit set immediately before MI_SET_CONTEXT. On GEN7 the invalidation mode is explicitly set, but this appears to be lacking for GEN6. Since I don't know the history on this, I've decided to dynamically read the value at ring init time, and use that value throughout. v2: better comment (daniel) Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
* \|	drm/i915: PIPE_CONTROL_TLB_INVALIDATE	Ben Widawsky	2012-06-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This has showed up in several other patches. It's required for the next context workaround. I tested this one on its own and saw no differences in basic tests (performance or otherwise). This patch is relatively likely to cause regressions, hence why it's split out. Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
* \|	drm/i915: stop using dev->agp->base	Daniel Vetter	2012-06-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For that to work we need to export the base address of the gtt mmio window from intel-gtt. Also replace all other uses of dev->agp by values we already have at hand. Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: enable parity error interrupts	Ben Widawsky	2012-05-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous patch put all the code, and handlers in place. It should now be safe to enable the parity error interrupt. The parity error must be unmasked in both the GTIMR, and the CS IMR. Unfortunately, the docs aren't clear about this; nevertheless it's the truth. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: s/i915_wait_request/i915_wait_seqno/g	Ben Widawsky	2012-05-25
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Wait request is poorly named IMO. After working with these functions for some time, I feel it's much clearer to name the functions more appropriately. Of course we must update the callers to use the new name as well. This leaves room within our namespace for a real wait request function at some point. Note to maintainer: this patch is optional. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	Merge remote-tracking branch 'airlied/drm-core-next' into drm-intel-next-queued	Daniel Vetter	2012-05-08
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backmerge of drm-next to resolve a few ugly conflicts and to get a few fixes from 3.4-rc6 (which drm-next has already merged). Note that this merge also restricts the stencil cache lra evict policy workaround to snb (as it should) - I had to frob the code anyway because the CM0_MASK_SHIFT define died in the masked bit cleanups. We need the backmerge to get Paulo Zanoni's infoframe regression fix for gm45 - further bugfixes from him touch the same area and would needlessly conflict. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	Merge tag 'v3.4-rc6' into drm-intel-next	Daniel Vetter	2012-05-07
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: drivers/gpu/drm/i915/intel_display.c Ok, this is a fun story of git totally messing things up. There /shouldn't/ be any conflict in here, because the fixes in -rc6 do only touch functions that have not been changed in -next. The offending commits in drm-next are 14415745b2..1fa611065 which simply move a few functions from intel_display.c to intel_pm.c. The problem seems to be that git diff gets completely confused: $ git diff 14415745b2..1fa611065 is a nice mess in intel_display.c, and the diff leaks into totally unrelated functions, whereas $git diff --minimal 14415745b2..1fa611065 is exactly what we want. Unfortunately there seems to be no way to teach similar smarts to the merge diff and conflict generation code, because with the minimal diff there really shouldn't be any conflicts. For added hilarity, every time something in that area changes the + and - lines in the diff move around like crazy, again resulting in new conflicts. So I fear this mess will stay with us for a little longer (and might result in another backmerge down the road). Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| \| *	drm/i915: Set the Stencil Cache eviction policy to non-LRA mode.	Kenneth Graunke	2012-04-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Clearing bit 5 of CACHE_MODE_0 is necessary to prevent GPU hangs in OpenGL programs such as Google MapsGL, Google Earth, and gzdoom when using separate stencil buffers. Without it, the GPU tries to use the LRA eviction policy, which isn't supported. This was supposed to be off by default, but seems to be on for many machines. This cannot be done in gen6_init_clock_gating with most of the other workaround bits; the render ring needs to exist. Otherwise, the register write gets dropped on the floor (one printk will show it changed, but a second printk immediately following shows the value reverts to the old one). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47535 Cc: stable@vger.kernel.org Cc: Rob Castle <futuredub@gmail.com> Cc: Eric Appleman <erappleman@gmail.com> Cc: aaron667@gmx.net Cc: Keith Packard <keithp@keithp.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>
* \| \|	drm/i915: add interface to simulate gpu hangs	Daniel Vetter	2012-05-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gpu reset is a very important piece of our infrastructure. Unfortunately we only really it test by actually hanging the gpu, which often has bad side-effects for the entire system. And the gpu hang handling code is one of the rather complicated pieces of code we have, consisting of - hang detection - error capture - actual gpu reset - reset of all the gem bookkeeping - reinitialition of the entire gpu This patch adds a debugfs to selectively stopping rings by ceasing to update the hw tail pointer, which will result in the gpu no longer updating it's head pointer and eventually to the hangcheck firing. This way we can exercise the gpu hang code under controlled conditions without a dying gpu taking down the entire systems. Patch motivated by me forgetting to properly reinitialize ppgtt after a gpu reset. Usage: echo $((1 << $ringnum)) > i915_ring_stop # stops one ring echo 0xffffffff > i915_ring_stop # stops all, future-proof version then run whatever testload is desired. i915_ring_stop automatically resets after a gpu hang is detected to avoid hanging the gpu to fast and declaring it wedged. v2: Incorporate feedback from Chris Wilson. v3: Add the missing cleanup. v4: Fix up inconsistent size of ring_stop_read vs _write, noticed by Eugeni Dodonov. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \| \|	drm/i915: fixup __iomem mixups in ringbuffer.c	Daniel Vetter	2012-05-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Two things: - ring->virtual start is an __iomem pointer, treat it accordingly. - dev_priv->status_page.page_addr is now always a cpu addr, no pointer casting needed for that. Take the opportunity to remove the unnecessary drm indirection when setting up the ringbuffer iomapping. v2: Add a compiler barrier before reading the hw status page. Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \| \|	drm/i915: kill pointless clearing of dev_priv->hws_map	Daniel Vetter	2012-05-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We kzalloc dev_priv, and we never use hws_map in intel_ringbuffer.c. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \| \|	drm/i915: remove do_retire from i915_wait_request	Ben Widawsky	2012-05-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This originates from a hack by me to quickly fix a bug in an earlier patch where we needed control over whether or not waiting on a seqno actually did any retire list processing. Since the two operations aren't clearly related, we should pull the parameter out of the wait function, and make the caller responsible for retiring if the action is desired. The only function call site which did not get an explicit retire_request call (on purpose) is i915_gem_inactive_shrink(). That code was already calling retire_request a second time. v2: don't modify any behavior excepit i915_gem_inactive_shrink(Daniel) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \| \|	drm/i915: rip out GEM drm feature checks	Daniel Vetter	2012-05-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We always set it so there's no point in checking. We could instead add a bit that tells us whether gem is actually initialized (i.e. either kms or gem_init_ioctl called), but that's imho not worth it. So just rip it out. There's a little change in the wait_ring timeout, but we've never run with anything else than the 60 second timeout, even on dri1 userspace. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \| \|	drm/i915: Use a global lock for modifying global irq flags	Chris Wilson	2012-05-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were attempting to use a per-ring spinlock whilst modifying global IRQ flags. A recipe for rare missed interrupts. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \| \|	drm/i915: create macros to handle masked bits	Daniel Vetter	2012-05-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	... and put them to so good use. Note that there's functional change in vlv clock gating code, we now no longer spuriously read back the current value of the bit. According to Bspec the high bits should always read zero, so ORing this in should have no effect. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \| \|	drm/i915: i8xx interrupt handler	Chris Wilson	2012-05-03
\|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gen2 hardware has some significant differences from the other interrupt routines that were glossed over and then forgotten about in the transition to KMS. Such as - 16bit IIR - PendingFlip status bit This patch reintroduces a handler specifically for gen2 for the purpose of handling pageflips correctly, simplifying code in the process. v2: Also fixup ring get/put irq to only access 16bit registers (Daniel) Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=24202 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41793 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: use posting_read16 in intel_ringbuffer.c and kill _driver from the function names.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: invalidate render cache on gen2	Daniel Vetter	2012-04-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It looks like we also need to flush the render cache when we just invalidate it. This fixes a regression in i-g-t/gem_tiled_blits on my i855gm. I guess the render cache there is virtually indexed, so we need to clean it when changing gtt mappings. This regression has been introduce in commit 46f0f8d120c4afae53a5670bf3ac80a928340ff3 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Apr 18 11:12:11 2012 +0100 drm/i915: Don't set a MBZ bit in gen2/3 MI_FLUSH Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: Don't set a MBZ bit in gen2/3 MI_FLUSH	Chris Wilson	2012-04-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On gen2 MI_EXE_FLUSH is actually an AGP flush bit and on gen3 marked as reserved. On both it is documented as being must-be-zero. So obey the documentation, and separate the gen2 flush into its own little routine and share with gen3. This means that we can rename the existing render_ring_flush() to reflect the generation from which it first applies and remove the code for handling earlier generations from it. v2: Applies to gen3 as well v3: Make it compile and improve the commit message. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: Replace open coded MI_BATCH_GTT	Chris Wilson	2012-04-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The (2<<6) virtual memory space selector harks back to gen3 and is mandatory given our use of GTT space for batchbuffers. On gen4+, use of the GTT became mandatory and bit6 marked reserved. However the code must now explicitly set (1<<7), which conveniently is also (2<<6). To clarify the meaning for future readers, replace the open coded (2<<6) with MI_BATCH_GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: [sparse] trivial sparse fixes	Ben Widawsky	2012-04-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This should contain all the changes which require no thought to make sparse happy. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	Merge tag 'v3.4-rc3' into drm-intel-next-queued	Daniel Vetter	2012-04-17
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backmerge Linux 3.4-rc3 into drm-intel-next to resolve a few things that conflict/depend upon patches in -rc3: - Second part of the Sandybridge workaround series - it changes some of the same registers. - Preparation for Chris Wilson's fencing cleanup - we need the fix from -rc3 merged before we can move around all that code. - Resolve the gmbus conflict - gmbus has been disabled in 3.4 again, but should be enabled on all generations in 3.5. Conflicts: drivers/gpu/drm/i915/intel_i2c.c Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	drm/i915/ringbuffer: Exclude last 2 cachlines of ring on 845g	Chris Wilson	2012-04-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 845g shares the errata with i830 whereby executing a command within 2 cachelines of the end of the ringbuffer may cause a GPU hang. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: inline enable/disable_irq into ring->get/put_irq	Daniel Vetter	2012-04-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that these are properly refactored this additional indirection doesn't really buy us anything but confusion. Hence inline them. This duplicates the ironlake gt enable/disable code snippet, but we've already separate ilk from gen6+ gt irq in i915_irq.c, so I think this makes more sense. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: don't set up gem ring functions on gen5 for !kms	Daniel Vetter	2012-04-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already disallow initialition of gem in this case in the corresponding ioctl, so don't bother setting up the gem support ring functions in the legacy dri render ring init. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: consolidate ring->add_request a bit	Daniel Vetter	2012-04-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	They're indentical, so just kill one. Also give the other a prefix to distinguish it from the gen6+ functions - this add_request function is not really generic code. v2: Fixup commit message as noted by Ben Widawsky. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>