nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: move gpfifo submit wait to userspace	Aingara Paramakuru	2016-09-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of blocking for gpfifo space in the nvgpu driver, return -EAGAIN and allow userspace to decide the blocking policy. Bug 1795076 Change-Id: Ie091caa92aad3f68bc01a3456ad948e76883bc50 Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1202591 (cherry picked from commit 8056f422c6a34a4239fc4993c40c2e517c932714) Reviewed-on: http://git-master/r/1203800 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: separate IOCTL to set preemption mode	Deepak Nibade	2016-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add separate IOCTL NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE to allow setting preemption modes from UMD Define preemption modes in nvgpu.h and use them everywhere Remove mode definitions from mm_gk20a.h Also, we support setting only one preemption mode in a channel But it is possible to have multiple preemption modes (one from graphics and one from compute) set simultaneously Hence, update struct gr_ctx_desc to include two separate preemption modes (graphics_preempt_mode and compute_preempt_mode) API NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE also supports setting two separate preemption modes i.e. one for graphics and one for compute Make necessary changes in code to support two preemption modes Bug 1646259 Change-Id: Ia1dea19e609ba8cc0de2f39ab6c0c4cd6b0a752c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1131805 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add trace and debugfs for sched params	Thomas Fleury	2016-05-05
\| \| \| \| \| \| \| \| \| \| \| \|	JIRA EVLR-244 JIRA EVLR-318 Change-Id: Ie95f42212dadcf2d0c1737eeb28812afb03b712f Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1120603 GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Ken Adams <kadams@nvidia.com>
*	gpu: nvgpu: add trace event for channel reset	Thomas Fleury	2016-04-07
\| \| \| \| \| \| \| \|	Change-Id: I319e877978b7f483108ef8f67c05702b71709f62 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1120501 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add support for FECS ctxsw tracing	Anton Vorontsov	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 1648908 This commit adds support for FECS ctxsw tracing. Code is compiled conditionnaly under CONFIG_GK20_CTXSW_TRACE. This feature requires an updated FECS ucode that writes one record to a ring buffer on each context switch. On RM/Kernel side, the GPU driver reads records from the master ring buffer and generates trace entries into a user-facing VM ring buffer. For each record in the master ring buffer, RM/Kernel has to retrieve the vmid+pid of the user process that submitted related work. Features currently implemented: - master ring buffer allocation - debugfs to dump master ring buffer - FECS record per context switch (with both current and new contexts) - dedicated device for ctxsw tracing (access to VM ring buffer) - SOF generation (and access to PTIMER) - VM ring buffer allocation, and reconfiguration - enable/disable tracing at user level - event-based trace filtering - context_ptr to vmid+pid mapping - read system call for ctxsw dev - mmap system call for ctxsw dev (direct access to VM ring buffer) - poll system call for ctxsw dev - save/restore register on ELPG/CG6 - separate user ring from FECS ring handling Features requiring ucode changes: - enable/disable tracing at FECS level - actual busy time on engine (bug 1642354) - master ring buffer threshold interrupt (P1) - API for GPU to CPU timestamp conversion (P1) - vmid/pid/uid based filtering (P1) Change-Id: I8e39c648221ee0fa09d5df8524b03dca83fe24f3 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1022737 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add per-channel refcounting	Konsta Holtta	2015-06-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add reference counting for channels, and wait for reference count to get to 0 in gk20a_channel_free() before actually freeing the channel. Also, change free channel tracking a bit by employing a list of free channels, which simplifies the procedure of finding available channels with reference counting. Each use of a channel must have a reference taken before use or held by the caller. Taking a reference of a wild channel pointer may fail, if the channel is either not opened or in a process of being closed. Also, add safeguards for protecting accidental use of closed channels, specifically, by setting ch->g = NULL in channel free. This will make it obvious if freed channel is attempted to be used. The last user of a channel might be the deferred interrupt handler, so wait for deferred interrupts to be processed twice in the channel free procedure: once for providing last notifications to the channel and once to make sure there are no stale pointers left after referencing to the channel has been denied. Finally, fix some races in channel and TSG force reset IOCTL path, by pausing the channel scheduler in gk20a_fifo_recover_ch() and gk20a_fifo_recover_tsg(), while the affected engines have been identified, the appropriate MMU faults triggered, and the MMU faults handled. In this case, make sure that the MMU fault does not attempt to query the hardware about the failing channel or TSG ids. This should make channel recovery more safe also in the regular (i.e., not in the interrupt handler) context. Bug 1530226 Bug 1597493 Bug 1625901 Bug 200076344 Bug 200071810 Change-Id: Ib274876908e18219c64ea41e50ca443df81d957b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/448463 (cherry picked from commit 3f03aeae64ef2af4829e06f5f63062e8ebd21353) Reviewed-on: http://git-master/r/755147 Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: Use busy looping for flush operations	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use busy looping for l2 tag flush and elpg flush operations. This is making total flash time more accurate and reduced overall time compared with usleep. Also added trace points to measure performance for these operations. Also corrected timeout error check for non-silicon platforms. Bug 200081799 Change-Id: I63410bb7528db9258501633996fbdee5fdec1c74 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/710472 (cherry picked from commit 18684cf9d5d6870a1a1fd5711c4fc2d733caad20) Reviewed-on: http://git-master/r/710986 GVS: Gerrit_Virtual_Submit Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: Use busy looping on memory ops	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Use busy looping on L2 and TLB maintenance operations. This speeds them up by an order of magnitude. Add also trace points to measure performance for memory ops and interrupt processing. Change-Id: Ic4a8525d3d946b2b8f57b4b8ddcfc61605619399 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/681640
*	gpu: nvgpu: gk20a: Add a gpfifo wait trace point	Janne Hellsten	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a couple of trace points for tracking when we need to wait for space in the gpfifo ring buffer. This wait can introduce significant latencies to rendering with large gpfifo entry inputs so it's good to be able to measure how often this path is taken. Bug 1592391 Bug 1550886 Change-Id: I7f362e9c307eeffeeecaaba268ef2e3613e54597 Signed-off-by: Janne Hellsten <jhellsten@nvidia.com> Reviewed-on: http://git-master/r/674021 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: cde: add trace events for ctx allocs	Konsta Holtta	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \|	Trace cde context allocation and deallocation with ftrace. Bug 200052943 Change-Id: Ieeb625166662971fb3eb3fb29c986fdb6809c10b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/602886 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add trace event for channel update	Konsta Holtta	2015-03-18
\| \| \| \| \| \| \| \| \| \|	Bug 200052943 Change-Id: Ied6454bbfb5df9ab29497ecbf2aac495f6d89362 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/602887 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Implement NVGPU_AS_IOCTL_GET_VA_REGIONS	Sami Kiminki	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_AS_IOCTL_GET_VA_REGIONS which returns a list of GPU VA regions for different page sizes. This is required for the userspace for safe fixed-address address space allocation. Bug 1551752 Change-Id: I63ddde30935db2471bec498dae0caa870e89c1a5 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/590814 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add rail gating trace events	Terje Bergstrom	2015-03-18
\| \| \| \| \| \| \|	Change-Id: I661f14b2858fb7bc993157a597d4a278859da837 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/418789 Reviewed-by: Automatic_Commit_Validation_User
*	video: tegra: host: Abstract gk20a channel synchronization	Lauri Peltonen	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move all channel synchronization code to new channel_sync_gk20a.c/h files, and access all synchronization functions through function pointers. This is groundwork for supporting semaphore based channel synchronization. Bug 1434573 Bug 1450122 Change-Id: Ic21709c1ee8cf85d018616787988e7eebb399fbe Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com> Reviewed-on: http://git-master/r/374841 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	video: tegra: host: gk20a: Add own trace events	Arto Merilainen	2015-03-18
	This patch modifies gk20a driver to use its own trace events instead of trace events of nvhost. Bug 1468086 Change-Id: I1f40ba957ec7e25904e9f4621346cb7dc256a550 Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/374289 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>