nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
...
*	gpu: nvgpu: use vidmem for gpfifos if available	Konsta Holtta	2016-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the common gk20a_gmmu_alloc() that tries vidmem too. Jira DNVGPU-21 Change-Id: Ie22cb0f5ed70ec71567fc85d348b3526c9a32b02 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1204304 (cherry picked from commit 07cb99baeb10194c520addd77517841a6f99df93) Reviewed-on: http://git-master/r/1169310 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: gk20a: Use spin_lock for jobs_lock	Bharat Nihalani	2016-08-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is done to boost performance of the GPU submit time, which is critical for compute use-cases. Bug 200215465 Bug 1804898 Conflicts: drivers/gpu/nvgpu/gk20a/channel_gk20a.c Change-Id: Ic4884ee4eac910b92b84a47fdc1b2e9f26b2f1f0 Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com> Reviewed-on: http://git-master/r/1199860 Reviewed-on: http://git-master/r/1209834 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix ctxsw timeout handling for TSGs	Thomas Fleury	2016-08-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While collecting failing engine data, id type (is_tsg) was not set for ctxsw and save engine states. This could result in some ctxsw timeout interrupts to be ignored (id reported with wrong is_tsg). For TSGs, check if we made some progress on any of the channels before kicking fifo recovery. Bug 200228310 Jira EVLR-597 Change-Id: I231549ae68317919532de0f87effb78ee9c119c6 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1204035 (cherry picked from commit 7221d256fd7e9b418f7789b3d81eede8faa16f0b) Reviewed-on: http://git-master/r/1204037 Reviewed-by: Richard Zhao <rizhao@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: use force_reset_ch in ch wdt handler	Richard Zhao	2016-08-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- let force_reset_ch pass down err code - force_reset_ch callback can cover vgpu too. Bug 1776876 JIRA VFND-2151 Change-Id: I48f7890294c6455247198e0cab5f21f83f61f0e1 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1202255 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix deferred engine reset sequence	Deepak Nibade	2016-08-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently store fault_id into fifo.deferred_fault_engines and use that in gk20a_fifo_reset_engine() which is incorrect Also, in deferred engine reset path during channel close, we do not check if channel is loaded on engine or not fix this with below - store engine_id bits into fifo.deferred_fault_engines - define new API gk20a_fifo_deferred_reset() to perform deferred engine reset - get all engines on which channel is loaded with gk20a_fifo_engines_on_id() - for each set bit/engine_id in fifo.deferred_fault_engines, check if channel is loaded on that engine, and if yes, reset the engine Bug 1791696 Change-Id: I1b8b1a9e3aa538fe6903a352aa732b47c95ec7d5 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1195087 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Free the channel used semaphore index	Lakshmanan M	2016-08-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Free the channel used semaphore index during gk20a_free_channel(). Bug 1793819 Change-Id: I4215d05f7f3ba0636e2abb1803011711c8a38301 Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: http://git-master/r/1196877 (cherry picked from commit 2c5720de506caac29629f6a1c578e6da80b1a135) Reviewed-on: http://git-master/r/1198883 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: initialize local variable	Deepak Nibade	2016-08-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Initialize character array buf in gk20a_channel_ioctl() to zero Keeping it uninitialized can result in leaking kernel stack info to user space since we pass this buffer to UMD Bug 1793398 Change-Id: Iffd654dbaca3b4e3c8fd2ac270d0febd01c165b8 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1195862 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: Add nvgpu infra to allow kernel to create privileged CE channels	Lakshmanan M	2016-07-20
\| \| \| \| \| \| \| \| \| \| \| \| \|	Added interface to allow kernel to create privileged CE channels for page migration and clearing support between sysmem and videmem. JIRA DNVGPU-53 Change-Id: I3e18d18403809c9e64fa45d40b6c4e3844992506 Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: http://git-master/r/1173085 GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
*	gpu: nvgpu: process granularity for FECS traces	Thomas Fleury	2016-07-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When processing FECS traces, a hash table is used to retrieve the 'pid' of the process that created the channel/TSG. Report process identifer (aka tgid in kernel) instead of thread identifier (aka pid) for FECS traces. Bug 1736423 Change-Id: I54cb9d298b9fe3e1cccdd7145604cd01c5758c9d Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1166501 (cherry picked from commit f7fd1f6d7ad0753b787ec20604a08a1f4882fe6f) Reviewed-on: http://git-master/r/1168728 (cherry picked from commit 97a62e5b89352fce576f1bca71b38bf2242ff047) Reviewed-on: http://git-master/r/1177823 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Richard Zhao <rizhao@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
*	gpu: nvgpu: use nvgpu_free for gpfifo pipe cleanup on error	Konsta Holtta	2016-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Replace kfree with nvgpu_free in error handling path in gk20a_alloc_channel_gpfifo where the gpfifo pipe buffer is being allocated, because it's allocated with nvgpu_alloc. Jira DNVGPU-21 Change-Id: I73100394b67da2ab064e4e9df6b430d818abce56 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1182401 GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
*	gpu: nvgpu: use vidmem by default in gmmu_alloc variants	Konsta Holtta	2016-07-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For devices that have vidmem available, use the vidmem allocator in gk20a_gmmu_alloc{,attr,_map,_map_attr}. For others, use sysmem. Because all of the buffers haven't been tested to work in vidmem yet, rename calls to gk20a_gmmu_alloc{,attr,_map,_map_attr} to have _sys at the end to declare explicitly that vidmem is used. Enabling vidmem for each now is a matter of removing "_sys" from the function call. Jira DNVGPU-18 Change-Id: Ibe42f67eff2c2b68c36582e978ace419dc815dc5 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1176805 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: initial support for vidmem apertures	Konsta Holtta	2016-07-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	add gk20a_aperture_mask() for memory target selection now that buffers can actually be allocated from vidmem, and use it in all cases that have a mem_desc available. Jira DNVGPU-76 Change-Id: I4353cdc6e1e79488f0875581cfaf2a5cfb8c976a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1169306 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: cancel job clean up before aborting channel	Deepak Nibade	2016-07-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is possible that when we abort the channel, we have job clean up worker running, which could race with abort and sometimes result in below panic [ 245.483566] Unable to handle kernel paging request at virtual address 800000000 ... [ 245.548991] PC is at gk20a_channel_abort_clean_up+0xb8/0x140 [ 245.554683] LR is at gk20a_channel_abort_clean_up+0xac/0x140 ... [ 247.301860] [<ffffffc000479390>] gk20a_channel_abort_clean_up+0xb8/0x140 [ 247.312853] [<ffffffc0004794d4>] gk20a_channel_abort+0xbc/0xc8 [ 247.322970] [<ffffffc0004794f8>] gk20a_disable_channel+0x18/0x30 [ 247.333267] [<ffffffc000479628>] gk20a_free_channel+0x118/0x584 [ 247.343473] [<ffffffc000479aa0>] gk20a_channel_close+0xc/0x14 [ 247.353479] [<ffffffc000479b80>] gk20a_channel_release+0xd8/0x104 Fix this by cancelling the job clean up worker before aborting the channel Bug 1777281 Change-Id: Ic24c7c03b27cfb5cd164a52efdb1e2813a41a10a Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1174416 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Revamp semaphore support	Alex Waterman	2016-06-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Revamp the support the nvgpu driver has for semaphores. The original problem with nvgpu's semaphore support is that it required a SW based wait for every semaphore release. This was because for every fence that gk20a_channel_semaphore_wait_fd() waited on a new semaphore was created. This semaphore would then get released by SW when the fence signaled. This meant that for every release there was necessarily a sync_fence_wait_async() call which could block. The latency of this SW wait was enough to cause massive degredation in performance. To fix this a fast path was implemented. When a fence is passed to gk20a_channel_semaphore_wait_fd() that is backed by a GPU semaphore a semaphore acquire is directly used to block the GPU. No longer is a sync_fence_wait_async() performed nor is there an extra semaphore created. To implement this fast path the semaphore memory had to be shared between channels. Previously since a new semaphore was created every time through gk20a_channel_semaphore_wait_fd() what address space a semaphore was mapped into was irrelevant. However, when using the fast path a sempahore may be released on one address space but acquired in another. Sharing the semaphore memory was done by making a fixed GPU mapping in all channels. This mapping points to the semaphore memory (the so called semaphore sea). This global fixed mapping is read-only to make sure no semaphores can be incremented (i.e released) by a malicious channel. Each channel then gets a RW mapping of it's own semaphore. This way a channel may only acquire other channel's semaphores but may both acquire and release its own semaphore. The gk20a fence code was updated to allow introspection of the GPU backed fences. This allows detection of when the fast path can be taken. If the fast path cannot be used (for example when a fence is sync-pt backed) the original slow path is still present. This gets used when the GPU needs to wait on an event from something which only understands how to use sync-pts. Bug 1732449 JIRA DNVGPU-12 Change-Id: Ic0fea74994da5819a771deac726bb0d47a33c2de Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1133792 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add interface for privileged channel allocation	Lakshmanan M	2016-06-23
\| \| \| \| \| \| \| \| \| \| \| \| \|	Added interface for privileged channel allocation to excute the privileged method (ex. CE phys mode transfer). JIRA DNVGPU-53 Change-Id: I07f9181720b14345cf5890919c2818dfcf505d86 Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: http://git-master/r/1169315 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: disable semaphore acquire timeout when channel wdt is disabled	Richard Zhao	2016-06-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	CUDA needs it disabled. Bug 1775453 Change-Id: Ic6d5050f9fda259337668e2a245c05e27d65e047 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1162765 (cherry picked from commit 44b48d84e75ced2fd9eecebbe94a0289c527c0c2) Reviewed-on: http://git-master/r/1169049 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: cancel channel wdt during suspend	Deepak Nibade	2016-06-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Cancel channel watchdog timeout during channel suspend This should help fix race conditions when watchdog is triggered during shutdown Bug 200209309 Change-Id: I6cf740d854c27985217a1a76afa822e3126d4153 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1168613 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Sparse fixes in gpfifo_mem user gpfifo	Konsta Holtta	2016-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	1) Keep the __user tag in the type of the user gpfifo when copying, 2) use NULL instead of 0 for initializing user_gpfifo pointer. Bug 200067946 Change-Id: I631b4bca44ded0900204134338fa1d62d0017df0 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1168441 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: use gpfifo_mem via gk20a_mem_{rd,wr}	Konsta Holtta	2016-06-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use gk20a_mem_*() accessors for gpfifo memory in work submission instead of direct cpu accesses in order to support other apertures than sysmem. The gpfifo memory is still allocated from sysmem for dgpus too. Split the copying of priv_cmds and the main gpfifo to be submitted in gk20a_submit_channel_gpfifo() into separate functions. JIRA DNVGPU-21 Change-Id: If271ca8e7e34235f00d31855dbccf77c0008e10b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1145923 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	Revert "gpu: nvgpu: Disable channel watchdog"	Deepak Nibade	2016-06-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit edd080b05ab118307c7c7b01426ea1e7c1cc9be7. Re-enable the watchdog since power management races are now resolved Bug 200198908 Change-Id: I74b97e564583aaedd858bc968adcfcaa275ea739 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1165746 GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: export gk20a_free_priv_cmdbuf	Alex Waterman	2016-06-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Export gk20a_free_priv_cmdbuf() so that the channel_sync_gk20a.c code can call this function. This is necessary for error paths in the semaphore wait/incr functions. Bug 1732449 JIRA DNVGPU-12 Change-Id: Id2ea13e5553d50475ee1bbf94781e18590321fdf Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1162686 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Disable channel watchdog	Terje Bergstrom	2016-06-14
\| \| \| \| \| \| \| \| \| \|	Bug 200198908 Change-Id: I4dfb3517f5467f8b5449e65290453ba1c828243d Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1163439 Tested-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: Add uapi support for non-graphics engines	Lakshmanan M	2016-06-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Extend the existing NVGPU_GPU_IOCTL_OPEN_CHANNEL interface to allow opening channels for other than the primary (i.e., the graphics) runlists. This is required to push work to dGPU engines that have their own runlists, such as the asynchronous copy engines and the multimedia engines. Minor change - Added active_engines_list allocation and assignment for fifo_vgpu back end. JIRA DNVGPU-25 Change-Id: I3ed377e2c9a2b4dd72e8256463510a62c64e7a8f Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: http://git-master/r/1161541 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Rework the channel timeout handler messages	Alex Waterman	2016-06-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rework how the messages in the channel timeout handler to be a little bit more verbose and more clear about what is happening. Bug 1732449 JIRA DNVGPU-12 Change-Id: Ifc018d99c647b3036caa8ad453e5e3dfc4151396 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1153669 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Remove dead priv_cmdbuf code	Alex Waterman	2016-06-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the gp_get and gp_put pointers from the priv_cmdbuf code. These pointers appear to track the position of th the priv_cmdbuf in the gp_fifo. However, these pointers are not used for anything nor are they needed for anything in the future. This code appears to be a relic left over from the past. Change-Id: Ibed1a6d51fa0cac12c5e0429760e8e2f611fc899 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1161859 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix event id polling	Deepak Nibade	2016-06-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_event_id_poll(), we always set the mask value and return it. This causes poll() from UMD to be always successful irrespective of event is really generated or not Fix this by adding a flag event_posted for each event Set this flag while posting the event In gk20a_event_id_poll(), set the mask value only if this flag is set. If flag is set, set mask and clear the flag Bug 200089620 Change-Id: If14236547c611fe4bfa1410ff5b69c9fa02d43bb Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1160253 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add multiple engine and runlist support	Lakshmanan M	2016-06-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL covers the following modification, 1) Added multiple engine_info support 2) Added multiple runlist_info support 3) Initial changes for ASYNC CE support 4) Added ASYNC CE interrupt handling support for gm206 GPU family 5) Added generic mechanism to identify the CE engine pri_base address for gm206 (CE0, CE1 and CE2) 6) Removed hard coded engine_id logic and made generic way 7) Code cleanup for readability JIRA DNVGPU-26 Change-Id: I2c3846c40bcc8d10c2dfb225caa4105fc9123b65 Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: http://git-master/r/1155963 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix TSG abort sequence	Deepak Nibade	2016-06-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_fifo_abort_tsg(), we loop through channels of TSG and call gk20a_channel_abort() for each channel This is incorrect since we disable and preempt each channel separately, whereas we should disable all channels at once and use TSG specific API to preempt TSG Fix this with below sequence : - gk20a_disable_tsg() to disable all channels - preempt tsg if required - for each channel in TSG - set has_timedout flag - call gk20a_channel_abort_clean_up() to clean up channel state Also, separate out common gk20a_channel_abort_clean_up() API which can be called from both channel and TSG abort routines In gk20a_channel_abort(), call gk20a_fifo_abort_tsg() if the channel is part of TSG Add new argument "preempt" to gk20a_fifo_abort_tsg() and preempt TSG if flag is set Bug 200205041 Change-Id: I4eff5394d26fbb53996f2d30b35140b75450f338 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1157190 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: use correct APIs for disable and preempt	Deepak Nibade	2016-06-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gr_gk20a_ctx_zcull_setup(), gr_gk20a_update_smpc_ctxsw_mode(), and in gk20a_channel_suspend(), we call channel specific APIs to disable/preempt/enable channel But we do not consider TSGs in this case Hence use correct (below) APIs in above functions which will handle channel or TSG internally : gk20a_disable_channel_tsg() gk20a_fifo_preempt() gk20a_enable_channel_tsg() Bug 200205041 Change-Id: Ieed378dac4ad2322b35f9102706176ec326d386c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1157189 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Free unbound channels	Terje Bergstrom	2016-05-31
\| \| \| \| \| \| \| \| \| \| \|	If a channel cannot be allocated a GPFIFO, its status will remain unbound. We still need to free the channel when the fd gets closed. Bug 1769481 Change-Id: I8a9170f431cfa98dcf9e3cbc082393f02d2203db Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1154725
*	gpu: nvgpu: add tsg support for vgpu	Richard Zhao	2016-05-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- make tsg_gk20a.c call HAL for enable/disable channels - add preempt_tsg HAL callbacks - add tsg bind/unbind channel HAL callbacks - add according tsg callbacks for vgpu Bug 1702773 JIRA VFND-1003 Change-Id: I2cba74b3ebd3920ef09219a168e6433d9574dbe8 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1144932 (cherry picked from commit c3787de7d38651d46969348f5acae2ba86b31ec7) Reviewed-on: http://git-master/r/1126942 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: suppress prints in submit path	Deepak Nibade	2016-05-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we run out of gpfifo space or private command buffer space, we have error spew like below : __gk20a_channel_syncpt_incr: not enough priv cmd buffer space gk20a_submit_channel_gpfifo: fail Dumping these prints to UART cause increase in submit latencies But on these failures, we return -ENOSPC to UMD and then UMD retries the submit, hence it might be unnecessary to dump these prints Hence, remove the error prints of insufficient space and use gk20a_dbg_fn() instead of gk20a_err() to print failure in gk20a_submit_channel_gpfifo() Bug 200202653 Change-Id: I49efd7c6c554dd4fbfa4e66d196eb352e69f92c6 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1152378 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: update trace for sched params	Thomas Fleury	2016-05-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use an inline function instead of a macro to "expand" all channel parameters. Jira EVLR-244 Jira EVLR-318 Change-Id: I4e8c5ee6bc9da36564af171be809f50dd2dfd439 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1150050 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix engine reset in FECS trace	Thomas Fleury	2016-05-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In virtualization case, VM server is the only one allowed to write to ctxsw ring buffer. It will also generate an event in case of engine reset. Only generate a tracepoint on Guest OS side. EVLR-314 Change-Id: I2cb09780a9b41237fe196ef1f3515923f36a24a4 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1130743 (cherry picked from commit 4bbf9538e2a3375eb86b2feea6c605c3eec2ca40) Reviewed-on: http://git-master/r/1133614 (cherry picked from commit 2076d944db41b37143c27795b3cffd88e99e0b00) Reviewed-on: http://git-master/r/1150046 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add ctxsw channel reset event	Thomas Fleury	2016-05-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Generate a ctxsw channel reset when engine needs to be reset. This event is generated by the driver, while other events are generated by FECS. JIRA ELVR-314 Change-Id: I7791cf3e538782464c37c442c871acb177484566 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1129029 (cherry picked from commit 114038a1a5d9e8941bc53f3e95115b01dd1f8c6e) Reviewed-on: http://git-master/r/1134379 (cherry picked from commit 15fa2a7b48a0937dfd449ca0c4ed5ad3a863d6bf) Reviewed-on: http://git-master/r/1123916 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: use mem_desc in priv_cmd_entry	Konsta Holtta	2016-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Replace the plain cpu pointer accesses with gk20a_mem_wr32(), and use a reference to the underlying mem_desc (within priv_cmd_queue) paired with an offset, for buffer aperture flexibility. JIRA DNVGPU-21 JIRA DNVGPU-23 Change-Id: I317672c94bb682bb895f9ed3e8116729c8bb7f4b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1145922 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: refactor gk20a_mem_{wr,rd} for vidmem	Konsta Holtta	2016-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To support vidmem, pass g and mem_desc to the buffer memory accessor functions. This allows the functions to select the memory access method based on the buffer aperture instead of using the cpu pointer directly (like until now). The selection and aperture support will be in another patch; this patch only refactors these accessors, but keeps the underlying functionality as-is. gk20a_mem_{rd,wr}32() work as previously; add also gk20a_mem_{rd,wr}() for byte-indexed accesses, gk20a_mem_{rd,wr}_n() for memcpy()-like functionality, and gk20a_memset() for filling buffers with a constant. The 8 and 16 bit accessor functions are removed. vmap()/vunmap() pairs are abstracted to gk20a_mem_{begin,end}() to support other types of mappings or conditions where mapping the buffer is unnecessary or different. Several function arguments that would access these buffers are also changed to take a mem_desc instead of a plain cpu pointer. Some relevant occasions are changed to use the accessor functions instead of cpu pointers without them (e.g., memcpying to and from), but the majority of direct accesses will be adjusted later, when the buffers are moved to support vidmem. JIRA DNVGPU-23 Change-Id: I3dd22e14290c4ab742d42e2dd327ebeb5cd3f25a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1121143 Reviewed-by: Ken Adams <kadams@nvidia.com> Tested-by: Ken Adams <kadams@nvidia.com>
*	gpu: nvgpu: separate IOCTL to set preemption mode	Deepak Nibade	2016-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add separate IOCTL NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE to allow setting preemption modes from UMD Define preemption modes in nvgpu.h and use them everywhere Remove mode definitions from mm_gk20a.h Also, we support setting only one preemption mode in a channel But it is possible to have multiple preemption modes (one from graphics and one from compute) set simultaneously Hence, update struct gr_ctx_desc to include two separate preemption modes (graphics_preempt_mode and compute_preempt_mode) API NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE also supports setting two separate preemption modes i.e. one for graphics and one for compute Make necessary changes in code to support two preemption modes Bug 1646259 Change-Id: Ia1dea19e609ba8cc0de2f39ab6c0c4cd6b0a752c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1131805 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	nvgpu: vgpu: create fifo.force_reset_ch in gpu_ops	Haley Teng	2016-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	gk20a_fifo_force_reset_ch() does not support vgpu now, so we need to create a function pointer in gpu_ops and assign it differently for vgpu and non-vgpu. Bug 200184349 Change-Id: I5f8f4f731b4b970c4ff8de65531f25568e7691b6 Signed-off-by: Haley Teng <hteng@nvidia.com> Reviewed-on: http://git-master/r/1130420 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: update priv_cmdbuff computation	Alex Waterman	2016-05-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update the priv_cmdbuff computation to take into account the amount of memory semaphores take. Since semaphores always require more memory than sync-pts the sync-pt computation has been dropped. Bug 1732449 JIRA DNVGPU-12 Change-Id: Ic05c26b4d1ed9cbd03d3239655c4607bb418396c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1141420 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add trace and debugfs for sched params	Thomas Fleury	2016-05-05
\| \| \| \| \| \| \| \| \| \| \| \|	JIRA EVLR-244 JIRA EVLR-318 Change-Id: Ie95f42212dadcf2d0c1737eeb28812afb03b712f Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1120603 GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Ken Adams <kadams@nvidia.com>
*	gpu: nvgpu: cancel clean up before update_fn_work	Deepak Nibade	2016-05-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_channel_suspend(), we first cancel worker thread update_fn_work and then cancel job clean up worker But it is possible that we re-schedule update_fn_work from job cleanup worker after it was cancelled And in that case worker update_fn_work might run after we poweroff GPU Fix this by first cancelling job clean up worker and then update_fn_work worker Bug 200187905 Change-Id: Ia192c515702f14becf60d92c6471d8c0e892551e Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1120426 (cherry picked from commit b132b6c5aa9ab88c6733f36906f1b874114ec72d) Reviewed-on: http://git-master/r/1134888 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: return from worker if gpu is not up	Deepak Nibade	2016-05-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During GPU shutdown path, it is possible that we shut down the GPU while worker thread is still running gk20a_channel_update() Hence before accessing gp_put/get, check if GPU is up or not Bug 200166139 Change-Id: Iba3ec173041a84527c4700a93f20564a842cfb01 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/935193 (cherry picked from commit c81ea5fe383c44e872754b363968af57d84225ac) Reviewed-on: http://git-master/r/1121917 Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Flush FB before checking semaphores	Alex Waterman	2016-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before checking semaphore values to determine if jobs have been completed flush the FB. If this is not done, despite the sempahore memory being mapped as volatile in the GMMU, outstanding writes can still be pending. Bug 1732449 JIRA DNVGPU-12 Change-Id: I67b596cd23a5465af05d6d173641a579cb7f168c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1133787 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Schedule channel update when jobs actually finish	Alex Waterman	2016-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do not schedule channel update call backs unless a job is actually finished. This saves a lot of call backs to the CDE code that don't do anything when semaphores are enabled. Bug 1732449 JIRA DNVGPU-12 Change-Id: I2f9a78498b08ebca44ee6a5171931a07721767f1 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1133786 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix sparse warnings	Deepak Nibade	2016-04-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix below sparse warnings : drivers/gpu/nvgpu/gk20a/gk20a.c:764:5: warning: symbol 'gk20a_pm_finalize_poweron' was not declared. Should it be static? drivers/gpu/nvgpu/gk20a/channel_gk20a.c:2504:14: warning: symbol 'gk20a_event_id_poll' was not declared. Should it be static? drivers/gpu/nvgpu/gk20a/channel_gk20a.c:2538:5: warning: symbol 'gk20a_event_id_release' was not declared. Should it be static? Bug 200067946 Bug 200088648 Change-Id: I5c23e7ee09c1a18fe2eeff12f80a3c2bf73120ef Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1128060 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Sri Krishna Chowdary <schowdary@nvidia.com>
*	gpu: nvgpu: make jobs_lock more fine grained	Deepak Nibade	2016-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While processing all the jobs in gk20a_channel_clean_up_jobs(), We currently acquire jobs_lock, traverse the list, clean up the jobs, and then release the lock But in this case we might hold the lock for too long blocking the submit path Hence make jobs_lock more fine grained by restricting it for list accesses only Bug 200187553 Change-Id: If82af8ff386f7bc29061cfd57fdda7df62f11c17 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1120412 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: remove submit lock	Deepak Nibade	2016-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove submit lock since we have moved to use more fine-grained locks Remove API check_gp_put() since we cannot call it in submit path due to latencies and we cannot call it in gk20a_channel_clean_up_jobs() anymore since it will fail there without the lock Bug 200187553 Change-Id: I05b9fa95c9009000e13232d8fa567336eeee11c6 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1120411 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: implement sync refcounting	Deepak Nibade	2016-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently free sync when we find job list empty If aggressive_sync is set to true, we try to free sync during channel unbind() call But we rarely free sync from channel_unbind() call since freeing it when job list is empty is aggressive enough Hence remove sync free code from channel_unbind() Implement refcounting for sync: - get a refcount while submitting a job (and allocate sync if it is not allocated already) - put a refcount while freeing the job - if refcount==0 and if aggressive_sync_destroy is set, free the sync - if aggressive_sync_destroy is not set, we will free the sync during channel close time Bug 200187553 Change-Id: I74e24adb15dc26a375ebca1fdd017b3ad6d57b61 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1120410 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add lock for fences	Deepak Nibade	2016-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All pre/post fence accesses in last_submit are currently protected by submit lock In order to remove the submit lock, move all fence accesses under own lock i.e. fence_lock Bug 200187553 Change-Id: I0132d1933dc92db8c5ed8c9311e49a030aa2d38c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1120409 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>