nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: Fix semaphore race condition	Alex Waterman	2016-01-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A race condition existed in gk20a_channel_semaphore_wait_fd(). In some instances the semaphore underlying the sync_fence being waited on would have already signaled. This would cause the subsequent sync_fence_wait_async() call to return 1 and do nothing. Normally, the sync_fence_wait_async() call would release the newly created semaphore but in the above case that would not happen and hang any channel waiting on that semaphore. To fix this problem if sync_fence_wait_async() returns 1 immediately release the newly created semaphore. Bug 1604892 Change-Id: I1f5e811695bb099f71b7762835aba4a7e27362ec Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/935910 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Fix gk20a_sync_pt_has_signaled()	Alex Waterman	2016-01-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix a reentrancy problem in gk20a_sync_pt_has_signaled() where one thread could clear a pointer before another thread tried to access it. A spinlock was added to the gk20a_sync_pt struct which is used to ensure that the underyling gk20a_sync_pt data is accessed in a sane manner. Bug 1604892 Change-Id: I270d89def7b986405a3167285d51ceda950c7b82 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/935909 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: set set_sm_debug_mode() for gm20b	Deepak Nibade	2016-01-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Set function pointer gops->gr.set_sm_debug_mode() for gm20b Bug 200168107 Change-Id: I40eebbc55b0f82f793fcea90245ae6dad0f5779c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/935773 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Richard Zhao <rizhao@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add support for therm gate ctrl	Seshendra Gadagottu	2016-01-27
\| \| \| \| \| \| \| \| \| \| \| \| \|	During gpu init, therm gate control is required to add delay cycles before clock gating. Bug 1717152 Change-Id: Ifabc428cf7b49e49964dc994eba2c38af4aa1a91 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/936443 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: add channel_set_priority support	Richard Zhao	2016-01-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- add gops.fifo.channel_set_priority and move current code as native callback. - implement the callback for vgpu Bug 1701079 Change-Id: If1cd13ea4478d11d578da2f682598e0c4522bcaf Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/932829 Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: let gk20a_fifo_preempt call gops callbacks	Richard Zhao	2016-01-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It fixed vgpu regression that vgpu tried to call native channel preemption function. Bug 1617046 Change-Id: Ia5a5486d8b95a34ca6ecc75f8d3b5fea76919405 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/935897 Tested-by: Damian Halas <dhalas@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: pmu version update	Mahantesh Kumbar	2016-01-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- ucode CL http://git-master/r/#/c/935012/ - EXTERR exception for ZBC L2 regsiters access during ELPG entry/exit. FIX : ZBC L2 is not part of GR, so ZBC L2 rigsters save/restore not required for ELPG entry/exit, P4 CL 20360931 - 10 msec as GR_FECS_SUBMIT_METHOD_TIMEOUT_US, P4 CL 20313730 - keep disabled ELCG till Clear DAT_RESTORE interrupt at ELPG exit path, P4 CL 20313676 Bug 1712507 Bug 200166877 Change-Id: I2c9843cfd18cd3b513ee6587d1a79e7034b19cae Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/935019 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: move pmu_load_update() to get_dev_status()	Deepak Nibade	2016-01-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently call gk20a_pmu_load_update() before calling update_devfreq() But it is possible to disable governor and set a constant/max frequency. In that case we will unnecessarily keep executing gk20a_pmu_load_update() for each submit Hence. move gk20a_pmu_load_update() to gk20a_scale_get_dev_status() so that we call gk20a_pmu_load_update() only when we really have to scale the frequency Bug 200161377 Change-Id: Ifac5a659a3a2d088b636f048213c2fbec801bdb9 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/929509 (cherry picked from commit f857a1b31400dfc0c35c58c6424aaac36bc09e7c) Reviewed-on: http://git-master/r/933704 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: move resetup_ramfc() out of sync_lock	Deepak Nibade	2016-01-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently have this sequence : - acquire sync_lock - sync_create - resetup_ramfc() - release sync_lock but this can lead to deadlock in case resetup_ramfc() triggers below stack : - resetup_ramfc() - channel_preempt() - preemption fails - trigger recovery - channel_abort() - acquire sync_lock Fix this by moving resetup_ramfc() out of sync_lock. resetup_ramfc() is still protected by submit_lock and hence we cannot free sync after allocation and before resetup Bug 200165811 Change-Id: Iebf74d950d6f6902b6d180c2cd8cd2d50493062c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/931726 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gk20a: correct thermal slowdown factor	Seshendra Gadagottu	2016-01-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	With extended mode enable, correct thermal slowdown factors to have divideby2, divideby4 and divideby8 slowdown. Bug 1719974 Change-Id: I5723b3972d34de13ffc456195b001fffe9fb56ec Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/933293 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gm20b: correct thermal slowdown factor	Seshendra Gadagottu	2016-01-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	With extended mode enable, correct thermal slowdown factors to have divideby2, divideby4 and divideby8 slowdown. Bug 1719974 Change-Id: I1e3a3f869657ce7c6409851df0ccd1523a06544b Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/933282 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: suspend cde cleanly	Seshendra Gadagottu	2016-01-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Few times cde is getting deadlocked because of pending cde operation. So do the things cleanly, first suspend cde then do channel suspend. Bug 1709757 Change-Id: Iaf566b63d9efb13aa2691c19e2df676c70f26afc Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/926574 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: bitmap allocator for comptags	Konsta Holtta	2016-01-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Restore comptags to be bitmap-allocated, like they were before we had the buddy allocator. The new buddy allocator introduced by e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff (originally 6ab2e0c49cb79ca68d2f83f1d4610783d2eaa79b) is fine for the big VAs, but unsuitable for the small compbit store. This commit reverts partially the combination of the above commit and also one after it, 86fc7ec9a05999bea8de320840b962db3ee11410, that fixed a bug which is not present when using a bitmap. With a bitmap allocator, pruning the extra allocation necessary for user-mapped mode is possible, so that is also restored. The original generic bitmap allocator is not restored; instead, a comptag-only allocator is introduced. Bug 200145635 Change-Id: I87f3a911826a801124cfd21e44857dfab1c3f378 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/837180 (cherry picked from commit 5a504aeb54f3e89e6561932971158a397157b3f2) Reviewed-on: http://git-master/r/839742 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix tsg bugs	Richard Zhao	2016-01-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- correct runlist entry type for tsg - consider tsg when preempt channel Bug 1617046 Change-Id: Ie067df17fb53ae91c49403637a5f35fc3710e0b3 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/926571 GVS: Gerrit_Virtual_Submit Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Do not readback L2 ZBC RAM	Terje Bergstrom	2016-01-15
\| \| \| \| \| \| \| \| \| \| \| \|	Do not read back L2 ZBC RAM. That can conflict with in-flight transactions causing a live-lock. Change-Id: I6122af48513b5a4b801202dc611eba58ce86aa4d Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/929580 GVS: Gerrit_Virtual_Submit Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
*	drivers: allow selected drivers to async probe	dmitry pervushin	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	List of drivers that now want async probe: - sdhci-tegra - qspi-mtd - nvmap - gk20a - dc Bug 200083391 Change-Id: Ie0a0677961b704c78d4eb2cdab9f0e9a925a3ca1 Reviewed-on: http://git-master/r/923738 (cherry-picked from 75c067e83c7cde2a37c4fae01719e40c5b7d2835) Signed-off-by: dmitry pervushin <dpervushin@nvidia.com> Reviewed-on: http://git-master/r/923121 Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com> Tested-by: Sumeet Gupta <sumeetg@nvidia.com>
*	gpu: nvgpu: return ENOSPC if no private command buffer space	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we run out of gpfifo space or private command buffer space, we currently return EAGAIN as error code Instead of EAGAIN, return ENOSPC as error code so that caller (user space) can read the error code and do some re-trials As the jobs are processed, it is possible to free up some space. And hence such re-trials could succeed Bug 1715291 Change-Id: I9a2ed7134d2496b383899b3c02c0e70452b26115 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/929402 Reviewed-by: Sachin Nikam <snikam@nvidia.com> Tested-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: enable wdt for each channel open	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently enable per-channel wdt flag at channel initialization time only But if process disables channel's wdt via per-channel IOCTL, and closes the channel without re-enabling it, we leave the wdt disabled on that channel And if same channel is assigned to some other process, then that process might have wdt disabled already Fix this by setting ch->wdt_enabled = true during gk20a_open_new_channel() Bug 200165797 Change-Id: I3ab482ce7cfbcbbd2178041f01f97457ff24f7bb Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/931128 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Sachin Nikam <snikam@nvidia.com> Tested-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: API to push fecs sideband methods	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new API gr_gk20a_submit_fecs_sideband_method_op() to support pushing fecs sideband methods Bug 200156699 Change-Id: Ibacd7d03e05b3b67416aa2148a741ffc6e2215c9 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/927135 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support masking hww_warp_esr	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add below API pointer to support masking of hww_warp_esr after hardware read of register and before using it further u32 (*mask_hww_warp_esr)(u32 hww_warp_esr) If needed, this API will mask value of hww_warp_esr appropriately and return it Bug 200156699 Change-Id: I1afb1347e650fab607009c1ee55691484653a4c1 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/927133 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: API to post channel events	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new API gk20a_channel_post_event() which adds channel event and also calls wake_up() for channel's semaphore wq Bug 200156699 Change-Id: If56f1bf8edcce79c9248809f8476ed853b7d2d9d Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/927132 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: APIs to enable/disable TSG	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	export below APIs for TSGs : gk20a_enable_tsg() - enable only TSG gk20a_disable_tsg() - disable only TSG gk20a_enable_channel_tsg() - if channel is part of TSG, enable TSG otherwise enable channel gk20a_disable_channel_tsg() - if channel is part of TSG, disable TSG otherwise disable channel Bug 200156699 Change-Id: Icdaca35235c3f323687f839fe32c6c5fe964b230 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/927131 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: API to extract context id	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add new API gr_gk20a_get_ctx_id() to get/extract context id from GR context Bug 200156699 Change-Id: If0e8887a9a6b139cd795bf03f5def64fd664d12b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/927130 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: APIs to suspend/resume single SM	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add below APIs to suspend or resume single SM : gk20a_suspend_single_sm() gk20a_resume_single_sm() Also, update gk20a_suspend_all_sms() to make it more generic by passing global_esr_mask and check_errors flag as parameter Bug 200156699 Change-Id: If40f4bcae74a8132673b4dca10b7d9898f23c164 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/925884 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support preprocessing of SM exceptions	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support preprocessing of SM exceptions if API pointer pre_process_sm_exception() is defined Also, expose some common APIs Bug 200156699 Change-Id: I1303642c1c4403c520b62efb6fd83e95eaeb519b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/925883 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: stop timer on failing channel	Deepak Nibade	2016-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_channel_timeout_handler(), below deadlock scenario is possible : thread 1: - take global lock g->ch_wdt_lock - identify timed out channel (as ch1) - check engine status which is stuck - identify failing channel on engine as ch2 - we need to trigger recovery with ch2 - as part of recovery, call channel_abort() for ch2 - in channel_abort(), we wait to cancel the timer wq - but timer wq for ch2 never completes due to thread 2 thread 2: - ch2 has already timed out - to process, we wait for global lock g->ch_wdt_lock - this lock needs to be released by thread 1 To fix this, cancel the timer (through flag) of ch2 (failing channel on engine) before triggering recovery on that channel Bug 200164753 Change-Id: Idb42d01c8440a53f43cb5e87e41f1c283f7e8fcf Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/929924 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: disable ctxsw instead of all engines activity	Deepak Nibade	2016-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_channel_timeout_handler(), we currently disable all engine activity before checking for fence completion and before we identify timed out channel But disabling all engine activity could be overkill for this process. Also, as part of disabling engine activity we preempt the channel on engine. But it is possible that channel preemption times out since channel has already timed out And this can lead to races and deadlock Hence, instead of disabling all engine activity, just disable the context switch which should also do the same trick Bug 1716062 Change-Id: I596515ed670a2e134f7bcd9758488a4aa0bf16f7 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/929421 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add high priority channel interleave	Peter Pipkorn	2016-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Interleave all high priority channels between all other channels. This reduces the latency for high priority work when there are a lot of lower priority work present, imposing an upper bound on the latency. Change the default high priority timeslice from 5.2ms to 3.0 in the process, to prevent long running high priority apps from hogging the GPU too much. Introduce a new debugfs node to enable/disable high priority channel interleaving. It is currently enabled by default. Adds new runlist length max register, used for allocating suitable sized runlist. Limit the number of interleaved channels to 32. This change reduces the maximum time a lower priority job is running (one timeslice) before we check that high priority jobs are running. Tested with gles2_context_priority (still passes) Basic sanity testing is done with graphics_submit (one app is high priority) Also more functional testing using lots of parallel runs with: NVRM_GPU_CHANNEL_PRIORITY=3 ./gles2_expensive_draw –drawsperframe 20000 –triangles 50 –runtime 30 –finish plus multiple: NVRM_GPU_CHANNEL_PRIORITY=2 ./gles2_expensive_draw –drawsperframe 20000 –triangles 50 –runtime 30 -finish Previous to this change, the relative performance between high priority work and normal priority work comes down to timeslice value. This means that when there are many low priority channels, the high priority work will still drop quite a lot. But with this change, the high priority work will roughly get about half the entire GPU time, meaning that after the initial lower performance, it is less likely to get lower in performance due to more apps running on the system. This change makes a large step towards real priority levels. It is not perfect and there are no guarantees on anything, but it is a step forwards without any additional CPU overhead or other complications. It will also serve as a baseline to judge other algorithms against. Support for priorities with TSG is future work. Support for interleave mid + high priority channels, instead of just high, is also future work. Bug 1419900 Change-Id: I0f7d0ce83b6598fe86000577d72e14d312fdad98 Signed-off-by: Peter Pipkorn <ppipkorn@nvidia.com> Reviewed-on: http://git-master/r/805961 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: enable semaphore acquire timeout	Richard Zhao	2016-01-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It'll detect dead semaphore acquire. The worst case is when ACQUIRE_SWITCH is disabled, semaphore acquire will poll and consume full gpu timeslicees. The timeout value is set to half of channel WDT. Bug 1636800 Change-Id: Ida6ccc534006a191513edf47e7b82d4b5b758684 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/928827 GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: vgpu: add regops support	Richard Zhao	2016-01-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added new RM Server command for regops. JIRA VFND-1128 Bug 1700139 Change-Id: Ia1cc63e993c29c91f87440c241077fa91edb9e53 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/923235 (cherry picked from commit 7de22e42cfd2e419ad64178b9f1f1ee16273bd03) Reviewed-on: http://git-master/r/841330 Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: vgpu: add SM exception support	Richard Zhao	2016-01-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When TEGRA_VGPU_GR_INTR_SM_EXCEPTION comes, post debugger event. Bug 1594604 JIRA VFND-1120 Change-Id: I7229c3994220a7c6f117d38a1af2e766187a47c6 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/923234 (cherry picked from commit bdd414d9366133380a202d88b1a50038b70c068d) Reviewed-on: http://git-master/r/840646 Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: vgpu: add set sm debug mode support	Richard Zhao	2016-01-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	JIRA VFND-1006 Bug 1594604 Change-Id: If6eb7ae22b5b0557faddd3d68deb791abb24bec4 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/923233 (cherry picked from commit 9e14ca393c3044be702c50524a9ef3a2c3a6270c) Reviewed-on: http://git-master/r/841866 Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: abstract set sm debug mode	Richard Zhao	2016-01-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new operation g->ops.gr.set_sm_debug_mode and move native implementation to gr_gk20a.c It's preparing for adding vgpu set sm debug mode hook. JIRA VFND-1006 Bug 1594604 Change-Id: Ia5ca06a86085a690e70bfa9c62f57ec3830ea933 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/923232 (cherry picked from commit 032552b54c570952d1e36c08191e9f70b9c59447) Reviewed-on: http://git-master/r/835614 Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	Revert "gpu: nvgpu: Enable ELPG when disabled due to reset"	Seshendra Gadagottu	2016-01-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit f6ab5bd17d16f3605b78c3c2ee80513d5823c594. Fix for graphics_submit regresssion. Bug 200164812 Change-Id: I5e37b8263758ee389cdba3ec6e3758afbdd9c910 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/929605 Tested-by: Hoang Pham <hopham@nvidia.com> Reviewed-by: Hoang Pham <hopham@nvidia.com>
*	gpu: nvgpu: Control comptagline assignment from kernel	Terje Bergstrom	2016-01-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Maxwell comptaglines are assigned per 128k, but preferred big page size for graphics is 64k. Bit 16 of GPU VA is used for determining which half of comptagline is used. This creates problems if user space wants to map a page multiple times and to arbitrary GPU VA. In one mapping the page might be mapped to lower half of 128k comptagline, and in another mapping the page might be mapped to upper half. Turn on mode where MSB of comptagline in PTE is used instead of bit 16 for determining the comptagline lower/upper half selection. Bug 1704834 Change-Id: If87e8f6ac0fc9c5624e80fa1ba2ceeb02781355b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/924322 Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	gpu: nvgpu: disable secure allocations on linsim	Deepak Nibade	2016-01-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Disable all secure allocations on linsim by returning an error from gk20a_tegra_secure_page_alloc() With this failure, no more secure allocations will be done from nvgpu Bug 200163671 Change-Id: I26604e45a684dde29c092dc34cc89259f5de5d91 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/928280 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com> Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
*	gpu: nvgpu: Enable ELPG when disabled due to reset	Mahantesh Kumbar	2016-01-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable ELPG back whenever ELPG disable is done due to reset or recovery. Otherwise elpg_refcnt mismatch doesn’t engage ELPG correctly Bug 200156347 Change-Id: Ic01f85b9e1eff10cfb9cb180b50b045f67d4b33c Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/925763 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gm20b: use jiffies for wait on PMU	Vijayakumar	2015-12-30
\| \| \| \| \| \| \| \| \| \| \|	bug 200157852 Change-Id: Ib5ab6ed5f3d8356efd527ce5ff6e4134ac60da7d Signed-off-by: Vijayakumar <vsubbu@nvidia.com> Reviewed-on: http://git-master/r/921711 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
*	gpu: nvgpu: Add comptag offset to part mappings	Terje Bergstrom	2015-12-14
\| \| \| \| \| \| \| \| \| \|	Add offset to comptags when mapping partial buffers. Bug 1704834 Change-Id: I3405b465bb1373bcc79eb5ecbd93dd1b866abfb4 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/837401
*	gpu: nvgpu: Add 3 functions to regops interface.	Ashutosh Jain	2015-12-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds the following IOCTLS: - NVGPU_GPU_IOCTL_RESUME_FROM_PAUSE - NVGPU_GPU_IOCTL_TRIGGER_SUSPEND - NVGPU_GPU_IOCTL_CLEAR_SM_ERRORS Bug 1619430 Change-Id: Iac37d515a753d8b799e631224eae2fa168b43e2c Signed-off-by: ashutosh jain <ashutoshj@nvidia.com> Reviewed-on: http://git-master/r/921378 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: make local function static	Amit Sharma	2015-12-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed the following sparse warning by restricting the scope of API's within file. - gr_gk20a.c:7273: warning: symbol 'gr_gk20a_bpt_reg_info' was not declared. Should it be static? - gr_gm20b.c:1053: warning: symbol 'gr_gm20b_bpt_reg_info' was not declared. Should it be static? Bug 200088648 Change-Id: I63bba55b1432e4284c9074d2729a176f1767a83a Signed-off-by: Amit Sharma <amisharma@nvidia.com> Reviewed-on: http://git-master/r/842260 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: fix possible NULL dereference	Deepak Nibade	2015-12-14
\| \| \| \| \| \| \| \| \| \| \| \| \|	Coverity id : 20260 Bug 1416640 Change-Id: I6ca8df2ed001df99ad46e476b1fe4de9f1346786 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/922362 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: fix dereference after null check	Deepak Nibade	2015-12-11
\| \| \| \| \| \| \| \| \| \| \| \|	Coverity id : 20234 Bug 1703084 Change-Id: I7b82a44dde39bf6b811ae144815c0c45468aa740 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/921326 Reviewed-by: Sachin Nikam <snikam@nvidia.com> Tested-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: preempt before adjusting fences	Deepak Nibade	2015-12-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current sequence in gk20a_disable_channel() is - disable channel in gk20a_channel_abort() - adjust pending fence in gk20a_channel_abort() - preempt channel But this leads to scenarios where syncpoint has min > max value Hence to fix this, make sequence in gk20a_disable_channel() - disable channel in gk20a_channel_abort() - preempt channel in gk20a_channel_abort() - adjust pending fence in gk20a_channel_abort() If gk20a_channel_abort() is called from other API where preemption is not needed, then use channel_preempt flag and do not preempt channel in those cases Bug 1683059 Change-Id: I4d46d4294cf8597ae5f05f79dfe1b95c4187f2e3 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/921290 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: call channel_update() always in channel_abort()	Deepak Nibade	2015-12-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_channel_abort(), we disable the channel and adjust the fences. And then we call gk20a_channel_update() only if we released the post-fence semaphore Instead of this, call gk20a_channel_update() always to ensure that all the clean up after fence completion is done always Bug 1683059 Change-Id: Ife00ba43e808c0833264d79c98c74417ffadf802 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/842999 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Immediate channel release	Terje Bergstrom	2015-12-10
\| \| \| \| \| \| \| \| \| \| \| \|	When closing channel, disable and preempt it immediately instead of waiting for it to finish all work. Bug 1683059 Change-Id: Ia5f5fc6a072dc3ddb1e9bf63534814ff0a60b5b4 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/836746
*	gpu: nvgpu: optimize map calls in patch smpc	Konsta Holtta	2015-12-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Copy the necessary ctx patching prologues and epilogues from patch_write to gr_gk20a_exec_ctx_ops next to mapping of gr ctx, around a loop that applies patches multiple times, in order to optimize the number of maps/unmaps on the channel's patch context. Bug 200075565 Change-Id: I7125be5c778192d639f0bbed1731bb900c7015da Signed-off-by: Konsta Holtta <kholtta@nvidia.com> (cherry picked from commit TODO-FILL-THIS-WHEN-MERGED) Reviewed-on: http://git-master/r/839203 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add API to extract GPU timeout mode	Chris Dragan	2015-12-09
\| \| \| \| \| \| \| \| \| \| \|	Bug 1706457 Change-Id: Iab76bcb7cabc55d99b5acd932716d30da6f01b46 Signed-off-by: Chris Dragan <kdragan@nvidia.com> Reviewed-on: http://git-master/r/835852 Reviewed-on: http://git-master/r/836454 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add grad slowdown HW register masks	Terje Bergstrom	2015-12-08
\| \| \| \| \| \| \| \| \| \|	Change-Id: I8d64c36d569e79cad3648bad248624290319ac2d Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/839367 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
*	gpu: nvgpu: create sync_fence only if needed	Deepak Nibade	2015-12-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, we create sync_fence (from nvhost_sync_create_fence()) for every submit But not all submits request for a sync_fence. Also, nvhost_sync_create_fence() API takes about 1/3rd of the total submit path. Hence to optimize, we can allocate sync_fence only when user explicitly asks for it using (NVGPU_SUBMIT_GPFIFO_FLAGS_FENCE_GET && NVGPU_SUBMIT_GPFIFO_FLAGS_SYNC_FENCE) Also, in CDE path from gk20a_prepare_compressible_read(), we reuse existing fence stored in "state" and that can result into not returning sync_fence_fd when user asked for it Hence, force allocation of sync_fence when job submission comes from CDE path Bug 200141116 Change-Id: Ia921701bf0e2432d6b8a5e8b7d91160e7f52db1e Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/812845 (cherry picked from commit 5fd47015eeed00352cc8473eff969a66c94fee98) Reviewed-on: http://git-master/r/837662 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>