| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A race condition existed in gk20a_channel_semaphore_wait_fd().
In some instances the semaphore underlying the sync_fence being
waited on would have already signaled. This would cause the
subsequent sync_fence_wait_async() call to return 1 and do
nothing. Normally, the sync_fence_wait_async() call would
release the newly created semaphore but in the above case that
would not happen and hang any channel waiting on that semaphore.
To fix this problem if sync_fence_wait_async() returns 1
immediately release the newly created semaphore.
Bug 1604892
Change-Id: I1f5e811695bb099f71b7762835aba4a7e27362ec
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/935910
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix a reentrancy problem in gk20a_sync_pt_has_signaled()
where one thread could clear a pointer before another
thread tried to access it. A spinlock was added to the
gk20a_sync_pt struct which is used to ensure that the
underyling gk20a_sync_pt data is accessed in a sane
manner.
Bug 1604892
Change-Id: I270d89def7b986405a3167285d51ceda950c7b82
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/935909
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set function pointer gops->gr.set_sm_debug_mode()
for gm20b
Bug 200168107
Change-Id: I40eebbc55b0f82f793fcea90245ae6dad0f5779c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/935773
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During gpu init, therm gate control is required to
add delay cycles before clock gating.
Bug 1717152
Change-Id: Ifabc428cf7b49e49964dc994eba2c38af4aa1a91
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/936443
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- add gops.fifo.channel_set_priority and move current code
as native callback.
- implement the callback for vgpu
Bug 1701079
Change-Id: If1cd13ea4478d11d578da2f682598e0c4522bcaf
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/932829
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It fixed vgpu regression that vgpu tried to call native
channel preemption function.
Bug 1617046
Change-Id: Ia5a5486d8b95a34ca6ecc75f8d3b5fea76919405
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/935897
Tested-by: Damian Halas <dhalas@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- ucode CL http://git-master/r/#/c/935012/
- EXTERR exception for ZBC L2 regsiters access
during ELPG entry/exit.
FIX : ZBC L2 is not part of GR, so ZBC L2 rigsters
save/restore not required for ELPG entry/exit,
P4 CL 20360931
- 10 msec as GR_FECS_SUBMIT_METHOD_TIMEOUT_US, P4 CL 20313730
- keep disabled ELCG till Clear DAT_RESTORE
interrupt at ELPG exit path, P4 CL 20313676
Bug 1712507
Bug 200166877
Change-Id: I2c9843cfd18cd3b513ee6587d1a79e7034b19cae
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/935019
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently call gk20a_pmu_load_update() before calling
update_devfreq()
But it is possible to disable governor and set a
constant/max frequency.
In that case we will unnecessarily keep executing
gk20a_pmu_load_update() for each submit
Hence. move gk20a_pmu_load_update() to gk20a_scale_get_dev_status()
so that we call gk20a_pmu_load_update() only when we really
have to scale the frequency
Bug 200161377
Change-Id: Ifac5a659a3a2d088b636f048213c2fbec801bdb9
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/929509
(cherry picked from commit f857a1b31400dfc0c35c58c6424aaac36bc09e7c)
Reviewed-on: http://git-master/r/933704
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently have this sequence :
- acquire sync_lock
- sync_create
- resetup_ramfc()
- release sync_lock
but this can lead to deadlock in case resetup_ramfc()
triggers below stack :
- resetup_ramfc()
- channel_preempt() - preemption fails
- trigger recovery
- channel_abort()
- acquire sync_lock
Fix this by moving resetup_ramfc() out of sync_lock.
resetup_ramfc() is still protected by submit_lock and
hence we cannot free sync after allocation and
before resetup
Bug 200165811
Change-Id: Iebf74d950d6f6902b6d180c2cd8cd2d50493062c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/931726
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With extended mode enable, correct thermal slowdown factors
to have divideby2, divideby4 and divideby8 slowdown.
Bug 1719974
Change-Id: I5723b3972d34de13ffc456195b001fffe9fb56ec
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/933293
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With extended mode enable, correct thermal slowdown factors
to have divideby2, divideby4 and divideby8 slowdown.
Bug 1719974
Change-Id: I1e3a3f869657ce7c6409851df0ccd1523a06544b
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/933282
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Few times cde is getting deadlocked because of pending cde
operation. So do the things cleanly, first suspend cde then
do channel suspend.
Bug 1709757
Change-Id: Iaf566b63d9efb13aa2691c19e2df676c70f26afc
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/926574
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Restore comptags to be bitmap-allocated, like they were before we had
the buddy allocator.
The new buddy allocator introduced by
e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff (originally
6ab2e0c49cb79ca68d2f83f1d4610783d2eaa79b) is fine for the big VAs, but
unsuitable for the small compbit store.
This commit reverts partially the combination of the above commit and
also one after it, 86fc7ec9a05999bea8de320840b962db3ee11410, that fixed
a bug which is not present when using a bitmap. With a bitmap allocator,
pruning the extra allocation necessary for user-mapped mode is possible,
so that is also restored.
The original generic bitmap allocator is not restored; instead, a
comptag-only allocator is introduced.
Bug 200145635
Change-Id: I87f3a911826a801124cfd21e44857dfab1c3f378
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/837180
(cherry picked from commit 5a504aeb54f3e89e6561932971158a397157b3f2)
Reviewed-on: http://git-master/r/839742
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- correct runlist entry type for tsg
- consider tsg when preempt channel
Bug 1617046
Change-Id: Ie067df17fb53ae91c49403637a5f35fc3710e0b3
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/926571
GVS: Gerrit_Virtual_Submit
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not read back L2 ZBC RAM. That can conflict with in-flight
transactions causing a live-lock.
Change-Id: I6122af48513b5a4b801202dc611eba58ce86aa4d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/929580
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
List of drivers that now want async probe:
- sdhci-tegra
- qspi-mtd
- nvmap
- gk20a
- dc
Bug 200083391
Change-Id: Ie0a0677961b704c78d4eb2cdab9f0e9a925a3ca1
Reviewed-on: http://git-master/r/923738
(cherry-picked from 75c067e83c7cde2a37c4fae01719e40c5b7d2835)
Signed-off-by: dmitry pervushin <dpervushin@nvidia.com>
Reviewed-on: http://git-master/r/923121
Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com>
Tested-by: Sumeet Gupta <sumeetg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we run out of gpfifo space or private command
buffer space, we currently return EAGAIN as error code
Instead of EAGAIN, return ENOSPC as error code so
that caller (user space) can read the error code
and do some re-trials
As the jobs are processed, it is possible to free up
some space. And hence such re-trials could succeed
Bug 1715291
Change-Id: I9a2ed7134d2496b383899b3c02c0e70452b26115
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/929402
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Tested-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently enable per-channel wdt flag at channel
initialization time only
But if process disables channel's wdt via per-channel
IOCTL, and closes the channel without re-enabling it,
we leave the wdt disabled on that channel
And if same channel is assigned to some other process,
then that process might have wdt disabled already
Fix this by setting ch->wdt_enabled = true during
gk20a_open_new_channel()
Bug 200165797
Change-Id: I3ab482ce7cfbcbbd2178041f01f97457ff24f7bb
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/931128
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Tested-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add new API gr_gk20a_submit_fecs_sideband_method_op()
to support pushing fecs sideband methods
Bug 200156699
Change-Id: Ibacd7d03e05b3b67416aa2148a741ffc6e2215c9
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/927135
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add below API pointer to support masking of hww_warp_esr
after hardware read of register and before using it
further
u32 (*mask_hww_warp_esr)(u32 hww_warp_esr)
If needed, this API will mask value of hww_warp_esr
appropriately and return it
Bug 200156699
Change-Id: I1afb1347e650fab607009c1ee55691484653a4c1
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/927133
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add new API gk20a_channel_post_event() which adds
channel event and also calls wake_up() for channel's
semaphore wq
Bug 200156699
Change-Id: If56f1bf8edcce79c9248809f8476ed853b7d2d9d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/927132
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
export below APIs for TSGs :
gk20a_enable_tsg() - enable only TSG
gk20a_disable_tsg() - disable only TSG
gk20a_enable_channel_tsg() -
if channel is part of TSG, enable TSG
otherwise enable channel
gk20a_disable_channel_tsg() -
if channel is part of TSG, disable TSG
otherwise disable channel
Bug 200156699
Change-Id: Icdaca35235c3f323687f839fe32c6c5fe964b230
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/927131
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add new API gr_gk20a_get_ctx_id() to get/extract
context id from GR context
Bug 200156699
Change-Id: If0e8887a9a6b139cd795bf03f5def64fd664d12b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/927130
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add below APIs to suspend or resume single SM :
gk20a_suspend_single_sm()
gk20a_resume_single_sm()
Also, update gk20a_suspend_all_sms() to make it
more generic by passing global_esr_mask and
check_errors flag as parameter
Bug 200156699
Change-Id: If40f4bcae74a8132673b4dca10b7d9898f23c164
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/925884
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support preprocessing of SM exceptions if API
pointer pre_process_sm_exception() is defined
Also, expose some common APIs
Bug 200156699
Change-Id: I1303642c1c4403c520b62efb6fd83e95eaeb519b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/925883
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_channel_timeout_handler(), below deadlock scenario
is possible :
thread 1:
- take global lock g->ch_wdt_lock
- identify timed out channel (as ch1)
- check engine status which is stuck
- identify failing channel on engine as ch2
- we need to trigger recovery with ch2
- as part of recovery, call channel_abort() for ch2
- in channel_abort(), we wait to cancel the timer wq
- but timer wq for ch2 never completes due to thread 2
thread 2:
- ch2 has already timed out
- to process, we wait for global lock g->ch_wdt_lock
- this lock needs to be released by thread 1
To fix this, cancel the timer (through flag) of ch2
(failing channel on engine) before triggering recovery
on that channel
Bug 200164753
Change-Id: Idb42d01c8440a53f43cb5e87e41f1c283f7e8fcf
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/929924
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_channel_timeout_handler(), we currently disable
all engine activity before checking for fence completion
and before we identify timed out channel
But disabling all engine activity could be overkill for
this process.
Also, as part of disabling engine activity we preempt
the channel on engine.
But it is possible that channel preemption times out
since channel has already timed out
And this can lead to races and deadlock
Hence, instead of disabling all engine activity, just
disable the context switch which should also do the
same trick
Bug 1716062
Change-Id: I596515ed670a2e134f7bcd9758488a4aa0bf16f7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/929421
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Interleave all high priority channels between all other channels.
This reduces the latency for high priority work when there
are a lot of lower priority work present, imposing an upper
bound on the latency. Change the default high priority timeslice
from 5.2ms to 3.0 in the process, to prevent long running high priority
apps from hogging the GPU too much.
Introduce a new debugfs node to enable/disable high priority
channel interleaving. It is currently enabled by default.
Adds new runlist length max register, used for allocating
suitable sized runlist.
Limit the number of interleaved channels to 32.
This change reduces the maximum time a lower priority job
is running (one timeslice) before we check that high priority
jobs are running.
Tested with gles2_context_priority (still passes)
Basic sanity testing is done with graphics_submit
(one app is high priority)
Also more functional testing using lots of parallel runs with:
NVRM_GPU_CHANNEL_PRIORITY=3 ./gles2_expensive_draw
–drawsperframe 20000 –triangles 50 –runtime 30 –finish
plus multiple:
NVRM_GPU_CHANNEL_PRIORITY=2 ./gles2_expensive_draw
–drawsperframe 20000 –triangles 50 –runtime 30 -finish
Previous to this change, the relative performance between
high priority work and normal priority work comes down
to timeslice value. This means that when there are many
low priority channels, the high priority work will still
drop quite a lot. But with this change, the high priority
work will roughly get about half the entire GPU time, meaning
that after the initial lower performance, it is less likely
to get lower in performance due to more apps running on the system.
This change makes a large step towards real priority levels.
It is not perfect and there are no guarantees on anything,
but it is a step forwards without any additional CPU overhead
or other complications. It will also serve as a baseline to
judge other algorithms against.
Support for priorities with TSG is future work.
Support for interleave mid + high priority channels,
instead of just high, is also future work.
Bug 1419900
Change-Id: I0f7d0ce83b6598fe86000577d72e14d312fdad98
Signed-off-by: Peter Pipkorn <ppipkorn@nvidia.com>
Reviewed-on: http://git-master/r/805961
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It'll detect dead semaphore acquire. The worst case is when
ACQUIRE_SWITCH is disabled, semaphore acquire will poll and
consume full gpu timeslicees.
The timeout value is set to half of channel WDT.
Bug 1636800
Change-Id: Ida6ccc534006a191513edf47e7b82d4b5b758684
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/928827
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added new RM Server command for regops.
JIRA VFND-1128
Bug 1700139
Change-Id: Ia1cc63e993c29c91f87440c241077fa91edb9e53
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/923235
(cherry picked from commit 7de22e42cfd2e419ad64178b9f1f1ee16273bd03)
Reviewed-on: http://git-master/r/841330
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When TEGRA_VGPU_GR_INTR_SM_EXCEPTION comes, post
debugger event.
Bug 1594604
JIRA VFND-1120
Change-Id: I7229c3994220a7c6f117d38a1af2e766187a47c6
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/923234
(cherry picked from commit bdd414d9366133380a202d88b1a50038b70c068d)
Reviewed-on: http://git-master/r/840646
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
JIRA VFND-1006
Bug 1594604
Change-Id: If6eb7ae22b5b0557faddd3d68deb791abb24bec4
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/923233
(cherry picked from commit 9e14ca393c3044be702c50524a9ef3a2c3a6270c)
Reviewed-on: http://git-master/r/841866
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add new operation g->ops.gr.set_sm_debug_mode and move native
implementation to gr_gk20a.c
It's preparing for adding vgpu set sm debug mode hook.
JIRA VFND-1006
Bug 1594604
Change-Id: Ia5ca06a86085a690e70bfa9c62f57ec3830ea933
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/923232
(cherry picked from commit 032552b54c570952d1e36c08191e9f70b9c59447)
Reviewed-on: http://git-master/r/835614
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit f6ab5bd17d16f3605b78c3c2ee80513d5823c594.
Fix for graphics_submit regresssion.
Bug 200164812
Change-Id: I5e37b8263758ee389cdba3ec6e3758afbdd9c910
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/929605
Tested-by: Hoang Pham <hopham@nvidia.com>
Reviewed-by: Hoang Pham <hopham@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Maxwell comptaglines are assigned per 128k, but preferred big page
size for graphics is 64k. Bit 16 of GPU VA is used for determining
which half of comptagline is used.
This creates problems if user space wants to map a page multiple times
and to arbitrary GPU VA. In one mapping the page might be mapped to
lower half of 128k comptagline, and in another mapping the page might
be mapped to upper half.
Turn on mode where MSB of comptagline in PTE is used instead of bit 16
for determining the comptagline lower/upper half selection.
Bug 1704834
Change-Id: If87e8f6ac0fc9c5624e80fa1ba2ceeb02781355b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/924322
Reviewed-by: Alex Waterman <alexw@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Disable all secure allocations on linsim by returning
an error from gk20a_tegra_secure_page_alloc()
With this failure, no more secure allocations will be
done from nvgpu
Bug 200163671
Change-Id: I26604e45a684dde29c092dc34cc89259f5de5d91
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/928280
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enable ELPG back whenever ELPG disable is done due to reset or recovery.
Otherwise elpg_refcnt mismatch doesn’t engage ELPG correctly
Bug 200156347
Change-Id: Ic01f85b9e1eff10cfb9cb180b50b045f67d4b33c
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/925763
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
bug 200157852
Change-Id: Ib5ab6ed5f3d8356efd527ce5ff6e4134ac60da7d
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/921711
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Add offset to comptags when mapping partial buffers.
Bug 1704834
Change-Id: I3405b465bb1373bcc79eb5ecbd93dd1b866abfb4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/837401
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change adds the following IOCTLS:
- NVGPU_GPU_IOCTL_RESUME_FROM_PAUSE
- NVGPU_GPU_IOCTL_TRIGGER_SUSPEND
- NVGPU_GPU_IOCTL_CLEAR_SM_ERRORS
Bug 1619430
Change-Id: Iac37d515a753d8b799e631224eae2fa168b43e2c
Signed-off-by: ashutosh jain <ashutoshj@nvidia.com>
Reviewed-on: http://git-master/r/921378
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixed the following sparse warning by restricting the scope of
API's within file.
- gr_gk20a.c:7273: warning: symbol 'gr_gk20a_bpt_reg_info' was not declared.
Should it be static?
- gr_gm20b.c:1053: warning: symbol 'gr_gm20b_bpt_reg_info' was not declared.
Should it be static?
Bug 200088648
Change-Id: I63bba55b1432e4284c9074d2729a176f1767a83a
Signed-off-by: Amit Sharma <amisharma@nvidia.com>
Reviewed-on: http://git-master/r/842260
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Coverity id : 20260
Bug 1416640
Change-Id: I6ca8df2ed001df99ad46e476b1fe4de9f1346786
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/922362
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Coverity id : 20234
Bug 1703084
Change-Id: I7b82a44dde39bf6b811ae144815c0c45468aa740
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/921326
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Tested-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current sequence in gk20a_disable_channel() is
- disable channel in gk20a_channel_abort()
- adjust pending fence in gk20a_channel_abort()
- preempt channel
But this leads to scenarios where syncpoint has
min > max value
Hence to fix this, make sequence in gk20a_disable_channel()
- disable channel in gk20a_channel_abort()
- preempt channel in gk20a_channel_abort()
- adjust pending fence in gk20a_channel_abort()
If gk20a_channel_abort() is called from other API where
preemption is not needed, then use channel_preempt
flag and do not preempt channel in those cases
Bug 1683059
Change-Id: I4d46d4294cf8597ae5f05f79dfe1b95c4187f2e3
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/921290
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_channel_abort(), we disable the channel and
adjust the fences. And then we call gk20a_channel_update()
only if we released the post-fence semaphore
Instead of this, call gk20a_channel_update() always to ensure
that all the clean up after fence completion is done always
Bug 1683059
Change-Id: Ife00ba43e808c0833264d79c98c74417ffadf802
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/842999
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When closing channel, disable and preempt it immediately instead of
waiting for it to finish all work.
Bug 1683059
Change-Id: Ia5f5fc6a072dc3ddb1e9bf63534814ff0a60b5b4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/836746
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Copy the necessary ctx patching prologues and epilogues from patch_write
to gr_gk20a_exec_ctx_ops next to mapping of gr ctx, around a loop that
applies patches multiple times, in order to optimize the number of
maps/unmaps on the channel's patch context.
Bug 200075565
Change-Id: I7125be5c778192d639f0bbed1731bb900c7015da
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
(cherry picked from commit TODO-FILL-THIS-WHEN-MERGED)
Reviewed-on: http://git-master/r/839203
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Bug 1706457
Change-Id: Iab76bcb7cabc55d99b5acd932716d30da6f01b46
Signed-off-by: Chris Dragan <kdragan@nvidia.com>
Reviewed-on: http://git-master/r/835852
Reviewed-on: http://git-master/r/836454
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I8d64c36d569e79cad3648bad248624290319ac2d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/839367
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, we create sync_fence (from nvhost_sync_create_fence())
for every submit
But not all submits request for a sync_fence.
Also, nvhost_sync_create_fence() API takes about 1/3rd of the total
submit path.
Hence to optimize, we can allocate sync_fence
only when user explicitly asks for it using
(NVGPU_SUBMIT_GPFIFO_FLAGS_FENCE_GET &&
NVGPU_SUBMIT_GPFIFO_FLAGS_SYNC_FENCE)
Also, in CDE path from gk20a_prepare_compressible_read(),
we reuse existing fence stored in "state" and that can
result into not returning sync_fence_fd when user asked
for it
Hence, force allocation of sync_fence when job submission
comes from CDE path
Bug 200141116
Change-Id: Ia921701bf0e2432d6b8a5e8b7d91160e7f52db1e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/812845
(cherry picked from commit 5fd47015eeed00352cc8473eff969a66c94fee98)
Reviewed-on: http://git-master/r/837662
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|