| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
| |
Setting max_ways_evict reserves some of L2 for CB. In gk20a CB is in
dedicated RAM, so we don't need to reserve space for it.
The code gets invoked only on gk20a.
Change-Id: Ib8efec8c5e90c135bd0c10bb1eaa3f797ec68698
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1144993
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added support for multiple PBDMAs handling during
fifo_pbdma_isr and gk20a_init_fifo_reset_enable_hw
use case.
JIRA DNVGPU-26
Change-Id: I5f013c5373f7a4b80a8de8863f0e175576ed4c22
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1145591
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enabling L2 CYA15 is necessary only in another GPU for enabling
an HW fix. gk20a does not have this problem, so enabling CYA15
is not necessary.
Change-Id: I7318e8541ad392f9a34f3650beac05a39d7bba68
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1146086
Reviewed-by: Ken Adams <kadams@nvidia.com>
Tested-by: Ken Adams <kadams@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To support vidmem, pass g and mem_desc to the buffer memory accessor
functions. This allows the functions to select the memory access method
based on the buffer aperture instead of using the cpu pointer directly
(like until now). The selection and aperture support will be in another
patch; this patch only refactors these accessors, but keeps the
underlying functionality as-is.
gk20a_mem_{rd,wr}32() work as previously; add also gk20a_mem_{rd,wr}()
for byte-indexed accesses, gk20a_mem_{rd,wr}_n() for memcpy()-like
functionality, and gk20a_memset() for filling buffers with a constant.
The 8 and 16 bit accessor functions are removed.
vmap()/vunmap() pairs are abstracted to gk20a_mem_{begin,end}() to
support other types of mappings or conditions where mapping the buffer
is unnecessary or different.
Several function arguments that would access these buffers are also
changed to take a mem_desc instead of a plain cpu pointer. Some relevant
occasions are changed to use the accessor functions instead of cpu
pointers without them (e.g., memcpying to and from), but the majority of
direct accesses will be adjusted later, when the buffers are moved to
support vidmem.
JIRA DNVGPU-23
Change-Id: I3dd22e14290c4ab742d42e2dd327ebeb5cd3f25a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1121143
Reviewed-by: Ken Adams <kadams@nvidia.com>
Tested-by: Ken Adams <kadams@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the offset calculation code to save/restore proper number of LTCs
to conform to FECS microcode
Bug 1736910
Change-Id: I46ae1e0b45ffe354693db6ceb0aeeeac3d344b34
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1133896
(cherry picked from commit 97751ee7a44537f5aa5198ac362d9ee416adc172)
Reviewed-on: http://git-master/r/1146244
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
| |
When programming GR ZROP and CROP settings for number of active LTCs
we use FB count, which is always 1. Use LTC count instead.
Change-Id: I4d6f9b2b59d44b82ba098b243d6390cf321dbc36
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1144698
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- add fuse_override in gops. Implement it starting from gm20b.
- set cwd fs register, so cuda won't use disabled TPCs
Bug 1757262
Bug 200169697
Change-Id: If7bac58bd3a6bcf2925197ea5b7c2d10a77e0933
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
(cherry picked from commit 66cde7724815e9e5e85ab9b07fc985a78530222f)
Reviewed-on: http://git-master/r/1132177
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Adeel Raza <araza@nvidia.com>
Tested-by: Adeel Raza <araza@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Skip initialization of ctxsw tracing if it's not supported for that
chip.
Change-Id: I5197295c2a5d0f039347e1784137b3fd99894dc7
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1143837
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add below flag fields to gpu characteristics to indicate
supported and default preemption modes on platform for
graphics and compute
__u32 graphics_preemption_mode_flags;
__u32 compute_preemption_mode_flags;
__u32 default_graphics_preempt_mode;
__u32 default_compute_preempt_mode;
Add struct nvgpu_preemption_modes_rec to struct gr_gk20a
to store these values locally
Use platform specific get_preemption_mode_flags() to
get the flags and define gk20a/gm20b specific
get_preemption_mode_flags() API
Bug 1646259
Change-Id: I80193c0d988dc93bd96585f9aa631fd817f4dfa3
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1133595
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add separate IOCTL NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE
to allow setting preemption modes from UMD
Define preemption modes in nvgpu.h and use them everywhere
Remove mode definitions from mm_gk20a.h
Also, we support setting only one preemption mode in a channel
But it is possible to have multiple preemption modes (one from
graphics and one from compute) set simultaneously
Hence, update struct gr_ctx_desc to include two separate
preemption modes (graphics_preempt_mode and compute_preempt_mode)
API NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE also supports
setting two separate preemption modes i.e. one for graphics
and one for compute
Make necessary changes in code to support two preemption
modes
Bug 1646259
Change-Id: Ia1dea19e609ba8cc0de2f39ab6c0c4cd6b0a752c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1131805
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Old code maxed at 2 PEs per GPC. Support 3 PEs and add code to make
sure we will get a warning if hardware supports more than that.
JIRA DNVGPU-6
Change-Id: Id6061567bad20474f4b4a7a0959be3426e5e4828
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1142440
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Deassert reset in L2 and FB before initializing L2. In gk20a L2 can
be off and thus writing registers results in a priv ring failure.
Change-Id: I680b8b1e77cf67a8269c6de59a15d9817301300e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1140482
(cherry picked from commit d85edcf4170d7bc59d2c080f4343bc2f959be023)
Reviewed-on: http://git-master/r/1143684
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gk20a_fifo_force_reset_ch() does not support vgpu now, so we need to
create a function pointer in gpu_ops and assign it differently for
vgpu and non-vgpu.
Bug 200184349
Change-Id: I5f8f4f731b4b970c4ff8de65531f25568e7691b6
Signed-off-by: Haley Teng <hteng@nvidia.com>
Reviewed-on: http://git-master/r/1130420
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix below sparse warnings:
drivers/gpu/nvgpu/gk20a/fifo_gk20a.c:2628:29:
warning: symbol 'gk20a_fifo_sched_debugfs_seq_
ops' was not declared. Should it be static?
drivers/gpu/nvgpu/gk20a/fifo_gk20a.c:2657:30:
warning: symbol 'gk20a_fifo_sched_debugfs_fops'
was not declared. Should it be static?
Bug 200088648
Change-Id: I4e12ca0988b2d9dd7962d026a3a0f9f674e89ada
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1142909
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update the priv_cmdbuff computation to take into account the amount
of memory semaphores take. Since semaphores always require more
memory than sync-pts the sync-pt computation has been dropped.
Bug 1732449
JIRA DNVGPU-12
Change-Id: Ic05c26b4d1ed9cbd03d3239655c4607bb418396c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1141420
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Watchdog timeout is now zero for PCIe devices. This makes also
semaphore acquire timeout to be zero. Fix the timeout to be the
same as for platform devices.
JIRA DNVGPU-7
Change-Id: I5c2921b65f4f957c08a4e9f815deeed2ba231013
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1134441
|
|
|
|
|
|
|
|
|
|
|
|
| |
GPU characteristics hard coded GPC mask of 1 and returned number of
enabled GPCs as maximum number of GPCs. Fix both gpc_count and
max_gpc_count to be returned correctly.
JIRA DNVGPU-6
Change-Id: I41598b2f2d2ada26b0ad433f40a51e59b14deadd
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1142185
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix below sparse warning:
drivers/gpu/nvgpu/pci.c:52:23: warning: symbol 'nvgpu_pci_device' was
not declared. Should it be static?
Bug 200067946
Bug 200088648
Change-Id: Ie30c952e0addf16fe3639e9372ecaace552d6e46
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1142612
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If bar1 resource remap fails during initialization, return the correct
error code from g->bar1 instead of g->regs.
Coverity ID 32068
Bug 200192125
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Change-Id: Ia326bfa122259c7b3402f78746673612f0ac0009
Reviewed-on: http://git-master/r/1141078
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use device name to distinguish between different instances of PCIe
devices.
JIRA DNVGPU-7
Change-Id: Ice0a9dae41396c8faada1815afd1237907896036
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1140685
Reviewed-by: Sami Kiminki <skiminki@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
| |
JIRA EVLR-244
JIRA EVLR-318
Change-Id: Ie95f42212dadcf2d0c1737eeb28812afb03b712f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1120603
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Ken Adams <kadams@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use gk20a_gmmu_{alloc,free}() instead of duplicating their code for page
tables. The linsim-specific special case is kept as-is.
JIRA DNVGPU-23
JIRA DNVGPU-20
Change-Id: I66d772337bad5d081256b13877c4e713ea8b634a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1139695
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For upcoming vidmem refactor, replace struct gk20a_mm_entry's contents
identical to struct mem_desc, with a struct mem_desc member. This makes
it possible to use the page table buffers like the others too.
JIRA DNVGPU-23
JIRA DNVGPU-20
Change-Id: I714ee5dcb33c27aaee932e8e3ac367e84610b102
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1139694
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_channel_suspend(), we first cancel worker
thread update_fn_work and then cancel job clean up worker
But it is possible that we re-schedule update_fn_work from
job cleanup worker after it was cancelled
And in that case worker update_fn_work might run after
we poweroff GPU
Fix this by first cancelling job clean up worker and
then update_fn_work worker
Bug 200187905
Change-Id: Ia192c515702f14becf60d92c6471d8c0e892551e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120426
(cherry picked from commit b132b6c5aa9ab88c6733f36906f1b874114ec72d)
Reviewed-on: http://git-master/r/1134888
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_pm_shutdown(), we currently wait for 2s
for all channels to finish their work
But we already cancel all the nvgpu workers, freeze
user processes during shutdown
So the waiting should not be required, and hence
remove it
Bug 200166139
Change-Id: I0012f1b3c0f4f676958d083f8c60a001f7015fb0
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1121918
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During GPU shutdown path, it is possible that we
shut down the GPU while worker thread is still
running gk20a_channel_update()
Hence before accessing gp_put/get, check if GPU
is up or not
Bug 200166139
Change-Id: Iba3ec173041a84527c4700a93f20564a842cfb01
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/935193
(cherry picked from commit c81ea5fe383c44e872754b363968af57d84225ac)
Reviewed-on: http://git-master/r/1121917
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gr_gk20a_elpg_protected_call(), we return with error value
if we fail to disable elpg
But since this is a #define'd function, we end up returning
from function which is using gr_gk20a_elpg_protected_call()
So in some cases it is possible that parent function does
not free up resources due to return statement in
gr_gk20a_elpg_protected_call()
Fix this by removing return statement, and execute rest
of the code if there is no error
Coverity id : 31980
Bug 200192125
Change-Id: Ic003b160b76820cdf9355f44658c23bfb2f3815f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1133404
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clear the FIFO interrupt before prcessing the job list after
receiving a nonstalling interrupt. This prevents a race in which
some non-stalling interrupts after a semaphore incr can get lost.
Bug 1732449
JIRA DNVGPU-12
Change-Id: I03df56b2ebca4ed8a0aeb26dd5480c91ffb42d8b
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133791
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not put any ctag data in the PTEs unless compression is
actually enabled for the mapping.
Bug 1732449
JIRA DNVGPU-12
Change-Id: I2abfbf9d1282af24541f8199bd9fbf2133c12899
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133790
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Wake up the correct workqueue during the nonstalling interrupt
handler. Previously the stalling workqueue was woken up which
lead to any process waiting on the nonstalling workqueue hanging
indefinitely.
Bug 1732449
JIRA DNVGPU-12
Change-Id: I8744ceddd7957bbaee0b8203f9a3aaf8ad3792fc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133788
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before checking semaphore values to determine if jobs have been
completed flush the FB. If this is not done, despite the sempahore
memory being mapped as volatile in the GMMU, outstanding writes
can still be pending.
Bug 1732449
JIRA DNVGPU-12
Change-Id: I67b596cd23a5465af05d6d173641a579cb7f168c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133787
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not schedule channel update call backs unless a job is actually
finished. This saves a lot of call backs to the CDE code that don't
do anything when semaphores are enabled.
Bug 1732449
JIRA DNVGPU-12
Change-Id: I2f9a78498b08ebca44ee6a5171931a07721767f1
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133786
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a function to allow the kernel to do fixed mappings. Necessary
for the semaphore functionality since there needs to be a common
address in each VM for the semaphores.
Bug 1732449
JIRA DNVGPU-12
Change-Id: I2b451db2d3cb3c003d951f7b0ffc87f6c91db7dc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133789
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Channel table can be bigger than one page, so allocate it with
vmalloc.
Also add a free for tsg table, which did not exist before, and
remove per-channel remove_channel callback which was never used.
JIRA DNVGPU-50
Change-Id: I3ee84b65d94881df52bf0618bf4c5f2e85758223
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1129244
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Ken Adams <kadams@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for changing a TSG's timeslice, within reasonable
limits imposed by the kernel driver.
JIRA VFND-1494
Bug 1749744
Change-Id: Ifca1b63a00da7a5872483bb56692da70a5f18bdf
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1129837
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There were multiple bugs in dealing with a GPU with more than one
GPC.
* Beta CB size was set to wrong PPC
* TPC mask did not shift fields correctly
* PD skip table used || instead of | operator
Change-Id: I849e2331a943586df16996fe573da2a0ac4cce19
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1132109
|
|
|
|
|
|
|
|
|
|
| |
Add support for probing PCIe graphics cards.
JIRA DNVGPU-7
Change-Id: Iad3d31a1dc0ca6575d8a9916857022cac9181948
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1127684
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On gk20a when PMU is updating ZBC colors it is reading them from L2.
But L2 has one port, and ZBC reads can race with other transactions.
Idle graphics before sending PMU the ZBC_UPDATE request.
Also makes pmu_save_zbc a HAL, because PMU ucode has changes to bypass
this problem on some chips.
Bug 1746047
Change-Id: Id8fcd6850af7ef1d8f0a6aafa0fe6b4f88b5f2d9
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1129017
|
|
|
|
|
|
|
|
|
|
|
|
| |
Program sysmem flush address to prevent random accesses of
address 0.
Change-Id: I886170395f036805f02e0bce7ecd3c8c46b921df
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1129216
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
hw_proj_gk20a.h and hw_proj_gm20b.h should not be
included, hence remove the includes and APIs used
from the header
Use nvgpu_get_litter_value() API to replace use
of header
Bug 200156699
Change-Id: I5e88f71657682dd94ac7f0a45f940b70cf8222e7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1129611
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set platform data for soc memory aperture type, whether
soc memory aperture seen as sysmem or vidmem. For gk20a/gm20b,
soc memory aperture seen as vidmem.
Bug 1749338
Change-Id: I407562ca484c1a4bae1bee12089d2b19f378ca53
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1129167
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Ken Adams <kadams@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix below sparse warnings :
drivers/gpu/nvgpu/gk20a/gk20a.c:764:5: warning: symbol
'gk20a_pm_finalize_poweron' was not declared. Should it be static?
drivers/gpu/nvgpu/gk20a/channel_gk20a.c:2504:14:
warning: symbol 'gk20a_event_id_poll' was not declared. Should it be
static?
drivers/gpu/nvgpu/gk20a/channel_gk20a.c:2538:5:
warning: symbol 'gk20a_event_id_release' was not declared. Should it be
static?
Bug 200067946
Bug 200088648
Change-Id: I5c23e7ee09c1a18fe2eeff12f80a3c2bf73120ef
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1128060
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sri Krishna Chowdary <schowdary@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Bug 1735765
Change-Id: I6adf9cbe8ba636d5e05e2aa3ac46f7f20b1de7ed
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1128303
Reviewed-by: Ken Adams <kadams@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While processing all the jobs in gk20a_channel_clean_up_jobs(),
We currently acquire jobs_lock, traverse the list,
clean up the jobs, and then release the lock
But in this case we might hold the lock for too long
blocking the submit path
Hence make jobs_lock more fine grained by restricting
it for list accesses only
Bug 200187553
Change-Id: If82af8ff386f7bc29061cfd57fdda7df62f11c17
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120412
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove submit lock since we have moved to use
more fine-grained locks
Remove API check_gp_put() since we cannot call
it in submit path due to latencies and we cannot
call it in gk20a_channel_clean_up_jobs() anymore
since it will fail there without the lock
Bug 200187553
Change-Id: I05b9fa95c9009000e13232d8fa567336eeee11c6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120411
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently free sync when we find job list empty
If aggressive_sync is set to true, we try to free
sync during channel unbind() call
But we rarely free sync from channel_unbind() call
since freeing it when job list is empty is
aggressive enough
Hence remove sync free code from channel_unbind()
Implement refcounting for sync:
- get a refcount while submitting a job (and
allocate sync if it is not allocated already)
- put a refcount while freeing the job
- if refcount==0 and if aggressive_sync_destroy is
set, free the sync
- if aggressive_sync_destroy is not set, we will
free the sync during channel close time
Bug 200187553
Change-Id: I74e24adb15dc26a375ebca1fdd017b3ad6d57b61
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120410
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All pre/post fence accesses in last_submit are
currently protected by submit lock
In order to remove the submit lock, move all fence
accesses under own lock i.e. fence_lock
Bug 200187553
Change-Id: I0132d1933dc92db8c5ed8c9311e49a030aa2d38c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120409
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add gk20a/gm20b accessors for various global_esr values
and for sm_dbgr_control modes
Bug 200156699
Change-Id: If7fd8cd7567f8bcd1f645facf9553bdc0a153526
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120333
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add below IOCTL to suspend/resume a context
NVGPU_DBG_GPU_IOCTL_SUSPEND_RESUME_CONTEXTS:
Suspend sequence :
- disable ctxsw
- loop through list of channels
- if channel is ctx resident, suspend all SMs
- otherwise, disable channel/TSG
- enable ctxsw
Resume sequence :
- disable ctxsw
- loop through list of channels
- if channel is ctx resident, resume all SMs
- otherwise, enable channel/TSG
- enable ctxsw
Bug 200156699
Change-Id: Iacf1bf7877b67ddf87cc6891c37c758a4644b014
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120332
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently bind only one channel to a debug session
But some use cases might need multiple channels bound
to same debug session
Add this support by adding a list of channels to debug session.
List structure is implemented as struct dbg_session_channel_data
List node dbg_s_list_node is currently defined in struct
dbg_session_gk20a. But this is inefficient when we need to
add debug session to multiple channels
Hence add new reference structure dbg_session_data to
store dbg_session pointer and list entry
For each NVGPU_DBG_GPU_IOCTL_BIND_CHANNEL call, create
two reference structure dbg_session_channel_data for channel
and dbg_session_data for debug session and bind them together
Define API nvgpu_dbg_gpu_get_session_channel() which will
get first channel in the list of debug session
Use this API wherever we refer to channel bound to debug
session
Remove dbg_sessions define in struct gk20a since it is
not being used anywhere
Add new API NVGPU_DBG_GPU_IOCTL_UNBIND_CHANNEL to support
unbinding of channel from debug sesssion
Bug 200156699
Change-Id: I3bfa6f9cd5b90e7254a75c7e64ac893739776b7f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120331
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|