| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Disabling / enabling of PFIFO must stay inside the isr. It cannot be held
disabled outside the isr -- this causes any kind of preemption mechanism to
fail in the presence of an MMU fault until the channel resets the engine.
Bug 1791696
Change-Id: I16600a8571f6555262a75deb305c1d67eb29581a
Signed-off-by: Cory Perry <cperry@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1191026
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added interface to allow kernel to create privileged CE channels for
page migration and clearing support between sysmem and videmem.
JIRA DNVGPU-53
Change-Id: I3e18d18403809c9e64fa45d40b6c4e3844992506
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1173085
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When processing FECS traces, a hash table is used
to retrieve the 'pid' of the process that created
the channel/TSG. Report process identifer (aka
tgid in kernel) instead of thread identifier (aka
pid) for FECS traces.
Bug 1736423
Change-Id: I54cb9d298b9fe3e1cccdd7145604cd01c5758c9d
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1166501
(cherry picked from commit f7fd1f6d7ad0753b787ec20604a08a1f4882fe6f)
Reviewed-on: http://git-master/r/1168728
(cherry picked from commit 97a62e5b89352fce576f1bca71b38bf2242ff047)
Reviewed-on: http://git-master/r/1177823
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug 1781383
CID 37989
- Changed for_each_set_bit addr parameter to unsigned long.
Change-Id: I3f3f314a1aea9d376d45699f870a9e372854f069
Signed-off-by: George Bauernschmidt <georgeb@nvidia.com>
Reviewed-on: http://git-master/r/1177417
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For devices that have vidmem available, use the vidmem allocator in
gk20a_gmmu_alloc{,attr,_map,_map_attr}. For others, use sysmem.
Because all of the buffers haven't been tested to work in vidmem yet,
rename calls to gk20a_gmmu_alloc{,attr,_map,_map_attr} to have _sys at
the end to declare explicitly that vidmem is used. Enabling vidmem for
each now is a matter of removing "_sys" from the function call.
Jira DNVGPU-18
Change-Id: Ibe42f67eff2c2b68c36582e978ace419dc815dc5
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1176805
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
add gk20a_aperture_mask() for memory target selection now that buffers
can actually be allocated from vidmem, and use it in all cases that have
a mem_desc available.
Jira DNVGPU-76
Change-Id: I4353cdc6e1e79488f0875581cfaf2a5cfb8c976a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1169306
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add CE engine to vgpu engine list. CE engine is defined differently
for different GPUs, so we also add HAL for initializing the engine
info.
Bug 1780185
Change-Id: I5ae265551feac08d0c4d45402dd3277514e62b2d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1169720
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Tested-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Lakshmanan M <lm@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not spew an error when choosing the default runlist for engine. That is
normal behavior.
Change-Id: Ide786712f3f74bf59aee48de98c2186db1d97378
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1163511
GVS: Gerrit_Virtual_Submit
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Tested-by: Lakshmanan M <lm@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extend the existing NVGPU_GPU_IOCTL_OPEN_CHANNEL interface to allow
opening channels for other than the primary (i.e., the graphics)
runlists. This is required to push work to dGPU engines that have
their own runlists, such as the asynchronous copy engines and the
multimedia engines.
Minor change - Added active_engines_list allocation
and assignment for fifo_vgpu back end.
JIRA DNVGPU-25
Change-Id: I3ed377e2c9a2b4dd72e8256463510a62c64e7a8f
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1161541
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This CL covers the following modification,
1) Added multiple engine_info support
2) Added multiple runlist_info support
3) Initial changes for ASYNC CE support
4) Added ASYNC CE interrupt handling support
for gm206 GPU family
5) Added generic mechanism to identify the
CE engine pri_base address for gm206
(CE0, CE1 and CE2)
6) Removed hard coded engine_id logic and
made generic way
7) Code cleanup for readability
JIRA DNVGPU-26
Change-Id: I2c3846c40bcc8d10c2dfb225caa4105fc9123b65
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1155963
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_fifo_abort_tsg(), we loop through channels of
TSG and call gk20a_channel_abort() for each channel
This is incorrect since we disable and preempt each
channel separately, whereas we should disable all channels
at once and use TSG specific API to preempt TSG
Fix this with below sequence :
- gk20a_disable_tsg() to disable all channels
- preempt tsg if required
- for each channel in TSG
- set has_timedout flag
- call gk20a_channel_abort_clean_up() to clean up channel state
Also, separate out common gk20a_channel_abort_clean_up() API
which can be called from both channel and TSG abort routines
In gk20a_channel_abort(), call gk20a_fifo_abort_tsg() if the
channel is part of TSG
Add new argument "preempt" to gk20a_fifo_abort_tsg() and
preempt TSG if flag is set
Bug 200205041
Change-Id: I4eff5394d26fbb53996f2d30b35140b75450f338
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1157190
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When resetting GR engine flush FECS trace before
halting the pipeline. Otherwise FECS remains in
sideband method processing loop, and we get a
timeout on FECS trace flush
Bug 200193891
Change-Id: I137ea20eb1fb4ef6d618cd01cd3c096471eb8fb0
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1155240
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Channels belonging to a TSG did not have their error notifier
set correctly. This was due to using an incorrect TSG id.
Bug 1617046
Change-Id: Icb6911c7d79a9d02d7713bb47a7cbb24c3098dc1
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1155293
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- make tsg_gk20a.c call HAL for enable/disable channels
- add preempt_tsg HAL callbacks
- add tsg bind/unbind channel HAL callbacks
- add according tsg callbacks for vgpu
Bug 1702773
JIRA VFND-1003
Change-Id: I2cba74b3ebd3920ef09219a168e6433d9574dbe8
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1144932
(cherry picked from commit c3787de7d38651d46969348f5acae2ba86b31ec7)
Reviewed-on: http://git-master/r/1126942
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add below notifier for pbcrc mismatch
NVGPU_CHANNEL_PBDMA_PUSHBUFFER_CRC_MISMATCH
And use this notifier value when we have
pbdma pbcrc interrupt pending
Bug 200179981
Change-Id: I289351e990afb0a4e002902881b99023530f6443
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1156210
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added device_info_data parsing
support for maxwell GPU series.
JIRA DNVGPU-26
Change-Id: I06dbec6056d4c26501e607c2c3d67ef468d206f4
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1151602
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In virtualization case, VM server is the only one
allowed to write to ctxsw ring buffer. It will
also generate an event in case of engine reset.
Only generate a tracepoint on Guest OS side.
EVLR-314
Change-Id: I2cb09780a9b41237fe196ef1f3515923f36a24a4
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1130743
(cherry picked from commit 4bbf9538e2a3375eb86b2feea6c605c3eec2ca40)
Reviewed-on: http://git-master/r/1133614
(cherry picked from commit 2076d944db41b37143c27795b3cffd88e99e0b00)
Reviewed-on: http://git-master/r/1150046
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generate a ctxsw channel reset when engine needs to be reset.
This event is generated by the driver, while other events are
generated by FECS.
JIRA ELVR-314
Change-Id: I7791cf3e538782464c37c442c871acb177484566
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1129029
(cherry picked from commit 114038a1a5d9e8941bc53f3e95115b01dd1f8c6e)
Reviewed-on: http://git-master/r/1134379
(cherry picked from commit 15fa2a7b48a0937dfd449ca0c4ed5ad3a863d6bf)
Reviewed-on: http://git-master/r/1123916
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
| |
Enable GRCE when enabling GR. Also use the reset mask read from
device info instead of using the hard coded value.
Change-Id: I4812c32d09ea8b5e07abd1b2c6f1efdbe00cb36e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1149359
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were not using the engine_type field in device info, and the code
did not handle chained entries properly. The code assumed that first
entry is for graphics and second for CE, which is not always true.
Improve the code to go through all entries of device_info, and
preserve values across entries until we reach the last entry.
Only last entry triggers a write to fifo engine info.
There can also be multiple engines with same type, so accumulate
interrupts and reset ids from all of them.
As the code got fixed, now it reads the engine enum correctly from
hardware. We used to compare that against CE0, but we should compare
against CE2.
gk20a_fifo_reset_engine() uses wrong constants - it is passed a
internal numbering of engines, but it compares them against hardware
engine enum.
Change-Id: Ia59273921c602d2a090f7a5b1404afb0fca2532c
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1147746
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added support for multiple PBDMAs handling during
fifo_pbdma_isr and gk20a_init_fifo_reset_enable_hw
use case.
JIRA DNVGPU-26
Change-Id: I5f013c5373f7a4b80a8de8863f0e175576ed4c22
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1145591
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To support vidmem, pass g and mem_desc to the buffer memory accessor
functions. This allows the functions to select the memory access method
based on the buffer aperture instead of using the cpu pointer directly
(like until now). The selection and aperture support will be in another
patch; this patch only refactors these accessors, but keeps the
underlying functionality as-is.
gk20a_mem_{rd,wr}32() work as previously; add also gk20a_mem_{rd,wr}()
for byte-indexed accesses, gk20a_mem_{rd,wr}_n() for memcpy()-like
functionality, and gk20a_memset() for filling buffers with a constant.
The 8 and 16 bit accessor functions are removed.
vmap()/vunmap() pairs are abstracted to gk20a_mem_{begin,end}() to
support other types of mappings or conditions where mapping the buffer
is unnecessary or different.
Several function arguments that would access these buffers are also
changed to take a mem_desc instead of a plain cpu pointer. Some relevant
occasions are changed to use the accessor functions instead of cpu
pointers without them (e.g., memcpying to and from), but the majority of
direct accesses will be adjusted later, when the buffers are moved to
support vidmem.
JIRA DNVGPU-23
Change-Id: I3dd22e14290c4ab742d42e2dd327ebeb5cd3f25a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1121143
Reviewed-by: Ken Adams <kadams@nvidia.com>
Tested-by: Ken Adams <kadams@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add separate IOCTL NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE
to allow setting preemption modes from UMD
Define preemption modes in nvgpu.h and use them everywhere
Remove mode definitions from mm_gk20a.h
Also, we support setting only one preemption mode in a channel
But it is possible to have multiple preemption modes (one from
graphics and one from compute) set simultaneously
Hence, update struct gr_ctx_desc to include two separate
preemption modes (graphics_preempt_mode and compute_preempt_mode)
API NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE also supports
setting two separate preemption modes i.e. one for graphics
and one for compute
Make necessary changes in code to support two preemption
modes
Bug 1646259
Change-Id: Ia1dea19e609ba8cc0de2f39ab6c0c4cd6b0a752c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1131805
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gk20a_fifo_force_reset_ch() does not support vgpu now, so we need to
create a function pointer in gpu_ops and assign it differently for
vgpu and non-vgpu.
Bug 200184349
Change-Id: I5f8f4f731b4b970c4ff8de65531f25568e7691b6
Signed-off-by: Haley Teng <hteng@nvidia.com>
Reviewed-on: http://git-master/r/1130420
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix below sparse warnings:
drivers/gpu/nvgpu/gk20a/fifo_gk20a.c:2628:29:
warning: symbol 'gk20a_fifo_sched_debugfs_seq_
ops' was not declared. Should it be static?
drivers/gpu/nvgpu/gk20a/fifo_gk20a.c:2657:30:
warning: symbol 'gk20a_fifo_sched_debugfs_fops'
was not declared. Should it be static?
Bug 200088648
Change-Id: I4e12ca0988b2d9dd7962d026a3a0f9f674e89ada
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1142909
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
JIRA EVLR-244
JIRA EVLR-318
Change-Id: Ie95f42212dadcf2d0c1737eeb28812afb03b712f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1120603
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Ken Adams <kadams@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clear the FIFO interrupt before prcessing the job list after
receiving a nonstalling interrupt. This prevents a race in which
some non-stalling interrupts after a semaphore incr can get lost.
Bug 1732449
JIRA DNVGPU-12
Change-Id: I03df56b2ebca4ed8a0aeb26dd5480c91ffb42d8b
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133791
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Channel table can be bigger than one page, so allocate it with
vmalloc.
Also add a free for tsg table, which did not exist before, and
remove per-channel remove_channel callback which was never used.
JIRA DNVGPU-50
Change-Id: I3ee84b65d94881df52bf0618bf4c5f2e85758223
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1129244
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Ken Adams <kadams@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
| |
In Tegra GPU, SoC memory has to be accessed as vidmem. In discrete GPU, it
has to be accessed as sysmem.
Change-Id: I4efe71b54a9a32f0bf1f02ec4016ed74405a14c5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120468
|
|
|
|
|
|
|
|
|
| |
Move per-chip constants to be returned by a chip specific function.
Implement get_litter_value() for each chip.
Change-Id: I2a2730fce14010924d2507f6fa15cc2ea0795113
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1121383
|
|
|
|
|
|
|
|
|
|
|
| |
Support GPUs which cannot choose between SMMU and physical
addressing.
Change-Id: If3256fa1bc795a84d039ad3aa63ebdccf5cc0afb
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120469
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
|
|
|
|
|
|
|
|
|
| |
Use struct device instead of struct platform_device wherever
possible. This allows adding other bus types later.
Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120466
|
|
|
|
|
|
|
|
| |
Change-Id: I319e877978b7f483108ef8f67c05702b71709f62
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1120501
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Post EVENT_ID_BPT_INT when bpt.int is pending
Post EVENT_ID_BPT_PAUSE when bpt.pause is pending
Post EVENT_ID_BLOCKING_SYNC whenever there is
non-stalling semaphore interrupt indicating work
completion from GR/CE2 engine
Bug 200089620
Change-Id: I91b7bf48f8585f0d318298fc0c4a66d42055f0a7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1112274
(cherry picked from commit d2b744b1f9acac56435cd7e7ab9a7a845579ef24)
Reviewed-on: http://git-master/r/1120321
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bug 1648908
This commit adds support for FECS ctxsw tracing. Code is compiled
conditionnaly under CONFIG_GK20_CTXSW_TRACE.
This feature requires an updated FECS ucode that writes one record to a ring
buffer on each context switch. On RM/Kernel side, the GPU driver reads records
from the master ring buffer and generates trace entries into a user-facing
VM ring buffer. For each record in the master ring buffer, RM/Kernel has
to retrieve the vmid+pid of the user process that submitted related work.
Features currently implemented:
- master ring buffer allocation
- debugfs to dump master ring buffer
- FECS record per context switch (with both current and new contexts)
- dedicated device for ctxsw tracing (access to VM ring buffer)
- SOF generation (and access to PTIMER)
- VM ring buffer allocation, and reconfiguration
- enable/disable tracing at user level
- event-based trace filtering
- context_ptr to vmid+pid mapping
- read system call for ctxsw dev
- mmap system call for ctxsw dev (direct access to VM ring buffer)
- poll system call for ctxsw dev
- save/restore register on ELPG/CG6
- separate user ring from FECS ring handling
Features requiring ucode changes:
- enable/disable tracing at FECS level
- actual busy time on engine (bug 1642354)
- master ring buffer threshold interrupt (P1)
- API for GPU to CPU timestamp conversion (P1)
- vmid/pid/uid based filtering (P1)
Change-Id: I8e39c648221ee0fa09d5df8524b03dca83fe24f3
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1022737
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Bug 1727687
Change-Id: I7a7a4a2011b029474122fdbfbeb02b6302a5902b
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1011486
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, only "high" priority bare channels were interleaved
between all other bare channels and TSGs. This patch decouples
priority from interleaving and introduces 3 levels for interleaving
a bare channel or TSG: high, medium, and low. The levels define
the number of times a channel or TSG will appear on a runlist (see
nvgpu.h for details).
By default, all bare channels and TSGs are set to interleave level
low. Userspace can then request the interleave level to be increased
via the CHANNEL_SET_RUNLIST_INTERLEAVE ioctl (TSG-specific ioctl will
be added later).
As timeslice settings will soon be coming from userspace, the default
timeslice for "high" priority channels has been restored.
JIRA VFND-1302
Bug 1729664
Change-Id: I178bc1cecda23f5002fec6d791e6dcaedfa05c0c
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1014962
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enable ELPG back whenever ELPG disable is done due to reset or recovery.
Otherwise elpg_refcnt mismatch doesn't engage ELPG correctly
Bug 200156347
Bug 1716764
Change-Id: I9284bb52b32fe911bb8eb260f138b616f4a564be
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1020617
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Export separate API gk20a_fifo_issue_preempt() to issue
preempt request to a channel or TSG
Bug 200156699
Change-Id: Ib3b097ef66a6411d75c1fe213cdbe8b1d08d3418
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/935771
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It fixed vgpu regression that vgpu tried to call native
channel preemption function.
Bug 1617046
Change-Id: Ia5a5486d8b95a34ca6ecc75f8d3b5fea76919405
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/935897
Tested-by: Damian Halas <dhalas@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- correct runlist entry type for tsg
- consider tsg when preempt channel
Bug 1617046
Change-Id: Ie067df17fb53ae91c49403637a5f35fc3710e0b3
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/926571
GVS: Gerrit_Virtual_Submit
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support preprocessing of SM exceptions if API
pointer pre_process_sm_exception() is defined
Also, expose some common APIs
Bug 200156699
Change-Id: I1303642c1c4403c520b62efb6fd83e95eaeb519b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/925883
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Interleave all high priority channels between all other channels.
This reduces the latency for high priority work when there
are a lot of lower priority work present, imposing an upper
bound on the latency. Change the default high priority timeslice
from 5.2ms to 3.0 in the process, to prevent long running high priority
apps from hogging the GPU too much.
Introduce a new debugfs node to enable/disable high priority
channel interleaving. It is currently enabled by default.
Adds new runlist length max register, used for allocating
suitable sized runlist.
Limit the number of interleaved channels to 32.
This change reduces the maximum time a lower priority job
is running (one timeslice) before we check that high priority
jobs are running.
Tested with gles2_context_priority (still passes)
Basic sanity testing is done with graphics_submit
(one app is high priority)
Also more functional testing using lots of parallel runs with:
NVRM_GPU_CHANNEL_PRIORITY=3 ./gles2_expensive_draw
–drawsperframe 20000 –triangles 50 –runtime 30 –finish
plus multiple:
NVRM_GPU_CHANNEL_PRIORITY=2 ./gles2_expensive_draw
–drawsperframe 20000 –triangles 50 –runtime 30 -finish
Previous to this change, the relative performance between
high priority work and normal priority work comes down
to timeslice value. This means that when there are many
low priority channels, the high priority work will still
drop quite a lot. But with this change, the high priority
work will roughly get about half the entire GPU time, meaning
that after the initial lower performance, it is less likely
to get lower in performance due to more apps running on the system.
This change makes a large step towards real priority levels.
It is not perfect and there are no guarantees on anything,
but it is a step forwards without any additional CPU overhead
or other complications. It will also serve as a baseline to
judge other algorithms against.
Support for priorities with TSG is future work.
Support for interleave mid + high priority channels,
instead of just high, is also future work.
Bug 1419900
Change-Id: I0f7d0ce83b6598fe86000577d72e14d312fdad98
Signed-off-by: Peter Pipkorn <ppipkorn@nvidia.com>
Reviewed-on: http://git-master/r/805961
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It'll detect dead semaphore acquire. The worst case is when
ACQUIRE_SWITCH is disabled, semaphore acquire will poll and
consume full gpu timeslicees.
The timeout value is set to half of channel WDT.
Bug 1636800
Change-Id: Ida6ccc534006a191513edf47e7b82d4b5b758684
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/928827
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit f6ab5bd17d16f3605b78c3c2ee80513d5823c594.
Fix for graphics_submit regresssion.
Bug 200164812
Change-Id: I5e37b8263758ee389cdba3ec6e3758afbdd9c910
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/929605
Tested-by: Hoang Pham <hopham@nvidia.com>
Reviewed-by: Hoang Pham <hopham@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enable ELPG back whenever ELPG disable is done due to reset or recovery.
Otherwise elpg_refcnt mismatch doesn’t engage ELPG correctly
Bug 200156347
Change-Id: Ic01f85b9e1eff10cfb9cb180b50b045f67d4b33c
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/925763
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current sequence in gk20a_disable_channel() is
- disable channel in gk20a_channel_abort()
- adjust pending fence in gk20a_channel_abort()
- preempt channel
But this leads to scenarios where syncpoint has
min > max value
Hence to fix this, make sequence in gk20a_disable_channel()
- disable channel in gk20a_channel_abort()
- preempt channel in gk20a_channel_abort()
- adjust pending fence in gk20a_channel_abort()
If gk20a_channel_abort() is called from other API where
preemption is not needed, then use channel_preempt
flag and do not preempt channel in those cases
Bug 1683059
Change-Id: I4d46d4294cf8597ae5f05f79dfe1b95c4187f2e3
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/921290
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we are handlling a fake mmu fault, then mention
so in the error print
Bug 1696529
Change-Id: Ife0106c3b83d18b315fc08dafadeca0129187624
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/828460
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of depending on clock frame-work, use platform data
for ptimer source rate. Removed ptimerscaling10x platform
data, and use ptimer source frequency to calculate
ptimerscaling factor.
Reviewed-on: http://git-master/r/819030
(cherry picked from commit dd291334d54dab80cab7eb1656dffc48a59610b4)
Change-Id: I7638ce9875a6e440bbfc2ba2da0d0b094b2700ff
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/827300
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add new IOCTL NVGPU_IOCTL_TSG_SET_PRIORITY to allow
setting timeslice for entire TSG
Return error from channel specific IOCTL_CHANNEL_SET_PRIORITY
if the channel is part of TSG
Separate out API gk20a_channel_get_timescale_from_timeslice()
to get timeslice_timeout and scale from timeslice period
Use this API to get timeslice_timeout and scale for TSG and
store it in tsg_gk20a structure
Then trigger runlist update so that new timeslice values
will be re-written to runlist for TSG
Bug 200146615
Change-Id: I555467d034f81b372b31372f0835d72b1c159508
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/824206
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|