| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move the submit synchornization code into it's own function. This should
help keep the submit code path a little more readable and understandable.
Bug 1732449
Reviewed-on: http://git-master/r/1203833
(cherry picked from commit f931c65c166aeca3b8fe2996dba4ea5133febc5a)
Change-Id: I4111252d242a4dbffe7f9c31e397a27b66403efc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1221043
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use gk20a_gmmu_alloc() in gk20a_alloc_inst_block() so that
we always try to allocate all inst blocks in vidmem first
Also use common API gk20a_alloc_inst_block() in
channel_gk20a_alloc_inst() as well
Jira DNVGPU-22
Change-Id: I6c47c19aae1189d7e57f47a51d21a32e2df53c1f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1216140
(cherry picked from commit 6c84961a50eb8a8b080b2db08f87e58143f5a6e8)
Reviewed-on: http://git-master/r/1219704
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use gk20a_gmmu_alloc() to allocate channel inst block
which first tries to allocate in vidmem
Jira DNVGPU-22
Change-Id: Ib4d92bf4d2bc0c3d53a82812d635fa8abca4340a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1206274
(cherry picked from commit 0c81c8984c42df27d3520f800eb87728f67d4453)
Reviewed-on: http://git-master/r/1219701
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When trying to power down GPU the engine might be still busy. In this
case delay power down by returning -EBUSY from
gk20a_pm_runtime_suspend().
Bug 200224907
Change-Id: Ibad74c090add24a185bc1a7a02df367af9b95ced
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1213042
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of blocking for gpfifo space in the nvgpu driver,
return -EAGAIN and allow userspace to decide the blocking
policy.
Bug 1795076
Change-Id: Ie091caa92aad3f68bc01a3456ad948e76883bc50
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1202591
(cherry picked from commit 8056f422c6a34a4239fc4993c40c2e517c932714)
Reviewed-on: http://git-master/r/1203800
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The channel timeout lock guards a very small critical section. Use a
spinlock instead of a mutex for performance.
Bug 1795076
Change-Id: I94940f3fbe84ed539bcf1bc76ca6ae7a0ef2fe13
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1200803
(cherry picked from commit 4fa9e973da141067be145d9eba2ea74e96869dcd)
Reviewed-on: http://git-master/r/1203799
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for cyclestats snapshots in the virtual case
Bug 1700143
JIRA EVLR-278
Change-Id: I376a8804d57324f43eb16452d857a3b7bb0ecc90
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1211547
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the common gk20a_gmmu_alloc() that tries vidmem too.
Jira DNVGPU-21
Change-Id: Ie22cb0f5ed70ec71567fc85d348b3526c9a32b02
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1204304
(cherry picked from commit 07cb99baeb10194c520addd77517841a6f99df93)
Reviewed-on: http://git-master/r/1169310
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is done to boost performance of the GPU submit time, which
is critical for compute use-cases.
Bug 200215465
Bug 1804898
Conflicts:
drivers/gpu/nvgpu/gk20a/channel_gk20a.c
Change-Id: Ic4884ee4eac910b92b84a47fdc1b2e9f26b2f1f0
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/1199860
Reviewed-on: http://git-master/r/1209834
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While collecting failing engine data, id type (is_tsg) was not
set for ctxsw and save engine states. This could result in some
ctxsw timeout interrupts to be ignored (id reported with wrong
is_tsg).
For TSGs, check if we made some progress on any of the channels
before kicking fifo recovery.
Bug 200228310
Jira EVLR-597
Change-Id: I231549ae68317919532de0f87effb78ee9c119c6
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1204035
(cherry picked from commit 7221d256fd7e9b418f7789b3d81eede8faa16f0b)
Reviewed-on: http://git-master/r/1204037
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- let force_reset_ch pass down err code
- force_reset_ch callback can cover vgpu too.
Bug 1776876
JIRA VFND-2151
Change-Id: I48f7890294c6455247198e0cab5f21f83f61f0e1
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1202255
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently store fault_id into fifo.deferred_fault_engines
and use that in gk20a_fifo_reset_engine() which is incorrect
Also, in deferred engine reset path during channel close,
we do not check if channel is loaded on engine or not
fix this with below
- store engine_id bits into fifo.deferred_fault_engines
- define new API gk20a_fifo_deferred_reset() to perform
deferred engine reset
- get all engines on which channel is loaded with
gk20a_fifo_engines_on_id()
- for each set bit/engine_id in fifo.deferred_fault_engines,
check if channel is loaded on that engine, and if yes,
reset the engine
Bug 1791696
Change-Id: I1b8b1a9e3aa538fe6903a352aa732b47c95ec7d5
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1195087
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Free the channel used semaphore index during gk20a_free_channel().
Bug 1793819
Change-Id: I4215d05f7f3ba0636e2abb1803011711c8a38301
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1196877
(cherry picked from commit 2c5720de506caac29629f6a1c578e6da80b1a135)
Reviewed-on: http://git-master/r/1198883
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Initialize character array buf in gk20a_channel_ioctl() to zero
Keeping it uninitialized can result in leaking kernel stack
info to user space since we pass this buffer to UMD
Bug 1793398
Change-Id: Iffd654dbaca3b4e3c8fd2ac270d0febd01c165b8
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1195862
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added interface to allow kernel to create privileged CE channels for
page migration and clearing support between sysmem and videmem.
JIRA DNVGPU-53
Change-Id: I3e18d18403809c9e64fa45d40b6c4e3844992506
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1173085
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When processing FECS traces, a hash table is used
to retrieve the 'pid' of the process that created
the channel/TSG. Report process identifer (aka
tgid in kernel) instead of thread identifier (aka
pid) for FECS traces.
Bug 1736423
Change-Id: I54cb9d298b9fe3e1cccdd7145604cd01c5758c9d
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1166501
(cherry picked from commit f7fd1f6d7ad0753b787ec20604a08a1f4882fe6f)
Reviewed-on: http://git-master/r/1168728
(cherry picked from commit 97a62e5b89352fce576f1bca71b38bf2242ff047)
Reviewed-on: http://git-master/r/1177823
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace kfree with nvgpu_free in error handling path in
gk20a_alloc_channel_gpfifo where the gpfifo pipe buffer is being
allocated, because it's allocated with nvgpu_alloc.
Jira DNVGPU-21
Change-Id: I73100394b67da2ab064e4e9df6b430d818abce56
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1182401
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For devices that have vidmem available, use the vidmem allocator in
gk20a_gmmu_alloc{,attr,_map,_map_attr}. For others, use sysmem.
Because all of the buffers haven't been tested to work in vidmem yet,
rename calls to gk20a_gmmu_alloc{,attr,_map,_map_attr} to have _sys at
the end to declare explicitly that vidmem is used. Enabling vidmem for
each now is a matter of removing "_sys" from the function call.
Jira DNVGPU-18
Change-Id: Ibe42f67eff2c2b68c36582e978ace419dc815dc5
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1176805
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
add gk20a_aperture_mask() for memory target selection now that buffers
can actually be allocated from vidmem, and use it in all cases that have
a mem_desc available.
Jira DNVGPU-76
Change-Id: I4353cdc6e1e79488f0875581cfaf2a5cfb8c976a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1169306
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible that when we abort the channel, we have
job clean up worker running, which could race with abort
and sometimes result in below panic
[ 245.483566] Unable to handle kernel paging request at virtual address
800000000
...
[ 245.548991] PC is at gk20a_channel_abort_clean_up+0xb8/0x140
[ 245.554683] LR is at gk20a_channel_abort_clean_up+0xac/0x140
...
[ 247.301860] [<ffffffc000479390>]
gk20a_channel_abort_clean_up+0xb8/0x140
[ 247.312853] [<ffffffc0004794d4>] gk20a_channel_abort+0xbc/0xc8
[ 247.322970] [<ffffffc0004794f8>] gk20a_disable_channel+0x18/0x30
[ 247.333267] [<ffffffc000479628>] gk20a_free_channel+0x118/0x584
[ 247.343473] [<ffffffc000479aa0>] gk20a_channel_close+0xc/0x14
[ 247.353479] [<ffffffc000479b80>] gk20a_channel_release+0xd8/0x104
Fix this by cancelling the job clean up worker before aborting
the channel
Bug 1777281
Change-Id: Ic24c7c03b27cfb5cd164a52efdb1e2813a41a10a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1174416
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Revamp the support the nvgpu driver has for semaphores.
The original problem with nvgpu's semaphore support is that it
required a SW based wait for every semaphore release. This was
because for every fence that gk20a_channel_semaphore_wait_fd()
waited on a new semaphore was created. This semaphore would then
get released by SW when the fence signaled. This meant that for
every release there was necessarily a sync_fence_wait_async() call
which could block. The latency of this SW wait was enough to cause
massive degredation in performance.
To fix this a fast path was implemented. When a fence is passed to
gk20a_channel_semaphore_wait_fd() that is backed by a GPU semaphore
a semaphore acquire is directly used to block the GPU. No longer is
a sync_fence_wait_async() performed nor is there an extra semaphore
created.
To implement this fast path the semaphore memory had to be shared
between channels. Previously since a new semaphore was created
every time through gk20a_channel_semaphore_wait_fd() what address
space a semaphore was mapped into was irrelevant. However, when
using the fast path a sempahore may be released on one address
space but acquired in another.
Sharing the semaphore memory was done by making a fixed GPU mapping
in all channels. This mapping points to the semaphore memory (the
so called semaphore sea). This global fixed mapping is read-only to
make sure no semaphores can be incremented (i.e released) by a
malicious channel. Each channel then gets a RW mapping of it's own
semaphore. This way a channel may only acquire other channel's
semaphores but may both acquire and release its own semaphore.
The gk20a fence code was updated to allow introspection of the GPU
backed fences. This allows detection of when the fast path can be
taken. If the fast path cannot be used (for example when a fence is
sync-pt backed) the original slow path is still present. This gets
used when the GPU needs to wait on an event from something which
only understands how to use sync-pts.
Bug 1732449
JIRA DNVGPU-12
Change-Id: Ic0fea74994da5819a771deac726bb0d47a33c2de
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133792
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added interface for privileged channel allocation to excute
the privileged method (ex. CE phys mode transfer).
JIRA DNVGPU-53
Change-Id: I07f9181720b14345cf5890919c2818dfcf505d86
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1169315
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CUDA needs it disabled.
Bug 1775453
Change-Id: Ic6d5050f9fda259337668e2a245c05e27d65e047
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1162765
(cherry picked from commit 44b48d84e75ced2fd9eecebbe94a0289c527c0c2)
Reviewed-on: http://git-master/r/1169049
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Cancel channel watchdog timeout during channel suspend
This should help fix race conditions when watchdog is triggered
during shutdown
Bug 200209309
Change-Id: I6cf740d854c27985217a1a76afa822e3126d4153
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1168613
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) Keep the __user tag in the type of the user gpfifo when copying,
2) use NULL instead of 0 for initializing user_gpfifo pointer.
Bug 200067946
Change-Id: I631b4bca44ded0900204134338fa1d62d0017df0
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1168441
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use gk20a_mem_*() accessors for gpfifo memory in work submission instead
of direct cpu accesses in order to support other apertures than sysmem.
The gpfifo memory is still allocated from sysmem for dgpus too.
Split the copying of priv_cmds and the main gpfifo to be submitted in
gk20a_submit_channel_gpfifo() into separate functions.
JIRA DNVGPU-21
Change-Id: If271ca8e7e34235f00d31855dbccf77c0008e10b
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1145923
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit edd080b05ab118307c7c7b01426ea1e7c1cc9be7.
Re-enable the watchdog since power management races
are now resolved
Bug 200198908
Change-Id: I74b97e564583aaedd858bc968adcfcaa275ea739
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1165746
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Export gk20a_free_priv_cmdbuf() so that the channel_sync_gk20a.c code
can call this function. This is necessary for error paths in the
semaphore wait/incr functions.
Bug 1732449
JIRA DNVGPU-12
Change-Id: Id2ea13e5553d50475ee1bbf94781e18590321fdf
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1162686
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Bug 200198908
Change-Id: I4dfb3517f5467f8b5449e65290453ba1c828243d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1163439
Tested-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extend the existing NVGPU_GPU_IOCTL_OPEN_CHANNEL interface to allow
opening channels for other than the primary (i.e., the graphics)
runlists. This is required to push work to dGPU engines that have
their own runlists, such as the asynchronous copy engines and the
multimedia engines.
Minor change - Added active_engines_list allocation
and assignment for fifo_vgpu back end.
JIRA DNVGPU-25
Change-Id: I3ed377e2c9a2b4dd72e8256463510a62c64e7a8f
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1161541
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rework how the messages in the channel timeout handler to be a little
bit more verbose and more clear about what is happening.
Bug 1732449
JIRA DNVGPU-12
Change-Id: Ifc018d99c647b3036caa8ad453e5e3dfc4151396
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1153669
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the gp_get and gp_put pointers from the priv_cmdbuf code. These
pointers appear to track the position of th the priv_cmdbuf in the gp_fifo.
However, these pointers are not used for anything nor are they needed for
anything in the future. This code appears to be a relic left over from the
past.
Change-Id: Ibed1a6d51fa0cac12c5e0429760e8e2f611fc899
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1161859
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_event_id_poll(), we always set the mask value
and return it. This causes poll() from UMD to be always
successful irrespective of event is really generated or not
Fix this by adding a flag event_posted for each event
Set this flag while posting the event
In gk20a_event_id_poll(), set the mask value only if
this flag is set. If flag is set, set mask and clear the flag
Bug 200089620
Change-Id: If14236547c611fe4bfa1410ff5b69c9fa02d43bb
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1160253
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This CL covers the following modification,
1) Added multiple engine_info support
2) Added multiple runlist_info support
3) Initial changes for ASYNC CE support
4) Added ASYNC CE interrupt handling support
for gm206 GPU family
5) Added generic mechanism to identify the
CE engine pri_base address for gm206
(CE0, CE1 and CE2)
6) Removed hard coded engine_id logic and
made generic way
7) Code cleanup for readability
JIRA DNVGPU-26
Change-Id: I2c3846c40bcc8d10c2dfb225caa4105fc9123b65
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1155963
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_fifo_abort_tsg(), we loop through channels of
TSG and call gk20a_channel_abort() for each channel
This is incorrect since we disable and preempt each
channel separately, whereas we should disable all channels
at once and use TSG specific API to preempt TSG
Fix this with below sequence :
- gk20a_disable_tsg() to disable all channels
- preempt tsg if required
- for each channel in TSG
- set has_timedout flag
- call gk20a_channel_abort_clean_up() to clean up channel state
Also, separate out common gk20a_channel_abort_clean_up() API
which can be called from both channel and TSG abort routines
In gk20a_channel_abort(), call gk20a_fifo_abort_tsg() if the
channel is part of TSG
Add new argument "preempt" to gk20a_fifo_abort_tsg() and
preempt TSG if flag is set
Bug 200205041
Change-Id: I4eff5394d26fbb53996f2d30b35140b75450f338
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1157190
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gr_gk20a_ctx_zcull_setup(), gr_gk20a_update_smpc_ctxsw_mode(),
and in gk20a_channel_suspend(), we call channel specific APIs
to disable/preempt/enable channel
But we do not consider TSGs in this case
Hence use correct (below) APIs in above functions which
will handle channel or TSG internally :
gk20a_disable_channel_tsg()
gk20a_fifo_preempt()
gk20a_enable_channel_tsg()
Bug 200205041
Change-Id: Ieed378dac4ad2322b35f9102706176ec326d386c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1157189
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
If a channel cannot be allocated a GPFIFO, its status will remain
unbound. We still need to free the channel when the fd gets closed.
Bug 1769481
Change-Id: I8a9170f431cfa98dcf9e3cbc082393f02d2203db
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1154725
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- make tsg_gk20a.c call HAL for enable/disable channels
- add preempt_tsg HAL callbacks
- add tsg bind/unbind channel HAL callbacks
- add according tsg callbacks for vgpu
Bug 1702773
JIRA VFND-1003
Change-Id: I2cba74b3ebd3920ef09219a168e6433d9574dbe8
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1144932
(cherry picked from commit c3787de7d38651d46969348f5acae2ba86b31ec7)
Reviewed-on: http://git-master/r/1126942
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we run out of gpfifo space or private command buffer
space, we have error spew like below :
__gk20a_channel_syncpt_incr: not enough priv cmd buffer space
gk20a_submit_channel_gpfifo: fail
Dumping these prints to UART cause increase in submit
latencies
But on these failures, we return -ENOSPC to UMD and then
UMD retries the submit, hence it might be unnecessary to dump
these prints
Hence, remove the error prints of insufficient space
and use gk20a_dbg_fn() instead of gk20a_err() to print failure
in gk20a_submit_channel_gpfifo()
Bug 200202653
Change-Id: I49efd7c6c554dd4fbfa4e66d196eb352e69f92c6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1152378
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use an inline function instead of a macro to
"expand" all channel parameters.
Jira EVLR-244
Jira EVLR-318
Change-Id: I4e8c5ee6bc9da36564af171be809f50dd2dfd439
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1150050
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In virtualization case, VM server is the only one
allowed to write to ctxsw ring buffer. It will
also generate an event in case of engine reset.
Only generate a tracepoint on Guest OS side.
EVLR-314
Change-Id: I2cb09780a9b41237fe196ef1f3515923f36a24a4
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1130743
(cherry picked from commit 4bbf9538e2a3375eb86b2feea6c605c3eec2ca40)
Reviewed-on: http://git-master/r/1133614
(cherry picked from commit 2076d944db41b37143c27795b3cffd88e99e0b00)
Reviewed-on: http://git-master/r/1150046
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generate a ctxsw channel reset when engine needs to be reset.
This event is generated by the driver, while other events are
generated by FECS.
JIRA ELVR-314
Change-Id: I7791cf3e538782464c37c442c871acb177484566
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1129029
(cherry picked from commit 114038a1a5d9e8941bc53f3e95115b01dd1f8c6e)
Reviewed-on: http://git-master/r/1134379
(cherry picked from commit 15fa2a7b48a0937dfd449ca0c4ed5ad3a863d6bf)
Reviewed-on: http://git-master/r/1123916
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace the plain cpu pointer accesses with gk20a_mem_wr32(), and use a
reference to the underlying mem_desc (within priv_cmd_queue) paired with
an offset, for buffer aperture flexibility.
JIRA DNVGPU-21
JIRA DNVGPU-23
Change-Id: I317672c94bb682bb895f9ed3e8116729c8bb7f4b
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1145922
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To support vidmem, pass g and mem_desc to the buffer memory accessor
functions. This allows the functions to select the memory access method
based on the buffer aperture instead of using the cpu pointer directly
(like until now). The selection and aperture support will be in another
patch; this patch only refactors these accessors, but keeps the
underlying functionality as-is.
gk20a_mem_{rd,wr}32() work as previously; add also gk20a_mem_{rd,wr}()
for byte-indexed accesses, gk20a_mem_{rd,wr}_n() for memcpy()-like
functionality, and gk20a_memset() for filling buffers with a constant.
The 8 and 16 bit accessor functions are removed.
vmap()/vunmap() pairs are abstracted to gk20a_mem_{begin,end}() to
support other types of mappings or conditions where mapping the buffer
is unnecessary or different.
Several function arguments that would access these buffers are also
changed to take a mem_desc instead of a plain cpu pointer. Some relevant
occasions are changed to use the accessor functions instead of cpu
pointers without them (e.g., memcpying to and from), but the majority of
direct accesses will be adjusted later, when the buffers are moved to
support vidmem.
JIRA DNVGPU-23
Change-Id: I3dd22e14290c4ab742d42e2dd327ebeb5cd3f25a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1121143
Reviewed-by: Ken Adams <kadams@nvidia.com>
Tested-by: Ken Adams <kadams@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add separate IOCTL NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE
to allow setting preemption modes from UMD
Define preemption modes in nvgpu.h and use them everywhere
Remove mode definitions from mm_gk20a.h
Also, we support setting only one preemption mode in a channel
But it is possible to have multiple preemption modes (one from
graphics and one from compute) set simultaneously
Hence, update struct gr_ctx_desc to include two separate
preemption modes (graphics_preempt_mode and compute_preempt_mode)
API NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE also supports
setting two separate preemption modes i.e. one for graphics
and one for compute
Make necessary changes in code to support two preemption
modes
Bug 1646259
Change-Id: Ia1dea19e609ba8cc0de2f39ab6c0c4cd6b0a752c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1131805
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gk20a_fifo_force_reset_ch() does not support vgpu now, so we need to
create a function pointer in gpu_ops and assign it differently for
vgpu and non-vgpu.
Bug 200184349
Change-Id: I5f8f4f731b4b970c4ff8de65531f25568e7691b6
Signed-off-by: Haley Teng <hteng@nvidia.com>
Reviewed-on: http://git-master/r/1130420
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update the priv_cmdbuff computation to take into account the amount
of memory semaphores take. Since semaphores always require more
memory than sync-pts the sync-pt computation has been dropped.
Bug 1732449
JIRA DNVGPU-12
Change-Id: Ic05c26b4d1ed9cbd03d3239655c4607bb418396c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1141420
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
JIRA EVLR-244
JIRA EVLR-318
Change-Id: Ie95f42212dadcf2d0c1737eeb28812afb03b712f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1120603
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Ken Adams <kadams@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_channel_suspend(), we first cancel worker
thread update_fn_work and then cancel job clean up worker
But it is possible that we re-schedule update_fn_work from
job cleanup worker after it was cancelled
And in that case worker update_fn_work might run after
we poweroff GPU
Fix this by first cancelling job clean up worker and
then update_fn_work worker
Bug 200187905
Change-Id: Ia192c515702f14becf60d92c6471d8c0e892551e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120426
(cherry picked from commit b132b6c5aa9ab88c6733f36906f1b874114ec72d)
Reviewed-on: http://git-master/r/1134888
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During GPU shutdown path, it is possible that we
shut down the GPU while worker thread is still
running gk20a_channel_update()
Hence before accessing gp_put/get, check if GPU
is up or not
Bug 200166139
Change-Id: Iba3ec173041a84527c4700a93f20564a842cfb01
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/935193
(cherry picked from commit c81ea5fe383c44e872754b363968af57d84225ac)
Reviewed-on: http://git-master/r/1121917
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|