| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Revamp the support the nvgpu driver has for semaphores.
The original problem with nvgpu's semaphore support is that it
required a SW based wait for every semaphore release. This was
because for every fence that gk20a_channel_semaphore_wait_fd()
waited on a new semaphore was created. This semaphore would then
get released by SW when the fence signaled. This meant that for
every release there was necessarily a sync_fence_wait_async() call
which could block. The latency of this SW wait was enough to cause
massive degredation in performance.
To fix this a fast path was implemented. When a fence is passed to
gk20a_channel_semaphore_wait_fd() that is backed by a GPU semaphore
a semaphore acquire is directly used to block the GPU. No longer is
a sync_fence_wait_async() performed nor is there an extra semaphore
created.
To implement this fast path the semaphore memory had to be shared
between channels. Previously since a new semaphore was created
every time through gk20a_channel_semaphore_wait_fd() what address
space a semaphore was mapped into was irrelevant. However, when
using the fast path a sempahore may be released on one address
space but acquired in another.
Sharing the semaphore memory was done by making a fixed GPU mapping
in all channels. This mapping points to the semaphore memory (the
so called semaphore sea). This global fixed mapping is read-only to
make sure no semaphores can be incremented (i.e released) by a
malicious channel. Each channel then gets a RW mapping of it's own
semaphore. This way a channel may only acquire other channel's
semaphores but may both acquire and release its own semaphore.
The gk20a fence code was updated to allow introspection of the GPU
backed fences. This allows detection of when the fast path can be
taken. If the fast path cannot be used (for example when a fence is
sync-pt backed) the original slow path is still present. This gets
used when the GPU needs to wait on an event from something which
only understands how to use sync-pts.
Bug 1732449
JIRA DNVGPU-12
Change-Id: Ic0fea74994da5819a771deac726bb0d47a33c2de
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133792
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some of the conditionally compiled code in the nvgpu driver there
are places where the code looks like:
#ifdef LINUX_VERSION_CODE < KERNEL_VERSION(3,18,0)
some-loop {
#else
a-diff-loop {
#endif
/* Some code... */
}
This leaves unbalanced curley braces: two open braces for one close
brace. This messes up some editors syntax highlighting and auto-
indentation features.
This patch puts in the extra brace. It's not necessary for compiling
code but it makes some editors much happier.
Change-Id: Ida28bc001cc840fe52a43982db934d49c07cc7d3
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1153668
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we run out of gpfifo space or private command buffer
space, we have error spew like below :
__gk20a_channel_syncpt_incr: not enough priv cmd buffer space
gk20a_submit_channel_gpfifo: fail
Dumping these prints to UART cause increase in submit
latencies
But on these failures, we return -ENOSPC to UMD and then
UMD retries the submit, hence it might be unnecessary to dump
these prints
Hence, remove the error prints of insufficient space
and use gk20a_dbg_fn() instead of gk20a_err() to print failure
in gk20a_submit_channel_gpfifo()
Bug 200202653
Change-Id: I49efd7c6c554dd4fbfa4e66d196eb352e69f92c6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1152378
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace manual buffer allocation and cpu_va pointer accesses with
gk20a_gmmu_{alloc,free}() and gk20a_mem_{rd,wr}() using a struct
mem_desc in gk20a_semaphore_pool, for buffer aperture flexibility.
JIRA DNVGPU-23
Change-Id: I394c38f407a9da02480bfd35062a892eec242ea3
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1146684
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace the plain cpu pointer accesses with gk20a_mem_wr32(), and use a
reference to the underlying mem_desc (within priv_cmd_queue) paired with
an offset, for buffer aperture flexibility.
JIRA DNVGPU-21
JIRA DNVGPU-23
Change-Id: I317672c94bb682bb895f9ed3e8116729c8bb7f4b
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1145922
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently free sync when we find job list empty
If aggressive_sync is set to true, we try to free
sync during channel unbind() call
But we rarely free sync from channel_unbind() call
since freeing it when job list is empty is
aggressive enough
Hence remove sync free code from channel_unbind()
Implement refcounting for sync:
- get a refcount while submitting a job (and
allocate sync if it is not allocated already)
- put a refcount while freeing the job
- if refcount==0 and if aggressive_sync_destroy is
set, free the sync
- if aggressive_sync_destroy is not set, we will
free the sync during channel close time
Bug 200187553
Change-Id: I74e24adb15dc26a375ebca1fdd017b3ad6d57b61
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120410
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make necessary changes to support nvgpu on kernel-3.10
This includes below changes
- PROBE_PREFER_ASYNCHRONOUS is defined only for K3.10
- Fence handling and struct sync_fence is different between
K3.10 and K3.18
- variable status in struct sync_fence is atomic on K3.18
whereas it is int on K3.10
- if SOC == T132, set soc_name = "tegra13x"
- ioremap_cache() is not defined on K3.10 ARM versions,
hence use ioremap_cached()
Bug 200188753
Change-Id: I18d77eb1404e15054e8510d67c9a61c0f1883e2b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1121092
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
| |
Use struct device instead of struct platform_device wherever
possible. This allows adding other bus types later.
Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120466
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move nvhost out of the common Linux repo and into its own git repo. By
doing this, the same nvhost driver can work on different Linux kernel
versions.
Previously android/sync.h was referenced relative to
kernel/drivers/video/tegra/host. However, host moved to a completely
different part of the tree. Instead, reference it relative to kernel/include.
bug 1749413
Change-Id: Ic7f94093c712e5b64c9b3b660d6fce5d18e59bc0
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/1120544
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Tested-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes a bug where reference to sync_fence is not closed before return.
Bug 200171146
Change-Id: If174eb124bd69692bab4cc8629a103517d7cfef1
Signed-off-by: Yunbo Wang <yunbow@nvidia.com>
Reviewed-on: http://git-master/r/1029844
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Eric Miao <emiao@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Increase the semaphore count per channel. Some channels
were running out of semaphores. The original limit was
255 (256 fits in 1 page, but the 0th semaphore is used
to return error codes from the allocator).
Easy fix was to simply increase the number of semaphores
each channel is allocated to 1024.
Bug 1604892
Change-Id: I163e24b8d42a3dc1bb9b418dadc0c8532aff9adb
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/935911
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A race condition existed in gk20a_channel_semaphore_wait_fd().
In some instances the semaphore underlying the sync_fence being
waited on would have already signaled. This would cause the
subsequent sync_fence_wait_async() call to return 1 and do
nothing. Normally, the sync_fence_wait_async() call would
release the newly created semaphore but in the above case that
would not happen and hang any channel waiting on that semaphore.
To fix this problem if sync_fence_wait_async() returns 1
immediately release the newly created semaphore.
Bug 1604892
Change-Id: I1f5e811695bb099f71b7762835aba4a7e27362ec
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/935910
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we run out of gpfifo space or private command
buffer space, we currently return EAGAIN as error code
Instead of EAGAIN, return ENOSPC as error code so
that caller (user space) can read the error code
and do some re-trials
As the jobs are processed, it is possible to free up
some space. And hence such re-trials could succeed
Bug 1715291
Change-Id: I9a2ed7134d2496b383899b3c02c0e70452b26115
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/929402
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Tested-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, we create sync_fence (from nvhost_sync_create_fence())
for every submit
But not all submits request for a sync_fence.
Also, nvhost_sync_create_fence() API takes about 1/3rd of the total
submit path.
Hence to optimize, we can allocate sync_fence
only when user explicitly asks for it using
(NVGPU_SUBMIT_GPFIFO_FLAGS_FENCE_GET &&
NVGPU_SUBMIT_GPFIFO_FLAGS_SYNC_FENCE)
Also, in CDE path from gk20a_prepare_compressible_read(),
we reuse existing fence stored in "state" and that can
result into not returning sync_fence_fd when user asked
for it
Hence, force allocation of sync_fence when job submission
comes from CDE path
Bug 200141116
Change-Id: Ia921701bf0e2432d6b8a5e8b7d91160e7f52db1e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/812845
(cherry picked from commit 5fd47015eeed00352cc8473eff969a66c94fee98)
Reviewed-on: http://git-master/r/837662
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently set "aggressive_destroy" flag to destroy
sync object statically and for each sync object
Move this flag to per-platform structure so that it
can be set per-platform for all the sync objects
Also, set the default value of this flag as "false"
and set it to "true" once we have more than 64
channels in use
Bug 200141116
Change-Id: I1bc271df4f468a4087a06a27c7289ee0ec3ef29c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/822041
(cherry picked from commit 98741e7e88066648f4f14490c76b61dbff745103)
Reviewed-on: http://git-master/r/835800
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
replace sprinf with snprintf in func gk20a_channel_syncpt_create.
sync point name can be long.
Bug 1638853
Change-Id: Ie305d04edfbb299c8b1241eca52101439bb4a6c6
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/769113
Reviewed-on: http://git-master/r/776424
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gk20a_busy() is already called on all the paths to
__gk20a_channel_syncpt_incr() i.e. in gk20a_submit_channel_gpfifo()
hence remove the redundant gk20a_busy() call since it causes
deadlock scenario with VPR resize use case
Bug 200128257
Bug 1645760
Bug 200114947
Bug 200124519
Change-Id: I4cd47b7e7cdc92aaeda17256a99f2ba93833a3b3
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/778341
(cherry picked from commit 5a5dc5b5a9d38a5e8d5c1ca29dc6de425c00b605)
Reviewed-on: http://git-master/r/779070
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gk20a_busy() was added to gk20a_channel_syncpt_update() for possible
case of channel deletion
But API to delete a channel (i.e. gk20a_free_channel()) is already
called in paths which ensure gk20a_busy() is called before
deleting the channel
Hence, remove redundant gk20a_busy()/idle() calls
This also fixes a deadlock scenario with VPR resize use case
Bug 200128257
Bug 1645760
Bug 200114947
Bug 200124519
Change-Id: I05dc739b3be88af2ba22b0a667e5004d8100bf6f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/778340
(cherry picked from commit 306282aa950201cf1ae91a5cc48d75719b179d19)
Reviewed-on: http://git-master/r/779069
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add reference counting for channels, and wait for reference count to
get to 0 in gk20a_channel_free() before actually freeing the channel.
Also, change free channel tracking a bit by employing a list of free
channels, which simplifies the procedure of finding available channels
with reference counting.
Each use of a channel must have a reference taken before use or held
by the caller. Taking a reference of a wild channel pointer may fail, if
the channel is either not opened or in a process of being closed. Also,
add safeguards for protecting accidental use of closed channels,
specifically, by setting ch->g = NULL in channel free. This will make it
obvious if freed channel is attempted to be used.
The last user of a channel might be the deferred interrupt handler,
so wait for deferred interrupts to be processed twice in the channel
free procedure: once for providing last notifications to the channel
and once to make sure there are no stale pointers left after referencing
to the channel has been denied.
Finally, fix some races in channel and TSG force reset IOCTL path,
by pausing the channel scheduler in gk20a_fifo_recover_ch() and
gk20a_fifo_recover_tsg(), while the affected engines have been identified,
the appropriate MMU faults triggered, and the MMU faults handled. In this
case, make sure that the MMU fault does not attempt to query the hardware
about the failing channel or TSG ids. This should make channel recovery
more safe also in the regular (i.e., not in the interrupt handler) context.
Bug 1530226
Bug 1597493
Bug 1625901
Bug 200076344
Bug 200071810
Change-Id: Ib274876908e18219c64ea41e50ca443df81d957b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/448463
(cherry picked from commit 3f03aeae64ef2af4829e06f5f63062e8ebd21353)
Reviewed-on: http://git-master/r/755147
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pass host1x device pointer for below APIs :
nvhost_get_syncpt_host_managed()
nvhost_syncpt_put_ref_ext()
Also, compose a name for syncpoints in GPU driver itself.
This name will be created as combination of device name
and channel index
Bug 1611482
Change-Id: Id1ddd0e87e0272ddb0758713d6b6c2544bc36bf4
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/751908
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Tested-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the addition of the buddy allocator often times push buffers are
allocated by the kernel in high GVA memory regions. These addresses,
when written into a GPFIFO entry, have bits set in entry1 of the GPFIFO
command.
As a result, if no length is set, then these address bits will be
interpreted as opcodes by the GPU. The bug fixed by this patch was
caused by a wait_cmd being inserted into the GPFIFO with an address
of a pushbuffer above 4GB and a zero length. This occured becasue
the code that creates the wait_cmd was able to return what appeared
to be a valid priv_cmd_entry even though there was nothing in that
command.
This bug does not appear before the buddy allocator because the FFF
allocator always starts allocating from low addresses. As such when a
channel's GPFIFO is allocated it gets an address below 32bits. The,
because no higher address bits are set, entry1 of the GPFIFO is simply
0 and the GPU trets the command as a no-op.
Change-Id: I9c1e600c368b55626e99f6f712f1821148bbb76d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/752080
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Drop host1x syncpoint refcount with nvhost_syncpt_put_ref_ext()
instead of freeing it directly
Bug 1646883
Change-Id: Ib213e58031a9302e683f8d13ebb4e1f913206464
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/747150
GVS: Gerrit_Virtual_Submit
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Resetup RAMFC once sync point id is allocated for a channel.
Change-Id: Idbac406bea1c94c89ef587dda08fddc740c1fadb
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/711302
Reviewed-on: http://git-master/r/737526
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Below check assumes that available syncpt range
starts from 0
id >= nvhost_syncpt_nb_pts_ext()
Instead of using this API, use nvhost_syncpt_is_valid_pt_ext()
which validates the syncpt id against both upper and lower
boundaries
Bug 1611482
Change-Id: I7c4465a2bc84b63fefaa17c64f02582885924c5e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/711211
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
| |
Change-Id: I8753e47ef4d3f4b3645ed6c6e604449d81d3da4b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/709061
(cherry picked from commit cc07f316334b88cc18070fba9dd9149ba193bd38)
Reviewed-on: http://git-master/r/707980
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This must have occurred while rebasing dev-kernel-3.10
over kernel 3.18.
This change corrects the mistake.
Change-Id: I11fbc11105a032198828e8bc31da5ab92af0ffdb
Signed-off-by: Ishan Mittal <imittal@nvidia.com>
Reviewed-on: http://git-master/r/720240
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
|
|
|
|
| |
Signed-off-by: Dan Willemsen <dwillemsen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 8eefb93c21934b101d7f423c38d9ea384a45fad6.
Bug 1585422
Change-Id: I217e0ffe6c230ee3c63d9aec1c48ce9c41770468
Signed-off-by: Timo Alho <talho@nvidia.com>
Reviewed-on: http://git-master/r/659426
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gm20b has more channels than sync points. We use aggressive reclaim
of sync points to offset that. Disable aggressive reclaim for gk20a
because it is not needed there.
Bug 1583849
Change-Id: I2a74b0504150a54cb8a97016effe20c5d905ac95
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/657095
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix below sparse warnings :
warning: Using plain integer as NULL pointer
warning: symbol <variable/funcion> was not declared. Should it be static?
warning: Initializer entry defined twice
Also, remove dead functions
Bug 1573254
Change-Id: I29d71ecc01c841233cf6b26c9088ca8874773469
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/593363
Reviewed-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using KEPLER_C for doing sync point increment has side effects.
It adds a SetObject method, which changes channel state that not all
user space accounts for.
Bug 1462255
Bug 1497928
Bug 1559462
Change-Id: I5c422ad8ca94fba15cad9bd232f7a10d94aa0973
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/554478
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes a refcounting issue in semaphore handling.
Change-Id: I03327c60ed6923a90663f0b845566e81af4b94d4
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/453056
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When moving compression state tracking and compbit management ops to
kernel, we need to attach a fence to dma-buf metadata, along with the
compbit state.
To make in-kernel fence management easier, introduce a new gk20a_fence
abstraction. A gk20a_fence may be backed by a semaphore or a syncpoint
(id, value) pair. If the kernel is configured with CONFIG_SYNC, it will
also contain a sync_fence. The gk20a_fence can easily be converted back
to a syncpoint (id, value) parir or sync FD when we need to return it to
user space.
Change gk20a_submit_channel_gpfifo to return a gk20a_fence instead of
nvhost_fence. This is to facilitate work submission initiated from
kernel.
Bug 1509620
Change-Id: I6154764a279dba83f5e91ba9e0cb5e227ca08e1b
Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com>
Reviewed-on: http://git-master/r/439846
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gm20b/gm20x requires incrementing syncpoints twice to ensure that
the data has reached memory in all cases. This patch modifies
increment push buffer to account this requirement.
Bug 1491360
Change-Id: I5c2899b26ce0e1cdf9408bb9aaa576fc3054480f
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/437675
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gk20a_busy() call in channel_syncpt_incr() and corresponding
gk20a_idle() call in channel_update() are redundant since they
are already encapsulated inside another pair of busy/idle calls
This busy/idle pair will be called only from submit_gpfifo()
and submit_gpfifo() already has its own busy/idle which it
preserves for whole path and hence this redundant pair can be
removed
Also, this prevents a dead lock scenario while do_idle() is in
progress as follows :
- in submit_gpfifo() we call first gk20a_busy() which acquires
busy read semaphore
- in do_idle() we acquire busy write semaphore and wait for
current jobs to finish
- now submit_gpfifo() encounters second gk20a_busy() and requests
busy read semaphore again
- this results in dead lock where do_idle() is waiting for
submit_gpfifo() to complete and submit_gpfifo() is waiting for
busy lock held by do_idle() and hence it cannot complete
bug 1529160
Change-Id: I96e4368352f693e93524f0f61689b4447e5331ea
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/434191
(cherry picked from commit c4315c6caa42bab72ba6017c7ded25f4e9363dec)
Reviewed-on: http://git-master/r/435132
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Tested-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add semaphore implementation of the gk20a_channel_sync interface.
Each channel has one semaphore pool, which is mapped as read-write to
the channel vm. We allocate one or two semaphores from the pool for each
submit.
The first semaphore is only needed if we need to wait for an opaque sync
fd. In that case, we allocate the semaphore, and ask GPU to wait for
it's value to become 1 (semaphore acquire method). We also queue a
kernel work that waits on the fence fd, and subsequently releases the
semaphore (sets its value to 1) so that the command buffer can proceed.
The second semaphore is used on every submit, and is used for work
completion tracking. The GPU sets its value to 1 when the command buffer
has been processed.
The channel jobs need to hold references to both semaphores so that
their backing semaphore pool slots are not reused while the job is in
flight. Therefore gk20a_channel_fence will keep a reference to the
semaphore that it represents (channel fences are stored in the job
structure). This means that we must diligently close and dup the
gk20a_channel_fence objects to avoid leaking semaphores.
Bug 1450122
Bug 1445450
Change-Id: Ib61091a1b7632fa36efe0289011040ef7c4ae8f8
Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com>
Reviewed-on: http://git-master/r/374844
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Allow suppressing WFI when submitting work and requesting a fence
back.
Bug 1491545
Change-Id: Ic3d061bb4f116cf7ea68dbd6a1b2ace9f11d0ab5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/390457
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nvhost_get_syncpt_host_managed() creates syncpt name based on
platform_device pointer passed to it
Passing host1x's pointer to this API results in setting gk20a
syncpt names as "host1x_0" which is conflicting
Hence to restore this pass gk20a's device pointer
which gives syncpt names as "gk20a.0_0"
Also, add a syncpt check for sycnpt received.
Bug 1305024
Change-Id: I4ff96c7c9ebff2dca385c5787a85b4a9451b9514
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/410121
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add gk20a as a sub power domain of host1x. This enforces keeping
host1x on when using gk20a.
Bug 200003112
Change-Id: I08db595bc7b819d86d33fb98af0d8fb4de369463
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/407006
(cherry picked from commit 009812b3e510518740e9c7e89b8b8b80439fe26a)
Reviewed-on: http://git-master/r/408013
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 20d48a759b032116e3092e1df76518065da59879.
Change-Id: I93718a314b70ee9284a83ca69964883e670ad78d
Signed-off-by: Matt Pedro <mapedro@nvidia.com>
Reviewed-on: http://git-master/r/407969
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the gpu driver assumes that the GPU is a child of host1x.
This is an invalid assumption and therefore we need to get the host1x
device from device tree based on nvidia,host1x property.
Bug 1311528
Bug 1434573
Change-Id: I097e39369aaa15ab6652cd23f353f88f7c2b9c48
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/395664
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set the flag syncpt_aggressive_destroy to enable
destroying syncpts aggressively
Bug 1305024
Change-Id: Iedff9c6bdb6bbe02057972733126ce685daa8d7f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/400234
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add gk20a as a sub power domain of host1x. This enforces keeping
host1x on when using gk20a.
Bug 200003112
Change-Id: I08db595bc7b819d86d33fb98af0d8fb4de369463
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/407543
Reviewed-by: Riham Haidar <rhaidar@nvidia.com>
Tested-by: Riham Haidar <rhaidar@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Bug 1503225
Change-Id: I52fd660de9bd251ceb936ad4edc34359753a0074
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-on: http://git-master/r/399460
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set syncpt_aggressive_destroy = true and enable gpu channels'
syncpt free at channel_unbind() time.
This is more agrressive level to free a syncpt.
Bug 1305024
Change-Id: I20296590454fcbf6556c5bd08b7e47156f7a1e65
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/395154
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix regression caused by commit 67fa249b419d32bfd0873fe5d924f4f01d9048de
"video: tegra: host: Abstract gk20a channel synchronization".
The above change unintentionally modified the channel synchronization
logic so that an nvhost interrupt handler was scheduled also when idling
the channel in gk20a_channel_submit_wfi. That appears to cause
intermittent hangs when running CUDA tests.
Bug 1484824
Change-Id: I4a1f85dd9e6215350f93710a2be9b0bbaef24b8f
Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com>
Reviewed-on: http://git-master/r/394127
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
add valid syncpt id checks when syncpt id is
extracted from fence fd
Bug 1448825
Change-Id: I0f1722aad60e7644b8f490f24cf18a3b80f8583c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/390572
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nvhost_get_syncpt_host_managed() creates syncpt name based on
platform_device pointer passed to it
Passing host1x's pointer to this API results in setting gk20a
syncpt names as "host1x_0" which is conflicting
Hence to restore this pass gk20a's device pointer
which gives syncpt names as "gk20a.0_0"
Bug 1305024
Change-Id: I40325f2e4e2d9ea8de1d44e136edcb48a431e45c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/389671
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
This patch moves the NVIDIA GPU driver to a new location.
Bug 1482562
Change-Id: I24293810b9d0f1504fd9be00135e21dad656ccb6
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/383722
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|