| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_channel_abort(), we disable the channel and
adjust the fences. And then we call gk20a_channel_update()
only if we released the post-fence semaphore
Instead of this, call gk20a_channel_update() always to ensure
that all the clean up after fence completion is done always
Bug 1683059
Change-Id: Ife00ba43e808c0833264d79c98c74417ffadf802
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/842999
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When closing channel, disable and preempt it immediately instead of
waiting for it to finish all work.
Bug 1683059
Change-Id: Ia5f5fc6a072dc3ddb1e9bf63534814ff0a60b5b4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/836746
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Copy the necessary ctx patching prologues and epilogues from patch_write
to gr_gk20a_exec_ctx_ops next to mapping of gr ctx, around a loop that
applies patches multiple times, in order to optimize the number of
maps/unmaps on the channel's patch context.
Bug 200075565
Change-Id: I7125be5c778192d639f0bbed1731bb900c7015da
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
(cherry picked from commit TODO-FILL-THIS-WHEN-MERGED)
Reviewed-on: http://git-master/r/839203
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Bug 1706457
Change-Id: Iab76bcb7cabc55d99b5acd932716d30da6f01b46
Signed-off-by: Chris Dragan <kdragan@nvidia.com>
Reviewed-on: http://git-master/r/835852
Reviewed-on: http://git-master/r/836454
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I8d64c36d569e79cad3648bad248624290319ac2d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/839367
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, we create sync_fence (from nvhost_sync_create_fence())
for every submit
But not all submits request for a sync_fence.
Also, nvhost_sync_create_fence() API takes about 1/3rd of the total
submit path.
Hence to optimize, we can allocate sync_fence
only when user explicitly asks for it using
(NVGPU_SUBMIT_GPFIFO_FLAGS_FENCE_GET &&
NVGPU_SUBMIT_GPFIFO_FLAGS_SYNC_FENCE)
Also, in CDE path from gk20a_prepare_compressible_read(),
we reuse existing fence stored in "state" and that can
result into not returning sync_fence_fd when user asked
for it
Hence, force allocation of sync_fence when job submission
comes from CDE path
Bug 200141116
Change-Id: Ia921701bf0e2432d6b8a5e8b7d91160e7f52db1e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/812845
(cherry picked from commit 5fd47015eeed00352cc8473eff969a66c94fee98)
Reviewed-on: http://git-master/r/837662
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently call check_gp_put() and update_gp_get()
in submit path and this takes about 5uS for both checks
check_gp_put() - 3.5 uS
update_gp_get() - 1.5 uS
But this book-keeping can be moved to gk20a_channel_update()
to save some submit time
Note that check_gp_put() needs to be done inside submit
lock
Bug 200141116
Change-Id: I276400111be0421eb673695e2f2899ff52e344b4
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/839232
(cherry picked from commit 289617e8bf01bde9aab45dfa3a1c6a1241e6eb78)
Reviewed-on: http://git-master/r/839713
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Add tile caching related registers to access map.
Bug 1692373
Change-Id: I4516812dd571bed3be2dfa2b210abe3177e794fe
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/812354
|
|
|
|
|
|
|
|
| |
Bug 1692373
Change-Id: Ie3fc3e02fa7b0636da464d6ee1c28da7a4543ec2
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/812353
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we could not find the channel for GR interrupt we skipped
resetting it. Change the logic to do the reset, and find the channel
via FIFO engine status.
Bug 1706962
Change-Id: I8a0d283b36d88ea1cec541dbf90db7929104ad7c
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/839465
|
|
|
|
|
|
|
|
|
|
|
|
| |
SM locking & register reads Order has been changed.
Also, functions have been implemented based on gk20a
and gm20b.
Change-Id: Iaf720d088130f84c4b2ca318d9860194c07966e1
Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com>
Signed-off-by: ashutosh jain <ashutoshj@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/837236
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
JIRA VFND-1005
Bug 1594604
Change-Id: Ic159a1aff9cee508194f1f5dff7a16eb0e47ad64
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/833498
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch modifies TegraDRM to compile on NVIDIA downstream kernel.
Currently buffers are pinned to virtual drm device. In upstream this
is ok since the buffers will be remapped into the TegraDRM maintained
IOMMU domain. However, NVIDIA downstream kernel relies on DMA mapping
API and hence requires that the buffers are pinned to the real
hardware device.
In addition, this patch modifies code to use downstream
power-management APIs if the downstream option is enabled.
Bug 1698151
Change-Id: I1cb7a1a0ad0ae767c48bfecb608d899e484d6b40
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/823640
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add IOCTL NVGPU_IOCTL_CHANNEL_WDT to disable/enable
watchdog per-channel
Also, if watchdog is disabled, we currently schedule
the worker with MAX timeout.
Instead of this, do not schedule any worker if
watchdog is disabled
Bug 1683059
Bug 1700277
Change-Id: I7f6bec84adeedb74e014ed6d1471317b854df84c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/837962
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix possible memory leak in error condition
Coverity id : 20022
Bug 1703084
Change-Id: Ie1085a6e9c206ba0d71fe6677986df2a357bb5d1
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/838776
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
bug 200153970
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Change-Id: Ia5f616269bfeb834540bf4da6ecfc6e399682819
Reviewed-on: http://git-master/r/836966
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- operands not affecting result (id = 12845)
- logically dead code (id = 12890)
- dereference after null check (id = 12968)
- unsigned compared to 0 (id = 13176)
- resource leak (id = 13338, 18673)
- unused pointer value (id = 13916)
Bug 1703084
Change-Id: I2f401dd93126af27748c53fa1b3a59cb154af36b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/835143
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- UCODE WAR to disable ELCG during a brief
time instant during ELPG entry and exit.
- UCODE app version - 20120791
Bug 1696192
Change-Id: Ia6ddf5cd86f3024d40dfa75ec610ba0d1dd4f1fe
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/835894
(cherry picked from commit 7bb55ae2b1a59f062f2875d1eebd113d66c2af14)
Reviewed-on: http://git-master/r/836577
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add new operaton g->ops.mm.set_debug_mode and let other places
that set debug mode call this callback.
It's preparing for adding vgpu set mmu debug mode hook.
JIRA VFND-1005
Bug 1594604
Change-Id: I1d227a0c0f96adb0035ae16ae1f4fbfa739bf0a7
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/833497
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
correct register settings for both set mmu debug mode and
set sm debug mode.
JIRA VFND-1005
Bug 1594604
Change-Id: I1d4b1d4b4cdd9d24d3b00481e0e22c4217f5a4b3
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/833490
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently set "aggressive_destroy" flag to destroy
sync object statically and for each sync object
Move this flag to per-platform structure so that it
can be set per-platform for all the sync objects
Also, set the default value of this flag as "false"
and set it to "true" once we have more than 64
channels in use
Bug 200141116
Change-Id: I1bc271df4f468a4087a06a27c7289ee0ec3ef29c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/822041
(cherry picked from commit 98741e7e88066648f4f14490c76b61dbff745103)
Reviewed-on: http://git-master/r/835800
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Create constant name for gpu debugfs node across all chip.
Bug n/a
Change-Id: I359b82b5389c49d8fe2a31ace49ff6daa1edfb10
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/805397
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
(cherry-picked from commit 17a3882cde09412c68f7a0ee4765f45be1a51c45)
Reviewed-on: http://git-master/r/817014
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently allocate private command buffers (wait_cmd
and incr_cmd) before submitting the job but we never
free them explicitly.
When private command queue of the channel is full, we
then try to recycle/remove free command buffers.
But this recycling happens during submit path, and
hence that particular submit path takes much longer
Rework this as below :
- add reference of command buffers to job structure
- when job completes, free the command buffers
explicitly
- remove the code to recycle buffers since it should
not be needed now
Note that command buffers need to be freed in order of
their allocation. Ensure this with error print before
freeing the command buffer entry
Bug 200141116
Bug 1698667
Change-Id: Id4b69429d7ad966307e0d122a71ad55076684307
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/827638
(cherry picked from commit c6cefd69b71c9b70d6df5343b13dfcfb3fa99598)
Reviewed-on: http://git-master/r/835802
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In job submission path, we always take refcount on all
the mapped buffers to safeguard against case where user
space releases the buffer early
But in case user space itself is doing proper buffer
management, kernel need not take refcounts on all the
buffers - which is also a overhead in submit path
Hence, provide a new submit flag
NVGPU_SUBMIT_GPFIFO_FLAGS_SKIP_BUFFER_REFCOUNTING to
optionally skip taking refcounts on all the buffers
Also, if we do not take refcounts, then no need to drop
any refcounts in gk20a_channel_update() as well
Bug 1698667
Bug 200141116
Change-Id: I81bb7a03240300b691c70bcec04ea1badd5934f4
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/824718
(cherry picked from commit 8c8978fa303ec4e6db0233becdbdcbad4a248173)
Reviewed-on: http://git-master/r/835801
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In GPU job submit path gk20a_ioctl_channel_submit_gpfifo(),
we currently allocate a temporary gpfifo, copy user space
gpfifo content into this temporary buffer, and then copy
temp buffer content into channel's gpfifo.
Allocation/copy/free of temporary buffer adds additional
overhead
Rewrite this sequence such that gk20a_submit_channel_gpfifo()
can receive either a pre-filled gpfifo or pointer to
user provided args.
And then we can direclty copy the user provided gpfifo
into the channel's gpfifo
Also, if command buffer tracing is enabled, we still need
to copy user provided gpfifo into temporaty buffer for reading
But that should not cause overhead in real world use case
Bug 200141116
Change-Id: I7166c9271da2694059da9853ab8839e98457b941
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/823386
(cherry picked from commit 3e0702db006c262dd8737a567b8e06f7ff005e2c)
Reviewed-on: http://git-master/r/835799
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Bug 200150865
Change-Id: If4f0e01bdeb95c303675b63444bd497b65d934f3
Signed-off-by: Ari Hirvonen <ahirvonen@nvidia.com>
Reviewed-on: http://git-master/r/835151
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update scaling driver to support to differnt
clock frameworks.
Bug 200147662
Reviewed-on: http://git-master/r/816929
(cherry picked from commit cbd4cb575fb2d27870089797ff2a8f22540b87e8)
Change-Id: Ie50304b4a593d74bd43b271005cc9616fdb52a6e
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/834748
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds wrappers to use nvhost functions even if the
upstream host1x driver was used. In this case the functions
are re-routed to corresponding functions in the host1x driver.
Bug 1698151
Change-Id: Id49ea78f925e9b14e54e67743cac42610b3ecbeb
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/823638
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which
enables creating userspace-managed GPU address spaces.
When an address space is marked as userspace-managed, the following
changes are in effect:
- Only fixed-address mappings are allowed.
- VA space allocation for fixed-address mappings is not required,
except to mark space as sparse.
- Maps and unmaps are always immediate. In particular, the mapping
ref increments at kickoffs and decrements at job completion are
skipped.
Bug 1614735
Bug 1623949
Bug 1660392
Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/738558
Reviewed-on: http://git-master/r/833253
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we are handlling a fake mmu fault, then mention
so in the error print
Bug 1696529
Change-Id: Ife0106c3b83d18b315fc08dafadeca0129187624
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/828460
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Do ZBC updates without forcing engine idle first.
Bug 1698013
Change-Id: I99218c8cfd02be05dace2003b8d91921765f7ca9
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/829145
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement NVGPU_GPU_IOCTL_GET_BUFFER_INFO. The new IOCTL can be used
to identify buffers and retrieve their sizes. This allows the
userspace to be agnostic to the dmabuf implementation, as the generic
dmabuf fd interface does not have a reliable way for buffer
identification.
Bug 1614735
Bug 1623949
Bug 1660392
Change-Id: Ic3dd0a9385c9852778110ccb80636dd6f4f36208
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/822845
Reviewed-on: http://git-master/r/833252
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It fixed kernel dump when run CUDA L0 test.
Bug 1594604
Change-Id: Ic986b34629052e915f4ccc5a5b6df198afaf2ff9
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/831391
(cherry picked from commit 43d4ba4d6ffc6043e8425dc40967975afe3a95f1)
Reviewed-on: http://git-master/r/832416
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug 1695718
Change-Id: I595694e4a92ccaf8b8cd389e215d28709ed98c50
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/827829
(cherry picked from commit be5397109404b632c51a57758ee0dcdd2d2f5cc7)
Reviewed-on: http://git-master/r/831556
GVS: Gerrit_Virtual_Submit
Reviewed-by: Kirill Artamonov <kartamonov@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add dbg_map debug spew for all mapping calls. This plugs the hole where
kernel mappings were not logged, because the debug log is added only in
ioctl path.
Change-Id: I036bf41f92ba5b612d32805020ca7a16fe54f9f4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/812288
(cherry picked from commit c37b2892d6d967ad48076b20e5a9ef97dc600b31)
Reviewed-on: http://git-master/r/831333
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allocate a separate VM for CDE channels instead of using the system
(PMU) vm, and make it much bigger than the PMU's to fit the maximum
number of CDE channels there.
Bug 1566740
Change-Id: I4f487c40c9ec79cc9ffb880b0ecd3f47eb450336
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/815149
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
G_ELPG_FLUSH is protected in some chips. Use L2 flush operations
instead.
Bug 1698618
Change-Id: I984a8ace8bcd0ad2d4a4e2d63af75a342bdeb75a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/828656
(cherry picked from commit ba9075fa43975112a221d37d246f0b8f5af40fab)
Reviewed-on: http://git-master/r/829415
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit c44947b1314bb2afa1f116b4928f4e8a4c34d7b1. It
introduces a regression in T124.
Bug 1702063
Change-Id: I64e333f66d98bd4dbcfe40a60f1aa825d90376a5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/830786
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of depending on clock frame-work, use platform data
for ptimer source rate. Removed ptimerscaling10x platform
data, and use ptimer source frequency to calculate
ptimerscaling factor.
Reviewed-on: http://git-master/r/819030
(cherry picked from commit dd291334d54dab80cab7eb1656dffc48a59610b4)
Change-Id: I7638ce9875a6e440bbfc2ba2da0d0b094b2700ff
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/827300
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add new IOCTL NVGPU_IOCTL_TSG_SET_PRIORITY to allow
setting timeslice for entire TSG
Return error from channel specific IOCTL_CHANNEL_SET_PRIORITY
if the channel is part of TSG
Separate out API gk20a_channel_get_timescale_from_timeslice()
to get timeslice_timeout and scale from timeslice period
Use this API to get timeslice_timeout and scale for TSG and
store it in tsg_gk20a structure
Then trigger runlist update so that new timeslice values
will be re-written to runlist for TSG
Bug 200146615
Change-Id: I555467d034f81b372b31372f0835d72b1c159508
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/824206
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add IOCTL NVGPU_DBG_GPU_IOCTL_TIMEOUT to support
disabling/re-enabling scheduler timeout from user space
If user space application is closed without re-enabling
the timeouts, kernel will restore the timeouts' state
while releasing the debug session
This is needed for debugging purpose
Bug 1514061
Change-Id: I32efb47ad09d793f3e7fd8f0aaa9720c8bc91272
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/788176
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
| |
Change-Id: Idfeb3bf95751902d52a895d77045a529f69abc0b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/758651
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add required fileds and values for thermal slow-down
settings in thermal header file and implemented chip
specific thermal register programming
Reviewed-on: http://git-master/r/822199
(cherry picked from commit 9e8a745b8295af002b9780c83caa8dc7b22cc737)
Change-Id: I016b18ed230fa6c104eada2e166ccd1a5f2ace36
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/823012
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Get rid of the duplicate gpfifo struct to emphasize the fact that
nvgpu_gpfifo is the only memory layout for gpfifo entries that works.
This is the same layout that HW uses.
Also, add a local pointer to the gpfifo memory in
gk20a_submit_channel_gpfifo to get rid of repeated typecasts.
Bug 1592391
Bug 1550886
Change-Id: I5432859ef8e7c1aab5907e44098994d7bb807f50
Signed-off-by: Janne Hellsten <jhellsten@nvidia.com>
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/677341
(cherry picked from commit 724c8c6228af81dd440e825bddf545dd6b2b8bd7)
Reviewed-on: http://git-master/r/822548
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add sysfs node "railgate_enable" to enable
gpu railgating dynamically.
Bug 1552469
Reviewed-on: http://git-master/r/804746
(cherry picked from commit 53d76d1d1576a96c70f66b744411d2909ec8414f)
Change-Id: I69ac63958b00ae4ea5d9ccbaed03316d35ddc5eb
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/822208
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_do_idle(), if can_railgate = false, we do
reset_assert() on the GPU
But asserting reset might have dependencies on
specific h/w seqeunce for ensuring proper reset
Hence avoid calling reset_assert() and directly
call platform specific unrailgate() and railgate()
APIs which take care of the correct reset sequence
Bug 200142989
Bug 200137963
Bug 1678611
Change-Id: Ide886dd88b8422ad36de52d54378b1edd9c7bbd6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/820322
(cherry picked from commit d621ddd49da976a75c14aa7aaa37f700fb4e83f2)
Reviewed-on: http://git-master/r/822515
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we unrailgate the GPU if railgating is not enabled
or pm_domains are not enabled
But in case if railgating is not enabled and pm_domains
are enabled, we explicitly unrailgate GPU in gk20a_pm_init()
and then runtime PM unrailgates it again when first user
space request arrives - setting unrailgate refcount to 2
Now for gk20a_do_idle(), we need to railgate the GPU in
fist call but that does not happen since unrailgate
refcount != 1
hence, in case railgating is not enabled, we should
unrailgate the GPU from only one place i.e. when first user
space request arrives
Bug 200142989
Bug 200137963
Bug 1678611
Reviewed-on: http://git-master/r/820321
(cherry picked from commit 452a1ff8da8e3f47caed2371440f9ad150bf8699)
Change-Id: Ic9fe2267c9df5629315c30c1404c2b3044c1265a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/822296
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At zcull bind we disable whole GR engine. This is unnecessary, so
instead disable only the channel and make sure it's unloaded.
Introduces also an API in fifo_gk20a.c to do the channel disable.
gr_gk20a_ctx_zcull_setup() was always passed true as last parameter,
so remove parameter.
Change-Id: I7ae6e101ec7d1ab3f6ee4e9bcc442d23dbd21247
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/787570
|
|
|
|
|
|
|
|
|
|
|
| |
Protect creation and deletion of sync by an own mutex. This prevents
deadlock in channel abort when abort is called from submit path.
Bug 200147887
Change-Id: I5d6308b773c1d1a6a89d4590e2e74c74d691f79d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/821127
|
|
|
|
|
|
|
|
|
| |
Program clock slowdown to happen using gradual slowdown. It is
significantly faster than the default slowdown.
Change-Id: I9e5259889637fce2c0b083a424b54af12bb45c25
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/819698
|