| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Export the gpu's safe fmax_at_vmin frequency so it can be
queried from userspace (e.g. PHS).
Bug 1566108
Change-Id: I47326588ebd443f189a6051edbf95b35b35636d1
Signed-off-by: Anders Kugler <akugler@nvidia.com>
Reviewed-on: http://git-master/r/743501
(cherry picked from commit a977495878a486ca45c7de969582fd9ea949b0f0)
Reviewed-on: http://git-master/r/753279
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For both adding and querying zbc entry, added callbacks in gr ops.
Native gpu driver (gk20a) and vgpu will both hook there. For vgpu, it
will add or query zbc entry from RM server.
Bug 1558561
Change-Id: If8a4850ecfbff41d8592664f5f93ad8c25f6fbce
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/732775
(cherry picked from commit a3787cf971128904c2712338087685b02673065d)
Reviewed-on: http://git-master/r/737880
(cherry picked from commit fca2a0457c968656dc29455608f35acab094d816)
Reviewed-on: http://git-master/r/753278
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The packaging of register's value in 64 bit variable
needs the reversal of 32-bit-word.
Bug 200083334
Change-Id: Id938f2a2fcffc90ef135ae963ae288c9a655069a
Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com>
Reviewed-on: http://git-master/r/744455
(cherry picked from commit dfd3a752ea6a0943be499410010a176756221593)
Reviewed-on: http://git-master/r/753277
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When allocation size is 4k or below, we should use kmalloc. vmalloc
should be used only for larged allocations.
Introduce nvgpu_alloc, which checks the size, and decides the API
to use.
Change-Id: I593110467cd319851b27e57d1bfe8d228d3f2909
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/743974
(cherry picked from commit 7f56aa1f0ecafbfde7286353b60e25e494674d26)
Reviewed-on: http://git-master/r/753276
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two minor cleanups of cyclestats snapshots code implemented:
1. In case of unacceptably small buffer passed as a cyclestats snapshot
it causes a kernel panic during list element removal:
NvRmGpuTest_Channel_Cyclestats_Snapshot_Gen for 1 clients,
each has 4 KB mappings and 1 perfmons
[ 304.533073] Unable to handle kernel NULL .... address 00000008
[ 304.541825] pgd = ffffffc04fc9f000
[ 304.545277] [00000008] *pgd=0000000000000000
[ 304.549554] Internal error: Oops: 96000045 [#1] PREEMPT SMPa
....
[ 304.584978] PC is at css_gr_free_client_data+0x28/0xe4
[ 304.590105] LR is at gr_gk20a_css_attach+0x6e0/0x700
2. Also fix with improved allocation of perfmon IDs implemented.
Bug 1573150
Change-Id: I58b753434141bf573463563fdd699c11ea914943
Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com>
Reviewed-on: http://git-master/r/751385
(cherry picked from commit e9314c29df3fb708a20fff58cfa64c2ead857b0f)
Reviewed-on: http://git-master/r/753275
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
That is a kernel supporting code for cyclestats mode E.
Cyclestats mode E implemented following Windows-design in user-space
and required the following operations to be implemented:
- attach a client for shared hardware buffer of device
- detach client from shared hardware buffer
- flush means copy of available data from hardware buffer to private
client buffers according to perfmon IDs assigned for clients
- perfmon IDs management for user-space clients
- a NVGPU_GPU_FLAGS_SUPPORT_CYCLE_STATS_SNAPSHOT capability added
Bug 1573150
Change-Id: I9e09f0fbb2be5a95c47e6d80a2e23fa839b46f9a
Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com>
Reviewed-on: http://git-master/r/740653
(cherry picked from commit 79fe89fd4cea39d8ab9dbef0558cd806ddfda87f)
Reviewed-on: http://git-master/r/753274
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pass host1x device pointer for below APIs :
nvhost_get_syncpt_host_managed()
nvhost_syncpt_put_ref_ext()
Also, compose a name for syncpoints in GPU driver itself.
This name will be created as combination of device name
and channel index
Bug 1611482
Change-Id: Id1ddd0e87e0272ddb0758713d6b6c2544bc36bf4
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/751908
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Tested-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 2e5803d0f2b7d7a1577a40f45ab9f3b22ef2df80 since
the issue seen with bug 200106514 is fixed with change
http://git-master/r/#/c/752080/.
Bug 200112195
Change-Id: I588151c2a7ea74bd89dc3fd48bb81ff2c49f5a0a
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/752503
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the addition of the buddy allocator often times push buffers are
allocated by the kernel in high GVA memory regions. These addresses,
when written into a GPFIFO entry, have bits set in entry1 of the GPFIFO
command.
As a result, if no length is set, then these address bits will be
interpreted as opcodes by the GPU. The bug fixed by this patch was
caused by a wait_cmd being inserted into the GPFIFO with an address
of a pushbuffer above 4GB and a zero length. This occured becasue
the code that creates the wait_cmd was able to return what appeared
to be a valid priv_cmd_entry even though there was nothing in that
command.
This bug does not appear before the buddy allocator because the FFF
allocator always starts allocating from low addresses. As such when a
channel's GPFIFO is allocated it gets an address below 32bits. The,
because no higher address bits are set, entry1 of the GPFIFO is simply
0 and the GPU trets the command as a no-op.
Change-Id: I9c1e600c368b55626e99f6f712f1821148bbb76d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/752080
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit ce1cf06b9a8eb6314ba0ca294e8cb430e1e141c0 since
it causes GPU pbdma interrupt to be generated.
Bug 200106514
Change-Id: If3ed9a914c4e3e7f3f98c6609c6dbf57e1eb9aad
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/749291
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pass flags parameter to channel_setup_ramfc for
indicating nvgpu_alloc_gpfifo_args characteristics.
Bug 1645628
Change-Id: Ia40b37c5c7b208d459aa84f1b022036dd5e1b599
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/744526
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Create a HAL for waiting for GR to become quiet. Use it forall cases
where we require GR to be quiet, but where it does not need to be
idle.
Bug 1640378
Change-Id: Ic0222d595a2d049e0fa8864b069ab94a97fac143
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/745640
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
|
|
|
|
|
|
|
|
| |
Change-Id: I96282b4e047ba8b5369dac039f0f51856c69235b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/747935
(cherry-picked from commit 0bb090745b4122fc4149b1bd6026138a1b9a32bc)
Reviewed-on: http://git-master/r/749235
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Drop host1x syncpoint refcount with nvhost_syncpt_put_ref_ext()
instead of freeing it directly
Bug 1646883
Change-Id: Ib213e58031a9302e683f8d13ebb4e1f913206464
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/747150
GVS: Gerrit_Virtual_Submit
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a GPU channel is closed immediately after opening without
performing any operation on it, we leak that channel
e.g. below command leaks a channel
echo > /dev/nvhost-gpu
Fix this leak by releasing the channel before returning
Change-Id: I2598e3cabec6996cb1cf8066a1e6d7d5864ae02b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/743235
(cherry picked from commit 428771509b4431ebe88e38061b495cabc5192327)
Reviewed-on: http://git-master/r/744279
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 7eb42bc239dbd207208ff491c3fb65c3d83274d8.
The original commit was actually fine.
Change-Id: I564ce6530ac73fcfad17dcec9c53f0353b4f02d4
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/743300
(cherry picked from commit e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff)
Reviewed-on: http://git-master/r/743301
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_pm_shutdown(), we currently call __pm_runtime_disable()
which prevents h/w access to new requests made after shutdown() call
Also, once gk20a_pm_shutdown() completes, platform code will
just rail gate the GPU
But it is possible that some other thread is already accessing
h/w while shutdown() was triggered and this can result in hang
Hence, wait until all currently executing jobs are finished
before returning from gk20a_pm_shutdown()
Also, we need to wait for GPU's usage count to become 1 since
platform code will increase the usage count and then call
shutdown(). Hence usage count of 1 indicates that GPU is idle
Bug 200099940
Change-Id: I1f2457829e2737c07302d13f355353a30c3b4e67
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/734920
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bug 200080684
keeping it disabled by default
also trimming the code by removing redundant
variable to check recovery. pmu quick wait
now checks only for irqs which are serviced
by kernel. requests pmu to bit bang gpccs
ucode.
Change-Id: I12ef23d6d59b507e86a129b69eab65b21d0438c6
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/729622
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Function validate_fixed_buffer used to do a linear search for
collision detection of already mapped buffers. Optimize this by doing
a nice logarithmic search instead.
Change-Id: Ifbf2ec015741d44883da27bc6f8cc090c48da145
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/739682
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit c18edd3686115ca0b7d8bb08b35f23264f865358.
Proper fixes for shutdown issue are being added with below changes
http://git-master/r/#/c/738509/
http://git-master/r/#/c/734920/
Hence revert this workaround
Bug 200099940
Change-Id: I74b29c804af2bdb9d95c6b93c5308a323575ae57
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/739082
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we execute pm_runtime_get_sync() and then
gk20a_scale_notify_busy() without checking return value of
pm_runtime_get_sync()
In case of shutdown of GPU is already initiate, we get
a hard hang due to this as per below sequence :
- one thread invokes GPU shutdown and then forcibly rail
gates the GPU
- another thread (unaware of shutdown) calls gk20a_busy()
- since runtime PM is disabled in shutdown path,
pm_runtime_get_sync() fails
- but we still go on running gk20a_scale_notify_busy() which
tries to access some GPU registers and hangs
Fix this by jumping to failure path in case
pm_runtime_get_sync() fails
Bug 200099940
Change-Id: I022f2dfa9408f640fb44e6f4b10a437688779c0a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/738509
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When system is in low memory, kzalloc will fail if
kernel requests more than PAGE_SIZE continous memory block.
Bug 200096099
Change-Id: I44e217ffa6aa6c453a4d4afba45a8ee3b5756cc1
Signed-off-by: Kerwin Wan <kerwinw@nvidia.com>
Reviewed-on: http://git-master/r/732197
(cherry picked from commit 62861976421415f93e98a0a9f977ac1f66046714)
Reviewed-on: http://git-master/r/737057
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
Tested-by: Krishna Reddy <vdumpa@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Return zero for missing sgl (sgt is already checked) instead of
attempting to dereference NULL. Those NULL conditions should be almost
nonexistent, and zero is not normally used.
When reading gk20a_mem_phys() in gk20a_gr_get_chid_from_ctx() from an
isr, the mem desc may race with channel deletion and get suddendly
zeroed, even if the channel's in_use flag would be set. Plain zero
results in expected behaviour.
Change-Id: I7033979091951cba3e3004ddc7550cd327ad0baf
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/737759
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement NVGPU_AS_IOCTL_GET_BUFFER_COMPBITS_INFO for requesting info
on compbits-mappable buffers; and NVGPU_AS_IOCTL_MAP_BUFFER_COMPBITS,
which enables mapping compbits to the GPU address space of said
buffers. This, subsequently, enables moving comptag swizzling from GPU
to CDEH/CDEV formats to userspace.
Compbits mapping is conservative and it may map more than what is
strictly needed. This is because two reasons: 1) mapping must be done
on small page alignment (4kB), and 2) GPU comptags are swizzled all
around the aggregate cache line, which means that the whole cache line
must be visible even if only some comptag lines are required from
it. Cache line size is not necessarily a multiple of the small page
size.
Bug 200077571
Change-Id: I5ae88fe6b616e5ea37d3bff0dff46c07e9c9267e
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/719710
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
o emc clock scaling (bug fix):
Take the gpu load into account for gpu frequencies less
than or equal to fmax @ Vmin.
Bug 1591643
Change-Id: I0298adfdd4b7111557907c3bd6022fd6005355f0
Signed-off-by: Anders Kugler <akugler@nvidia.com>
Reviewed-on: http://git-master/r/735846
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Allow querying and setting default betacb size via debugfs. For global buffers
the value takes effect upon first boot of GPU, and has no effect after that.
Bug 1628352
Change-Id: Ib63f4299249c41eab1b36cc501b525cc54211195
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/733328
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With CONFIG_PM_GENERIC_DOMAINS_OF enabled, device reboot
was getting hung while shutting-down gk20a. It was
happening because genpd_dev_pm_detach() was railgating
gk20a while other thread was still accessing it.
So, assigning NULL to dev->pm_domain->detach for gk20a,
so that genpd_dev_pm_detach() is not called during gk20a
shutdown, which will not railgate it.
This patch will be reverted once we have clean shutdown
for gk20a.
Bug 200070810
Bug 200099940
Change-Id: Ie2e89ea01a98a9d4f2f68a3ab07b6923ffa374f6
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
Reviewed-on: http://git-master/r/735455
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Specified locking timeout and IDDQ exit delay as GK20A PLL parameters,
and used this data instead of hard-coded numbers.
Change-Id: I59e16ed11fdba6911f2751195d182e68aed96851
Signed-off-by: Alex Frid <afrid@nvidia.com>
Reviewed-on: http://git-master/r/735481
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gr_*__set_alpha_circular_buffer_size() left max_batches field of
gr_pd_ab_dist_cfg1_r as 0 which results in too many alpha beta
transitions and poor performance when tessellation or geometry
shaders are used
Change-Id: If18feb1119e9672005455155dc56337cd444a1f1
Signed-off-by: David Li <davli@nvidia.com>
Reviewed-on: http://git-master/r/735476
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The message "per-write ctx patch begin?" is a legacy message for warning
about probably inefficient code, but it's written at error loglevel.
Silence it out a bit by using gk20a_dbg_info(). The inefficient paths
can be fixed later.
Bug 200075565
Change-Id: Idae821aef3001ea5016de22a1a87fec747c42d31
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/734248
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The channel sync object can get deleted before all channel updates have
finished if the channel is freed before them, so work around a null
dereference by testing if the sync exists. Channel and/or c->sync
refcounting would be necessary for proper fix.
Bug 200076344
Change-Id: Ica8ef2df9cd95cfa593cd4f41768dbb6641357b2
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/734266
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- pmu version 19494277 is from CL 19495746
- updated gpmu interface data struct with
respect to latest pmu ucode interface headers.
gpmuifpg.h - 19199047
gpmuifperfmon.h - 18238819
gpmuifpmu.h - 19199047
gpmuifacr.h - 19343196
gpmuifcmn.h - 19264862
rmflcnbl.h - 19317152
Bug 200085428
Change-Id: I7db56dcf5a3038b40da37a69e8723a2e9a652e4b
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/728461
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4b6f83704f054f5b21e05873fa5862c667a9992e tried to fix ACR related
leak. It fell short, because the data structures related were local
and thus the leak was not really fixed.
This patch stores the ACR ucode blob in a global variable, which
survives across rail gating.
Change-Id: Iec3ac9d41156baa26048e079732568c0a95264f4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/733732
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move the fifo engine activity disabling and wait-for-idle from the
lowest-level functions higher, into the ioctl path of zbc operations, so
that the sw initialization path wouldn't call them. During the init
path, the disable isn't necessary, and the code path could result in a
deadlock in the fifo runlist mutex.
Change-Id: Icf5c270ba29bc1c7f88874fba2d176d68e11278a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/733668
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added delays definitions to GPCPLL parameters structure:
- locking timeout delay (applied to locking in fixed frequency mode and
to PLL dynamic ramp in any mode)
- lock delay for GPCPLL NA mode
- IDDQ exit delay in any mode
Specified delay parameters for GM20B PLL, and used this data instead of
hard-coded numbers.
Change-Id: I63ce0abc9ee900c36ec34b8641513db3cbb6f7d5
Signed-off-by: Alex Frid <afrid@nvidia.com>
Reviewed-on: http://git-master/r/732094
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Added GPU voltage debug print to the initial locking of GPCPLL under
bypass (available only when GPCPLL is in NA mode).
- Added /sys/kernel/debug/gpu.0/voltage debugfs node to read voltage
through GPCPLL (available only when GPCPLL is in NA mode).
Change-Id: I6643ad4d1b228ec4cbc4ff5e8716cce3ef9dccfc
Signed-off-by: Alex Frid <afrid@nvidia.com>
Reviewed-on: http://git-master/r/731572
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This removes all direct access to the MC registers. This requires
that the MC be loaded before the GPU.
Bug 1540908
Change-Id: I90bcde62f65a0c0d73a2bbe92cbf4a980c671c7d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/453653
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Supriya Sharatkumar <ssharatkumar@nvidia.com>
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
| |
This reverts commit 259842f9d222dd2ca2e66bddaceef4a2fd626bc7.
The commit clears some init values that are never restored.
Change-Id: I4efee115863cbfb08b2e280a58b525cb49adc0b6
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/732428
|
|
|
|
|
|
|
|
|
|
|
| |
GPU does not need to powered up if user space calls kernel and there
is no new work to be done.
Bug 1623918
Change-Id: I531aa7033530ae652d13684d8f8568a0e05fc2e1
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/732748
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix compile time error of missing argument when
ALLOCATOR_DEBUG is enabled
Bug 200095967
Change-Id: I600330f3a75cf777d9cd35ec1f00fdd926fba429
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/731320
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sri Krishna Chowdary <schowdary@nvidia.com>
Tested-by: Sri Krishna Chowdary <schowdary@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
__REGOPS_GK20A_H_ -> __REGOPS_GM20B_H_
Bug 1634208
Change-Id: Ic623563492c084162bfad10f895896d77b4192ed
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: http://git-master/r/729749
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While writing to sysfs "tpc_fs_mask", we need to have
GPU initialized (we need to have called gk20a_busy()
at least once before)
If this is not happened yet, then return error
Bug 1456969
Change-Id: I09db6bcaa44b8939246cb5ed1205f3fbc0ee0552
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/731327
(cherry picked from commit 0dbbcf60bbad6b9a31392d2290a3e26c5daa1e5d)
Reviewed-on: http://git-master/r/731671
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We call prepare_ucode_blob() once each time we un-railgate. We
allocate prepare the header for ACR ucode there, but the header
never gets freed.
Allocate and prepare the ACR header only once.
Change-Id: I948da8b47d6bb2fa021868d7038d2cc35eccb460
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/729745
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Return zero for missing sgt instead of attempting to dereference NULL.
Those NULL conditions should be almost nonexistent, and zero is not
normally used.
When reading gk20a_mem_phys() in gk20a_gr_get_chid_from_ctx() from an
isr, the mem desc may race with channel deletion and get suddendly
zeroed, even if the channel's in_use flag would be set. Plain zero
results in expected behaviour.
Change-Id: Id8ce37798d6fd3ceeb96a3f521c82569fccf30aa
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/729006
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Bug 1550628
Change-Id: I8daed555704b49ee0d50530e3d51c03027d31fc5
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/719892
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the return code for both gk20a_ and gm20b_ltc_cbc_ctrl()
functions. Before a positive return woudl always happen. Now,
if there's a timeout -EBUSY is returned.
Change-Id: Id76dc44af1376fceebf5043afb057c153cb0752e
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/729165
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
The flush timeout should have been comparing between the current
time (jiffies) not the snapshot in time when the L2 flush started.
Change-Id: Idba0ccbfeeab9e3fadd0b5bed7073acefbd403e3
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/729090
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.
Bug 1605769
Change-Id: Ib70db4dff782176ed7f92b6809c8415b8c35abe1
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/721120
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have below race condition during __gk20a_do_idle()
and force_reset case :
- before execution of __gk20a_do_idle(), a process drops the last
usage count of GPU, which triggers GPU railgate process
- but before GPU is really railgated (there is 500 mS delay),
some process calls __gk20a_do_idle()
- in __gk20a_do_idle(), we first take railgate_lock
- then we check if GPU is already railgated or not
- since it is not railgated yet (due to 500 mS delay), this
returns false
- then we call pm_runtime_get_noresume() which just increases the
usage counter
- in this particular case, this call just increases usage count to
1 from 0, but whereas GPU is already on its way to railgate
- while we check if GPU usage count drops to one, GPU gets railgated
- now if we have force_reset=true case, we will end up calling
pm_runtime_get_sync() which will take railgate_lock lock _again_
and try to unrailgate GPU
- this causes a deadlock on railgate_lock
To fix this, use below sequence :
- take railgate_lock
- check if GPU is already railgated
- release railgate_lock
- call pm_runtime_get_sync() which will keep GPU active even if
railgating is already triggered
- take railgate_lock again to prevent unrailgate in futher process
Also, add more descriptive comments to explain the flow
Bug 1624537
Change-Id: I0febc65d7bfac03ee738be200cf321322ffbe5a6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/719625
(cherry picked from commit 480284eda16e2b50ee6368bad3d15574e098b231)
Reviewed-on: http://git-master/r/719620
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
If the clock is null, calling the reset function will crash the
kernel. So, don't call the reset function.
Change-Id: I37ef25c8dca67bec8bf6654eb6e275b866bdae53
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/742361
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
|