| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the gpu is railgated, the usermode region mappings must be cleared.
This is already done with zap_vma_ptes() but as an extra measure the vm
flags are also zeroed. That is an oversight, so delete that code; in
particular the VM_DONTCOPY flag is important so that the mapping does
not follow fork, as the design does not allow that.
Bug 200726443
Change-Id: I84ed4e38b7de1f0c8cbf4cca6276abfa2409ac3b
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538481
(cherry picked from commit e44ece25ba405505b7fd537b41bd0ad52f1250ce)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548631
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't store the return value of elpg re-enable if disable fails; this
could make the local status value zero again, causing the elpg-protected
call to be executed with elpg still enabled and elpg re-enabled twice.
Commit c90585856567 ("gpu: nvgpu: add cg and pg function") introduced
this bug; failure of re-enabling after a failed disable might be another
problem (and it's not clear why this is done in the first place) which
isn't propagated to the caller, but that would belong to another patch.
Bug 200565050
Change-Id: I7cf7a0887ae59e85bf0c56c38aaaadfefd16cc1c
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2541859
(cherry picked from commit 4b3591aafb8a53b051ce75629b15d2957943bd76)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2543030
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Following locking sequence leads to deadlock:
1. gk20a_pm_prepare_poweroff (alter_usermode_mappings):
ctrl_privs_lock -> mmap_lock
2. __do_munmap (usermode_vma_close):
mmap_lock -> ctrl_privs_lock
This lock contention can be resolved by retrying the usermode mapping
alteration after a while releasing the ctrl_priv_lock for munmap to
proceed.
Below is the kernel panic log with deadlock.
[] INFO: task kworker/1:1:116 blocked for more than 120 seconds.
[] Tainted: G W 5.10.17-tegra #1
[] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[] task:kworker/1:1 state:D stack: 0 pid: 116 ppid: 2 flags:0x00000028
[] Workqueue: pm pm_runtime_work
[] Call trace:
[] __switch_to+0x104/0x160
[] __schedule+0x3d4/0x900
[] schedule+0x74/0x100
[] rwsem_down_write_slowpath+0x250/0x4b0
[] down_write+0x6c/0x80
[] alter_usermode_mappings+0xb4/0x160 [nvgpu]
[] nvgpu_hide_usermode_for_poweroff+0x24/0x30 [nvgpu]
[] gk20a_pm_prepare_poweroff+0xe8/0x140 [nvgpu]
[] gk20a_pm_runtime_suspend+0x78/0xf0 [nvgpu]
[] pm_generic_runtime_suspend+0x3c/0x60
[] genpd_runtime_suspend+0xb0/0x2c0
[] __rpm_callback+0x90/0x150
[] rpm_callback+0x34/0xa0
[] rpm_suspend+0xe0/0x5e0
[] pm_runtime_work+0xbc/0xc0
[] process_one_work+0x1c0/0x4a0
[] worker_thread+0x11c/0x430
[] kthread+0x148/0x170
[] ret_from_fork+0x10/0x18
[] INFO: task nvrm_gpu_tests:1273 blocked for more than 121 seconds.
[] Tainted: G W 5.10.17-tegra #1
[] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[] task:nvrm_gpu_tests state:D stack: 0 pid: 1273 ppid: 1245 flags:0x00000000
[] Call trace:
[] __switch_to+0x104/0x160
[] __schedule+0x3d4/0x900
[] schedule+0x74/0x100
[] schedule_preempt_disabled+0x28/0x40
[] __mutex_lock.isra.0+0x184/0x5c0
[] __mutex_lock_slowpath+0x24/0x30
[] mutex_lock+0x5c/0x70
[] usermode_vma_close+0x30/0x50 [nvgpu]
[] remove_vma+0x34/0x60
[] __do_munmap+0x1f4/0x4a0
[] __vm_munmap+0x74/0xd0
[] __arm64_sys_munmap+0x3c/0x50
[] el0_svc_common.constprop.0+0x7c/0x1a0
[] do_el0_svc+0x34/0xa0
[] el0_svc+0x1c/0x30
[] el0_sync_handler+0xa8/0xb0
[] el0_sync+0x160/0x180
[] ---[ end Kernel panic - not syncing: hung_task: blocked tasks ]---
Bug 200703921
Change-Id: Ie7f017c92f20061d3bf891079f7fc7fe390f7cf7
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2533853
(cherry picked from commit 1dd3e0761c1995c88e9f8e1a26cf5eaf197510be)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2540111
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make sure file->private_data is set before installing file into file
descriptor with fd_install().
Bug 200724607
Bug 200725718
Change-Id: I03e79a3f8981f959ab5f75f442911253d166aa87
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2520465
(cherry picked from commit c78efae5e721287a3c7e9c9ca045220d6e433a30)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2535099
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Harsh Sinha <hsinha@nvidia.com>
Reviewed-by: Thomas Steinle <tsteinle@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Byungkuk Seo <bseo@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement nvgpu plumbing to allow reporting ECC errors(corrected
and uncorrected) to a L1SS service(if one exists).
This patch includes the following
1) Added code that submits ECC error reports via the Interrupt context
directly to a L1SS service in linux OS.
2) Added support for enabling/disabling the error reports via L1SS's
registration/deregistration API. Nvgpu simply invokes an empty function
until the registration is successful.
3) Added Spinlock to correctly handle concurrency for accessing the
correct Ops for submitting requests.
4) Adds error reporting for a subset of interrupts that can be verified
via external ECC injection logic. A subsequent patch will add the
API for rest of the interrupts.
5) In case of critical(uncorrected errors), change nvgpu's state to
quiesce state.
Jira L4T-1187
Bug 200700400
Change-Id: Id31f70531fba355e94e72c4f9762593e7667a11c
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2530411
Tested-by: Bibek Basu <bbasu@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Engine reset is skipped if channel is disassociated from the tsg.
During unbind, tsg is disassociated before calling deferred
engine reset. Hence any deferred resets don't work
actually.
Engines to be reset is already set in the variable
deferred_fault_engines. Use it.
Bug 200711183
Change-Id: I0c2bdcad1770e0ccd001c208a9ac0cf499a374e1
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2521974
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
(cherry picked from commit 668bd75c1ac84d18f0aa0d0b22145d24405fdc8e)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2524252
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nvgpu_tsg_unbind_channel_common failure handling missed
channel.clear & nvgpu_tsg_set_mmu_debug_mode calls.
Bug 200711183
Change-Id: I19fd53be55db9df725b7cf467b2673e4cd29deb5
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2521972
(cherry picked from commit 89ec2afbd4538fe5036bef7affed5871d45ec734)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2524251
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some of the engine stalling interrupts can block the context save off
the engine if not handled during fifo.preempt_tsg. They need to be
handled while polling for engine ctxsw status.
Bug 200711183
Bug 200726848
Change-Id: Ie45d76d9d1d8be3ffb842670843507f2d9aea6d0
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2521971
(cherry picked from commit I7418a9e0354013b81fbefd8c0cab5068404fc44e)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2523938
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
completion
In order to process stalling interrupts during TSG unbind, we need a API
to wait for the stalling interrupts to complete within certain duration.
Prepare these APIs for stalling and non-stalling interrupts.
Bug 200711183
Bug 200726848
Change-Id: I634738249ade64224326b356d6244ad4299f1baf
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2521970
(cherry picked from commit I0b7a64c0f3761bbd0ca0843aea28a591ed23739f)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2523937
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The NEXT bit can remain set for the channel if timeslice expires before
scheduler clears it. Due to this nvgpu fails TSG unbind and in turn
nvrm_gpu fails channel close. In this case, checking the channel hw
state after some time can help see NEXT bit cleared by scheduler.
Reenable the tsg and return -EAGAIN to nvrm_gpu for it to retry again.
Bug 3144960
Bug 200520811
Change-Id: I35f417f02270e371a4e632986b73a00f8a4f921a
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2468391
(cherry picked from commit cf287a4ef592e7329f813c076ec8bdad18dc5933)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2479106
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- ZBC save/restore registers are removed in GP10B PMU ucode.
- These registers are saved/restored from CTXSW ucode during
ELPG entry/exit.
- Accessing the ZBC registers will cause PMU EXTERR error.
- To resolve this, ZBC functionality is removed from GP10B
feature list in PMU ucode.
- From NvGPU driver, set NVGPU_PMU_ZBC_SAVE bit to false
for GP10B
- Updated the GP10B PMU app version for the ucode:
https://git-master.nvidia.com/r/c/tegra/kernel-firmware-t18x/+/2476260
P4 CL link related to this PMU ucode change:
https://p4sw-swarm.nvidia.com/changes/29594520
Bug 3233071
Bug 200696431
Change-Id: If3f1707b79699e7e2e65367418b25ac71b09cf0b
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2476259
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As sprintf() is used to populate pool_name[20], it can overflow
for larger u32 values (u32 max decimal number chars are 10) i.e.
20 < strlen("semaphore_pool-") i.e. 15 + 10.
Fix this overflow by removing pool_name as it's not used.
Bug 2626446
Bug 3273414
Change-Id: I4e0a222a2cd34dcd09e69294bc46e2242abb04bb
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2205356
(cherry picked from commit baa86cf134ee6753beabfa974a10faffc5775ee8)
Signed-off-by: ByungKuk Seo <bseo@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2496976
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Harsh Sinha <hsinha@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fix add NULL-ptr checks for some of the user-accessible
ioctl.
Bug 3240771
Bug 200696704
Change-Id: Ibe7f75b31b2521a530883253a93ba832f010dc80
Signed-off-by: Thomas Steinle <tsteinle@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2483635
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Phoenix Jung <pjung@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Wrong acquire/release sequence.
DEBUG_LOCKS_WARN_ON(rt_mutex_owner(lock) != current)
....
CPU: 4 PID: 5404 Comm: cyclictest.sh Not tainted 4.9.201-rt134-tegra #1
Hardware name: Jetson-AGX (DT)
....
Call trace:
[<ffffff800810e4f8>] debug_rt_mutex_unlock+0x58/0x68
[<ffffff8008f34d0c>] rt_mutex_unlock+0x4c/0xb0
[<ffffff8008f36ea8>] _mutex_unlock+0x20/0x2c
[<ffffff8000f69d80>] nvgpu_cg_elcg_set_elcg_enabled+0x78/0xf0 [nvgpu]
[<ffffff8000f7bd44>] nvgpu_intr_nonstall_cb+0x21bc/0x22f0 [nvgpu]
[<ffffff800875b304>] dev_attr_store+0x44/0x60
[<ffffff80082dca44>] sysfs_kf_write+0x5c/0x78
[<ffffff80082dbd28>] kernfs_fop_write+0xc0/0x1d8
[<ffffff8008245b60>] __vfs_write+0x48/0x128
[<ffffff8008246b3c>] vfs_write+0xac/0x1b8
[<ffffff800824808c>] SyS_write+0x5c/0xc8
Bug 3227296
Suggested-by: Bibek Basu <bbasu@nvidia.com>
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Change-Id: I932a23700539422c07de045dde516c52dd8348cf
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2472903
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Bibek Basu <bbasu@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When try to read '/sys/kernel/debug/gpu.0/railgate_residency'
debug fs node, NULL pointer access error can be happened if
is_railgated function is not assinged.
Add check for is_railgated before calling the function pointer.
Bug 200682233
Change-Id: I4a03d4e19b04d02815b792d7d967d4a1d5f42c35
Signed-off-by: Alvin Park <apark@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2459751
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Hardik T Shah <hardikts@nvidia.com>
Reviewed-by: Phoenix Jung <pjung@nvidia.com>
Reviewed-by: Jay Kumar Bajaj <jbajaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Host scheduler might be confused if more than one channels are present
in TSG and one of the unbound channel has NEXT set.
This is not so much of an issue if there is single channel in the TSG.
So don't fail unbind in that case. ctx_reload and engine_faulted check
can also be skipped for single channel TSG.
Bug 3144960
Change-Id: I85eb9025ea53706ce8fda6d9b4bcf6a15a300d17
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2442970
(cherry picked from commit ad4624aae3f109fc3c8c03653cb691e09f086930)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2445445
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Register list for ACB SLCG is auto generated with scripts.
Add HAL operations to enable/disable ACB clock gating.
Cherry-pick/manually port from dev-main
Bug 200647909
Change-Id: I4be4c14cc072fcccd91031a5a40321f5ff11f549
Signed-off-by: Prateek sethi <prsethi@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2420355
(cherry picked from commit c7c04d3a28c2eb0edc8e015dd0130fa50d3496c7)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2434464
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-by: Phoenix Jung <pjung@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For Linux, limit the use of the cache to entries less than the page size, to
avoid potential problems with running out of CMA memory when allocating large,
contiguous slabs, as would be required for non-iommmuable chips.
Also, in nvgpu_pd_cache_do_free(), zero out entries only if iommu is in use
and PTE entries use the cache (since it's the prefetch of invalid PTEs by
iommu that needs to be avoided).
Bug 3093183
Bug 3100907
Change-Id: I363031db32e11bc705810a7e87fc9e9ac1dc00bd
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2422039
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Satish Arora <satisha@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Running out of priv cmd buffer allocation capacity is typically a
recoverable "error" caused by extra pressure wrt. allocation sizes based
on number of inflight jobs chosen by userspace. These conditions return
-EAGAIN and further retries will succeed as long as the channel advances
with submitted jobs. Remove the unnecessary debug spew.
Bug 200641803
Bug 200651329
Change-Id: I4dfc38cfc3eb10d57ac11c1b7164c3d84f9034d3
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2388799
(cherry picked from commit 29ad324f8226ed3326f5de9117b9115a15cdd032)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2410069
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_free_channel, destroy notifier_wq and
semaphore_wq
In __nvgpu_vm_remove, destroy the update_gmmu_lock mutex
Bug 200647668
Change-Id: Icbb4e626c0fa9fa2dcf1430b3112b51829b00e4f
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2414820
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Satish Arora <satisha@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Large buffers being mapped to GMMU end up needing many
pages for the PTE tables. Allocating these pages one
by one can end up being a performance bottleneck, particularly
in the virtualized case.
Add support for page-sized PTEs to the existing PD cache:
- define NVGPU_PD_CACHE_SIZE, the allocation size for a new slab
for the PD cache, effectively set to 64K bytes
- Use the PD cache for any allocation < NVGPU_PD_CACHE_SIZE
- When freeing up cached entries, avoid prefetch errors by
invalidating the entry (memset to 0)
Bug 3093183
Bug 3100907
Change-Id: I2302a1dfeb056b9461159121bbae1be70524a357
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2401783
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Satish Arora <satisha@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The device node permission for the ctxsw should be set to "root:debug"
instead.
Bug 2823941
Change-Id: I523fdd298b70cac82c0a8d853f3e241a80a2ebf5
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2372943
(cherry picked from commit 692eafdd03af2f7ab4164732f878d2699867ac63)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2392715
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Below change added capability check in the ioctl. nvgpu is advertising
the support for RESCHEDULE_RUNLIST for all processes even though it
fails the ioctl for non-realtime processes.
Clear the ioctl flag for RESCHEDULE_RUNLIST for non-realtime processes.
commit 838ba0a14d61f ("gpu: nvgpu: check capability for reschedule runlist submit flag")
Author: David Li <davli@nvidia.com>
Date: Tue Sep 12 18:37:00 2017 -0700
NVGPU_SUBMIT_GPFIFO_FLAGS_RESCHEDULE_RUNLIST is only used by realtime
priority EGL context, which checks for CAP_SYS_NICE during context
creation in userspace, so it wasn't secure against unprivileged program
spoofing submit ioctl with this flag to stall GPU progress of others.
This flag does increase duration of submit by approx 16us,
mostly due to register accesses and PMU FIFO mutex.
Bug 2823941
Change-Id: Iecee3989e5af035264b1ed5c1aa9a8576dd90883
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2372957
(cherry picked from commit 864213ae55b009b0a026ac380b26276332f79177)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2392714
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Debugfs can be mounted with root-only permissions hence remove the extra
cap checks in the debugfs open calls for fifo_sched & ctxsw_ring.
Bug 2823941
Change-Id: I41668a887635f34897886b872ad435b183b85959
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2372982
(cherry picked from commit f34037a09f5996762c69bb4ce86751ed7df24ee7)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2392713
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With MSS Nvlink set for force snoop, check for the coherency flag in
gmmu attribute and setting pte aperture to coherent type based on that
checking is not relevant.
coherent variable removed from nvgpu_gmmu_attrs struct.
Bug 200473147
Bug 3057980
Change-Id: Idf76cac901ef7c70faa2c4f7f11a046d94b9466a
Signed-off-by: Vinod G <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2013212
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry-picked from 4e1769097526e5203f7c18a663ab3c29f5568ae5
in rel-32)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2387272
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Aayush Rajoria <arajoria@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the code that set default aperture mask as coherent.
MSS nvlink is set for force snoop, so default aperture mask is set as
non-coherent.
Bug 200473147
Bug 3057980
Change-Id: Ia8f826b8414826d2642f9c35c14ffba1cd0b9353
Signed-off-by: Vinod G <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2011966
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry-picked from aec64d8f8bcda4a5270a16032005c7b5d1742656
in dev-main)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2387271
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Aayush Rajoria <arajoria@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Cycle stats and cycle stats snapshot ioctls have been moved to
debug node. Removing channel ioctls.
Bug 2660206
Bug 220464613
Change-Id: I3aecdf4a8310eeb38de2de5ac076048891afe436
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2030992
(cherry picked from commit f20424ea6a7c6fcf977630e3e95d9e78418f13b8)
Signed-off-by: Gagan Grover <ggrover@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2092020
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Phoenix Jung <pjung@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Peter Daifuku <pdaifuku@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add NVGPU_DBG_GPU_IOCTL_CYCLE_STATS to debugger node, to
install/uninstall a buffer for cycle stats.
Add NVGPU_DBG_GPU_IOCTL_CYCLE_STATS_SNAPSHOT to debugger
node, to attach/flush/detach a buffer for Mode-E streamout.
Those ioctls will apply to the first channel in the debug session.
Bug 2660206
Bug 200464613
Change-Id: I0b96d9a07c016690140292fa5886fda545697ee6
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2002060
(cherry picked from commit 90b0bf98ac01d7fa24c40f6a1f20bfe5fa481d36)
Signed-off-by: Gagan Grover <ggrover@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2092008
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Phoenix Jung <pjung@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Peter Daifuku <pdaifuku@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Volta, nvgpu needs to wait for explicit ACK from CTXSW while
setting FECS watchdog timeoout
This is manual port of the fixes 4d7e5026e38528b88a4a168eca9a8b180475b368
and ad89436b03428a42e43042b6a849c15843fdebc4 on dev-main since clean
cherry-pick is not possible due to huge file and structure differences.
Bug 200603566
Change-Id: Icba69998ab45eee5fdf2a29e1ac1067589301be6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2371708
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The timeout message of nvgpu_timeout_expired_msg() leaks
a stack value (%llx) in error log on timeout. As the format
expects 1 argument and none is given, fix this by specifying
the required argument.
Manual port of https://git-master.nvidia.com/r/c/linux-nvgpu/+/2205423
Bug 2780861
Bug 3051385
Change-Id: Ic223e4b79bde718108826f095740b10b54a5e84d
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2366452
(cherry picked from commit 372837506af77e2c5b8489ee2123292778abe75d)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2370285
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sungwook Kim <sungwookk@nvidia.com>
Reviewed-by: Rahul Jain (SW-TEGRA) <rahuljain@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
File tu104/gr_tu104.c was added with commit f56874aec2 owing to
incorrect conflict resolution. Delete it.
Bug 200447167
Change-Id: I24648a3130bc76731c888328d5742229f6f6c928
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2371610
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Data can be speculativerly stored and
code flow can be hijacked.
To mitigate this problem insert a
speculation barrier.
Bug 200447167
Change-Id: Ia865ff2add8b30de49aa970715625b13e8f71c08
Signed-off-by: Ranjanikar Nikhil Prabhakarrao <rprabhakarra@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1972221
(cherry picked from commit f0762ed4831b3fe6cc953a4a4ec26c2537dcb69f)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/1996052
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Deepak Nibade <dnibade@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Initialize the perfmon counters #3 masks to be same values as ELPG.
Hardware boots up with value NV_PPWR_PMU_IDLE_MASK_1(3) (0x10aa4c) = 0x1030,
but ELPG NV_PPWR_PMU_IDLE_MASK_1_SUPP(0) (0x10a9f4) boots up with 0.
Bug 2833620
Change-Id: I3a424345aec6176a97dd20fb2c68a6e2faf955ad
Signed-off-by: David Ung <davidu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2335299
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add pmu_idle_mask_1, pmu_idle_mask_2 and pmu_idle_mask_2_supp
Bug 2833620
Change-Id: Icceb99b48a227d32653fd9fbc3da9e27065e9fe2
Signed-off-by: David Ung <davidu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2334219
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Regenerating header from register generator
Bug 2833620
Change-Id: Idc8e922bb611ed5acae66b6ca38db4bb9c8a1904
Signed-off-by: David Ung <davidu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2335263
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no point to depend on strlen of second argument,
otherwise it could be a simple strcpy. Instead, let's make
it depending on sizeof(destination)
...and make sure that result is NUL-terminated, too
Bug 2973859
Change-Id: Ifc941fab07e503b7b980696950d65b8bb10bf4ff
Signed-off-by: dmitry pervushin <dpervushin@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2342281
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Phoenix Jung <pjung@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, for platforms with canRailgate device characteristics disabled,
suspend can block as deterministic channels hold busy references. This
patch makes the change to first hold off any new jobs for deterministic
channels and then reverts back the busy references taken by those
channels. Following this, suspend also waits for the device to get idle
by waiting (with timeout) for the nvgpu's internal usage counter to be
come zero. This ensures there are no further jobs in progress and
allows the system to go into a suspend state.
Bug 200598228
Bug 2930266
Change-Id: Id02b4d41a9c2dd64303b2e2449dbed48c12aea4c
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328489
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Background: There is a race that occurs when l2_fb_ops ioctl is
invoked. The race occurs as part of the flush() call while a
gk20_idle() is in progress.
This patch handles the race by making changes in the l2_fb_ops
ioctl itself. For cases where pm_runtime is disabled or railgate is
disabled, we allow this ioctl call to always go ahead as power is
assumed to be always on.
For the other case, we first check the status of g->power_on. In the
driver, g->power_on is set to true, once unrailgate is completed and is
set to false just before calling railgate.
For linux, the driver invokes gk20a_idle() but there is a delay after
which the call to the rpm_suspend()'s callback gets triggered. This
leads to a scenario where we cannot efficiently rely on the
runtime_pm's APIs to allow us to block an imminent suspend or exit if
the suspend is currently in progress. Previous attempts at solving this
has lead to ineffective solutions and make it much complicated to
maintain the code.
With regards to the above, this patch attempts to simplify the way this
can be solved. The patch calls gk20a_busy() when g->power_on = true.
This prevents the race with gk20a_idle(). Based on the rpm_resume and
rpm_suspend's upstream code, resume is prioritized over a suspend
unless a suspend is already in progress i.e. the delay period has been
served and the suspend invokes the callback. There is a very small
window for this to happen and the ioctl can then power_up the device as
evident from the gk20a_busy's calls.
A new function gk20a_check_poweron() is added. This function protects
the access to g->power_on via a mutex. By preventing a read from
happening simulatenously as a write on g->power_on, the likelihood of
an runtime_suspend triggering before a runtime_resume is further
reduced.
Bug 200507468
Change-Id: I5c02dfa8ea855732e59b759d167152cf45a1131f
Signed-off-by:Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2299545
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some of the APIs that access debugger register are not protected
from ELPG. This might trigger PRI access timeouts for corresponding
registers if GR engine is power gated.
Add gr_gk20a_elpg_protected_call() to protect against ELPG.
Bug 2820066
Change-Id: I467ea28aaea1c0e36c2d6aabce6a2daea6ee9911
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2306383
(cherry picked from commit 0c0eb25ee798db3a8dcd8cab7db06312a220240e)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2313210
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When nvgpu_vm_unmap_sync fails, nvgpu_unmap_sync currently bails
out without decreasing the buffer refcount. This prevents from
releasing the buffer, in case a deferred job completes after the
timeout (which was observed 2 times during overnight
stress tests). This also means that the fixed address is not
re-useable.
Throw out a warning when nvgpu_vm_unmap_sync fails, but proceed
with decreasing refcount.
Bug 200578193
Change-Id: Ie0cc7caa7d12ca0a3b42123a5f7a28bda72dabbc
Signed-off-by: ddutta <ddutta@nvidia.com>
(cherry picked from commit a433f26d5bb1ec3253fc2655998b1ef7fb2847cb
in dev-main)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2291352
Tested-by: Naveen Kumar S <nkumars@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
blcg is always enabled by default and there is no need for disabling
this during gr init or gr reset.
Bug 2866010
Change-Id: Iaf17b7fdf05ad04fe435e1a1fda758deedc6484c
Signed-off-by: ddutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2303114
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check kzalloc() allocations for failures and return
an error if an allocation fails.
Bug 2279948
Change-Id: I8a2c3b84904da897ad6118900c11489c8656c20f
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2020123
(cherry picked from commit fadd0014da39cb9498472494e52590db4b0bd7b9)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2298066
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When unbinding a channel from a tsg when virtual, vgpu_tsg_unbind_channel
would return an error if unbinding the channel on the guest side failed,
and did so before notifying the RM server of the unbind.
Later on in the recovery process, the guest OS would remove the channel from the
TSG's list, but this would leave the RM server with an out-of-date channel list.
Fix this by making the tsg_unbind_channel HAL optional and implemented only for vgpu:
the vgpu version now just notifies the RM server so that it can clean up its version
of the TSG; if vgpu, always call the tsg_unbind_channel HAL whether or not
the local unbind succeeded.
Minimal port from dev-main of https://git-master.nvidia.com/r/c/linux-nvgpu/+/2084029
Bug 2766920
Bug 200587845
Change-Id: I75bddf3a28ac20bf4fb7510ff64097a32c7eec3f
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2287774
(cherry picked from commit 471c72c1efcc4fe6d547f556edf7773827fd2674)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2289928
Reviewed-by: Thomas Steinle <tsteinle@nvidia.com>
Reviewed-by: Satish Arora <satisha@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tsg_release hal_fn was missing in vgpu_gp10b causing
proper cleanup not to happen at the rm-server.
Bug 2766920
Bug 200587845
Change-Id: Ic0e57d1d37e0f92eea23087299c8c22c094199b0
Signed-off-by: aalex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1830192
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2287773
(cherry picked from commit db765a8b898e1fd74a447081ea5e7de195ad3ce7)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2289927
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch help resolve the boot time failures happening with
pmu_exterr for porg. cg_enable can race with pmu_init thread,
cg_enable is moved post pmu init thread to avoid the above race.
Bug 200565050
Change-Id: I2192053eff8767847ea012ca20b3607d2f6cd26f
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2239959
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added two new IVC commands that set gr and fb mmu debug mode.
Bug 2586624
Change-Id: I358fb04713a9754fb209c0a90d02130dd4a1caf6
Reviewed-on: https://git-master.nvidia.com/r/2204980
(cherry picked from commit db4e5b09891aff075dfffb7cc2fe0630a71ab9a6)
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2288347
Reviewed-by: Kajetan Dutka <kdutka@nvidia.com>
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Kajetan Dutka <kdutka@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replaced ch->mmu_debug_mode_enabled with ch->mmu_debug_mode_refcnt.
If channel is enabled multiple times by userspace, then ref count is
updated accordingly. There is an expectation that enable/disable
calls are balanced for setting channel's mmu debug mode.
When unbinding the channel, decrease refcnt for the channel until it
reaches 0.
Also, removed tsg parameter from nvgpu_tsg_set_mmu_debug_mode as it
can be retrieved from ch.
Bug 2515097
Bug 2713590
Change-Id: If334e374a55bd14ae219edbfd3b1fce5ff25c226
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2184702
(cherry picked from commit f422aee39387a5aa337de69cc21a67f16697ae0e)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2208772
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Kajetan Dutka <kdutka@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Winnie Hsu <whsu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Kajetan Dutka <kdutka@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set NV_PFB_HSMMU_PRI_MMU_DEBUG_CTRL and NV_PFB_PRI_MMU_DEBUG_CTRL
in addition to NV_PGRAPH_PRI_GPCS_MMU_DEBUG_CTRL, in
NVGPU_DBG_GPU_IOCTL_SET_CTX_MMU_DEBUG_MODE
Bug 2515097
Bug 2713590
Change-Id: I1763b43e79fac3edb68a35980683d58bfa89519f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2115785
(cherry picked from commit 8057514a9f7fc5f175e2e0571dfa91d78ebb6410)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2208771
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Kajetan Dutka <kdutka@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Winnie Hsu <whsu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Kajetan Dutka <kdutka@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add accessors for NV_PFB_HSMMU_PRI_MMU_DEBUG_CTRL and
NV_PFB_PRI_MMU_DEBUG_CTRL
Bug 2515097
Bug 2713590
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Change-Id: Ieadee041854bc9a17721b5b17938a106e205517f
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2208770
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Kajetan Dutka <kdutka@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Kajetan Dutka <kdutka@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GPC MMU debug mode should be set if at least one channel
in the TSG has requested it. Add refcounting for MMU debug
mode, to make sure debug mode is disabled only when no
channel in the TSG is using it.
Bug 2515097
Bug 2713590
Change-Id: Ic5530f93523a9ec2cd3bfebc97adf7b7000531e0
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2123017
(cherry picked from commit a1248d87fe6e20aab3e5f2e0764f9fe8d80d0552)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2208769
Reviewed-by: Kajetan Dutka <kdutka@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Winnie Hsu <whsu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Kajetan Dutka <kdutka@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
|