| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With NVGPU_IOCTL_CHANNEL_EVENTS_CTRL, nvgpu can
raise events to User space. But user space
cannot distinguish between various types of events.
To overcome this, we need finer-grained API to
deliver various events to user space.
Remove old API NVGPU_IOCTL_CHANNEL_EVENTS_CTRL,
and all the support for this API (we can remove
this since User space has not started using this
API at all)
Add new API NVGPU_IOCTL_CHANNEL_EVENT_ID_CTRL
which will accept an event_id (like BPT.INT or
BPT.PAUSE), a command to enable the event,
and return a file descriptor on which
we can raise the event (if cmd=enable)
Event is disabled when file descriptor is closed
Add file operations "gk20a_event_id_ops"
to support polling on event fd
Also add API gk20a_channel_get_event_data_from_id()
to get event_data of event from its id
Bug 200089620
Change-Id: I5288f19f38ff49448c46338c33b2a927c9e02254
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1030775
(cherry picked from commit 5721ce2735950440bedc2b86f851db08ed593275)
Reviewed-on: http://git-master/r/1120318
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add function pointer for chip specific GPU characteristics init.
Bug 1637486
Change-Id: I6ce5eea124d8057393dec6e86e72412cc87e1cfa
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: http://git-master/r/780535
(cherry picked from commit f5c240d6ed19b5b9eedff05767c885ad5812c71e)
Reviewed-on: http://git-master/r/1120428
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This change performs merge of 'PM_RUNTIME_Removal' dev-branch with
'dev-kernel-3.18' branch. It replaces CONFIG_PM_RUNTIME with CONFIG_PM.
JIRA TPM-704
Change-Id: I306e254716f275c283f727fc232d7244939542b6
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
After commit b2b49ccbdd54 (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is
selected) PM_RUNTIME is always set if PM is set, so #ifdef blocks
depending on CONFIG_PM_RUNTIME may now be changed to depend on
CONFIG_PM.
Replace CONFIG_PM_RUNTIME with CONFIG_PM everywhere under
drivers/gpu/nvgpu/.
JIRA TPM-704
Change-Id: I23965838ff6ec77829076cd834e87641fb68e268
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Wait for 500 usec before ce reset to ensure that
no memory outstanding requests are pending.
Bug 1699365
Change-Id: I9f73f87cbbdca0208e95ebaee32dd1f764a3cd4f
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1116679
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Allow a special address space node to be split out from the
user adress space or fixed allocations. A debugfs node,
/d/<gpu>/separate_fixed_allocs
Controls this feature. To enable it:
# echo <SPLIT_ADDR> > /d/<gpu>/separate_fixed_allocs
Where <SPLIT_ADDR> is the address to do the split on in the
GVA address range. This will cause the split to be made in
all subsequent address space ranges that get created until it
is turned off. To turn this off just echo 0x0 into the same
debugfs node.
Change-Id: I21a3f051c635a90a6bfa8deae53a54db400876f9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1030303
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
bug 1648908
This commit adds support for FECS ctxsw tracing. Code is compiled
conditionnaly under CONFIG_GK20_CTXSW_TRACE.
This feature requires an updated FECS ucode that writes one record to a ring
buffer on each context switch. On RM/Kernel side, the GPU driver reads records
from the master ring buffer and generates trace entries into a user-facing
VM ring buffer. For each record in the master ring buffer, RM/Kernel has
to retrieve the vmid+pid of the user process that submitted related work.
Features currently implemented:
- master ring buffer allocation
- debugfs to dump master ring buffer
- FECS record per context switch (with both current and new contexts)
- dedicated device for ctxsw tracing (access to VM ring buffer)
- SOF generation (and access to PTIMER)
- VM ring buffer allocation, and reconfiguration
- enable/disable tracing at user level
- event-based trace filtering
- context_ptr to vmid+pid mapping
- read system call for ctxsw dev
- mmap system call for ctxsw dev (direct access to VM ring buffer)
- poll system call for ctxsw dev
- save/restore register on ELPG/CG6
- separate user ring from FECS ring handling
Features requiring ucode changes:
- enable/disable tracing at FECS level
- actual busy time on engine (bug 1642354)
- master ring buffer threshold interrupt (P1)
- API for GPU to CPU timestamp conversion (P1)
- vmid/pid/uid based filtering (P1)
Change-Id: I8e39c648221ee0fa09d5df8524b03dca83fe24f3
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1022737
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Reset controller is not enabled in all builds, so make its use
optional.
Change-Id: I88df11d0aae0552eb4c7f3acee5be70885ea2901
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1028348
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, only "high" priority bare channels were interleaved
between all other bare channels and TSGs. This patch decouples
priority from interleaving and introduces 3 levels for interleaving
a bare channel or TSG: high, medium, and low. The levels define
the number of times a channel or TSG will appear on a runlist (see
nvgpu.h for details).
By default, all bare channels and TSGs are set to interleave level
low. Userspace can then request the interleave level to be increased
via the CHANNEL_SET_RUNLIST_INTERLEAVE ioctl (TSG-specific ioctl will
be added later).
As timeslice settings will soon be coming from userspace, the default
timeslice for "high" priority channels has been restored.
JIRA VFND-1302
Bug 1729664
Change-Id: I178bc1cecda23f5002fec6d791e6dcaedfa05c0c
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1014962
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bug 200139995
Any GR register access should disable ELPG and clock gating before
access and enable it back after it is done. Disable ELPG while tweaking
perf parameters in gk20a_alloc_obj_ctx.
Also output NV_PBUS_INTR_0 in case of interrupt (including fix to
display correct value on pbus isr).
Change-Id: I81d2eb4461e92fbb33db8554779f6566f6b002c1
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/835307
(cherry picked from commit 6acc35bd1bcc706fbde8d11521cf1d0f64a16fe4)
Reviewed-on: http://git-master/r/921299
(cherry picked from commit 73afd520445bb1f4757fd167b38289143fd46d80)
Reviewed-on: http://git-master/r/930040
(cherry picked from commit 7a784ebea0dd60a88469f51eaa61c33b356e499c)
Reviewed-on: http://git-master/r/1023529
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Adding support for FECS mem overrides
Bug 1699676
Change-Id: I6c9ddcd98d57b29059513ee508c6f92b194c4fc7
Signed-off-by: Supriya <ssharatkumar@nvidia.com>
Reviewed-on: http://git-master/r/921253
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Bug 1726406
Change-Id: Ia03b0a174e92b28c471164cefcde514e6db94bdf
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1002700
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NVGPU_GPU_FLAGS_SUPPORT_TSG indicates both the kernel driver and
device support time slice group (TSG).
Bug 1617046
Bug 200155618
Change-Id: Ib3490a32b773222560c58f1fd6d32bffcb97d6cd
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1010173
Reviewed-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Bug 200097029
Change-Id: Id63dad1629b1d1919cbbfb20b0cb85d4855f526d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1000724
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When gpu rail-gating is enabled, it is possible that
both rail gating code and system shudown can start
executing gk20a_pm_prepare_poweroff() in parallel.
To synchronize this execution, protect gk20a_pm_prepare_poweroff()
with a mutex lock.
Bug 200168805
Change-Id: I19536a43ed20c3e82b32c316922dc3e19e3f59bb
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/999548
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Few times cde is getting deadlocked because of pending cde
operation. So do the things cleanly, first suspend cde then
do channel suspend.
Bug 1709757
Change-Id: Iaf566b63d9efb13aa2691c19e2df676c70f26afc
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/926574
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
List of drivers that now want async probe:
- sdhci-tegra
- qspi-mtd
- nvmap
- gk20a
- dc
Bug 200083391
Change-Id: Ie0a0677961b704c78d4eb2cdab9f0e9a925a3ca1
Reviewed-on: http://git-master/r/923738
(cherry-picked from 75c067e83c7cde2a37c4fae01719e40c5b7d2835)
Signed-off-by: dmitry pervushin <dpervushin@nvidia.com>
Reviewed-on: http://git-master/r/923121
Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com>
Tested-by: Sumeet Gupta <sumeetg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Interleave all high priority channels between all other channels.
This reduces the latency for high priority work when there
are a lot of lower priority work present, imposing an upper
bound on the latency. Change the default high priority timeslice
from 5.2ms to 3.0 in the process, to prevent long running high priority
apps from hogging the GPU too much.
Introduce a new debugfs node to enable/disable high priority
channel interleaving. It is currently enabled by default.
Adds new runlist length max register, used for allocating
suitable sized runlist.
Limit the number of interleaved channels to 32.
This change reduces the maximum time a lower priority job
is running (one timeslice) before we check that high priority
jobs are running.
Tested with gles2_context_priority (still passes)
Basic sanity testing is done with graphics_submit
(one app is high priority)
Also more functional testing using lots of parallel runs with:
NVRM_GPU_CHANNEL_PRIORITY=3 ./gles2_expensive_draw
–drawsperframe 20000 –triangles 50 –runtime 30 –finish
plus multiple:
NVRM_GPU_CHANNEL_PRIORITY=2 ./gles2_expensive_draw
–drawsperframe 20000 –triangles 50 –runtime 30 -finish
Previous to this change, the relative performance between
high priority work and normal priority work comes down
to timeslice value. This means that when there are many
low priority channels, the high priority work will still
drop quite a lot. But with this change, the high priority
work will roughly get about half the entire GPU time, meaning
that after the initial lower performance, it is less likely
to get lower in performance due to more apps running on the system.
This change makes a large step towards real priority levels.
It is not perfect and there are no guarantees on anything,
but it is a step forwards without any additional CPU overhead
or other complications. It will also serve as a baseline to
judge other algorithms against.
Support for priorities with TSG is future work.
Support for interleave mid + high priority channels,
instead of just high, is also future work.
Bug 1419900
Change-Id: I0f7d0ce83b6598fe86000577d72e14d312fdad98
Signed-off-by: Peter Pipkorn <ppipkorn@nvidia.com>
Reviewed-on: http://git-master/r/805961
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add IOCTL NVGPU_IOCTL_CHANNEL_WDT to disable/enable
watchdog per-channel
Also, if watchdog is disabled, we currently schedule
the worker with MAX timeout.
Instead of this, do not schedule any worker if
watchdog is disabled
Bug 1683059
Bug 1700277
Change-Id: I7f6bec84adeedb74e014ed6d1471317b854df84c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/837962
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- operands not affecting result (id = 12845)
- logically dead code (id = 12890)
- dereference after null check (id = 12968)
- unsigned compared to 0 (id = 13176)
- resource leak (id = 13338, 18673)
- unused pointer value (id = 13916)
Bug 1703084
Change-Id: I2f401dd93126af27748c53fa1b3a59cb154af36b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/835143
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add new operaton g->ops.mm.set_debug_mode and let other places
that set debug mode call this callback.
It's preparing for adding vgpu set mmu debug mode hook.
JIRA VFND-1005
Bug 1594604
Change-Id: I1d227a0c0f96adb0035ae16ae1f4fbfa739bf0a7
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/833497
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Create constant name for gpu debugfs node across all chip.
Bug n/a
Change-Id: I359b82b5389c49d8fe2a31ace49ff6daa1edfb10
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/805397
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
(cherry-picked from commit 17a3882cde09412c68f7a0ee4765f45be1a51c45)
Reviewed-on: http://git-master/r/817014
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which
enables creating userspace-managed GPU address spaces.
When an address space is marked as userspace-managed, the following
changes are in effect:
- Only fixed-address mappings are allowed.
- VA space allocation for fixed-address mappings is not required,
except to mark space as sparse.
- Maps and unmaps are always immediate. In particular, the mapping
ref increments at kickoffs and decrements at job completion are
skipped.
Bug 1614735
Bug 1623949
Bug 1660392
Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/738558
Reviewed-on: http://git-master/r/833253
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_do_idle(), if can_railgate = false, we do
reset_assert() on the GPU
But asserting reset might have dependencies on
specific h/w seqeunce for ensuring proper reset
Hence avoid calling reset_assert() and directly
call platform specific unrailgate() and railgate()
APIs which take care of the correct reset sequence
Bug 200142989
Bug 200137963
Bug 1678611
Change-Id: Ide886dd88b8422ad36de52d54378b1edd9c7bbd6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/820322
(cherry picked from commit d621ddd49da976a75c14aa7aaa37f700fb4e83f2)
Reviewed-on: http://git-master/r/822515
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we unrailgate the GPU if railgating is not enabled
or pm_domains are not enabled
But in case if railgating is not enabled and pm_domains
are enabled, we explicitly unrailgate GPU in gk20a_pm_init()
and then runtime PM unrailgates it again when first user
space request arrives - setting unrailgate refcount to 2
Now for gk20a_do_idle(), we need to railgate the GPU in
fist call but that does not happen since unrailgate
refcount != 1
hence, in case railgating is not enabled, we should
unrailgate the GPU from only one place i.e. when first user
space request arrives
Bug 200142989
Bug 200137963
Bug 1678611
Reviewed-on: http://git-master/r/820321
(cherry picked from commit 452a1ff8da8e3f47caed2371440f9ad150bf8699)
Change-Id: Ic9fe2267c9df5629315c30c1404c2b3044c1265a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/822296
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, if can_railgate = false, then we have below
sequence to allocate secure_page
- unrailgate GPU (forever)
- reset_assert()
- allocate secure_page
- reset_deassert()
But if we allocate secure page even before unrailgating GPU
for first time, then we can avoid reset_assert()/deassert()
calls since GPU should already be in reset/railgated at
boot time
hence, rework this sequence as below
- init required mutex, set platform->reset_control
- allocate secure page (GPU should already be in reset
at this point)
- gk20a_pm_init() which unrailgates GPU in case of
can_railgate = false
Bug 200137963
Bug 1678611
Change-Id: I79d0543bb5cf1eaf1009e1e6ac142532d84514a5
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/797153
(cherry picked from commit 368004501943d38c003747f6bec0384fed57ee65)
Reviewed-on: http://git-master/r/816005
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scale timeslice register value based on platform
specific ptimer scale koefficient.
Expose timeslice values through debugfs to simplify performance
tuning.
bug 1605552
bug 1603226
Change-Id: I49f86f22d58d26a366ee1b5f5a9ab9d7f896ad25
Signed-off-by: Kirill Artamonov <kartamonov@nvidia.com>
Reviewed-on: http://git-master/r/800007
(cherry picked from commit 00c85ef24cf28ffaa81eb53fff7edef1c699220a)
Reviewed-on: http://git-master/r/808251
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug 200137963
Change-Id: I3197af905c945540b97ba191e5695d970d77af8e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/797154
(cherry picked from commit 8a50245ea636deb87a3d9435fb115b4eac88fac9)
Reviewed-on: http://git-master/r/808247
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It fixed the issue that system hang when reboot.
Bug 1638850
Change-Id: If53a31e86c10b2fce4a22fe4fcf92106d86c95ef
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/803234
(cherry picked from commit 4dbea2c7037a5244ccb9d6e886023c29ba584892)
Reviewed-on: http://git-master/r/808245
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Create debugfs node before platform->probe() is called.
Allow chip specific debugfs entries go to correct
directory.
bug 1525327
bug 1581799
Change-Id: I2d91bdc1e72dac6787938eff01218c9f871029cb
Signed-off-by: Kirill Artamonov <kartamonov@nvidia.com>
Reviewed-on: http://git-master/r/796092
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/778729
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement per-channel watchdog/timer as per below rules :
- start the timer while submitting first job on channel or if
no timer is already running
- cancel the timer when job completes
- re-start the timer if there is any incomplete job left
in the channel's queue
- trigger appropriate recovery method as part of timeout
handling mechanism
Handle the timeout as per below :
- get timed out channel, and job data
- disable activity on all engines
- check if fence is really pending
- get information on failing engine
- if no engine is failing, just abort the channel
- if engine is failing, trigger the recovery
Also, add flag "ch_wdt_enabled" to enable/disable channel
watchdog mechanism. Watchdog can also be disabled using
global flag "timeouts_enabled"
Set the watchdog time to be 5s using macro
NVGPU_CHANNEL_WATCHDOG_DEFAULT_TIMEOUT_MS
Bug 200133289
Change-Id: I401cf14dd34a210bc429f31bd5216a361edf1237
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/797072
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Cyclestats snapshot feature is expected for new devices.
The detection code was isolated in separate function and run-time
check added to validate/allow ioctl calls on the current GPU.
Bug 1674079
Change-Id: Icc2f1e5cc50d39b395d31d5292c314f99d67f3eb
Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com>
Reviewed-on: http://git-master/r/781697
(cherry picked from commit bdd23136b182c933841f91dd2829061e278a46d4)
Reviewed-on: http://git-master/r/793630
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dumps NV_PGRAPH_PRI_GPC0_GPCCS_FS_GPC
whenever pbus sends the 0xbadf13 error
bug 1662268
Change-Id: I302ffe5c86098e7235ecc8c071a5e2c852455565
Signed-off-by: Sam Payne <spayne@nvidia.com>
Reviewed-on: http://git-master/r/789090
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Inject function addresses of gk20a_do_idle and
gk20a_do_unidle once the nvgpu module loads.
Bug 1476801
Change-Id: I67a8ae7fb654524616c2c2c710013cbc097a3f32
Signed-off-by: Yogesh Bhosale <ybhosale@nvidia.com>
Reviewed-on: http://git-master/r/785047
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add device-tree support for tegra power-domains and power-gating
for t186, then perform the related cleanup.
Also enable TEGRA_MC_DOMIANS, PM_GENERIC_DOMAINS_OF and
TEGRA_POWERGATE for t186.
Bug 200105664
Change-Id: I548c6b71a1577afa439a39a0eafc317a1c3cbc68
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
As CONFIG_PM_GENERIC_DOMAINS_OF is enabled, so cleaning-up
the code which remains unused when this config is enabled.
Bug 200070810
Change-Id: I884ca3d6fb8fa6acdff8c1b2fbe66a672758274a
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Make modification to add DT support for gpu
power-domain for T186 chip.
Bug 200105664
Change-Id: Ief8d0a6c84918578c52d153db7eac02587b67ee7
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gk20a_pm_shutdown() is the last callback before
GPU railgate will be forced by platform code
Hence we need to call prepare_poweroff() before
returning from shutdown() to clean up below things
mainly,
1. disable interrupts to ensure that GPU is not
processing any interrupts while railgating
2. disable clocks (and related flags) to ensure
no h/w access from exported clock ops
Note that GPU railgate will be triggered by platform
code since config CONFIG_PM_GENERIC_DOMAINS_OF is
enabled by default
Bug 200123584
Change-Id: Ifaa0d1ba9b01d49bf5cc85d9c9a9feb3815866d8
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/770485
Reviewed-by: Prashant Gaikwad <pgaikwad@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The cyclestats mode-e feature supported by userspace only
for t210 devices, so kernel should advertize it only for t210.
Also small check added to prevent BUG in dma-buf.c:826
if device has lack of memory.
Bug 1662506
Change-Id: I8417a8cdd9092e64126382f379d171932e4592a1
Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com>
Reviewed-on: http://git-master/r/767073
(cherry picked from commit 06f86b6e78bae5e26e32466716c18e7918efb1b1)
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/767148
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clock bypass divider was changed just before resetting priv ring.
Move the code to a new clk op instead so that it is executed only on
gk20a.
Change-Id: Ic8084a4a5fac23770f50b50f910ced2543ba0f28
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/764970
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Aleksandr Frid <afrid@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add batch support for mapping and unmapping. Batching essentially
helps transform some per-map/unmap overhead to per-batch overhead,
namely gk20a_busy()/gk20a_idle() calls, GPU L2 flushes, and GPU TLB
invalidates. Batching with size 64 has been measured to yield >20x
speed-up in low-level fixed-address mapping microbenchmarks.
Bug 1614735
Bug 1623949
Change-Id: Ie22b9caea5a7c3fc68a968d1b7f8488dfce72085
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/733231
(cherry picked from commit de4a7cfb93e8228a4a0c6a2815755a8df4531c91)
Reviewed-on: http://git-master/r/763812
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
As upstream has removed them, but we are still using these.
So uncommenting these callback assignment.
Bug 200070810
Change-Id: I26a221f9d76f6acef70095eb8afcf440057f464c
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make modification to add DT support for gpu
power-domain for T132 chip.
Bug 200070810
Change-Id: Iac63c8fb5fc5280e9a9f5758e63c9da009f3813d
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
Reviewed-on: http://git-master/r/739698
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 83699a4ec9ebf55f6cc12c76e57dad1d4ec2fbfa.
This hack was put in place as upstream has removed of_node
field from generic_pm_domain structure. But as we are still
using it, so removing this hack.
Bug 200100078
Change-Id: I14e533786fb814e361c580e2883ceff1f63d251f
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add reference counting for channels, and wait for reference count to
get to 0 in gk20a_channel_free() before actually freeing the channel.
Also, change free channel tracking a bit by employing a list of free
channels, which simplifies the procedure of finding available channels
with reference counting.
Each use of a channel must have a reference taken before use or held
by the caller. Taking a reference of a wild channel pointer may fail, if
the channel is either not opened or in a process of being closed. Also,
add safeguards for protecting accidental use of closed channels,
specifically, by setting ch->g = NULL in channel free. This will make it
obvious if freed channel is attempted to be used.
The last user of a channel might be the deferred interrupt handler,
so wait for deferred interrupts to be processed twice in the channel
free procedure: once for providing last notifications to the channel
and once to make sure there are no stale pointers left after referencing
to the channel has been denied.
Finally, fix some races in channel and TSG force reset IOCTL path,
by pausing the channel scheduler in gk20a_fifo_recover_ch() and
gk20a_fifo_recover_tsg(), while the affected engines have been identified,
the appropriate MMU faults triggered, and the MMU faults handled. In this
case, make sure that the MMU fault does not attempt to query the hardware
about the failing channel or TSG ids. This should make channel recovery
more safe also in the regular (i.e., not in the interrupt handler) context.
Bug 1530226
Bug 1597493
Bug 1625901
Bug 200076344
Bug 200071810
Change-Id: Ib274876908e18219c64ea41e50ca443df81d957b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/448463
(cherry picked from commit 3f03aeae64ef2af4829e06f5f63062e8ebd21353)
Reviewed-on: http://git-master/r/755147
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
That is a kernel supporting code for cyclestats mode E.
Cyclestats mode E implemented following Windows-design in user-space
and required the following operations to be implemented:
- attach a client for shared hardware buffer of device
- detach client from shared hardware buffer
- flush means copy of available data from hardware buffer to private
client buffers according to perfmon IDs assigned for clients
- perfmon IDs management for user-space clients
- a NVGPU_GPU_FLAGS_SUPPORT_CYCLE_STATS_SNAPSHOT capability added
Bug 1573150
Change-Id: I9e09f0fbb2be5a95c47e6d80a2e23fa839b46f9a
Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com>
Reviewed-on: http://git-master/r/740653
(cherry picked from commit 79fe89fd4cea39d8ab9dbef0558cd806ddfda87f)
Reviewed-on: http://git-master/r/753274
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 2e5803d0f2b7d7a1577a40f45ab9f3b22ef2df80 since
the issue seen with bug 200106514 is fixed with change
http://git-master/r/#/c/752080/.
Bug 200112195
Change-Id: I588151c2a7ea74bd89dc3fd48bb81ff2c49f5a0a
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/752503
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit ce1cf06b9a8eb6314ba0ca294e8cb430e1e141c0 since
it causes GPU pbdma interrupt to be generated.
Bug 200106514
Change-Id: If3ed9a914c4e3e7f3f98c6609c6dbf57e1eb9aad
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/749291
|
|
|
|
|
|
|
|
| |
Change-Id: I96282b4e047ba8b5369dac039f0f51856c69235b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/747935
(cherry-picked from commit 0bb090745b4122fc4149b1bd6026138a1b9a32bc)
Reviewed-on: http://git-master/r/749235
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 7eb42bc239dbd207208ff491c3fb65c3d83274d8.
The original commit was actually fine.
Change-Id: I564ce6530ac73fcfad17dcec9c53f0353b4f02d4
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/743300
(cherry picked from commit e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff)
Reviewed-on: http://git-master/r/743301
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|