| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use new APIs from <nvgpu/list.h> to access free
channel list
Define channel_gk20a_from_free_chs() to convert
a list node to struct channel_gk20a
Jira NVGPU-13
Change-Id: Idaf58f04be1c7fc553bea7c8de45951bf82bb340
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1303025
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move code that touches host registers and instance block to fifo HAL.
This involves adding HAL ops for the fifo HAL functions that get
called from outside fifo. This clears responsibility of channel by
leaving it only managing channels in software and push buffers.
channel had member ramfc defined, but it was not used, to remove it.
pbdma_acquire_val consisted both of channel logic and hardware
programming. The channel logic was moved to the caller and only
hardware programming was moved.
Change-Id: Id005787f6cc91276b767e8e86325caf966913de9
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1322423
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fifo reset_enable_hw is reorged to clear and enable pbdma/fifo interrupts
after all the required configuration such as configuring timeouts,
enabling timeout detections are taken care of.
JIRA GPUT19X-74
JIRA GPUT19X-47
Change-Id: Id780cc11d858db18f8d748c037954ede73298506
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1325351
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For fake faults, errror notifiers are expected to be set
before triggering fake mmu fault.
JIRA GPUT19X-7
Change-Id: I458af8d95c5960f20693b6923e1990fe3aa59857
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1323413
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-gk20a_fifo_set_runlist_state() can be used to enable/disable runlists
scheduler. This change would be needed for t19x fifo recovery too
-Also delete gk20a_fifo_disable_all_engine_activity function as it is not
used anywhere.
JIRA GPUT19X-7
Change-Id: I6bb9a7574a473327f0e47060f32d52cd90551c6d
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1315180
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
is_preempt_pending fifo ops is added as t19x
preempt done sequence is differnt than legacy
chips.
Change-Id: I6b46be1f5b911ae11bbe806968cb8fabb21848e0
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1309678
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
This change is required to support t19x mmu fault
Change-Id: I3953dcf02c71ace606ba81896e56ea98683eb2ca
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1313482
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
required to support t19x mmu fault
Change-Id: Ibe621d924717696a359d7e2065beb6501a9f9b5e
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1315928
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
JIRA: EVLR-1004
(*) Refactor the non-stalling interrupt path to execute clear on the
top half, so on dGPU case processing of stalling interrupts does not
block non-stalling one.
(*) Use a worker thread to do semaphore wakeups and allow batching of
the non-stalling operations.
(*) Fix a bug where some gpus will not properly track the completion
of interrupts, preventing safe driver unloads
Change-Id: Icc90a3acba544c97ec6a9285ab235d337ab9eefa
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1312796
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Navneet Kumar <navneetk@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fifo ops added for dumping channel & ramfc status
and pbdma & engine status.
Change-Id: Icc739f4f05f0864721954489517fefdfa2fa608a
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1302369
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a debugfs interface to profile the kickoff ioctl
it provides the probability distribution and separates the information
between time spent in: the full ioctl, the kickoff function, the amount
of time spent in job tracking and the amount of time doing pushbuffer
copies
JIRA: EVLR-1003
Change-Id: I9888b114c3fbced61b1cf134c79f7a8afce15f56
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1308997
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
*client_type_gpc_v is different for t19x
Change-Id: Ic8f8eff2d98138a877ef95c6f7f40226f0d61a61
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1313436
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-Init pbdma and engine bit mask per runlist.
-Organize debug info to print supported pbdma instances
for particular runlist.
JIRA GV11B-3
Change-Id: Ie34dd98ccbe2c779ca1c795855c2a7df4abd2715
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1309706
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of using Linux APIs for mutex and spinlocks
directly, use new APIs defined in <nvgpu/lock.h>
Replace Linux specific mutex/spinlock declaration,
init, lock, unlock APIs with new APIs
e.g
struct mutex is replaced by struct nvgpu_mutex and
mutex_lock() is replaced by nvgpu_mutex_acquire()
And also include <nvgpu/lock.h> instead of including
<linux/mutex.h> and <linux/spinlock.h>
Add explicit nvgpu/lock.h includes to below
files to fix complilation failures.
gk20a/platform_gk20a.h
include/nvgpu/allocator.h
Jira NVGPU-13
Change-Id: I81a05d21ecdbd90c2076a9f0aefd0e40b215bd33
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1293187
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Setting timeslice for virtualized case was not effective,
because both ioctls NVGPU_TSG_IOCTL_SET_TIMESLICE and
NVGPU_SCHED_IOCTL_TSG_SET_TIMESLICE were calling the
native function to set TSG timeslice.
- Fixed wrapper function to call HAL
- Defined HAL function for "native" set TSG timeslice
- Also, properly update timeout_us in TSG context, in
virtualized case.
This change also moves the min/max bounds checking for
tsg timeslice into the native function implementation.
There is no sysfs node for these parameters for vgpu,
as RM server is ultimately responsible for this check.
Bug 200263575
Change-Id: Ibceab9427561ad58ec28abfff0c96ca8f592bdb9
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1283180
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix small problems related to signed versus unsigned comparisons
throughout the driver. Bump up the warning level to prevent such
problems from occuring in future.
Change-Id: I8ff5efb419f664e8a2aedadd6515ae4d18502ae0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1252068
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When bar1 memory is not supported then userd will be
allocated from sysmem.
Functions gp_get and gp_put are updated accordingly.
JIRA GV11B-1
Change-Id: Ia895712a110f6cca26474228141488f5f8ace756
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1225384
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To handle chip specific runlist entry size and structure,
add and implement relevant functional pointers.
Bug 1735760
Change-Id: I01f3ea78fb21d9fe30c82ba51ef24d7d95ebf90a
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1214473
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When trying to power down GPU the engine might be still busy. In this
case delay power down by returning -EBUSY from
gk20a_pm_runtime_suspend().
Bug 200224907
Change-Id: Ibad74c090add24a185bc1a7a02df367af9b95ced
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1213042
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- let force_reset_ch pass down err code
- force_reset_ch callback can cover vgpu too.
Bug 1776876
JIRA VFND-2151
Change-Id: I48f7890294c6455247198e0cab5f21f83f61f0e1
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1202255
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently store fault_id into fifo.deferred_fault_engines
and use that in gk20a_fifo_reset_engine() which is incorrect
Also, in deferred engine reset path during channel close,
we do not check if channel is loaded on engine or not
fix this with below
- store engine_id bits into fifo.deferred_fault_engines
- define new API gk20a_fifo_deferred_reset() to perform
deferred engine reset
- get all engines on which channel is loaded with
gk20a_fifo_engines_on_id()
- for each set bit/engine_id in fifo.deferred_fault_engines,
check if channel is loaded on that engine, and if yes,
reset the engine
Bug 1791696
Change-Id: I1b8b1a9e3aa538fe6903a352aa732b47c95ec7d5
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1195087
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added interface to allow kernel to create privileged CE channels for
page migration and clearing support between sysmem and videmem.
JIRA DNVGPU-53
Change-Id: I3e18d18403809c9e64fa45d40b6c4e3844992506
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1173085
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add CE engine to vgpu engine list. CE engine is defined differently
for different GPUs, so we also add HAL for initializing the engine
info.
Bug 1780185
Change-Id: I5ae265551feac08d0c4d45402dd3277514e62b2d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1169720
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Tested-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Lakshmanan M <lm@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extend the existing NVGPU_GPU_IOCTL_OPEN_CHANNEL interface to allow
opening channels for other than the primary (i.e., the graphics)
runlists. This is required to push work to dGPU engines that have
their own runlists, such as the asynchronous copy engines and the
multimedia engines.
Minor change - Added active_engines_list allocation
and assignment for fifo_vgpu back end.
JIRA DNVGPU-25
Change-Id: I3ed377e2c9a2b4dd72e8256463510a62c64e7a8f
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1161541
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This CL covers the following modification,
1) Added multiple engine_info support
2) Added multiple runlist_info support
3) Initial changes for ASYNC CE support
4) Added ASYNC CE interrupt handling support
for gm206 GPU family
5) Added generic mechanism to identify the
CE engine pri_base address for gm206
(CE0, CE1 and CE2)
6) Removed hard coded engine_id logic and
made generic way
7) Code cleanup for readability
JIRA DNVGPU-26
Change-Id: I2c3846c40bcc8d10c2dfb225caa4105fc9123b65
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1155963
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_fifo_abort_tsg(), we loop through channels of
TSG and call gk20a_channel_abort() for each channel
This is incorrect since we disable and preempt each
channel separately, whereas we should disable all channels
at once and use TSG specific API to preempt TSG
Fix this with below sequence :
- gk20a_disable_tsg() to disable all channels
- preempt tsg if required
- for each channel in TSG
- set has_timedout flag
- call gk20a_channel_abort_clean_up() to clean up channel state
Also, separate out common gk20a_channel_abort_clean_up() API
which can be called from both channel and TSG abort routines
In gk20a_channel_abort(), call gk20a_fifo_abort_tsg() if the
channel is part of TSG
Add new argument "preempt" to gk20a_fifo_abort_tsg() and
preempt TSG if flag is set
Bug 200205041
Change-Id: I4eff5394d26fbb53996f2d30b35140b75450f338
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1157190
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added device_info_data parsing
support for maxwell GPU series.
JIRA DNVGPU-26
Change-Id: I06dbec6056d4c26501e607c2c3d67ef468d206f4
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1151602
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were not using the engine_type field in device info, and the code
did not handle chained entries properly. The code assumed that first
entry is for graphics and second for CE, which is not always true.
Improve the code to go through all entries of device_info, and
preserve values across entries until we reach the last entry.
Only last entry triggers a write to fifo engine info.
There can also be multiple engines with same type, so accumulate
interrupts and reset ids from all of them.
As the code got fixed, now it reads the engine enum correctly from
hardware. We used to compare that against CE0, but we should compare
against CE2.
gk20a_fifo_reset_engine() uses wrong constants - it is passed a
internal numbering of engines, but it compares them against hardware
engine enum.
Change-Id: Ia59273921c602d2a090f7a5b1404afb0fca2532c
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1147746
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
JIRA EVLR-244
JIRA EVLR-318
Change-Id: Ie95f42212dadcf2d0c1737eeb28812afb03b712f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1120603
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Ken Adams <kadams@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, only "high" priority bare channels were interleaved
between all other bare channels and TSGs. This patch decouples
priority from interleaving and introduces 3 levels for interleaving
a bare channel or TSG: high, medium, and low. The levels define
the number of times a channel or TSG will appear on a runlist (see
nvgpu.h for details).
By default, all bare channels and TSGs are set to interleave level
low. Userspace can then request the interleave level to be increased
via the CHANNEL_SET_RUNLIST_INTERLEAVE ioctl (TSG-specific ioctl will
be added later).
As timeslice settings will soon be coming from userspace, the default
timeslice for "high" priority channels has been restored.
JIRA VFND-1302
Bug 1729664
Change-Id: I178bc1cecda23f5002fec6d791e6dcaedfa05c0c
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1014962
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Export separate API gk20a_fifo_issue_preempt() to issue
preempt request to a channel or TSG
Bug 200156699
Change-Id: Ib3b097ef66a6411d75c1fe213cdbe8b1d08d3418
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/935771
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support preprocessing of SM exceptions if API
pointer pre_process_sm_exception() is defined
Also, expose some common APIs
Bug 200156699
Change-Id: I1303642c1c4403c520b62efb6fd83e95eaeb519b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/925883
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Interleave all high priority channels between all other channels.
This reduces the latency for high priority work when there
are a lot of lower priority work present, imposing an upper
bound on the latency. Change the default high priority timeslice
from 5.2ms to 3.0 in the process, to prevent long running high priority
apps from hogging the GPU too much.
Introduce a new debugfs node to enable/disable high priority
channel interleaving. It is currently enabled by default.
Adds new runlist length max register, used for allocating
suitable sized runlist.
Limit the number of interleaved channels to 32.
This change reduces the maximum time a lower priority job
is running (one timeslice) before we check that high priority
jobs are running.
Tested with gles2_context_priority (still passes)
Basic sanity testing is done with graphics_submit
(one app is high priority)
Also more functional testing using lots of parallel runs with:
NVRM_GPU_CHANNEL_PRIORITY=3 ./gles2_expensive_draw
–drawsperframe 20000 –triangles 50 –runtime 30 –finish
plus multiple:
NVRM_GPU_CHANNEL_PRIORITY=2 ./gles2_expensive_draw
–drawsperframe 20000 –triangles 50 –runtime 30 -finish
Previous to this change, the relative performance between
high priority work and normal priority work comes down
to timeslice value. This means that when there are many
low priority channels, the high priority work will still
drop quite a lot. But with this change, the high priority
work will roughly get about half the entire GPU time, meaning
that after the initial lower performance, it is less likely
to get lower in performance due to more apps running on the system.
This change makes a large step towards real priority levels.
It is not perfect and there are no guarantees on anything,
but it is a step forwards without any additional CPU overhead
or other complications. It will also serve as a baseline to
judge other algorithms against.
Support for priorities with TSG is future work.
Support for interleave mid + high priority channels,
instead of just high, is also future work.
Bug 1419900
Change-Id: I0f7d0ce83b6598fe86000577d72e14d312fdad98
Signed-off-by: Peter Pipkorn <ppipkorn@nvidia.com>
Reviewed-on: http://git-master/r/805961
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently set "aggressive_destroy" flag to destroy
sync object statically and for each sync object
Move this flag to per-platform structure so that it
can be set per-platform for all the sync objects
Also, set the default value of this flag as "false"
and set it to "true" once we have more than 64
channels in use
Bug 200141116
Change-Id: I1bc271df4f468a4087a06a27c7289ee0ec3ef29c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/822041
(cherry picked from commit 98741e7e88066648f4f14490c76b61dbff745103)
Reviewed-on: http://git-master/r/835800
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement per-channel watchdog/timer as per below rules :
- start the timer while submitting first job on channel or if
no timer is already running
- cancel the timer when job completes
- re-start the timer if there is any incomplete job left
in the channel's queue
- trigger appropriate recovery method as part of timeout
handling mechanism
Handle the timeout as per below :
- get timed out channel, and job data
- disable activity on all engines
- check if fence is really pending
- get information on failing engine
- if no engine is failing, just abort the channel
- if engine is failing, trigger the recovery
Also, add flag "ch_wdt_enabled" to enable/disable channel
watchdog mechanism. Watchdog can also be disabled using
global flag "timeouts_enabled"
Set the watchdog time to be 5s using macro
NVGPU_CHANNEL_WATCHDOG_DEFAULT_TIMEOUT_MS
Bug 200133289
Change-Id: I401cf14dd34a210bc429f31bd5216a361edf1237
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/797072
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add below APIs to disable/re-enable activity on all
engines
gk20a_fifo_disable_all_engine_activity()
gk20a_fifo_enable_all_engine_activity()
Bug 200133289
Change-Id: Ie01a260d587807a3c1712ee32fe870fbcb08f9cd
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/798747
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bug 1625901
1) disable ELPG before doing GR reset when runlist update times out
2) add mutex for GR reset to avoid multiple threads resetting GR
3) protect GR reset with FECS mutex so that no one else submits methods
Change-Id: I02993fd1eabe6875ab1c58a40a06e6c79fcdeeae
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/793643
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a_fifo_handle_sched_error(), we currently have a sequence
to identify failing engine (stuck on context switch) and
corresponding failing channel with its type
Separate out this sequence in new API
gk20a_fifo_get_failing_engine_data() so that it can be
reused from else where too
Bug 200133289
Change-Id: I3cef395170cf8990c014c7505c798fd6f2e37921
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/797070
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bug 200114561
1) when handling sched error, if CTXSW status reads switch
check FECS mailbox register to know whether next or current
channel caused error
2) Update recovery function to use ch id passed to it
3) Recovery function now passes mmu_engine_id to mmu fault
handler instead of fifo_engine_id
Change-Id: I3576cc4a90408b2f76b2c42cce19c27344531b1c
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/763538
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add reference counting for channels, and wait for reference count to
get to 0 in gk20a_channel_free() before actually freeing the channel.
Also, change free channel tracking a bit by employing a list of free
channels, which simplifies the procedure of finding available channels
with reference counting.
Each use of a channel must have a reference taken before use or held
by the caller. Taking a reference of a wild channel pointer may fail, if
the channel is either not opened or in a process of being closed. Also,
add safeguards for protecting accidental use of closed channels,
specifically, by setting ch->g = NULL in channel free. This will make it
obvious if freed channel is attempted to be used.
The last user of a channel might be the deferred interrupt handler,
so wait for deferred interrupts to be processed twice in the channel
free procedure: once for providing last notifications to the channel
and once to make sure there are no stale pointers left after referencing
to the channel has been denied.
Finally, fix some races in channel and TSG force reset IOCTL path,
by pausing the channel scheduler in gk20a_fifo_recover_ch() and
gk20a_fifo_recover_tsg(), while the affected engines have been identified,
the appropriate MMU faults triggered, and the MMU faults handled. In this
case, make sure that the MMU fault does not attempt to query the hardware
about the failing channel or TSG ids. This should make channel recovery
more safe also in the regular (i.e., not in the interrupt handler) context.
Bug 1530226
Bug 1597493
Bug 1625901
Bug 200076344
Bug 200071810
Change-Id: Ib274876908e18219c64ea41e50ca443df81d957b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/448463
(cherry picked from commit 3f03aeae64ef2af4829e06f5f63062e8ebd21353)
Reviewed-on: http://git-master/r/755147
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
| |
Introduce mem_desc, which holds all information needed for a buffer.
Implement helper functions for allocation and freeing that use this
data type.
Change-Id: I82c88595d058d4fb8c5c5fbf19d13269e48e422f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/712699
|
|
|
|
|
|
|
|
| |
PBDMA HW signature depends on the chip.
Change-Id: If57d721d9bb77a090f967930a1aa2037bf4a16fe
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/672922
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make recovery a more straightforward process. When we detect a fault,
trigger MMU fault, and wait for it to trigger, and complete recovery.
Also reset engines before aborting channel to ensure no stray sync
point increments can happen.
Change-Id: Iac685db6534cb64fe62d9fb452391f43100f2999
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/709060
(cherry picked from commit 95c62ffd9ac30a0d2eb88d033dcc6e6ff25efd6f)
Reviewed-on: http://git-master/r/707443
|
|
|
|
|
|
|
|
|
|
|
|
| |
Query interrupt number and reset id from HW. Use the number
from HW when enabling and detecting interrupts.
Bug 200036089
Bug 1567274
Change-Id: If9cb4db79a19dcb193ba7ad9db7081f4fe1ab433
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/600988
|
|
|
|
|
|
|
|
|
|
|
| |
Runlist event is not sent in gm20b for updated runlist. Polling is
the preferred way also for gk20a.
Bug 1555239
Change-Id: I60de084db69f848f63451f1f3078f183ca51ba50
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/500241
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use TSG specific API gk20a_fifo_recover_tsg() in following cases :
- IOCTL_CHANNEL_FORCE_RESET
to force reset a channel in TSG, reset all the channels
- handle pbdma intr
while resetting in case of pbdma intr, if channel is part of
TSG, recover entire TSG
- TSG preempt failure
when TSG preempt times out, use TSG recover API
Use preempt_tsg() API to preempt if channel is part of TSG
Add below two generic APIs which will take care of preempting/
recovering either of channel or TSG as required
gk20a_fifo_preempt()
gk20a_fifo_force_reset_ch()
Bug 1470692
Change-Id: I8d46e252af79136be85a9a2accf8b51bd924ca8c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/497875
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- add and export API "gk20a_fifo_recover_tsg()" to
recover a TSG
- if TSG is running on any engine, then trigger MMU fault
on those engines
- otherwise, abort each channel in TSG
- modify channel specific API engines_on_ch() to generic
engines_on_id() which will take an ID and a flag to specify
whether ID is for channel or TSG and return engines running
on that ID
- modify channel specific API get_faulty_channel() to generic
get_faulty_id_type() which will take pointers to ID and type
of ID (either a regular channel or TSG)
- remove runlist update from recover_ch() since
no need to touch runlist during recovery
- set error notifier first and then only abort the channels
for TSG recovery path
- also, add necessary accessors to get engine
status type as TSG
Bug 1470692
Change-Id: I7137f611f80916b3d256d4b0dc6e5cf1e93eef6f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/497873
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add API gk20a_fifo_preempt_tsg() which takes ID of tsg
and preempts it
Bug 1514064
Bug 1470692
Change-Id: I1d52c1dd7a9aecc1314b0f223fe4eedecc033629
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/495583
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we clear the runlist and re-create it in
scheduled work during fifo recovery process
But we can post-pone this runlist re-generation for
later time i.e. when channel is closed
Hence, remove runlist locks and re-generation from
handle_mmu_fault() methods. Instead of that, disable
gr fifo access at start of recovery and re-enable
it at end of recovery process.
Also, delete scheduled work to re-create runlist.
Re-enable EPLG and fifo access in
finish_mmu_fault_handling() itself.
bug 1470692
Change-Id: I705a6a5236734c7207a01d9a9fa9eca22bdbe7eb
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/449225
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Wait for engine idle via FIFO's engine status instead of submitting
WFI to channel. Submitting WFI and waiting is not robust, and wait
might invoke debug dump which cannot be done while powering down.
Bug 1499214
Change-Id: I4d52e8558e1a862ad4292036594d81ebfbd5f36b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/432151
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Tested-by: Sachin Nikam <snikam@nvidia.com>
|