| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MISRA Rule 11.6 prohibits the casting of an integer value to a
void *.
The nvgpu allocator used for the fence pool stores the base
address of the associated memory as a u64 and returns it via
nvgpu_alloc_base().
In gk20a_free_fence_pool() this u64 value was cast to a void *
before being passed to nvgpu_vfree() (leading to the violation).
This change modifies gk20a_free_fence_pool() to cast the base
address back to the original struct gk20a_fence * to eliminate
the violation.
JIRA NVGPU-895: MISRA Rule 11.6 violations
Change-Id: If89cf2c1bc8ea4b0b59da4cf8b1c167738f6badc
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1774530
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sync_gk20a.* files are no longer used by core code and only invoked
from linux specific implementations of the OS_FENCE framework which are
under the common/linux directory. Hence, sync_gk20a.* files are also
moved under common/linux.
JIRA NVGPU-66
Change-Id: If623524611373d2da39b63cfb3c1e40089bf8d22
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1712900
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replaced all instances of sync_fence in gk20a_fence* code with
nvgpu_os_fence. Added the API install_fence for the nvgpu_os_fence
abstraction. sync_fence mechanism and its dependencies are completely
removed from the fence_gk20a methods. Due to the recent os_fence changes
and the changes to fence_gk20a, we can finally get rid of all the CONFIG_SYNCS
present in the submit path.
JIRA NVGPU-66
Change-Id: I3551dab04b93b1e94db83fc102a41872be89e9ed
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1701245
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch deals with cleanups meant to make things simpler for the
upcoming os abstraction patches for the sync framework. This patch
causes some substantial changes which are listed out as follows.
1) sync_timeline is moved out of gk20a_fence into struct
nvgpu_channel_linux. New function pointers are created to facilitate os
independent methods for enabling/disabling timeline and are now named
as os_fence_framework. These function pointers are located in the struct
os_channel under struct gk20a.
2) construction of the channel_sync require nvgpu_finalize_poweron_linux()
to be invoked before invocations to nvgpu_init_mm_ce_context(). Hence,
these methods are now moved away from gk20a_finalize_poweron() and
invoked after nvgpu_finalize_poweron_linux().
3) sync_fence creation is now delinked from fence construction and move
to the channel_sync_gk20a's channel_incr methods. These sync_fences are
mainly associated with post_fences.
4) In case userspace requires the sync_fences to be constructed, we
try to obtain an fd before the gk20a_channel_submit_gpfifo() instead of
trying to do that later. This is used to avoid potential after effects
of duplicate work submission due to failure to obtain an unused fd.
JIRA NVGPU-66
Change-Id: I42a3e4e2e692a113b1b36d2b48ab107ae4444dfa
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1678400
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The boolean wfi field in struct gk20a_fence is not used for anything.
Delete it and a couple of function parameters that carried the flag.
Jira NVGPU-43
Change-Id: I399c8709102a3f944cab669ff806761aedaeb6d3
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1636344
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In fence_gk20a.c protect <linux/file.h> and <linux/fs.h> includes with config
CONFIG_SYNC since they are only needed with this config enabled
Jira NVGPU-487
Change-Id: I6c26aa0fbb4ee284129109c625a0e324d5caf235
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1635471
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The functions gk20a_fence_from_semaphore and gk20a_fence_from_syncpt
return errno-like codes, so replace two conditions with better-fitting
errors than -1.
Change-Id: Ic9a43cd0365c1eb187e7dc19da14acdd2fbc3f1c
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1605563
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For nvhost_sync_create_fence, num_pts corresponds to the number of
syncpoints in the array given to it, and the wrapper
nvgpu_nvhost_sync_create_fence only supports one syncpoint at a time.
Use 1 explicitly and make it impossible for the caller of this wrapper
to use something else by mistake.
Change-Id: I2497c1dd4fed0906e3bb07e8f5ddd3a9346cb381
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1604339
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change license of OS independent source code files to MIT.
JIRA NVGPU-218
Change-Id: I1474065f4b552112786974a16cdf076c5179540e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1565880
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- added wrapper struct nvgpu_ref over nvgpu_atomic_t
- added nvgpu_ref_* APIs to access the above struct
JIRA NVGPU-140
Change-Id: Id47f897995dd4721751f7610b6d4d4fbfe4d6b9a
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1540899
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Remove support for pre-4.4 kernels. This allows deleting the checks
for kernel version, and usage of linux/version.h.
Change-Id: I4d6cb30512ea164d27549f4f4d096e5931bb1379
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1543499
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
construct wrapper nvgpu_* methods to replace
mb,rmb,wmb,smp_mb,smp_rmb,smp_wmb,read_barrier_depends and
smp_read_barrier_depends.
NVGPU-122
Change-Id: I8d24dd70fef5cb0fadaacc15f3ab11531667a0df
Signed-off-by: Debarshi <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1541199
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sourab Gupta <sourabg@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix some simple pointer arithmetic errors in the deterministic submit
path. The lockless allocator was doing a subtraction to determine the
correct offset of the element to free. However, this subtraction was
using the base address and a numeric offset - not a pointer offset. Thus
the difference computed was in bytes not in elements of the block size.
The fix is simple: just divide by the block size.
Also this modifies the debugging statement a bit so that a bit more
information is printed at more useful times.
Lastly, a pointer to numeric cast was fixed in the fence code.
Change-Id: I514724205f1b73805b21e979481a13ac689f3482
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1538905
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To make do_idle work when nvgpu is built as a module, reverse the order
of call dependencies for do_idle. Don't provide visible
gk20a_do_{idle,unidle}() functions for the kernel but instead call the
kernel for registering and unregistering pointers to them when the
driver loads and unloads.
Refactor the internal __gk20a_do_{idle,unidle} functions to take a
struct gk20a * instead of struct device *, and use the callback api for
providing that g instead of retrieving the plat device from device tree.
Bug 200290850
Change-Id: Ibef8b069302e547b298069cbb97734f461a10cc3
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1493774
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove use of linux specifix header files
<linux/nvhost.h> and <linux/nvhost_ioctl.h>
and use nvgpu specific header file <nvgpu/nvhost.h>
instead
This is needed to remove all Linux dependencies
from nvgpu driver
Replace all nvhost_*() calls by
nvgpu_nvhost_*() calls from new nvgpu library
Remove platform device pointer host1x_dev
from struct gk20a and add struct
nvgpu_nvhost_dev instead
Jira NVGPU-29
Change-Id: Ia7af70602cfc16f9ccc380752538c05a9cbb8a67
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1489726
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The timeout parameter to NVGPU_COND_WAIT()
was passed directly to wait_event_timeout(), which takes jiffies.
Also allows zero timeout to disable timeout.
The return value of NVGPU_COND_WAIT() was defined in a way specific
to how Linux wait_event_() calls work. Replace that with proper error
reporting and change the callers to check against error codes.
JIRA NVGPU-14
Change-Id: Idbd2c8fbbef7589c3ca4f4c5732852bc71217515
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1484927
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change semaphore wait queue to use nvgpu_cond instead of Linux wait
queue.
JIRA NVGPU-14
Change-Id: I3be5097ded168300b4480e986218d9f4fd6104b1
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1469852
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix issues related with wrong storage type for
64 bit variables.
(1) Fixed width of HZ_TO_MHZ constant
(2) changed fence_wait timeout to store unsigned
long
bug 200299572
Change-Id: Ie8f2386b738f3aafce75fc2440947e36befac273
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1471611
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Put all debugfs dependencies inside #ifdef CONFIG_DEBUG_FS. This
includes some functions in allocators that were used only for
debugging.
Remove include of linux/debugfs.h on files that do not deal with
debugfs.
linux/debugfs.h implicitly included linux/fs.h, which we relied on.
Add explicit include of linux/fs.h for all files where this is the
case.
Change-Id: I16feffae6b0e3a2edf366075cdc01ade86be06f9
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1467897
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add nvgpu_* wrappers for determining if we're running in simulation
or silicon, and if we're running in hypervisor.
The new wrappers require struct gk20a pointer, and gk20a_fence_wait()
did not have access to one. Add struct gk20a pointer as the first
parameter.
JIRA NVGPU-16
Change-Id: I73b2b8f091ca29fb1827054abd2adaf583710331
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1331565
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In preparation for better abstraction in job synchronization, drop
support for the dependency fences tracked via submit pre-fences in
semaphore-based syncs. This has only worked for semaphores, not nvhost
syncpoints, and hasn't really been used. The dependency was printed in
the sync framework's sync pt value string.
Remove also the userspace-visible gk20a_sync_pt_info which is not used
and depends on this feature (providing a duration since the dependency
fence's timestamp).
Jira NVGPU-43
Change-Id: Ia2b26502a9dc8f5bef5470f94b1475001f621da1
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1456880
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the new kmem API functions in misc gk20a code. Some additional
modifications were also made:
o Add a struct gk20a pointer to gk20a_fence to enable proper
kmem free usage.
o Add gk20a pointer to alloc_session() in dbg_gpu_gk20a.c to
use kmem API for allocating a session.
o Plumb a gk20a pointer through the fence creation and deletion.
o Use statically allocated buffers for names in file creation.
Bug 1799159
Bug 1823380
Change-Id: I3678080e3ffa1f9bcf6934e3f4819a1bc531689b
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1318323
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add CONFIG_TEGRA_GK20A_NVHOST and remove the TEGRA_GRHOST ||
TEGRA_HOST1X dependency in CONFIG_TEGRA_GK20A to allow using the iGPU
without the nvhost driver. Use the new config to guard syncpt-related
code.
Also make TEGRA_ACR depend on GK20A too so that it aligns properly under
gk20a in menuconfig.
Bug 1853519
Change-Id: I9e9b0a7915d000aae7930821627b7a01d08d3f5c
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1321303
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change the prefix in the semaphore code to 'nvgpu_' since this code
is global to all chips.
Bug 1799159
Change-Id: Ic1f3e13428882019e5d1f547acfe95271cc10da5
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1284628
Reviewed-by: Varun Colbert <vcolbert@nvidia.com>
Tested-by: Varun Colbert <vcolbert@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move semaphore_gk20a.c drivers/gpu/nvgpu/common/ since the semaphore
code is common to all chips.
Move the semaphore_gk20a.h header file to drivers/gpu/nvgpu/include/nvgpu
and rename it to semaphore.h. Also update all places where the header
is inluced to use the new path.
This revealed an odd location for the enum gk20a_mem_rw_flag. This should
be in the mm headers. As a result many places that did not need anything
semaphore related had to include the semaphore header file. Fixing this
oddity allowed the semaphore include to be removed from many C files that
did not need it.
Bug 1799159
Change-Id: Ie017219acf34c4c481747323b9f3ac33e76e064c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1284627
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move the GPU allocators to common/mm/ since the allocators are common
code across all GPUs. Also rename the allocator code to move away from
gk20a_ prefixed structs and functions.
This caused one issue with the nvgpu_alloc() and nvgpu_free() functions.
There was a function for allocating either with kmalloc() or vmalloc()
depending on the size of the allocation. Those have now been renamed to
nvgpu_kalloc() and nvgpu_kfree().
Bug 1799159
Change-Id: Iddda92c013612bcb209847084ec85b8953002fa5
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1274400
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix small problems related to signed versus unsigned comparisons
throughout the driver. Bump up the warning level to prevent such
problems from occuring in future.
Change-Id: I8ff5efb419f664e8a2aedadd6515ae4d18502ae0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1252068
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When allocating a fence fails, free sync_fence only if one has been
created.
Change-Id: I2ecefd25c4e000f415b28c7c2b01b91654d6ef43
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1249958
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove a global debugfs variable and instead save the allocator
debugfs root node in the gk20a struct.
Bug 1799159
Change-Id: If4eed34fa24775e962001e34840b334658f2321c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1225611
(cherry picked from commit 1908fde10bb1fb60ce898ea329f5a441a3e4297a)
Reviewed-on: http://git-master/r/1242390
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change makes the invocation of the deferred job clean-up
mechanism conditional. For submissions that require job tracking,
deferred clean-up is only required if any of the following
conditions are met:
1) Channel's deterministic flag is not set
2) Rail-gating is enabled
3) Channel WDT is enabled
4) Buffer refcounting is enabled
5) Dependency on Sync Framework
In case deferred clean-up is not needed, we clean-up
a single job tracking resource in the submit path. For
deterministic channels, we do not allow deferred clean-up to
occur and fail any submits that require it.
Bug 1795076
Change-Id: I4021dffe8a71aa58f12db6b58518d3f4021f3313
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1220920
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
(cherry picked from commit b09f7589d5ad3c496e7350f1ed583a4fe2db574a)
Reviewed-on: http://git-master/r/1223941
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes to fence framework's usage of allocator APIs to be
compatible w/ 32-bit architectures.
Bug 1795076
Change-Id: Ia677f9842c36d482d4e82e9fa09613702f3111b3
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1237904
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for pre-allocation of job tracking resources
w/ new (extended) ioctl. Goal is to avoid dynamic memory
allocation in the submit path. This patch does the following:
1) Intoduces a new ioctl, NVGPU_IOCTL_CHANNEL_ALLOC_GPFIFO_EX,
which enables pre-allocation of tracking resources per job:
a) 2x priv_cmd_entry
b) 2x gk20a_fence
2) Implements circular ring buffer for job
tracking to avoid lock contention between producer
(submitter) and consumer (clean-up)
Bug 1795076
Change-Id: I6b52e5c575871107ff380f9a5790f440a6969347
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1203300
(cherry picked from commit 9fd270c22b860935dffe244753dabd87454bef39)
Reviewed-on: http://git-master/r/1223934
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change is the first of a series of changes to
support the usage of pre-allocated job tracking resources
in the submit path. With this change, we still maintain a
dynamically-allocated joblist, but make the necessary changes
in the channel_sync & fence framework to use in-place
allocations. Specifically, we:
1) Update channel sync framework routines to take in
pre-allocated priv_cmd_entry(s) & gk20a_fence(s) rather
than dynamically allocating themselves
2) Move allocation of priv_cmd_entry(s) & gk20a_fence(s)
to gk20a_submit_prepare_syncs
3) Modify fence framework to have seperate allocation
and init APIs. We expose allocation as a seperate API, so
the client can allocate the object before passing it into
the channel sync framework.
4) Fix clean_up logic in channel sync framework
Bug 1795076
Change-Id: I96db457683cd207fd029c31c45f548f98055e844
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1206725
(cherry picked from commit 9d196fd10db6c2f934c2a53b1fc0500eb4626624)
Reviewed-on: http://git-master/r/1223933
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Only create sync-fences in the semaphore synchronization path
when they are actually needed (i.e requested by userspace).
Bug 1795076
Reviewed-on: http://git-master/r/1201564
(cherry picked from commit dc52d424a839e6c064c02b7f02905dd6a59a50af)
Change-Id: Ieac6aef415678d4ea982683a955897c64959436e
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1221041
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Revamp the support the nvgpu driver has for semaphores.
The original problem with nvgpu's semaphore support is that it
required a SW based wait for every semaphore release. This was
because for every fence that gk20a_channel_semaphore_wait_fd()
waited on a new semaphore was created. This semaphore would then
get released by SW when the fence signaled. This meant that for
every release there was necessarily a sync_fence_wait_async() call
which could block. The latency of this SW wait was enough to cause
massive degredation in performance.
To fix this a fast path was implemented. When a fence is passed to
gk20a_channel_semaphore_wait_fd() that is backed by a GPU semaphore
a semaphore acquire is directly used to block the GPU. No longer is
a sync_fence_wait_async() performed nor is there an extra semaphore
created.
To implement this fast path the semaphore memory had to be shared
between channels. Previously since a new semaphore was created
every time through gk20a_channel_semaphore_wait_fd() what address
space a semaphore was mapped into was irrelevant. However, when
using the fast path a sempahore may be released on one address
space but acquired in another.
Sharing the semaphore memory was done by making a fixed GPU mapping
in all channels. This mapping points to the semaphore memory (the
so called semaphore sea). This global fixed mapping is read-only to
make sure no semaphores can be incremented (i.e released) by a
malicious channel. Each channel then gets a RW mapping of it's own
semaphore. This way a channel may only acquire other channel's
semaphores but may both acquire and release its own semaphore.
The gk20a fence code was updated to allow introspection of the GPU
backed fences. This allows detection of when the fast path can be
taken. If the fast path cannot be used (for example when a fence is
sync-pt backed) the original slow path is still present. This gets
used when the GPU needs to wait on an event from something which
only understands how to use sync-pts.
Bug 1732449
JIRA DNVGPU-12
Change-Id: Ic0fea74994da5819a771deac726bb0d47a33c2de
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133792
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rename alloc_fence() to gk20a_alloc_fence() and allow this function
to be called by the channel_sync_gk20a.c code.
Bug 1732449
JIRA DNVGPU-12
Change-Id: Ic17131db2c8545832a2e8caacbd092cf970af4d1
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1162687
Reviewed-by: David Martinez Nieto <dmartineznie@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace manual buffer allocation and cpu_va pointer accesses with
gk20a_gmmu_{alloc,free}() and gk20a_mem_{rd,wr}() using a struct
mem_desc in gk20a_semaphore_pool, for buffer aperture flexibility.
JIRA DNVGPU-23
Change-Id: I394c38f407a9da02480bfd35062a892eec242ea3
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1146684
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make necessary changes to support nvgpu on kernel-3.10
This includes below changes
- PROBE_PREFER_ASYNCHRONOUS is defined only for K3.10
- Fence handling and struct sync_fence is different between
K3.10 and K3.18
- variable status in struct sync_fence is atomic on K3.18
whereas it is int on K3.10
- if SOC == T132, set soc_name = "tegra13x"
- ioremap_cache() is not defined on K3.10 ARM versions,
hence use ioremap_cached()
Bug 200188753
Change-Id: I18d77eb1404e15054e8510d67c9a61c0f1883e2b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1121092
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
get_unused_fd() is deprecated on kernel-4.4,
but get_unused_fd_flags() is supported on both
kernel-4.4 and previous versions
hence remove use of get_unused_fd() and use
get_unused_fd_flags()
Change-Id: I132aa67d2bc23a698848ac51d2d176d7d33e1695
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1126847
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move nvhost out of the common Linux repo and into its own git repo. By
doing this, the same nvhost driver can work on different Linux kernel
versions.
Previously android/sync.h was referenced relative to
kernel/drivers/video/tegra/host. However, host moved to a completely
different part of the tree. Instead, reference it relative to kernel/include.
bug 1749413
Change-Id: Ic7f94093c712e5b64c9b3b660d6fce5d18e59bc0
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/1120544
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Tested-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A race condition existed in gk20a_channel_semaphore_wait_fd().
In some instances the semaphore underlying the sync_fence being
waited on would have already signaled. This would cause the
subsequent sync_fence_wait_async() call to return 1 and do
nothing. Normally, the sync_fence_wait_async() call would
release the newly created semaphore but in the above case that
would not happen and hang any channel waiting on that semaphore.
To fix this problem if sync_fence_wait_async() returns 1
immediately release the newly created semaphore.
Bug 1604892
Change-Id: I1f5e811695bb099f71b7762835aba4a7e27362ec
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/935910
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Coverity id : 20260
Bug 1416640
Change-Id: I6ca8df2ed001df99ad46e476b1fe4de9f1346786
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/922362
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, we create sync_fence (from nvhost_sync_create_fence())
for every submit
But not all submits request for a sync_fence.
Also, nvhost_sync_create_fence() API takes about 1/3rd of the total
submit path.
Hence to optimize, we can allocate sync_fence
only when user explicitly asks for it using
(NVGPU_SUBMIT_GPFIFO_FLAGS_FENCE_GET &&
NVGPU_SUBMIT_GPFIFO_FLAGS_SYNC_FENCE)
Also, in CDE path from gk20a_prepare_compressible_read(),
we reuse existing fence stored in "state" and that can
result into not returning sync_fence_fd when user asked
for it
Hence, force allocation of sync_fence when job submission
comes from CDE path
Bug 200141116
Change-Id: Ia921701bf0e2432d6b8a5e8b7d91160e7f52db1e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/812845
(cherry picked from commit 5fd47015eeed00352cc8473eff969a66c94fee98)
Reviewed-on: http://git-master/r/837662
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Handling null pointer in gk20a_fence_is_expired.
Bug 200117724
Change-Id: I0f9307a5f8b82bf990b6ddaea1a408d4f3f376fb
Signed-off-by: Gagan Grover <ggrover@nvidia.com>
Reviewed-on: http://git-master/r/777796
(cherry picked from commit dbf5bae53e0e7862754faba78eab84284786ecb3)
Reviewed-on: http://git-master/r/795356
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
|
|
|
|
|
|
|
|
|
| |
On presilicon, syncpt waits should have infinite timeout.
Change-Id: Ifa9b2fa0ef164e2f87a631bca77941e995b06ad4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/717947
Reviewed-by: Kirill Artamonov <kartamonov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Semaphore fences use int for timeout. It should be long instead.
Bug 1567274
Change-Id: Ia2b2c5ceeb03b4d09c1d8933ce33310356dd7e01
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/595980
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To help remove the nvhost dependency from nvgpu, rename ioctl defines
and structures used by nvgpu such that nvhost is replaced by nvgpu.
Duplicate some structures as needed.
Update header guards and such accordingly.
Change-Id: Ifc3a867713072bae70256502735583ab38381877
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/542620
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
nvhost_sync_create_fence returns ERR_PTRs instead of NULLs on error;
check for its errors with IS_ERR.
Change-Id: I9752e0d8fa703b2872918b23721ae973be58bf35
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/533794
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes a refcounting issue in semaphore handling.
Change-Id: I03327c60ed6923a90663f0b845566e81af4b94d4
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/453056
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
When moving compression state tracking and compbit management ops to
kernel, we need to attach a fence to dma-buf metadata, along with the
compbit state.
To make in-kernel fence management easier, introduce a new gk20a_fence
abstraction. A gk20a_fence may be backed by a semaphore or a syncpoint
(id, value) pair. If the kernel is configured with CONFIG_SYNC, it will
also contain a sync_fence. The gk20a_fence can easily be converted back
to a syncpoint (id, value) parir or sync FD when we need to return it to
user space.
Change gk20a_submit_channel_gpfifo to return a gk20a_fence instead of
nvhost_fence. This is to facilitate work submission initiated from
kernel.
Bug 1509620
Change-Id: I6154764a279dba83f5e91ba9e0cb5e227ca08e1b
Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com>
Reviewed-on: http://git-master/r/439846
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|