| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add CONFIG_TEGRA_GK20A_NVHOST and remove the TEGRA_GRHOST ||
TEGRA_HOST1X dependency in CONFIG_TEGRA_GK20A to allow using the iGPU
without the nvhost driver. Use the new config to guard syncpt-related
code.
Also make TEGRA_ACR depend on GK20A too so that it aligns properly under
gk20a in menuconfig.
Bug 1853519
Change-Id: I9e9b0a7915d000aae7930821627b7a01d08d3f5c
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1321303
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change the prefix in the semaphore code to 'nvgpu_' since this code
is global to all chips.
Bug 1799159
Change-Id: Ic1f3e13428882019e5d1f547acfe95271cc10da5
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1284628
Reviewed-by: Varun Colbert <vcolbert@nvidia.com>
Tested-by: Varun Colbert <vcolbert@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move semaphore_gk20a.c drivers/gpu/nvgpu/common/ since the semaphore
code is common to all chips.
Move the semaphore_gk20a.h header file to drivers/gpu/nvgpu/include/nvgpu
and rename it to semaphore.h. Also update all places where the header
is inluced to use the new path.
This revealed an odd location for the enum gk20a_mem_rw_flag. This should
be in the mm headers. As a result many places that did not need anything
semaphore related had to include the semaphore header file. Fixing this
oddity allowed the semaphore include to be removed from many C files that
did not need it.
Bug 1799159
Change-Id: Ie017219acf34c4c481747323b9f3ac33e76e064c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1284627
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move the GPU allocators to common/mm/ since the allocators are common
code across all GPUs. Also rename the allocator code to move away from
gk20a_ prefixed structs and functions.
This caused one issue with the nvgpu_alloc() and nvgpu_free() functions.
There was a function for allocating either with kmalloc() or vmalloc()
depending on the size of the allocation. Those have now been renamed to
nvgpu_kalloc() and nvgpu_kfree().
Bug 1799159
Change-Id: Iddda92c013612bcb209847084ec85b8953002fa5
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1274400
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix small problems related to signed versus unsigned comparisons
throughout the driver. Bump up the warning level to prevent such
problems from occuring in future.
Change-Id: I8ff5efb419f664e8a2aedadd6515ae4d18502ae0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1252068
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When allocating a fence fails, free sync_fence only if one has been
created.
Change-Id: I2ecefd25c4e000f415b28c7c2b01b91654d6ef43
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1249958
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove a global debugfs variable and instead save the allocator
debugfs root node in the gk20a struct.
Bug 1799159
Change-Id: If4eed34fa24775e962001e34840b334658f2321c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1225611
(cherry picked from commit 1908fde10bb1fb60ce898ea329f5a441a3e4297a)
Reviewed-on: http://git-master/r/1242390
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change makes the invocation of the deferred job clean-up
mechanism conditional. For submissions that require job tracking,
deferred clean-up is only required if any of the following
conditions are met:
1) Channel's deterministic flag is not set
2) Rail-gating is enabled
3) Channel WDT is enabled
4) Buffer refcounting is enabled
5) Dependency on Sync Framework
In case deferred clean-up is not needed, we clean-up
a single job tracking resource in the submit path. For
deterministic channels, we do not allow deferred clean-up to
occur and fail any submits that require it.
Bug 1795076
Change-Id: I4021dffe8a71aa58f12db6b58518d3f4021f3313
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1220920
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
(cherry picked from commit b09f7589d5ad3c496e7350f1ed583a4fe2db574a)
Reviewed-on: http://git-master/r/1223941
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes to fence framework's usage of allocator APIs to be
compatible w/ 32-bit architectures.
Bug 1795076
Change-Id: Ia677f9842c36d482d4e82e9fa09613702f3111b3
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1237904
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for pre-allocation of job tracking resources
w/ new (extended) ioctl. Goal is to avoid dynamic memory
allocation in the submit path. This patch does the following:
1) Intoduces a new ioctl, NVGPU_IOCTL_CHANNEL_ALLOC_GPFIFO_EX,
which enables pre-allocation of tracking resources per job:
a) 2x priv_cmd_entry
b) 2x gk20a_fence
2) Implements circular ring buffer for job
tracking to avoid lock contention between producer
(submitter) and consumer (clean-up)
Bug 1795076
Change-Id: I6b52e5c575871107ff380f9a5790f440a6969347
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1203300
(cherry picked from commit 9fd270c22b860935dffe244753dabd87454bef39)
Reviewed-on: http://git-master/r/1223934
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change is the first of a series of changes to
support the usage of pre-allocated job tracking resources
in the submit path. With this change, we still maintain a
dynamically-allocated joblist, but make the necessary changes
in the channel_sync & fence framework to use in-place
allocations. Specifically, we:
1) Update channel sync framework routines to take in
pre-allocated priv_cmd_entry(s) & gk20a_fence(s) rather
than dynamically allocating themselves
2) Move allocation of priv_cmd_entry(s) & gk20a_fence(s)
to gk20a_submit_prepare_syncs
3) Modify fence framework to have seperate allocation
and init APIs. We expose allocation as a seperate API, so
the client can allocate the object before passing it into
the channel sync framework.
4) Fix clean_up logic in channel sync framework
Bug 1795076
Change-Id: I96db457683cd207fd029c31c45f548f98055e844
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1206725
(cherry picked from commit 9d196fd10db6c2f934c2a53b1fc0500eb4626624)
Reviewed-on: http://git-master/r/1223933
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Only create sync-fences in the semaphore synchronization path
when they are actually needed (i.e requested by userspace).
Bug 1795076
Reviewed-on: http://git-master/r/1201564
(cherry picked from commit dc52d424a839e6c064c02b7f02905dd6a59a50af)
Change-Id: Ieac6aef415678d4ea982683a955897c64959436e
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1221041
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Revamp the support the nvgpu driver has for semaphores.
The original problem with nvgpu's semaphore support is that it
required a SW based wait for every semaphore release. This was
because for every fence that gk20a_channel_semaphore_wait_fd()
waited on a new semaphore was created. This semaphore would then
get released by SW when the fence signaled. This meant that for
every release there was necessarily a sync_fence_wait_async() call
which could block. The latency of this SW wait was enough to cause
massive degredation in performance.
To fix this a fast path was implemented. When a fence is passed to
gk20a_channel_semaphore_wait_fd() that is backed by a GPU semaphore
a semaphore acquire is directly used to block the GPU. No longer is
a sync_fence_wait_async() performed nor is there an extra semaphore
created.
To implement this fast path the semaphore memory had to be shared
between channels. Previously since a new semaphore was created
every time through gk20a_channel_semaphore_wait_fd() what address
space a semaphore was mapped into was irrelevant. However, when
using the fast path a sempahore may be released on one address
space but acquired in another.
Sharing the semaphore memory was done by making a fixed GPU mapping
in all channels. This mapping points to the semaphore memory (the
so called semaphore sea). This global fixed mapping is read-only to
make sure no semaphores can be incremented (i.e released) by a
malicious channel. Each channel then gets a RW mapping of it's own
semaphore. This way a channel may only acquire other channel's
semaphores but may both acquire and release its own semaphore.
The gk20a fence code was updated to allow introspection of the GPU
backed fences. This allows detection of when the fast path can be
taken. If the fast path cannot be used (for example when a fence is
sync-pt backed) the original slow path is still present. This gets
used when the GPU needs to wait on an event from something which
only understands how to use sync-pts.
Bug 1732449
JIRA DNVGPU-12
Change-Id: Ic0fea74994da5819a771deac726bb0d47a33c2de
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133792
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rename alloc_fence() to gk20a_alloc_fence() and allow this function
to be called by the channel_sync_gk20a.c code.
Bug 1732449
JIRA DNVGPU-12
Change-Id: Ic17131db2c8545832a2e8caacbd092cf970af4d1
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1162687
Reviewed-by: David Martinez Nieto <dmartineznie@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace manual buffer allocation and cpu_va pointer accesses with
gk20a_gmmu_{alloc,free}() and gk20a_mem_{rd,wr}() using a struct
mem_desc in gk20a_semaphore_pool, for buffer aperture flexibility.
JIRA DNVGPU-23
Change-Id: I394c38f407a9da02480bfd35062a892eec242ea3
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1146684
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make necessary changes to support nvgpu on kernel-3.10
This includes below changes
- PROBE_PREFER_ASYNCHRONOUS is defined only for K3.10
- Fence handling and struct sync_fence is different between
K3.10 and K3.18
- variable status in struct sync_fence is atomic on K3.18
whereas it is int on K3.10
- if SOC == T132, set soc_name = "tegra13x"
- ioremap_cache() is not defined on K3.10 ARM versions,
hence use ioremap_cached()
Bug 200188753
Change-Id: I18d77eb1404e15054e8510d67c9a61c0f1883e2b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1121092
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
get_unused_fd() is deprecated on kernel-4.4,
but get_unused_fd_flags() is supported on both
kernel-4.4 and previous versions
hence remove use of get_unused_fd() and use
get_unused_fd_flags()
Change-Id: I132aa67d2bc23a698848ac51d2d176d7d33e1695
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1126847
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move nvhost out of the common Linux repo and into its own git repo. By
doing this, the same nvhost driver can work on different Linux kernel
versions.
Previously android/sync.h was referenced relative to
kernel/drivers/video/tegra/host. However, host moved to a completely
different part of the tree. Instead, reference it relative to kernel/include.
bug 1749413
Change-Id: Ic7f94093c712e5b64c9b3b660d6fce5d18e59bc0
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/1120544
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Tested-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A race condition existed in gk20a_channel_semaphore_wait_fd().
In some instances the semaphore underlying the sync_fence being
waited on would have already signaled. This would cause the
subsequent sync_fence_wait_async() call to return 1 and do
nothing. Normally, the sync_fence_wait_async() call would
release the newly created semaphore but in the above case that
would not happen and hang any channel waiting on that semaphore.
To fix this problem if sync_fence_wait_async() returns 1
immediately release the newly created semaphore.
Bug 1604892
Change-Id: I1f5e811695bb099f71b7762835aba4a7e27362ec
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/935910
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Coverity id : 20260
Bug 1416640
Change-Id: I6ca8df2ed001df99ad46e476b1fe4de9f1346786
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/922362
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, we create sync_fence (from nvhost_sync_create_fence())
for every submit
But not all submits request for a sync_fence.
Also, nvhost_sync_create_fence() API takes about 1/3rd of the total
submit path.
Hence to optimize, we can allocate sync_fence
only when user explicitly asks for it using
(NVGPU_SUBMIT_GPFIFO_FLAGS_FENCE_GET &&
NVGPU_SUBMIT_GPFIFO_FLAGS_SYNC_FENCE)
Also, in CDE path from gk20a_prepare_compressible_read(),
we reuse existing fence stored in "state" and that can
result into not returning sync_fence_fd when user asked
for it
Hence, force allocation of sync_fence when job submission
comes from CDE path
Bug 200141116
Change-Id: Ia921701bf0e2432d6b8a5e8b7d91160e7f52db1e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/812845
(cherry picked from commit 5fd47015eeed00352cc8473eff969a66c94fee98)
Reviewed-on: http://git-master/r/837662
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Handling null pointer in gk20a_fence_is_expired.
Bug 200117724
Change-Id: I0f9307a5f8b82bf990b6ddaea1a408d4f3f376fb
Signed-off-by: Gagan Grover <ggrover@nvidia.com>
Reviewed-on: http://git-master/r/777796
(cherry picked from commit dbf5bae53e0e7862754faba78eab84284786ecb3)
Reviewed-on: http://git-master/r/795356
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
|
|
|
|
|
|
|
|
|
| |
On presilicon, syncpt waits should have infinite timeout.
Change-Id: Ifa9b2fa0ef164e2f87a631bca77941e995b06ad4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/717947
Reviewed-by: Kirill Artamonov <kartamonov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Semaphore fences use int for timeout. It should be long instead.
Bug 1567274
Change-Id: Ia2b2c5ceeb03b4d09c1d8933ce33310356dd7e01
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/595980
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To help remove the nvhost dependency from nvgpu, rename ioctl defines
and structures used by nvgpu such that nvhost is replaced by nvgpu.
Duplicate some structures as needed.
Update header guards and such accordingly.
Change-Id: Ifc3a867713072bae70256502735583ab38381877
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/542620
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
nvhost_sync_create_fence returns ERR_PTRs instead of NULLs on error;
check for its errors with IS_ERR.
Change-Id: I9752e0d8fa703b2872918b23721ae973be58bf35
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/533794
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes a refcounting issue in semaphore handling.
Change-Id: I03327c60ed6923a90663f0b845566e81af4b94d4
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/453056
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
When moving compression state tracking and compbit management ops to
kernel, we need to attach a fence to dma-buf metadata, along with the
compbit state.
To make in-kernel fence management easier, introduce a new gk20a_fence
abstraction. A gk20a_fence may be backed by a semaphore or a syncpoint
(id, value) pair. If the kernel is configured with CONFIG_SYNC, it will
also contain a sync_fence. The gk20a_fence can easily be converted back
to a syncpoint (id, value) parir or sync FD when we need to return it to
user space.
Change gk20a_submit_channel_gpfifo to return a gk20a_fence instead of
nvhost_fence. This is to facilitate work submission initiated from
kernel.
Bug 1509620
Change-Id: I6154764a279dba83f5e91ba9e0cb5e227ca08e1b
Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com>
Reviewed-on: http://git-master/r/439846
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|