| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If page tables are allocated from vidmem, cpu cache flushing doesn't
make sense, so skip it. Unify also map/unmap actions if the pages are
not mapped.
Jira DNVGPU-20
Change-Id: I36b22749aab99a7bae26c869075f8073eab0f860
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1178830
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The allocator doesn't give us empty pages, so make sure that they're
full of zeros, just like the sysmem alloc path does.
Jira DNVGPU-16
Change-Id: I0ff8a0718829b13973535ba1111a8a11b91be04d
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1178829
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For devices that have vidmem available, use the vidmem allocator in
gk20a_gmmu_alloc{,attr,_map,_map_attr}. For others, use sysmem.
Because all of the buffers haven't been tested to work in vidmem yet,
rename calls to gk20a_gmmu_alloc{,attr,_map,_map_attr} to have _sys at
the end to declare explicitly that vidmem is used. Enabling vidmem for
each now is a matter of removing "_sys" from the function call.
Jira DNVGPU-18
Change-Id: Ibe42f67eff2c2b68c36582e978ace419dc815dc5
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1176805
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function is only used in mm_gk20a.c and as a result should be
static (fixes a sparse issue).
Bug 200088648
Change-Id: I6787b4ebc5925a503d8ef2fed90c3d7cd5027589
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1176309
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Propagate the buffer aperture flag in gk20a_locked_gmmu_map up so that
buffers represented as a mem_desc and present in vidmem can be mapped to
gpu.
JIRA DNVGPU-18
JIRA DNVGPU-76
Change-Id: I46cf87e27229123016727339b9349d5e2c835b3e
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1169308
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Modify page table updates to take an aperture flag (up until
gk20a_locked_gmmu_map()), don't hard-assume sysmem and propagate it to
hardware.
Jira DNVGPU-76
Change-Id: Ifcb22900c96db993068edd110e09368f72b06f69
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1169307
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
add gk20a_aperture_mask() for memory target selection now that buffers
can actually be allocated from vidmem, and use it in all cases that have
a mem_desc available.
Jira DNVGPU-76
Change-Id: I4353cdc6e1e79488f0875581cfaf2a5cfb8c976a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1169306
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Cast cacheline_start to u64 to use 64-bit maths for cacheline offset.
Coverity ID 24250
Change-Id: Ic10c92ebb737bd39486a83e4de53cc1191193667
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1172053
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Revamp the support the nvgpu driver has for semaphores.
The original problem with nvgpu's semaphore support is that it
required a SW based wait for every semaphore release. This was
because for every fence that gk20a_channel_semaphore_wait_fd()
waited on a new semaphore was created. This semaphore would then
get released by SW when the fence signaled. This meant that for
every release there was necessarily a sync_fence_wait_async() call
which could block. The latency of this SW wait was enough to cause
massive degredation in performance.
To fix this a fast path was implemented. When a fence is passed to
gk20a_channel_semaphore_wait_fd() that is backed by a GPU semaphore
a semaphore acquire is directly used to block the GPU. No longer is
a sync_fence_wait_async() performed nor is there an extra semaphore
created.
To implement this fast path the semaphore memory had to be shared
between channels. Previously since a new semaphore was created
every time through gk20a_channel_semaphore_wait_fd() what address
space a semaphore was mapped into was irrelevant. However, when
using the fast path a sempahore may be released on one address
space but acquired in another.
Sharing the semaphore memory was done by making a fixed GPU mapping
in all channels. This mapping points to the semaphore memory (the
so called semaphore sea). This global fixed mapping is read-only to
make sure no semaphores can be incremented (i.e released) by a
malicious channel. Each channel then gets a RW mapping of it's own
semaphore. This way a channel may only acquire other channel's
semaphores but may both acquire and release its own semaphore.
The gk20a fence code was updated to allow introspection of the GPU
backed fences. This allows detection of when the fast path can be
taken. If the fast path cannot be used (for example when a fence is
sync-pt backed) the original slow path is still present. This gets
used when the GPU needs to wait on an event from something which
only understands how to use sync-pts.
Bug 1732449
JIRA DNVGPU-12
Change-Id: Ic0fea74994da5819a771deac726bb0d47a33c2de
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133792
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add in-nvgpu APIs for allocating and freeing mem_descs in video memory.
Changes for gmmu tables etc. will be added in upcoming changes.
Video memory is allocated via nvmap by initially registering the
aperture size to it and binding it to a struct device, and then going
via the usual dma alloc. This API allows also fixed-address allocations,
meant for reserving special memory areas at boot.
The aperture registration is skipped completely if vidmem isn't found
for the particular device.
gk20a_gmmu_alloc_attr() still uses sysmem, and the unmap/free paths
select internally the correct path by the mem_desc's aperture.
Video memory allocation is off by default, and can be turned on with
CONFIG_GK20A_VIDMEM.
JIRA DNVGPU-16
Change-Id: I77eae5ea90cbed6f4b5db0da86c5f70ddf2a34f9
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1157216
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of going via gk20a_mem_{wr,rd}32() on each iteration, do direct
memcpy/memset with sysmem, and minimize the enter/exit overhead with
vidmem.
JIRA DNVGPU-23
Change-Id: I5437e35f8393a746777a40636c1e9b5d93ced1f6
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1159524
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Read video memory size from hardware during initialization for devices
that support it.
JIRA DNVGPU-14
Change-Id: If190f2d89f7148520ee274ca674f972987c8056d
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1157215
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Save the whole bar0 window register that encodes also the target
aperture (vid/sys mem) instead of only the base address that could
overlap between the two.
JIRA DNVGPU-23
Change-Id: I2ccbea0e1f7c7310c1ca6b158afafe8fd974a615
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1159523
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To support vidmem, implement a way to access buffers via the PRAMIN
window instead of just kernel-mapped sysmem buffers for iGPU as of now.
Depending on the buffer aperture, choose between the two access types in
the buffer memory accessor functions.
vmap()/vunmap() pairs are no-ops for buffers that can't be cpu-mapped.
Two uses of DMA_ATTR_READ_ONLY are removed in the ucode loading path to
support writing to them too via the indirection in addition to cpu.
JIRA DNVGPU-23
Change-Id: I282dba6741c6b8224bc12e69c1fb3936bde7e6ed
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1141314
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit df05d2a7c214bc8cdb887f1609853d0f424ef6f1. It causes
intermittent failures on laguna_t124.
Bug 1766083
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Change-Id: Idabcf96d2eaddc989b029c429cec213bcabbf28c
Reviewed-on: http://git-master/r/1147683
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To support vidmem, pass g and mem_desc to the buffer memory accessor
functions. This allows the functions to select the memory access method
based on the buffer aperture instead of using the cpu pointer directly
(like until now). The selection and aperture support will be in another
patch; this patch only refactors these accessors, but keeps the
underlying functionality as-is.
gk20a_mem_{rd,wr}32() work as previously; add also gk20a_mem_{rd,wr}()
for byte-indexed accesses, gk20a_mem_{rd,wr}_n() for memcpy()-like
functionality, and gk20a_memset() for filling buffers with a constant.
The 8 and 16 bit accessor functions are removed.
vmap()/vunmap() pairs are abstracted to gk20a_mem_{begin,end}() to
support other types of mappings or conditions where mapping the buffer
is unnecessary or different.
Several function arguments that would access these buffers are also
changed to take a mem_desc instead of a plain cpu pointer. Some relevant
occasions are changed to use the accessor functions instead of cpu
pointers without them (e.g., memcpying to and from), but the majority of
direct accesses will be adjusted later, when the buffers are moved to
support vidmem.
JIRA DNVGPU-23
Change-Id: I3dd22e14290c4ab742d42e2dd327ebeb5cd3f25a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1121143
Reviewed-by: Ken Adams <kadams@nvidia.com>
Tested-by: Ken Adams <kadams@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Deassert reset in L2 and FB before initializing L2. In gk20a L2 can
be off and thus writing registers results in a priv ring failure.
Change-Id: I680b8b1e77cf67a8269c6de59a15d9817301300e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1140482
(cherry picked from commit d85edcf4170d7bc59d2c080f4343bc2f959be023)
Reviewed-on: http://git-master/r/1143684
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use gk20a_gmmu_{alloc,free}() instead of duplicating their code for page
tables. The linsim-specific special case is kept as-is.
JIRA DNVGPU-23
JIRA DNVGPU-20
Change-Id: I66d772337bad5d081256b13877c4e713ea8b634a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1139695
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For upcoming vidmem refactor, replace struct gk20a_mm_entry's contents
identical to struct mem_desc, with a struct mem_desc member. This makes
it possible to use the page table buffers like the others too.
JIRA DNVGPU-23
JIRA DNVGPU-20
Change-Id: I714ee5dcb33c27aaee932e8e3ac367e84610b102
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1139694
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not put any ctag data in the PTEs unless compression is
actually enabled for the mapping.
Bug 1732449
JIRA DNVGPU-12
Change-Id: I2abfbf9d1282af24541f8199bd9fbf2133c12899
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133790
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a function to allow the kernel to do fixed mappings. Necessary
for the semaphore functionality since there needs to be a common
address in each VM for the semaphores.
Bug 1732449
JIRA DNVGPU-12
Change-Id: I2b451db2d3cb3c003d951f7b0ffc87f6c91db7dc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133789
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Program sysmem flush address to prevent random accesses of
address 0.
Change-Id: I886170395f036805f02e0bce7ecd3c8c46b921df
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1129216
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
|
|
|
|
|
|
|
|
|
| |
Wait for BAR1 bind to complete before continuing. The register to
wait exists Maxwell onwards.
Change-Id: Ie3736033fdb748c5da8d7a6085ad6d63acaf41f5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1123941
|
|
|
|
|
|
|
|
|
| |
In Tegra GPU, SoC memory has to be accessed as vidmem. In discrete GPU, it
has to be accessed as sysmem.
Change-Id: I4efe71b54a9a32f0bf1f02ec4016ed74405a14c5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120468
|
|
|
|
|
|
|
|
|
|
|
| |
Support GPUs which cannot choose between SMMU and physical
addressing.
Change-Id: If3256fa1bc795a84d039ad3aa63ebdccf5cc0afb
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120469
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
|
|
|
|
|
|
|
|
|
| |
Use struct device instead of struct platform_device wherever
possible. This allows adding other bus types later.
Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120466
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow a special address space node to be split out from the
user adress space or fixed allocations. A debugfs node,
/d/<gpu>/separate_fixed_allocs
Controls this feature. To enable it:
# echo <SPLIT_ADDR> > /d/<gpu>/separate_fixed_allocs
Where <SPLIT_ADDR> is the address to do the split on in the
GVA address range. This will cause the split to be made in
all subsequent address space ranges that get created until it
is turned off. To turn this off just echo 0x0 into the same
debugfs node.
Change-Id: I21a3f051c635a90a6bfa8deae53a54db400876f9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1030303
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clear comptags for whole buffer when nvgpu sees the buffer for the
first time.
Change-Id: I67108ce0f0def46ddda1aa9b9bb5ea22549cce13
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1013517
(cherry picked from commit 544446aacdc695dc2e27c42a0086292cd69c2eee)
Reviewed-on: http://git-master/r/1031009
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use right shift instead of division for computing the ctag offset.
Bug 1704834
Change-Id: Id57526a08bad34e41b2335a21e299d1c0a2ffba1
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1024467
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While mapping the buffer, if kind argument is -1,
we extract kind value from nvmap
but kind information from nvmap is going away
and hence remove respective call to nvmap
Bug 1616899
Change-Id: I2764655f60df691ac8a86484c6ec929d2b83b2e3
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1012239
GVS: Gerrit_Virtual_Submit
Reviewed-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Restore comptags to be bitmap-allocated, like they were before we had
the buddy allocator.
The new buddy allocator introduced by
e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff (originally
6ab2e0c49cb79ca68d2f83f1d4610783d2eaa79b) is fine for the big VAs, but
unsuitable for the small compbit store.
This commit reverts partially the combination of the above commit and
also one after it, 86fc7ec9a05999bea8de320840b962db3ee11410, that fixed
a bug which is not present when using a bitmap. With a bitmap allocator,
pruning the extra allocation necessary for user-mapped mode is possible,
so that is also restored.
The original generic bitmap allocator is not restored; instead, a
comptag-only allocator is introduced.
Bug 200145635
Change-Id: I87f3a911826a801124cfd21e44857dfab1c3f378
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/837180
(cherry picked from commit 5a504aeb54f3e89e6561932971158a397157b3f2)
Reviewed-on: http://git-master/r/839742
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Maxwell comptaglines are assigned per 128k, but preferred big page
size for graphics is 64k. Bit 16 of GPU VA is used for determining
which half of comptagline is used.
This creates problems if user space wants to map a page multiple times
and to arbitrary GPU VA. In one mapping the page might be mapped to
lower half of 128k comptagline, and in another mapping the page might
be mapped to upper half.
Turn on mode where MSB of comptagline in PTE is used instead of bit 16
for determining the comptagline lower/upper half selection.
Bug 1704834
Change-Id: If87e8f6ac0fc9c5624e80fa1ba2ceeb02781355b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/924322
Reviewed-by: Alex Waterman <alexw@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Add offset to comptags when mapping partial buffers.
Bug 1704834
Change-Id: I3405b465bb1373bcc79eb5ecbd93dd1b866abfb4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/837401
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- operands not affecting result (id = 12845)
- logically dead code (id = 12890)
- dereference after null check (id = 12968)
- unsigned compared to 0 (id = 13176)
- resource leak (id = 13338, 18673)
- unused pointer value (id = 13916)
Bug 1703084
Change-Id: I2f401dd93126af27748c53fa1b3a59cb154af36b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/835143
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add new operaton g->ops.mm.set_debug_mode and let other places
that set debug mode call this callback.
It's preparing for adding vgpu set mmu debug mode hook.
JIRA VFND-1005
Bug 1594604
Change-Id: I1d227a0c0f96adb0035ae16ae1f4fbfa739bf0a7
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/833497
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Bug 200150865
Change-Id: If4f0e01bdeb95c303675b63444bd497b65d934f3
Signed-off-by: Ari Hirvonen <ahirvonen@nvidia.com>
Reviewed-on: http://git-master/r/835151
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which
enables creating userspace-managed GPU address spaces.
When an address space is marked as userspace-managed, the following
changes are in effect:
- Only fixed-address mappings are allowed.
- VA space allocation for fixed-address mappings is not required,
except to mark space as sparse.
- Maps and unmaps are always immediate. In particular, the mapping
ref increments at kickoffs and decrements at job completion are
skipped.
Bug 1614735
Bug 1623949
Bug 1660392
Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/738558
Reviewed-on: http://git-master/r/833253
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement NVGPU_GPU_IOCTL_GET_BUFFER_INFO. The new IOCTL can be used
to identify buffers and retrieve their sizes. This allows the
userspace to be agnostic to the dmabuf implementation, as the generic
dmabuf fd interface does not have a reliable way for buffer
identification.
Bug 1614735
Bug 1623949
Bug 1660392
Change-Id: Ic3dd0a9385c9852778110ccb80636dd6f4f36208
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/822845
Reviewed-on: http://git-master/r/833252
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add dbg_map debug spew for all mapping calls. This plugs the hole where
kernel mappings were not logged, because the debug log is added only in
ioctl path.
Change-Id: I036bf41f92ba5b612d32805020ca7a16fe54f9f4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/812288
(cherry picked from commit c37b2892d6d967ad48076b20e5a9ef97dc600b31)
Reviewed-on: http://git-master/r/831333
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allocate a separate VM for CDE channels instead of using the system
(PMU) vm, and make it much bigger than the PMU's to fit the maximum
number of CDE channels there.
Bug 1566740
Change-Id: I4f487c40c9ec79cc9ffb880b0ecd3f47eb450336
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/815149
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
G_ELPG_FLUSH is protected in some chips. Use L2 flush operations
instead.
Bug 1698618
Change-Id: I984a8ace8bcd0ad2d4a4e2d63af75a342bdeb75a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/828656
(cherry picked from commit ba9075fa43975112a221d37d246f0b8f5af40fab)
Reviewed-on: http://git-master/r/829415
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit c44947b1314bb2afa1f116b4928f4e8a4c34d7b1. It
introduces a regression in T124.
Bug 1702063
Change-Id: I64e333f66d98bd4dbcfe40a60f1aa825d90376a5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/830786
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
| |
Change-Id: Idfeb3bf95751902d52a895d77045a529f69abc0b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/758651
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adding support to remove bar2 mm on gpu module
remove.
Change-Id: Id5f680b1abf7056da9871d5460d9fbc40422673e
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/814571
(cherry picked from commit e7c6c87dd6b0893d26a9a3b4568121a691e1eb3c)
Reviewed-on: http://git-master/r/815429
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Return immediately in case there are no buffers to put. This skips
acquiring mutexes and map batch start/finish overheads.
Bug 1614735
Bug 1623949
Bug 1660392
Change-Id: Ief04e36d995e65c1510496c17cb3f5bb90486c69
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/815376
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug 1689976
Change-Id: I97ad14c9698030b630d3396199a2a5296c661392
Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com>
Reviewed-on: http://git-master/r/806590
(cherry picked from commit c90cd5ee674d6357db3be2243950ff0d81ef15ef)
Reviewed-on: http://git-master/r/808249
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible that user space requests more unmaps on a buffer
than it requested maps
In this case, we end up dropping one extra refcount which could
lead to releasing buffer early
Fix this by checking and returning if buffer's user_mapped
refcount is already zero
Bug 200130521
Change-Id: Ic8ef2dbfe0476b16d852ad899b1ed0404b5bb7de
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/788904
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Correct logic for supporting unmapped
ptes during gmmu map.
Bug 1587825
Change-Id: I1b0b603f7758a65d9666046d0d908663f8e460e3
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/796577
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/759345
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Separate the kernel and userspace regions in the GPU virtual address
space. Do this by reserving the last part of the GPU VA aperture for
the kernel, and extend GPU VA aperture accordingly for regular address
spaces. This prevents the kernel polluting the userspace-visible GPU
VA regions, and thus, makes the success of fixed-address mapping more
predictable.
Bug 200077571
Change-Id: I63f0e73d4c815a4a9fa4a9ce568709974690ef0f
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/747191
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit b12efd059070b942a33e23d06e9050145a0694ef.
Bug 1492689
Change-Id: Iae07341f246010ca0b69eddbbb9cd434b8b5f05a
Signed-off-by: Sri Krishna chowdary <schowdary@nvidia.com>
Reviewed-on: http://git-master/r/795112
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|