| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Maxwell comptaglines are assigned per 128k, but preferred big page
size for graphics is 64k. Bit 16 of GPU VA is used for determining
which half of comptagline is used.
This creates problems if user space wants to map a page multiple times
and to arbitrary GPU VA. In one mapping the page might be mapped to
lower half of 128k comptagline, and in another mapping the page might
be mapped to upper half.
Turn on mode where MSB of comptagline in PTE is used instead of bit 16
for determining the comptagline lower/upper half selection.
Bug 1704834
Change-Id: If87e8f6ac0fc9c5624e80fa1ba2ceeb02781355b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/924322
Reviewed-by: Alex Waterman <alexw@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Add offset to comptags when mapping partial buffers.
Bug 1704834
Change-Id: I3405b465bb1373bcc79eb5ecbd93dd1b866abfb4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/837401
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- operands not affecting result (id = 12845)
- logically dead code (id = 12890)
- dereference after null check (id = 12968)
- unsigned compared to 0 (id = 13176)
- resource leak (id = 13338, 18673)
- unused pointer value (id = 13916)
Bug 1703084
Change-Id: I2f401dd93126af27748c53fa1b3a59cb154af36b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/835143
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add new operaton g->ops.mm.set_debug_mode and let other places
that set debug mode call this callback.
It's preparing for adding vgpu set mmu debug mode hook.
JIRA VFND-1005
Bug 1594604
Change-Id: I1d227a0c0f96adb0035ae16ae1f4fbfa739bf0a7
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/833497
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Bug 200150865
Change-Id: If4f0e01bdeb95c303675b63444bd497b65d934f3
Signed-off-by: Ari Hirvonen <ahirvonen@nvidia.com>
Reviewed-on: http://git-master/r/835151
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which
enables creating userspace-managed GPU address spaces.
When an address space is marked as userspace-managed, the following
changes are in effect:
- Only fixed-address mappings are allowed.
- VA space allocation for fixed-address mappings is not required,
except to mark space as sparse.
- Maps and unmaps are always immediate. In particular, the mapping
ref increments at kickoffs and decrements at job completion are
skipped.
Bug 1614735
Bug 1623949
Bug 1660392
Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/738558
Reviewed-on: http://git-master/r/833253
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement NVGPU_GPU_IOCTL_GET_BUFFER_INFO. The new IOCTL can be used
to identify buffers and retrieve their sizes. This allows the
userspace to be agnostic to the dmabuf implementation, as the generic
dmabuf fd interface does not have a reliable way for buffer
identification.
Bug 1614735
Bug 1623949
Bug 1660392
Change-Id: Ic3dd0a9385c9852778110ccb80636dd6f4f36208
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/822845
Reviewed-on: http://git-master/r/833252
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add dbg_map debug spew for all mapping calls. This plugs the hole where
kernel mappings were not logged, because the debug log is added only in
ioctl path.
Change-Id: I036bf41f92ba5b612d32805020ca7a16fe54f9f4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/812288
(cherry picked from commit c37b2892d6d967ad48076b20e5a9ef97dc600b31)
Reviewed-on: http://git-master/r/831333
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allocate a separate VM for CDE channels instead of using the system
(PMU) vm, and make it much bigger than the PMU's to fit the maximum
number of CDE channels there.
Bug 1566740
Change-Id: I4f487c40c9ec79cc9ffb880b0ecd3f47eb450336
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/815149
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
G_ELPG_FLUSH is protected in some chips. Use L2 flush operations
instead.
Bug 1698618
Change-Id: I984a8ace8bcd0ad2d4a4e2d63af75a342bdeb75a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/828656
(cherry picked from commit ba9075fa43975112a221d37d246f0b8f5af40fab)
Reviewed-on: http://git-master/r/829415
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit c44947b1314bb2afa1f116b4928f4e8a4c34d7b1. It
introduces a regression in T124.
Bug 1702063
Change-Id: I64e333f66d98bd4dbcfe40a60f1aa825d90376a5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/830786
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
| |
Change-Id: Idfeb3bf95751902d52a895d77045a529f69abc0b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/758651
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adding support to remove bar2 mm on gpu module
remove.
Change-Id: Id5f680b1abf7056da9871d5460d9fbc40422673e
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/814571
(cherry picked from commit e7c6c87dd6b0893d26a9a3b4568121a691e1eb3c)
Reviewed-on: http://git-master/r/815429
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Return immediately in case there are no buffers to put. This skips
acquiring mutexes and map batch start/finish overheads.
Bug 1614735
Bug 1623949
Bug 1660392
Change-Id: Ief04e36d995e65c1510496c17cb3f5bb90486c69
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/815376
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug 1689976
Change-Id: I97ad14c9698030b630d3396199a2a5296c661392
Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com>
Reviewed-on: http://git-master/r/806590
(cherry picked from commit c90cd5ee674d6357db3be2243950ff0d81ef15ef)
Reviewed-on: http://git-master/r/808249
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible that user space requests more unmaps on a buffer
than it requested maps
In this case, we end up dropping one extra refcount which could
lead to releasing buffer early
Fix this by checking and returning if buffer's user_mapped
refcount is already zero
Bug 200130521
Change-Id: Ic8ef2dbfe0476b16d852ad899b1ed0404b5bb7de
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/788904
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Correct logic for supporting unmapped
ptes during gmmu map.
Bug 1587825
Change-Id: I1b0b603f7758a65d9666046d0d908663f8e460e3
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/796577
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/759345
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Separate the kernel and userspace regions in the GPU virtual address
space. Do this by reserving the last part of the GPU VA aperture for
the kernel, and extend GPU VA aperture accordingly for regular address
spaces. This prevents the kernel polluting the userspace-visible GPU
VA regions, and thus, makes the success of fixed-address mapping more
predictable.
Bug 200077571
Change-Id: I63f0e73d4c815a4a9fa4a9ce568709974690ef0f
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/747191
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit b12efd059070b942a33e23d06e9050145a0694ef.
Bug 1492689
Change-Id: Iae07341f246010ca0b69eddbbb9cd434b8b5f05a
Signed-off-by: Sri Krishna chowdary <schowdary@nvidia.com>
Reviewed-on: http://git-master/r/795112
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
consider buffer size as well when calculating the required alignment
for a buffer else we would be mapping a VA range greater than requested
thus allowing access to entire large page even when not needed creating
a security hole.
Bug 1492689
Change-Id: Ic404708d238621ea64c26cafd05bc30ba8e02e12
Signed-off-by: Sri Krishna chowdary <schowdary@nvidia.com>
Reviewed-on: http://git-master/r/793229
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Before destroying the allocator for PMU dmem check if it was already
initialized. It is only initialized through certain paths like PMU ISRs.
So while testing the nvgpu module using nvgpu_submit_twod test I found
that it was never initialized.
2. Inside gk20a_init_gr_setup_sw, cleanup part calls for de-allocating
the already allocated chunk of memory. Whereas, cleanup also gets called
when memory allocation inside the same function fails. In such cases,
we should have a non-null check else we attempt to free a non-allocated
memory and kernel panics.
Bug 1476801
Change-Id: Ia2f0599ac0c35d58709acd149033e114b898b426
Signed-off-by: Yogesh Bhosale <ybhosale@nvidia.com>
Reviewed-on: http://git-master/r/777118
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The address space limit was being computed with the assumption
that the va_limit field is inclusive. The va_limit field is
actually not inclusive. It points to the first invalid byte.
Thus when generating the adr_limit register the code incorrectly
calculated that the address limit should be 0. To fix this the
computation now just uses va_limit - 1.
Also, the bitwise OR of 0xfff into the lower limit word was
incorrect. The bottom 12 bits of the lower 32 bit word are
ignored by the GPU and as such should not be populated.
Change-Id: Ifcc13343aaf50776f3cf1a1e3726e73ffde5003f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/756690
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/771151
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix an issue where large ( > 4GB) allocations were not being computed
correctly. The two fields, pages and page_size, were both 32 bits so
when multiplied they easily overflowed. Simple fix is to cast them to
64 bits before multiplying them.
Change-Id: I63fa54679e485de5c3a99684cbeb72c6cdc65504
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/747429
Reviewed-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/771148
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
if GPU is not powered before L2 is flushed, then
L2 cache flush is a noop. Same behavior as
gk20a_mm_L2_Invalidate()
bug 1661228
Change-Id: I0f590628928a73b7277d1b16a5a79a86e0213648
Signed-off-by: Sam Payne <spayne@nvidia.com>
Reviewed-on: http://git-master/r/768068
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
(cherry picked from commit cb4d29d34d0736aa753afa323bfb216481cc8640)
Reviewed-on: http://git-master/r/771113
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
| |
Implement support for privileged pages. Use them for kernel allocated buffers.
Change-Id: I720fc441008077b8e2ed218a7a685b8aab2258f0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/761919
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add batch support for mapping and unmapping. Batching essentially
helps transform some per-map/unmap overhead to per-batch overhead,
namely gk20a_busy()/gk20a_idle() calls, GPU L2 flushes, and GPU TLB
invalidates. Batching with size 64 has been measured to yield >20x
speed-up in low-level fixed-address mapping microbenchmarks.
Bug 1614735
Bug 1623949
Change-Id: Ie22b9caea5a7c3fc68a968d1b7f8488dfce72085
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/733231
(cherry picked from commit de4a7cfb93e8228a4a0c6a2815755a8df4531c91)
Reviewed-on: http://git-master/r/763812
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The WAR put into simulation to avoid a simulator crash can now be
removed (c85be1a0968de813fe9b99ebd5c261dcb0ca8875). The first issue
with the failing test was found to be GPFIFO entries that were not
invalid.
Other issues are still present with the test and are fixed in a
later commit.
Change-Id: I7d3def2e384eede82cfc82b961f09ca23b239d30
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/753378
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/755815
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
| |
Bug 1587825
Change-Id: I66f2988b7f1884b53bb8f3cd09ad1ead1652ffda
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/751484
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With 4K hole T186 PMU does not boot in NS
T186 has 64 bit DMA Base. We subtract IMEM
offset from GPUVA for PMU boot DMABASE setup
It becomes above 4GB because of that
So we will use a hole which is bigger than
IMEM size.
Change-Id: Ib87c39881299a4f5b14e28415195e00800250c46
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/740656
(cherry picked from commit 6504934d5f90719a5d564174aeb92da90aafbd5b)
Reviewed-on: http://git-master/r/747742
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Increase sync-unmap wait time from 5 ms to 50 ms.
6ccac11b4dd1a4eaf9c914fd567cdf7922184e28 decreased the wait tenfold, so
this puts it back.
Bug 1650025
Bug 200078514
Change-Id: I53a4ea115536ca2ff5d6aa701547c7477ac6e4ea
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/748224
(cherry picked from commit 7c22a24817f0880941e6f4343059fa303ec9eff5)
Reviewed-on: http://git-master/r/753285
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In non-silicon wait infinitely for all jobs to complete before
unmapping a fixed allocation.
Bug 200078514
Change-Id: I9196afb1d3c5f0c999113a4a17ada2989ac55707
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/744067
(cherry picked from commit 6ccac11b4dd1a4eaf9c914fd567cdf7922184e28)
Reviewed-on: http://git-master/r/753284
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff broke compbits
mapping. So, let's fix it.
Bug 200077571
Change-Id: I02dc150fbcb4cd59660f510adde9f029290efdfb
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/745001
(cherry picked from commit 86fc7ec9a05999bea8de320840b962db3ee11410)
Reviewed-on: http://git-master/r/753281
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When allocation size is 4k or below, we should use kmalloc. vmalloc
should be used only for larged allocations.
Introduce nvgpu_alloc, which checks the size, and decides the API
to use.
Change-Id: I593110467cd319851b27e57d1bfe8d228d3f2909
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/743974
(cherry picked from commit 7f56aa1f0ecafbfde7286353b60e25e494674d26)
Reviewed-on: http://git-master/r/753276
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 2e5803d0f2b7d7a1577a40f45ab9f3b22ef2df80 since
the issue seen with bug 200106514 is fixed with change
http://git-master/r/#/c/752080/.
Bug 200112195
Change-Id: I588151c2a7ea74bd89dc3fd48bb81ff2c49f5a0a
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/752503
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit ce1cf06b9a8eb6314ba0ca294e8cb430e1e141c0 since
it causes GPU pbdma interrupt to be generated.
Bug 200106514
Change-Id: If3ed9a914c4e3e7f3f98c6609c6dbf57e1eb9aad
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/749291
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 7eb42bc239dbd207208ff491c3fb65c3d83274d8.
The original commit was actually fine.
Change-Id: I564ce6530ac73fcfad17dcec9c53f0353b4f02d4
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/743300
(cherry picked from commit e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff)
Reviewed-on: http://git-master/r/743301
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Function validate_fixed_buffer used to do a linear search for
collision detection of already mapped buffers. Optimize this by doing
a nice logarithmic search instead.
Change-Id: Ifbf2ec015741d44883da27bc6f8cc090c48da145
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/739682
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When system is in low memory, kzalloc will fail if
kernel requests more than PAGE_SIZE continous memory block.
Bug 200096099
Change-Id: I44e217ffa6aa6c453a4d4afba45a8ee3b5756cc1
Signed-off-by: Kerwin Wan <kerwinw@nvidia.com>
Reviewed-on: http://git-master/r/732197
(cherry picked from commit 62861976421415f93e98a0a9f977ac1f66046714)
Reviewed-on: http://git-master/r/737057
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
Tested-by: Krishna Reddy <vdumpa@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement NVGPU_AS_IOCTL_GET_BUFFER_COMPBITS_INFO for requesting info
on compbits-mappable buffers; and NVGPU_AS_IOCTL_MAP_BUFFER_COMPBITS,
which enables mapping compbits to the GPU address space of said
buffers. This, subsequently, enables moving comptag swizzling from GPU
to CDEH/CDEV formats to userspace.
Compbits mapping is conservative and it may map more than what is
strictly needed. This is because two reasons: 1) mapping must be done
on small page alignment (4kB), and 2) GPU comptags are swizzled all
around the aggregate cache line, which means that the whole cache line
must be visible even if only some comptag lines are required from
it. Cache line size is not necessarily a multiple of the small page
size.
Bug 200077571
Change-Id: I5ae88fe6b616e5ea37d3bff0dff46c07e9c9267e
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/719710
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 2e235ac150fa4af8632c9abf0f109a10973a0bf5.
Change-Id: I3aa745152124c2bc09c6c6dc5aeb1084ae7e08a4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/741469
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Hiroshi Doyu <hdoyu@nvidia.com>
Tested-by: Hiroshi Doyu <hdoyu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement a new buddy allocation scheme for the GPU's VA space.
The bitmap allocator was using too much memory and is not a scaleable
solution as the GPU's address space keeps getting bigger. The buddy
allocation scheme is much more memory efficient when the majority
of the address space is not allocated.
The buddy allocator is not constrained by the notion of a split
address space. The bitmap allocator could only manage either small
pages or large pages but not both at the same time. Thus the bottom
of the address space was for small pages, the top for large pages.
Although, that split is not removed quite yet, the new allocator
enables that to happen.
The buddy allocator is also very scalable. It manages the relatively
small comptag space to the enormous GPU VA space and everything in
between. This is important since the GPU has lots of different sized
spaces that need managing.
Currently there are certain limitations. For one the allocator does
not handle the fixed allocations from CUDA very well. It can do so
but with certain caveats. The PTE page size is always set to small.
This means the BA may place other small page allocations in the
buddies around the fixed allocation. It does this to avoid having
large and small page allocations in the same PDE.
Change-Id: I501cd15af03611536490137331d43761c402c7f9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740694
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On linsim, when the push buffers are allowed to be allocated with small
pages above 4GB the simulator crashes. This patch ensures that for
linsim all small page allocations are forced to be below 4GB in the
GPU VA space. By doing so the simulator no longer crashes.
This bug has come up because the GPU buddy allocator work generates
allocations at the top of the address space first. Thus push buffers
were located at between 12GB and 16GB in the GPU VA space.
Change-Id: Iaef0af3fda3f37ac09a66b5e1179527d6fe08ccc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740728
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The number of entries in the next level PDE data structure was one
half of what was needed since the bit shift was 1 bit too small.
Change-Id: Id4981f230dd206ae94336cddab117312e143e6a1
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740727
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement NVGPU_AS_MAP_BUFFER_FLAGS_MAPPABLE_COMPBITS, which adds
extra alignment to compbits allocation for safe compbits mapping.
Bug 200077571
Change-Id: I3a74ebb81412e4e1e69501debeb9ef4e2056ef1a
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/730763
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/740693
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improve GMMU mapping code to cope with discontiguous buffers.
Add debugfs entry that allows bypassing SMMU and disabling big pages.
Bug 1605769
Change-Id: I14d32c62293a16ff8c7195377c75a85fa8061083
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/717503
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/737533
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bug N/A
with 128MB hole we are running into PDE
errors when 64K big page is used instead
of 128k
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Change-Id: Id887b32484e2114a8707e7d534e6ebf5e108b83f
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/733497
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/737532
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Record size of each page table level. The size of level 0 depends
on size of the address space, and we generally do not support the
whole address space.
Change-Id: Iab47505af1a641e193d9e98a2246e522813f221a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/729730
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-on: http://git-master/r/737531
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.
Bug 1605769
Change-Id: Idf51831e8be9cabe1ab9122b18317137fde6339f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/721030
Reviewed-on: http://git-master/r/737530
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure that the GPU VA for a buffer is aligned correctly if
compression is enabled.
Bug 1605769
Change-Id: I12566ddd554da7cc9fb41dd553576c534ac96ba8
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/725767
Reviewed-on: http://git-master/r/737529
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Convert the loop to free page tables into a recursive loop that goes
through all levels.
Change-Id: I3ab8f021bd8263f2f6dad29b5fbd0e6212c55a86
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/711393
Reviewed-on: http://git-master/r/737528
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
|