| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reduce the usage of nvgpu_vidmem_get_page_alloc() and friends as much
as possible. This reduces the dependency of nvgpu on Linux SGLs. SGLs
still need to be used, however, since sharing buffers in userspace is
done by dma_buf FD. The best way to pass the vidmem buf through the
dma_buf is by SGL pointer.
JIRA NVGPU-30
JIRA NVGPU-138
Change-Id: Ide0e9e5a557f00aa63b063be085042101a5b34ee
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1540709
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Split the core vidmem allocation from the Linux component of vidmem
allocation. The core vidmem allocation allocates the nvgpu_mem struct
that defines the vidmem buffer in the core MM code. The Linux code
now allocates some Linux specific stuff (dma_buf, etc) and also
allocates the core vidmem buf.
JIRA NVGPU-30
JIRA NVGPU-138
Change-Id: I88e87e0abd5ec714610eacc6eac17e148bcee3ce
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1540708
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rename the VIDMEM APIs to be prefixed by nvgpu_ to ensure
consistency and that all the non-static vidmem functions are
properly namespaced.
JIRA NVGPU-30
JIRA NVGPU-138
Change-Id: I9986ee8f2c8f95a4b7c5e2b9607bc1e77933ccfc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1540707
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For sync-point read map:
1. Added nvgpu_mem memory allocator in gk20a struct and
allocated memory for this in gk20a_finalize_poweron()
and freed this memory in gk20a_remove().
2. Added "u64 syncpt_ro_map_gpu_va" in vm_gk20a struct
for read map in vm.
Added nvgpu_quiesce() in nvgpu_remove() before freeing
syncpoint read map to ensure that nvgpu is idle.
JIRA GPUT19X-2
Change-Id: I7cbfec57f0992682dd833a1b0d92d694bcaf1eb3
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1514338
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Split VIDMEM support into its own code files organized as such:
common/mm/vidmem.c - Base vidmem support
common/linux/vidmem.c - Linux specific user-space interaction
include/nvgpu/vidmem.h - Vidmem API definitions
Also use the config to enable/disable VIDMEM support in the makefile
and remove as many CONFIG_GK20A_VIDMEM preprocessor checks as possible
from the source code.
And lastly update a while-loop that iterated over an SGT to use the
new for_each construct for iterating over SGTs.
Currently this organization is not perfectly adhered to. More patches
will fix that.
JIRA NVGPU-30
JIRA NVGPU-138
Change-Id: Ic0f4d2cf38b65849c7dc350a69b175421477069c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1540705
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
__nvgpu_gmmu_do_update_page_table() uses nvgpu_sgt_for_each_sgl to loop
through the entries of a buffer to be mapped, so when continue is used,
the sgl entry must not be reassigned again like it was before with a
pure while-and-update loop. Delete a reassignment to fix a case where
sgl = sgl->next could happen twice.
Bug 2002279
Bug 2001466
Change-Id: I47c8b853d4b35304740cd4e8a840df02fcd23054
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1575279
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Timo Alho <talho@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Combine the VIDMEM and SYSMEM GMMU mapping paths into one
single function. With the proper nvgpu_sgt abstraction in
place the high level mapping code can treat these two types
of maps identically. The benefits of this are immediate: a
change to the mapping code will have much more testing
coverage - the same code runs on both vidmem and sysmem.
This also makes it easier to make changes to mapping code
since the changes will be testable no matter what type of
mem is being mapped.
JIRA NVGPU-68
Change-Id: I308eda03d426966ba8e1d4892c99cf97b45c5993
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1566706
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a macro to iterate across nvgpu_sgts. This makes it easier on
developers who may accidentally forget to move to the next SGL.
JIRA NVGPU-243
Change-Id: I90154a5d23f0014cb79bbcd5b6e8d8dbda303820
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1566627
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rename get_physical_addr_bits and related functions to something that
more clearly conveys what they are doing. The basic idea of these
functions is to translate from a physical GPU address to a IOMMU GPU
address. To do that a particular bit (that varies from chip to chip)
is added to the physical address.
JIRA NVGPU-68
Change-Id: I536cc595c4397aad69a24f740bc74db03f52bc0a
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1542966
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the last sg_phys() call from the GMMU code and replace it
with a generic nvgpu_mem API. This new API, nvgpu_mem_get_phys_addr(),
returns the physical address of an nvgpu_mem struct.
Also, implement this new API in the Linux specific nvgpu_mem code
since it requires access to the underlying SGT/SGL.
JIRA NVGPU-68
Change-Id: Idf88701a2a8515464c658c26e0de493c82ff850d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1542964
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nvgpu_mem and pmu_debug should be MIT licensed. Change the license
boilerplate.
JIRA NVGPU-218
Change-Id: I7750368674faa4c4e8bf071e136b80fd53d9a0c4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1568779
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change license of OS independent source code files to MIT.
JIRA NVGPU-218
Change-Id: I1474065f4b552112786974a16cdf076c5179540e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1565880
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The basic nvgpu_mem_sgl implementation provides support
for OS specific scatter-gather list implementations by
simply copying them node by node. This is inefficient,
taking extra time and memory.
This patch implements an nvgpu_mem_sgt struct to act as
a header which is inserted at the front of any scatter-
gather list implementation. This labels every struct
with a set of ops which can be used to interact with
the attached scatter gather list.
Since nvgpu common code only has to interact with these
function pointers, any sgl implementation can be used.
Initialization only requires the allocation of a single
struct, removing the need to copy or iterate through the
sgl being converted.
Jira NVGPU-186
Change-Id: I2994f804a4a4cc141b702e987e9081d8560ba2e8
Signed-off-by: Sunny He <suhe@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1541426
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The last major item preventing the core MM code in the nvgpu
driver from being platform agnostic is the usage of Linux
scattergather tables and scattergather lists. These data
structures are used throughout the mapping code to handle
discontiguous DMA allocations and also overloaded to represent
VIDMEM allocs.
The notion of a scatter gather table is crucial to a HW device
that can handle discontiguous DMA. The GPU has a MMU which
allows the GPU to do page gathering and present a virtually
contiguous buffer to the GPU HW. As a result it makes sense
for the GPU driver to use some sort of scatter gather concept
so maximize memory usage efficiency.
To that end this patch keeps the notion of a scatter gather
list but implements it in the nvgpu common code. It is based
heavily on the Linux SGL concept. It is a singly linked list
of blocks - each representing a chunk of memory. To map or
use a DMA allocation SW must iterate over each block in the
SGL.
This patch implements the most basic level of support for this
data structure. There are certainly easy optimizations that
could be done to speed up the current implementation. However,
this patches' goal is to simply divest the core MM code from
any last Linux'isms. Speed and efficiency come next.
Change-Id: Icf44641db22d87fa1d003debbd9f71b605258e42
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1530867
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Make sure that the block count is the length / block_size since the
length is passed in bytes.
Change-Id: Ibb132b16b70b9cd7117c441f37a7947052d279ee
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1538976
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Construct a wrapper macro NV_ACCESS_ONCE(x) which uses OS specific
versions of ACCESS_ONCE. e.g for linux, ACCESS_ONCE(x) is used.
Jira NVGPU-125
Change-Id: Ia5c67baae111c1a7978c530bf279715fc808287d
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1549928
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove debugging features that did not really get used and make
the debugging code use the nvgpu_log() functionality. This ties
the allocator debugging into the larger nvgpu debug framework.
Also modify many of the places CONFIG_DEBUG_FS was used to
conditionally compile allocator debug code to use __KERNEL__
instead. This is because that debug code can still be called even
when debugfs is not present in Linux.
Change-Id: I112ebe1cae22d6f8db96d023993498093e18d74a
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1544439
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove an unused tracing feature from the allocator code. This
should make multi-OS support slightly easier.
Change-Id: I8b57f037b66227f2110e457bff7aa6b443d89e82
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1544438
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- added wrapper struct nvgpu_ref over nvgpu_atomic_t
- added nvgpu_ref_* APIs to access the above struct
JIRA NVGPU-140
Change-Id: Id47f897995dd4721751f7610b6d4d4fbfe4d6b9a
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1540899
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add the amount of memory used to the page allocator debug dump.
Change-Id: Icd4b4a0489068aaa3f60221b792de7f8dbf0092c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1543695
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
construct wrapper nvgpu_* methods to replace
mb,rmb,wmb,smp_mb,smp_rmb,smp_wmb,read_barrier_depends and
smp_read_barrier_depends.
NVGPU-122
Change-Id: I8d24dd70fef5cb0fadaacc15f3ab11531667a0df
Signed-off-by: Debarshi <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1541199
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sourab Gupta <sourabg@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- added wrapper structs nvgpu_atomic_t and nvgpu_atomic64_t over
atomic_t and atomic64_t
- added nvgpu_atomic_* and nvgpu_atomic64_* APIs to access the above
wrappers.
JIRA NVGPU-121
Change-Id: I61667bb0a84c2fc475365abb79bffb42b8b4786a
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1533044
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix some simple pointer arithmetic errors in the deterministic submit
path. The lockless allocator was doing a subtraction to determine the
correct offset of the element to free. However, this subtraction was
using the base address and a numeric offset - not a pointer offset. Thus
the difference computed was in bytes not in elements of the block size.
The fix is simple: just divide by the block size.
Also this modifies the debugging statement a bit so that a bit more
information is printed at more useful times.
Lastly, a pointer to numeric cast was fixed in the fence code.
Change-Id: I514724205f1b73805b21e979481a13ac689f3482
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1538905
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The call to __set_pd_level() for vidmem allocs had the wrong
length being passed in. This was a silent error since the subsequent
__set_pd_level() calls overwrote the bad mappings. However this
caused significantly more PDE/PTE writes than necessary since
each chunk could be mapped N times where N is the number of chunks
in an SGL.
Change-Id: Ied7247b70825dc91b9eea1c3350f4ef370ab1a52
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1537078
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the mm.get_iova_addr() HAL and replace it with a new HAL
called mm.gpu_phys_addr(). This new HAL provides the real phys
address that should be passed to the GPU from a physical address
obtained from a scatter list. It also provides a mechanism by
which the HAL code can add extra bits to a GPU physical address
based on the attributes passed in. This is necessary during GMMU
page table programming.
Also remove the flags argument from the various address functions.
This flag was used for adding an IO coherence bit to the GPU
physical address which is not supported.
JIRA NVGPU-30
Change-Id: I69af5b1c6bd905c4077c26c098fac101c6b41a33
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1530864
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix kernel warnings for GPUs with real vidmem:
- dma.c: in nvgpu_dma_alloc_flags, ignore incoming flags when using vidmem,
since anything but NVGPU_DMA_NO_KERNEL_MAPPING will end up generating
kernel warnings, and the vidmem mapping functions ignore the other flags
anyway.
- gmmu.c: in __nvgpu_gmmu_update_page_table, use appropriate function for
memory type to retrieve physical address
Bug 1967748
Change-Id: I6fc01fd5f2c5cd7b81cba70ab59cc3c8fe4cda19
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1530877
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the PD caching code use a non-contiguous DMA alloc for PAGE_SIZE
and below allocations. There's no need for using the special contig
pool of mem for these page sized allocs so wasting said mem can lead
us to OOM problems pretty quickly (think large sparse textures, for
example).
Also turn several pd_dbg() statements for printing OOM errors into
nvgpu_err()s since knowing exactly where an alloc fails is very
convenient.
Bug 200326705
Change-Id: Ib7c45020894d4bdd73cc92179ef707e472714d61
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1527294
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add new routines for accessing and modifying PTEs in situ. They are:
__nvgpu_pte_words()
__nvgpu_get_pte()
__nvgpu_set_pte()
All the details of modifying a page table entry are handled within.
Note, however, that these routines will not build page tables. If a PTE
does not exist then said PTE will not be created. Instead -EINVAL will
be returned. But, keep in mind, a PTE marked as invalid still exists.
So this API can be used to mark an invalid PTE valid.
JIRA NVGPU-30
Change-Id: Ic8615f209a0c4eb6fa64af9abadcfb3b2c11ee73
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1510447
Reviewed-by: Automatic_Commit_Validation_User
Tested-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure that all debug prints are consistent from chip to chip
and function to function. The following maps letters in the
debug print to their meaning:
C Mapping is cachable
v Mapping is volatile
S Mapping is sparse
P Mapping is private (VPR/WPR)
c Mapping is coherent
V Mapping is valid
JIRA NVGPU-30
Change-Id: Ia890af88677c3e6d3fdd8c4fe266158c35b8afcd
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master/r/1514903
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Tested-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add t19x specific flags into the GMMU attributes struct.
Jira GPUT19X-10
Bug 200279508
Change-Id: Ib45b83705fa1ca4ff6d14da0a2f132050e7d2cd5
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master/r/1514876
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Tested-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On some GPUs certain physical address bits have special meaning. This
patch adds support for setting those bits based on the GMMU attributes
struct.
Jira GPUT19X-10
Bug 200279508
Change-Id: I32b8a028be7fd62af06a60c393a8c9251de0ef3c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master/r/1512600
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use sysmem_coherent aperture if the buffer mappings are requested
to be IO coherent. Use sysmem_noncoherent aperture otherwise. This
is implemented by adding a new coherent field to the GMMU attrs
struct.
Jira GPUT19X-17
Bug 1651331
Bug 200283998
Change-Id: I5cfb71b5913d4db50ebf10331b19f5a4216456bf
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master/r/1514438
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases page directories require less than a full page of memory.
For example, on Pascal, the final PD level for large pages is only 256 bytes;
thus 16 PDs can fit in a single page. To allocate an entire page for each of
these 256 B PDs is extremely wasteful. This patch aims to alleviate the
wasted DMA memory from having small PDs in a full page by packing multiple
small PDs into a single page.
The packing is implemented as a slab allocator - each page is a slab and
from each page multiple PD instances can be allocated. Several modifications
to the nvgpu_gmmu_pd struct also needed to be made to support this. The
nvgpu_mem is now a pointer and there's an explicit offset into the nvgpu_mem
struct so that each nvgpu_gmmu_pd knows what portion of the memory it's
using.
The nvgpu_pde_phys_addr() function and the pd_write() functions also require
some changes since the PD no longer is always situated at the start of the
nvgpu_mem.
Initialization and cleanup of the page tables for each VM was slightly
modified to work through the new pd_cache implementation. Some PDs (i.e
the PDB), despite not being a full page, still require a full page for
alignment purposes (HW requirements). Thus a direct allocation method for
PDs is still provided. This is also used when a PD that could in principle
be cached is greater than a page in size.
Lastly a new debug flag was added for the pd_cache code.
JIRA NVGPU-30
Change-Id: I64c8037fc356783c1ef203cc143c4d71bbd5d77c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master/r/1506610
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update the high level mapping logic. Instead of iterating over the
GPU VA iterate over the scatter-gather table chunks. As a result
each GMMU page table update call is simplified dramatically.
This also modifies the chip level code to no longer require an SGL
as an argument. Each call to the chip level code will be guaranteed
to be contiguous so it only has to worry about making a mapping from
virt -> phys.
This removes the dependency on Linux that the chip code currently
has. With this patch the core GMMU code still uses the Linux SGL but
the logic is highly transferable to a different, nvgpu specific,
scatter gather list format in the near future.
The last major update is to push most of the page table attribute
arguments to a struct. That struct is passed on through the various
mapping levels. This makes the funtions calls more simple and
easier to follow.
JIRA NVGPU-30
Change-Id: Ibb6b11755f99818fe642622ca0bd4cbed054f602
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master/r/1484104
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the special cases for fmodel in the GMMU allocation code. There
is no reason to treat fmodel any different than regular DMA memory.
If there is no IOMMU the DMA api will handle that perfectly acceptably.
JIRA NVGPU-30
Change-Id: Icceb832735a98b601b9f41064dd73a6edee29002
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master/r/1507562
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixup a minor spacing issue in the VM code.
Change-Id: Ia4aaa3cdd7cb63958870cebb51821bf640d56123
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1499446
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Separate the non-chip specific GMMU mapping implementation code
out of mm_gk20a.c. This puts all of the chip-agnostic code into
common/mm/gmmu.c in preparation for rewriting it.
JIRA NVGPU-12
JIRA NVGPU-30
Change-Id: I6f7fdac3422703f5e80bb22ad304dc27bba4814d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1480228
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support only VM pointers and ref-counting for maintaining VMs. This
dramatically reduces the complexity of the APIs, avoids the API
abuse that has existed, and ensures that future VM usage is
consistent with current usage.
Also remove the combined VM free/instance block deletion. Any place
where this was done is now replaced with an explict free of the
instance block and a nvgpu_vm_put().
JIRA NVGPU-12
JIRA NVGPU-30
Change-Id: Ib73e8d574ecc9abf6dad0b40a2c5795d6396cc8c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1480227
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unify the initialization routines for the vGPU and regular GPU paths.
This helps avoid any further code divergence. This also assumes that
the code running on the regular GPU essentially works for the vGPU.
The only addition is that the regular GPU path calls an API in the
vGPU code that sends the necessary RM server message.
JIRA NVGPU-12
JIRA NVGPU-30
Change-Id: I37af1993fd8b50f666ae27524d382cce49cf28f7
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1480226
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the way big_pages is handled in nvgpu_vm_init(). Prior to this
patch big_pages could wind up being true even if disable_bigpages is
set in the mm struct. Clearly this is wrong.
The logic is now simplified a little bit and makes it so that if the
disable_bigpages field is set then the resulting VM created by
nvgpu_vm_init() has only one user area (no need for a second LP user
area) and the big_pages field for the VM is set to false.
JIRA NVGPU-12
JIRA NVGPU-30
Change-Id: If5dc7fcf3fa4e070f87295406f0afe414269b702
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1493318
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since all debugfs code is Linux specific, remove
it from common code and move it to Linux module
Debugfs code is now divided into below
module specific files :
common/linux/debug.c
common/linux/debug_cde.c
common/linux/debug_ce.c
common/linux/debug_fifo.c
common/linux/debug_gr.c
common/linux/debug_mm.c
common/linux/debug_allocator.c
common/linux/debug_kmem.c
common/linux/debug_pmu.c
common/linux/debug_sched.c
Add corresponding header files for above modules too
And compile all of above files only if CONFIG_DEBUG_FS is set
Some more details of the changes made
- Move and rename gk20a/debug_gk20a.c to common/linux/debug.c
- Move and rename gk20a/debug_gk20a.h to include/nvgpu/debug.h
- Remove gm20b/debug_gm20b.c and gm20b/debug_gm20b.h and call
gk20a_init_debug_ops() directly from gm20b_init_hal()
- Update all debug APIs to receive struct gk20a as parameter
instead of receiving struct device pointer
- Update API gk20a_dmabuf_get_state() to receive struct gk20a
pointer instead of struct device
- Include <nvgpu/debug.h> explicitly in all files where debug
operations are used
- Remove "gk20a/platform_gk20a.h" include from HAL files
which no longer need this include
- Add new API gk20a_debug_deinit() to deinitialize debugfs
and call it from gk20a_remove()
- Move API gk20a_debug_dump_all_channel_status_ramfc() to
gk20a/fifo_gk20a.c
Jira NVGPU-62
Change-Id: I076975d3d7f669bdbe9212fa33d98529377feeb6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1488902
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the enabled flags API to handle the unify_address_sapce spaces
flag.
JIRA NVGPU-84
JIRA NVGPU-12
JIRA NVGPU-30
Change-Id: Id1b59aed4b349d6067615991597d534936cc5ce9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1488307
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Begin removing all of the myriad flag variables in struct gk20a and
replace that with one API that checks for flags being enabled or
disabled. The API is as follows:
bool nvgpu_is_enabled(struct gk20a *g, int flag);
bool __nvgpu_set_enabled(struct gk20a *g, int flag, bool state);
These APIs allow many of the gk20a flags to be replaced by defines.
This makes flag usage consistent and saves a small amount of memory in
struct gk20a. Also it makes struct gk20a easier to read since there's
less clutter scattered through out.
JIRA NVGPU-84
Change-Id: I6525cecbe97c4e8379e5f53e29ef0b4dbd1a7fc2
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1488049
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the VM init/de-init from the HAL and instead use a single
set of routines that init/de-init VMs. This prevents code divergence
between vGPUs and regular GPUs.
This patch also clears up the naming of the routines a little bit.
Since some VMs are used inplace and others are dynamically allocated
the APIs for freeing them were confusing. Also some free calls also
clean up an instance block (this is API abuse - but this is how it
currently exists).
The new API looks like this:
void __nvgpu_vm_remove(struct vm_gk20a *vm);
void nvgpu_vm_remove(struct vm_gk20a *vm);
void nvgpu_vm_remove_inst(struct vm_gk20a *vm,
struct nvgpu_mem *inst_block);
void nvgpu_vm_remove_vgpu(struct vm_gk20a *vm);
int nvgpu_init_vm(struct mm_gk20a *mm,
struct vm_gk20a *vm,
u32 big_page_size,
u64 low_hole,
u64 kernel_reserved,
u64 aperture_size,
bool big_pages,
bool userspace_managed,
char *name);
void nvgpu_deinit_vm(struct vm_gk20a *vm);
JIRA NVGPU-12
JIRA NVGPU-30
Change-Id: Ia4016384c54746bfbcaa4bdd0d29d03d5d7f7f1b
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1477747
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactor the API for initializing and cleaning up VMs.
This also involved moving a bunch of GMMU code out into the
gmmu code since part of initializing a VM involves initializing
the page tables for the VM.
JIRA NVGPU-12
JIRA NVGPU-30
Change-Id: I4710f08c26a6e39806f0762a35f6db5c94b64c50
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1477746
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function is an internal function to the VM manager that allocates
virtual memory space in the GVA allocator. It is unfortunately used in
the vGPU code, though. In any event, this patch cleans up and moves the
implementation of these functions into the VM common code.
JIRA NVGPU-12
JIRA NVGPU-30
Change-Id: I24a3d29b5fcb12615df27d2ac82891d1bacfe541
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1477745
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix a build failure caused by missing definition for struct device.
Instead of including device.h wrap the debugfs init function with
CONFIG_DEBUG_FS and forward declare struct device. We don't use any
struct device internals here so we only need to let the compiler
know that this type does exist.
Bug 200310575
Change-Id: I1ae45a8f191d920d9b606fefd5029fad84869cff
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1486012
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The vm_reserve_va_node struct is essentially a special VM area that
can be used for sparse mappings and fixed mappings. The name of this
struct is somewhat confusing (as node is typically used for list
items). Though this struct is a part of a list it doesn't really
make sense to call this a list item since it's much more. Based on
that the struct has been renamed to nvgpu_vm_area to capture the
actual use of the struct more accurately.
This also moves all of the management code of vm areas to a new file
devoted solely to vm_area management.
Also add a brief overview of the VM architecture. This should help
other people follow along the hierachy of ownership and lifetimes in
the rather complex MM code.
JIRA NVGPU-12
JIRA NVGPU-30
Change-Id: If85e1cf868031d0dc265e7bed50b58a2aed2602e
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1477744
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch begins splitting out the VM implementation from mm_gk20a.c and
moves it to common/linux/vm.c and common/mm/vm.c. This split is necessary
because the VM code has two portions: first, an interface for the OS
specific code to use (i.e userspace mappings), and second, a set of APIs
for the driver to use (init, cleanup, etc) which are not OS specific.
This is only the beginning of the split - there's still a lot of things
that need to be carefully moved around.
JIRA NVGPU-12
JIRA NVGPU-30
Change-Id: I3b57cba245d7daf9e4326a143b9c6217e0f28c96
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1477743
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch begins the major rework of the GPU's virtual memory manager
(VMM). The VMM is the piece of code that handles the userspace interface
to buffers and their mappings into the GMMU. The core data structure is
the VM - for now still known as 'struct vm_gk20a'. Each one of these
structs represents one addres space to which channels or TSGs may bind
themselves to.
The VMM splits the interface up into two broad categories. First there's
the common, OS independent interfaces; and second there's the OS specific
interfaces.
OS independent
--------------
This is the code that manages the lifetime of VMs, the buffers inside
VMs (search, batch mapping) creation, destruction, etc.
OS Specific
-----------
This handles mapping of buffers represented as they are represented by
the OS (dma_buf's for example on Linux).
This patch is by no means complete. There's still Linux specific functions
scattered in ostensibly OS independent code. This is the first step. A
patch that rewrites everything in one go would simply be too big to
effectively review.
Instead the goal of this change is to simply separate out the basic
OS specific and OS agnostic interfaces into their own header files. The
next series of patches will start to pull the relevant implementations
into OS specific C files and common C files.
JIRA NVGPU-12
JIRA NVGPU-30
Change-Id: I242c7206047b6c769296226d855b7e44d5c4bfa8
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1464939
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|