summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/nvgpu/common/mm/gmmu.c
Commit message (Collapse)AuthorAge
* gpu: nvgpu: Cleanup usage of bypass_smmuAlex Waterman2018-02-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The GPU has multiple different operating modes in respect to IOMMU'ability. As such there needs to be a clean way to tell the driver whether it is IOMMU'able or not. This state also does not always reflect what is possible: all becasue the GPU can generate IOMMU'ed memory requests doesn't mean it wants to. The nvgpu_iommuable() API has now existed for a little while which is a useful way to convey whether nvgpu should consider the GPU as IOMMU'able. However, there is also the g->mm.bypass_smmu flag which used to be able to override what the GPU decided it should do. Typically it was assigned the same value as nvgpu_iommuable() but that was not necessarily a requirment. This patch removes all the usages of g->mm.bypass_smmu and instead uses the nvgpu_iommuable() function. All places where the check against g->mm.bypass_smmu have been replaced with nvgpu_iommuable(). The code should now be much cleaner. Subsequently other checks can also be placed in the nvgpu_iommuable() function. For example, when NVLINK comes online and the GPU should no longer consider DMA addresses and instead use scatter-gather lists directly the ngpu_iommuable() function will be able to check the state of NVLINK and then act accordingly. Change-Id: I0da6262386de15709decac89d63d3eecfec20cd7 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1648332 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Fold T19x code back to main code pathsTerje Bergstrom2018-01-23
| | | | | | | | | | | | | Lots of code paths were split to T19x specific code paths and structs due to split repository. Now that repositories are merged, fold all of them back to main code paths and structs and remove the T19x specific Kconfig flag. Change-Id: Id0d17a5f0610fc0b49f51ab6664e716dc8b222b6 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1640606 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: fix non-IOMMU mappingsDeepak Nibade2017-12-21
| | | | | | | | | | | | | | | | | | | | | | | | | In __nvgpu_gmmu_do_update_page_table(), and in case of non-IOMMU mappings, we call nvgpu_sgt_get_phys() to get physical address But this API ignores mapping attributes including l3_alloc attribute specified by user space, and this breaks L3 cache allocations Fix this by using g->ops.mm.gpu_phys_addr() which also considers the mapping attributes and returns appropriate physical address Jira GPUT19X-10 Bug 200279508 Change-Id: Ibc0d29f7cb576a9d6893a97b1912d9ff4bc78e02 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1621245 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: fix indexing in locate pte functionDavid Nieto2017-12-05
| | | | | | | | | | | | | | | | | | | The current code does not properly calculate the indexes within the PDE to access the proper entry, and it has a bug in assignement of the big page entries. This change fixes the issue by: (1) Passing a pointer to the level structure and dereferencing the index offset to the next level. (2) Changing the format of the address. (3) Ensuring big pages are only selected if their address is set. Bug 200364599 Change-Id: I46e32560ee341d8cfc08c077282dcb5549d2a140 Signed-off-by: David Nieto <dmartineznie@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1610562 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Deepak Bhosale <dbhosale@nvidia.com>
* gpu: nvgpu: Increase programmed ctagline at compr page boundariesSami Kiminki2017-12-01
| | | | | | | | | | | | | | | | | Increase the ctagline that is programmed in the page tables when the buffer offset crosses the compression page boundaries. This fixes compressible-kind fixed-address mapping with 4k pages when the GPU VA is not aligned by the compression page size. Bug 1995897 Bug 2011640 Bug 2011668 Change-Id: I1f1f9750635a20a916527c9d18fda7f8aa6b1b1f Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1608465 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Introduce include/nvpgu/sizes.hTerje Bergstrom2017-12-01
| | | | | | | | | | | | We use SZ_* #defines in some parts of nvgpu, but we don't explicitly include a header that defines it. Add include/nvgpu/sizes.h that in Linux #includes linux/sizes.h. Change-Id: I8f506d85c7eaa12e649f5874a87533e2f0fe9438 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1607575 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Add nvgpu/bug.h include to some MM filesAlex Waterman2017-11-30
| | | | | | | | | | | | | | | | | | Add <nvgpu/bug.h> to MM files that use any of the BUG, BUG_ON, WARN, WARN_ON, etc, macros but do not yet include <nvgpu/bug.h>. JIRA NVGPU-401 Change-Id: I538219683d2a52b15abf147ff4bcf6375b6cb8a0 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1599960 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: David Martinez Nieto <dmartineznie@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Add translation for NVGPU MM flagsAlex Waterman2017-11-17
| | | | | | | | | | | | | | | | | | | | | | | Add a translation layer to convert from the NVGPU_AS_* flags to to new set of NVGPU_VM_MAP_* and NVGPU_VM_AREA_ALLOC_* flags. This allows the common MM code to not depend on the UAPI header defined for Linux. In addition to this change a couple of other small changes were made: 1. Deprecate, print a warning, and ignore usage of the NVGPU_AS_MAP_BUFFER_FLAGS_MAPPABLE_COMPBITS flag. 2. Move the t19x IO coherence flag from the t19x UAPI header to the regular UAPI header. JIRA NVGPU-293 Change-Id: I146402b0e8617294374e63e78f8826c57cd3b291 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1599802 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Fix some barrier usageAlex Waterman2017-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | Commit 81868a187fa3b217368206f17b19309846e8e7fb updated barrier usage to use the nvgpu wrappers and in doing so downgraded many plain barriers {mb(), wmb(), rmb()} to the SMP versions of these barriers. The SMP version of the barriers in question are only issued when running on an SMP machine. In most of the cases mentioned above this is fine since the barriers are present to faciliate proper ordering across CPUs. A single CPU is always coherent with itself, so on a non-SMP case we don't need those barriers. However, there are a few places where the barriers in use (GMMU page table programming, IO accessors, userd) where the barrier usage is for communicating and establishing ordering for the GPU. We need these barriers for both SMP machines and non-SMP machines. Therefor we must use the plain barrier versions. Change-Id: I376129840b7dc64af8f3f23f88057e4e81360f89 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1599744 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Mark nvgpu_pde_phys_addr staticAlex Waterman2017-11-16
| | | | | | | | | | | | | | | | | | nvgpu_pde_phys_addr() is only used in gmmu.c and as such can be marked static. JIRA NVGPU-402 Change-Id: I7adba6f54ebd4e06d176f23b9a959c04a8770338 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1599040 Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Konsta Holtta <kholtta@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Include UAPI explicitlyTerje Bergstrom2017-11-13
| | | | | | | | | | | | | Add explicit #includes for <uapi/linux/nvgpu.h> for source code files that depend on it. JIRA NVGPU-259 Change-Id: I717d5f1493423fd3a7a34b6dd3380d33a9307a09 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1596254 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Make buf alignment genericAlex Waterman2017-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drastically simplify and move the aligment computation for buffers getting mapped into the SGT code. An SGT is all that is needed for computing the alignment. However, this did require that a new SGT op was added: nvgpu_sgt_iommuable() This function returns true if the passed SGT is IOMMU'able and must be implemented by an SGT implementation that has IOMMU'able buffers. If this function is left as NULL then it is assumed that the buffer is not IOMMU'able. Also cleanup the parameter ordering convention among all nvgpu_sgt functions. Previously there was a mishmash of different parameter orderings. This patch now standardizes on the gk20a first approach seen everywhere else in the driver. JIRA NVGPU-30 JIRA NVGPU-246 JIRA NVGPU-71 Change-Id: Ic4ab7b752847cf795c7cfafed5a07818217bba86 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1583985 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Remove support for legacy mappingSami Kiminki2017-11-08
| | | | | | | | | | | | | | | | | | | | Make NVGPU_AS_MAP_BUFFER_FLAGS_DIRECT_KIND_CTRL mandatory for all map IOCTLs. We'll clean up the legacy kernel code in subsequent patches. Remove support for NVGPU_AS_IOCTL_MAP_BUFFER. It has been superseded by NVGPU_AS_IOCTL_MAP_BUFFER_EX. Remove legacy definitions to nvgpu_map_buffer_args and the related flags, and update the in-kernel map calls accordingly by switching to the newer definitions. Bug 1902982 Change-Id: Ie9a7f02b8d5d0ec7c3722c4481afab6d39b4fbd0 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1560932 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: fix pte location functionsDavid Nieto2017-11-01
| | | | | | | | | | | | | | | | | | Modify the recursive loop in pte_find to make sure it is targeting the proper pde page size. JIRA NVGPUGV100-36 Change-Id: Ib3673d8d9f1bd3c907d532f9e2562ecdc5dda4af Signed-off-by: David Nieto <dmartineznie@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1586739 Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Split VIDMEM support from mm_gk20a.cAlex Waterman2017-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | Split VIDMEM support into its own code files organized as such: common/mm/vidmem.c - Base vidmem support common/linux/vidmem.c - Linux specific user-space interaction include/nvgpu/vidmem.h - Vidmem API definitions Also use the config to enable/disable VIDMEM support in the makefile and remove as many CONFIG_GK20A_VIDMEM preprocessor checks as possible from the source code. And lastly update a while-loop that iterated over an SGT to use the new for_each construct for iterating over SGTs. Currently this organization is not perfectly adhered to. More patches will fix that. JIRA NVGPU-30 JIRA NVGPU-138 Change-Id: Ic0f4d2cf38b65849c7dc350a69b175421477069c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1540705 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: fix double iteration in gmmu updateKonsta Holtta2017-10-10
| | | | | | | | | | | | | | | | | | | | __nvgpu_gmmu_do_update_page_table() uses nvgpu_sgt_for_each_sgl to loop through the entries of a buffer to be mapped, so when continue is used, the sgl entry must not be reassigned again like it was before with a pure while-and-update loop. Delete a reassignment to fix a case where sgl = sgl->next could happen twice. Bug 2002279 Bug 2001466 Change-Id: I47c8b853d4b35304740cd4e8a840df02fcd23054 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1575279 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Timo Alho <talho@nvidia.com>
* gpu: nvgpu: Combine VIDMEM and SYSMEM map pathsAlex Waterman2017-10-04
| | | | | | | | | | | | | | | | | | | | | | | | Combine the VIDMEM and SYSMEM GMMU mapping paths into one single function. With the proper nvgpu_sgt abstraction in place the high level mapping code can treat these two types of maps identically. The benefits of this are immediate: a change to the mapping code will have much more testing coverage - the same code runs on both vidmem and sysmem. This also makes it easier to make changes to mapping code since the changes will be testable no matter what type of mem is being mapped. JIRA NVGPU-68 Change-Id: I308eda03d426966ba8e1d4892c99cf97b45c5993 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1566706 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
* gpu: nvgpu: Add for_each construct for nvgpu_sgtsAlex Waterman2017-10-04
| | | | | | | | | | | | | Add a macro to iterate across nvgpu_sgts. This makes it easier on developers who may accidentally forget to move to the next SGL. JIRA NVGPU-243 Change-Id: I90154a5d23f0014cb79bbcd5b6e8d8dbda303820 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1566627 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Remove sg_phys() from GMMU codeAlex Waterman2017-10-04
| | | | | | | | | | | | | | | | | | Remove the last sg_phys() call from the GMMU code and replace it with a generic nvgpu_mem API. This new API, nvgpu_mem_get_phys_addr(), returns the physical address of an nvgpu_mem struct. Also, implement this new API in the Linux specific nvgpu_mem code since it requires access to the underlying SGT/SGL. JIRA NVGPU-68 Change-Id: Idf88701a2a8515464c658c26e0de493c82ff850d Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1542964 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Change license for common files to MITTerje Bergstrom2017-09-26
| | | | | | | | | | | | Change license of OS independent source code files to MIT. JIRA NVGPU-218 Change-Id: I1474065f4b552112786974a16cdf076c5179540e Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1565880 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: SGL passthrough implementationSunny He2017-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | The basic nvgpu_mem_sgl implementation provides support for OS specific scatter-gather list implementations by simply copying them node by node. This is inefficient, taking extra time and memory. This patch implements an nvgpu_mem_sgt struct to act as a header which is inserted at the front of any scatter- gather list implementation. This labels every struct with a set of ops which can be used to interact with the attached scatter gather list. Since nvgpu common code only has to interact with these function pointers, any sgl implementation can be used. Initialization only requires the allocation of a single struct, removing the need to copy or iterate through the sgl being converted. Jira NVGPU-186 Change-Id: I2994f804a4a4cc141b702e987e9081d8560ba2e8 Signed-off-by: Sunny He <suhe@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1541426 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: nvgpu SGL implementationAlex Waterman2017-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The last major item preventing the core MM code in the nvgpu driver from being platform agnostic is the usage of Linux scattergather tables and scattergather lists. These data structures are used throughout the mapping code to handle discontiguous DMA allocations and also overloaded to represent VIDMEM allocs. The notion of a scatter gather table is crucial to a HW device that can handle discontiguous DMA. The GPU has a MMU which allows the GPU to do page gathering and present a virtually contiguous buffer to the GPU HW. As a result it makes sense for the GPU driver to use some sort of scatter gather concept so maximize memory usage efficiency. To that end this patch keeps the notion of a scatter gather list but implements it in the nvgpu common code. It is based heavily on the Linux SGL concept. It is a singly linked list of blocks - each representing a chunk of memory. To map or use a DMA allocation SW must iterate over each block in the SGL. This patch implements the most basic level of support for this data structure. There are certainly easy optimizations that could be done to speed up the current implementation. However, this patches' goal is to simply divest the core MM code from any last Linux'isms. Speed and efficiency come next. Change-Id: Icf44641db22d87fa1d003debbd9f71b605258e42 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1530867 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Nvgpu abstraction for linux barriers.Debarshi Dutta2017-08-22
| | | | | | | | | | | | | | | | | construct wrapper nvgpu_* methods to replace mb,rmb,wmb,smp_mb,smp_rmb,smp_wmb,read_barrier_depends and smp_read_barrier_depends. NVGPU-122 Change-Id: I8d24dd70fef5cb0fadaacc15f3ab11531667a0df Signed-off-by: Debarshi <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1541199 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Sourab Gupta <sourabg@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
* gpu: nvgpu: Fix length passed to VIDMEM mapAlex Waterman2017-08-14
| | | | | | | | | | | | | | | | | | The call to __set_pd_level() for vidmem allocs had the wrong length being passed in. This was a silent error since the subsequent __set_pd_level() calls overwrote the bad mappings. However this caused significantly more PDE/PTE writes than necessary since each chunk could be mapped N times where N is the number of chunks in an SGL. Change-Id: Ied7247b70825dc91b9eea1c3350f4ef370ab1a52 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1537078 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Remove mm.get_iova_addrAlex Waterman2017-08-04
| | | | | | | | | | | | | | | | | | | | | | Remove the mm.get_iova_addr() HAL and replace it with a new HAL called mm.gpu_phys_addr(). This new HAL provides the real phys address that should be passed to the GPU from a physical address obtained from a scatter list. It also provides a mechanism by which the HAL code can add extra bits to a GPU physical address based on the attributes passed in. This is necessary during GMMU page table programming. Also remove the flags argument from the various address functions. This flag was used for adding an IO coherence bit to the GPU physical address which is not supported. JIRA NVGPU-30 Change-Id: I69af5b1c6bd905c4077c26c098fac101c6b41a33 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1530864 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: fix warnings for GPUs with real vidmemPeter Daifuku2017-08-03
| | | | | | | | | | | | | | | | | | | | | Fix kernel warnings for GPUs with real vidmem: - dma.c: in nvgpu_dma_alloc_flags, ignore incoming flags when using vidmem, since anything but NVGPU_DMA_NO_KERNEL_MAPPING will end up generating kernel warnings, and the vidmem mapping functions ignore the other flags anyway. - gmmu.c: in __nvgpu_gmmu_update_page_table, use appropriate function for memory type to retrieve physical address Bug 1967748 Change-Id: I6fc01fd5f2c5cd7b81cba70ab59cc3c8fe4cda19 Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1530877 Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Add get/set PTE routinesAlex Waterman2017-07-12
| | | | | | | | | | | | | | | | | | | | | | | | | | Add new routines for accessing and modifying PTEs in situ. They are: __nvgpu_pte_words() __nvgpu_get_pte() __nvgpu_set_pte() All the details of modifying a page table entry are handled within. Note, however, that these routines will not build page tables. If a PTE does not exist then said PTE will not be created. Instead -EINVAL will be returned. But, keep in mind, a PTE marked as invalid still exists. So this API can be used to mark an invalid PTE valid. JIRA NVGPU-30 Change-Id: Ic8615f209a0c4eb6fa64af9abadcfb3b2c11ee73 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1510447 Reviewed-by: Automatic_Commit_Validation_User Tested-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
* gpu: nvgpu: Cleanup GMMU debug printingAlex Waterman2017-07-07
| | | | | | | | | | | | | | | | | | | | | | | Ensure that all debug prints are consistent from chip to chip and function to function. The following maps letters in the debug print to their meaning: C Mapping is cachable v Mapping is volatile S Mapping is sparse P Mapping is private (VPR/WPR) c Mapping is coherent V Mapping is valid JIRA NVGPU-30 Change-Id: Ia890af88677c3e6d3fdd8c4fe266158c35b8afcd Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master/r/1514903 GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Tested-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
* gpu: nvgpu: Add t19x GMMU attributesAlex Waterman2017-07-07
| | | | | | | | | | | | | | | | Add t19x specific flags into the GMMU attributes struct. Jira GPUT19X-10 Bug 200279508 Change-Id: Ib45b83705fa1ca4ff6d14da0a2f132050e7d2cd5 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master/r/1514876 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Tested-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
* gpu: nvgpu: support platform specific physical address translationDeepak Nibade2017-07-07
| | | | | | | | | | | | | | | On some GPUs certain physical address bits have special meaning. This patch adds support for setting those bits based on the GMMU attributes struct. Jira GPUT19X-10 Bug 200279508 Change-Id: I32b8a028be7fd62af06a60c393a8c9251de0ef3c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master/r/1512600 GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
* gpu: nvgpu: use coherent aperture for coherent buffersDeepak Nibade2017-07-07
| | | | | | | | | | | | | | | | | Use sysmem_coherent aperture if the buffer mappings are requested to be IO coherent. Use sysmem_noncoherent aperture otherwise. This is implemented by adding a new coherent field to the GMMU attrs struct. Jira GPUT19X-17 Bug 1651331 Bug 200283998 Change-Id: I5cfb71b5913d4db50ebf10331b19f5a4216456bf Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master/r/1514438 GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
* gpu: nvgpu: Implement PD packingAlex Waterman2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some cases page directories require less than a full page of memory. For example, on Pascal, the final PD level for large pages is only 256 bytes; thus 16 PDs can fit in a single page. To allocate an entire page for each of these 256 B PDs is extremely wasteful. This patch aims to alleviate the wasted DMA memory from having small PDs in a full page by packing multiple small PDs into a single page. The packing is implemented as a slab allocator - each page is a slab and from each page multiple PD instances can be allocated. Several modifications to the nvgpu_gmmu_pd struct also needed to be made to support this. The nvgpu_mem is now a pointer and there's an explicit offset into the nvgpu_mem struct so that each nvgpu_gmmu_pd knows what portion of the memory it's using. The nvgpu_pde_phys_addr() function and the pd_write() functions also require some changes since the PD no longer is always situated at the start of the nvgpu_mem. Initialization and cleanup of the page tables for each VM was slightly modified to work through the new pd_cache implementation. Some PDs (i.e the PDB), despite not being a full page, still require a full page for alignment purposes (HW requirements). Thus a direct allocation method for PDs is still provided. This is also used when a PD that could in principle be cached is greater than a page in size. Lastly a new debug flag was added for the pd_cache code. JIRA NVGPU-30 Change-Id: I64c8037fc356783c1ef203cc143c4d71bbd5d77c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master/r/1506610 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit
* gpu: nvgpu: gmmu programming rewriteAlex Waterman2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the high level mapping logic. Instead of iterating over the GPU VA iterate over the scatter-gather table chunks. As a result each GMMU page table update call is simplified dramatically. This also modifies the chip level code to no longer require an SGL as an argument. Each call to the chip level code will be guaranteed to be contiguous so it only has to worry about making a mapping from virt -> phys. This removes the dependency on Linux that the chip code currently has. With this patch the core GMMU code still uses the Linux SGL but the logic is highly transferable to a different, nvgpu specific, scatter gather list format in the near future. The last major update is to push most of the page table attribute arguments to a struct. That struct is passed on through the various mapping levels. This makes the funtions calls more simple and easier to follow. JIRA NVGPU-30 Change-Id: Ibb6b11755f99818fe642622ca0bd4cbed054f602 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master/r/1484104 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit
* gpu: nvgpu: Remove fmodel GMMU allocationAlex Waterman2017-06-27
| | | | | | | | | | | | | | | | | Remove the special cases for fmodel in the GMMU allocation code. There is no reason to treat fmodel any different than regular DMA memory. If there is no IOMMU the DMA api will handle that perfectly acceptably. JIRA NVGPU-30 Change-Id: Icceb832735a98b601b9f41064dd73a6edee29002 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master/r/1507562 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Separate GMMU mapping impl from mm_gk20a.cAlex Waterman2017-06-06
| | | | | | | | | | | | | | | Separate the non-chip specific GMMU mapping implementation code out of mm_gk20a.c. This puts all of the chip-agnostic code into common/mm/gmmu.c in preparation for rewriting it. JIRA NVGPU-12 JIRA NVGPU-30 Change-Id: I6f7fdac3422703f5e80bb22ad304dc27bba4814d Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1480228 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Begin removing variables in struct gk20aAlex Waterman2017-05-30
| | | | | | | | | | | | | | | | | | | | | | Begin removing all of the myriad flag variables in struct gk20a and replace that with one API that checks for flags being enabled or disabled. The API is as follows: bool nvgpu_is_enabled(struct gk20a *g, int flag); bool __nvgpu_set_enabled(struct gk20a *g, int flag, bool state); These APIs allow many of the gk20a flags to be replaced by defines. This makes flag usage consistent and saves a small amount of memory in struct gk20a. Also it makes struct gk20a easier to read since there's less clutter scattered through out. JIRA NVGPU-84 Change-Id: I6525cecbe97c4e8379e5f53e29ef0b4dbd1a7fc2 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1488049 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Refactor VM init/cleanupAlex Waterman2017-05-26
| | | | | | | | | | | | | | | | | Refactor the API for initializing and cleaning up VMs. This also involved moving a bunch of GMMU code out into the gmmu code since part of initializing a VM involves initializing the page tables for the VM. JIRA NVGPU-12 JIRA NVGPU-30 Change-Id: I4710f08c26a6e39806f0762a35f6db5c94b64c50 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1477746 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Separate GMMU out of mm_gk20a.cAlex Waterman2017-05-11
Begin moving (and renaming) the GMMU code into common/mm/gmmu.c. This block of code will be responsible for handling the platform/OS independent GMMU operations. JIRA NVGPU-12 JIRA NVGPU-30 Change-Id: Ide761bab75e5d84be3dcb977c4842ae4b3a7c1b3 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1464083 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>