summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/nvgpu/gk20a/mm_gk20a.h
Commit message (Collapse)AuthorAge
...
* gpu: nvgpu: remove unused list in priv_cmd_entryDeepak Nibade2017-04-19
| | | | | | | | | | | | | | | list_head list in struct priv_cmd_entry is unused and hence remove it Jira NVGPU-13 Change-Id: I268998d84a43d2e22fdfae53a9c7795b20fcf93f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1462081 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Add wrapper nvgpu/kref.hDeepak Nibade2017-04-17
| | | | | | | | | | | | | Add wrapper header file nvgpu/kref.h. It #includes <linux/kref.h> in Linux. JIRA NVGPU-13 Change-Id: Ib8b002268b1960646986551ecb9f286e1e21e7f6 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1463770 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: remove unused variableDeepak Nibade2017-04-12
| | | | | | | | | | | | | list node unmap_list in struct mapped_buffer_node is unused and hence remove it Jira NVGPU-13 Change-Id: Ie81a30691e45095ac72bd9ab8a8ebb88d5ea47c1 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1460577 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: use nvgpu list for buffer statesDeepak Nibade2017-04-10
| | | | | | | | | | | | | | | Use nvgpu list APIs instead of linux list APIs for to store buffer states in gk20a_dmabuf_priv. Jira NVGPU-13 Change-Id: I9666b2435804b132bb86bb74c0b20590749b153f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1454689 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
* gpu: nvgpu: Move DMA API to dma.hAlex Waterman2017-04-06
| | | | | | | | | | | | | | | Make an nvgpu DMA API include file so that the intricacies of the Linux DMA API can be hidden from the calling code. Also document the nvgpu DMA API. JIRA NVGPU-12 Change-Id: I7578e4c726ad46344b7921179d95861858e9a27e Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1323326 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: rename mem_desc to nvgpu_memAlex Waterman2017-04-06
| | | | | | | | | | | | | | | | | Renaming was done with the following command: $ find -type f | \ xargs sed -i 's/struct mem_desc/struct nvgpu_mem/g' Also rename mem_desc.[ch] to nvgpu_mem.[ch]. JIRA NVGPU-12 Change-Id: I69395758c22a56aa01e3dffbcded70a729bf559a Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1325547 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Rename gk20a_mem_* functionsAlex Waterman2017-04-06
| | | | | | | | | | | | Rename the functions used for mem_desc access to nvgpu_mem_*. JIRA NVGPU-12 Change-Id: Ibfdc1112d43f0a125e4487c250e3f977ffd2cd75 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1323325 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Split mem_desc from MM codeAlex Waterman2017-04-06
| | | | | | | | | | | | | | Split the mem_desc code out from the MM code. This is to help simplify the MM code and make it easier to abstract the DMA allocation routines. JIRA NVGPU-12 Change-Id: I2ccb643efe6bbed80d1360a580ff5593acb407bd Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1323324 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: use nvgpu rbtree to store mapped buffersDeepak Nibade2017-04-06
| | | | | | | | | | | | | | | | | | Use nvgpu rbtree instead of linux rbtree to store mapped buffers for each VM Move to use "struct nvgpu_rbtree_node" instead of "struct rb_node" And similarly use rbtree APIs from <nvgpu/rbtree.h> instead of linux APIs Jira NVGPU-13 Change-Id: Id96ba76e20fa9ecad016cd5d5a6a7d40579a70f2 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1453043 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Remove vmalloc.h and slab.h usageAlex Waterman2017-04-04
| | | | | | | | | | | | | | Remove all usage of vmalloc.h and slab.h outside of the Linux specific kmem API implementation code. Bug 1799159 Bug 1823380 Change-Id: I5b2a91bd1057b272efeaddc24902f6133b35024f Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1331703 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: add teardown_ch_tsg fifo opsSeema Khowala2017-04-04
| | | | | | | | | | | | | | | teardown_ch_tsg fifo ops added as t19x s/w recovery procedure is different than legacy chips. JIRA GPUT19X-7 Change-Id: I5b88f2c1a19d309e5c97c588ddf9689163a75fea Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: http://git-master/r/1327932 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: use nvgpu list for VA listsDeepak Nibade2017-04-03
| | | | | | | | | | | | | | Use nvgpu list APIs instead of linux list APIs for reserved VA list and buffer VA list Jira NVGPU-13 Change-Id: I83c02345d54bca03b00270563567227510cfce6b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1454013 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: use nvgpu list for vidmem clear listDeepak Nibade2017-04-03
| | | | | | | | | | | | | | | Use nvgpu list APIs instead of linux list APIs for vidmem clear list Jira NVGPU-13 Change-Id: I13f7c5a54fb199d15ad1402216f3275f0f0474af Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1454012 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Split out pramin codeAlex Waterman2017-03-31
| | | | | | | | | | | | | Split out the pramin interface code in preparation for splitting out the mem_desc code. JIRA NVGPU-12 Change-Id: I3f03447ea213cc15669b0934fa706e7cb22599b7 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1323323 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Use new kmem API functions (gk20a mm)Alex Waterman2017-03-30
| | | | | | | | | | | | | | | | | | | | Use the new kmem API functions in gk20a's mm code. Add a struct gk20a pointer to the dmabuf priv struct so that the cleanup function has access to the gk20a struct. Also add a gk20a pointer to some of the sg table functions so that they can use the nvgpu kmem APIs. Bug 1799159 Bug 1823380 Change-Id: I85a307c6bf862627c5b1af0e077283b48d78fa72 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1318321 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Use nvgpu_timeout for all loopsTerje Bergstrom2017-03-27
| | | | | | | | | | | | | | | | | | | | There were still a few remaining loops where we did not use nvgpu_timeout and required Tegra specific functions for detecting if timeout should be skipped. Replace all of them with nvgpu_timeout and remove including chip-id.h where possible. FE power mode timeout loop also used wrong delay value. It always waited for the whole max timeout instead of looping with smaller increments. If SEC2 ACR boot fails to halt, we should not try to check ACR result from mailbox. Add an early return for that case. JIRA NVGPU-16 Change-Id: I9f0984250d7d01785755338e39822e6631dcaa5a Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1323227
* gpu: nvgpu: abstract away dma alloc attrsKonsta Holtta2017-03-21
| | | | | | | | | | | | | | | | | | | | | | | | Don't use enum dma_attr in the gk20a_gmmu_alloc_attr* functions, but define nvgpu-internal flags for no kernel mapping, force contiguous, and read only modes. Store the flags in the allocated struct mem_desc and only use gk20a_gmmu_free, remove gk20a_gmmu_free_attr. This helps in OS abstraction. Rename the notion of attr to flags. Add implicit NVGPU_DMA_NO_KERNEL_MAPPING to all vidmem buffers allocated via gk20a_gmmu_alloc_vid for consistency. Fix a bug in gk20a_gmmu_alloc_map_attr that dropped the attr parameter accidentally. Bug 1853519 Change-Id: I1ff67dff9fc425457ae445ce4976a780eb4dcc9f Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1321101 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Move all FB programming to FB HALTerje Bergstrom2017-03-17
| | | | | | | | | | | | | | | | | | | | | Move all programming of FB to fb_*.c files, and remove the inclusion of FB hardware headers from other files. TLB invalidate function took previously a pointer to VM, but the new API takes only a PDB mem_desc, because FB does not need to know about higher level VM. GPC MMU is programmed from the same function as FB MMU, so added dependency to GR hardware header to FB. GP106 ACR was also triggering a VPR fetch, but that's not applicable to dGPU, so removed that call. Change-Id: I4eb69377ac3745da205907626cf60948b7c5392a Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1321516 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Remove unused function gk20a_get_phys_from_iovaTerje Bergstrom2017-03-14
| | | | | | | | | | | | Remove unused function gk20a_get_phys_from_iova. At the same time remove the #include for iommu.h, which was only needed by this function. Change-Id: Ia858b0ad5fe7e423d650aa9f82e430f419f2a492 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1319070 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Give nvgpu_kalloc a less generic nameAlex Waterman2017-03-03
| | | | | | | | | | | | | | | | | | | | | | Change nvgpu_kalloc() to nvgpu_big_[mz]alloc(). This is necessary since the natural free function name for this is nvgpu_kfree() but that conflicts with nvgpu_k[mz]alloc() (implemented in a subsequent patch). This API exists becasue not all allocation sizes can be determined at compile time and in some cases sizes may vary across the system page size. Thus always using kmalloc() could lead to OOM errors due to fragmentation. But always using vmalloc() is wastful of memory for small allocations. This API tries to alleviate those problems. Bug 1799159 Bug 1823380 Change-Id: I49ec5292ce13bcdecf112afbb4a0cfffeeb5ecfc Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1283827 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: remove use of DEFINE_MUTEX()Deepak Nibade2017-02-22
| | | | | | | | | | | | | | | | | | | API DEFINE_MUTEX() is defined in Linux and might not be available in other OSs. Hence remove its usage from nvgpu Declare and explicitly initialize below mutexes for both nvgpu and vgpu g->mm.priv_lock g->mm.tlb_lock Jira NVGPU-13 Change-Id: If72885a6da0227a1552303206172f1f2b751471d Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1298042 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: use common nvgpu mutex/spinlock APIsDeepak Nibade2017-02-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using Linux APIs for mutex and spinlocks directly, use new APIs defined in <nvgpu/lock.h> Replace Linux specific mutex/spinlock declaration, init, lock, unlock APIs with new APIs e.g struct mutex is replaced by struct nvgpu_mutex and mutex_lock() is replaced by nvgpu_mutex_acquire() And also include <nvgpu/lock.h> instead of including <linux/mutex.h> and <linux/spinlock.h> Add explicit nvgpu/lock.h includes to below files to fix complilation failures. gk20a/platform_gk20a.h include/nvgpu/allocator.h Jira NVGPU-13 Change-Id: I81a05d21ecdbd90c2076a9f0aefd0e40b215bd33 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1293187 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Move from gk20a_ to nvgpu_ in semaphore codeAlex Waterman2017-02-13
| | | | | | | | | | | | | Change the prefix in the semaphore code to 'nvgpu_' since this code is global to all chips. Bug 1799159 Change-Id: Ic1f3e13428882019e5d1f547acfe95271cc10da5 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1284628 Reviewed-by: Varun Colbert <vcolbert@nvidia.com> Tested-by: Varun Colbert <vcolbert@nvidia.com>
* gpu: nvgpu: Organize semaphore_gk20a.[ch]Alex Waterman2017-02-13
| | | | | | | | | | | | | | | | | | | | | | | Move semaphore_gk20a.c drivers/gpu/nvgpu/common/ since the semaphore code is common to all chips. Move the semaphore_gk20a.h header file to drivers/gpu/nvgpu/include/nvgpu and rename it to semaphore.h. Also update all places where the header is inluced to use the new path. This revealed an odd location for the enum gk20a_mem_rw_flag. This should be in the mm headers. As a result many places that did not need anything semaphore related had to include the semaphore header file. Fixing this oddity allowed the semaphore include to be removed from many C files that did not need it. Bug 1799159 Change-Id: Ie017219acf34c4c481747323b9f3ac33e76e064c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1284627 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Simplify ref-counting on VMsAlex Waterman2017-02-07
| | | | | | | | | | | | | | | | | | | | Simplify ref-counting on VMs: take a ref when a VM is bound to a channel and drop a ref when a channel is freed. Previously ref-counts were scattered over the driver. Also the CE and CDE code would bind channels with custom rolled code. This was because the gk20a_vm_bind_channel() function took an as_share as the VM argument (the VM was then inferred from that as_share). However, it is trivial to abtract that bit out and allow a central bind channel function that just takes a VM and a channel. Bug 1846718 Change-Id: I156aab259f6c7a2fa338408c6c4a3a464cd44a0c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1261886 Reviewed-by: Richard Zhao <rizhao@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Conditional address space unificationAlex Waterman2017-01-31
| | | | | | | | | | | | | | | | Allow platforms to choose whether or not to have unified GPU VA spaces. This is useful for the dGPU where having a unified address space has no problems. On iGPUs testing issues is getting in the way of enabling this feature. Bug 1396644 Bug 1729947 Change-Id: I65985f1f9a818f4b06219715cc09619911e4824b Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1265303 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Remove separate fixed address VMAAlex Waterman2017-01-31
| | | | | | | | | | | | | | Remove the special VMA that could be used for allocating fixed addresses. This feature was never used and is not worth maintaining. Bug 1396644 Bug 1729947 Change-Id: I06f92caa01623535516935acc03ce38dbdb0e318 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1265302 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Unify the small and large page address spacesAlex Waterman2017-01-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The basic structure of this patch is to make the small page allocator and the large page allocator into pointers (where they used to be just structs). Then assign each of those pointers to the same actual allocator since the buddy allocator has supported mixed page sizes since its inception. For the rest of the driver some changes had to be made in order to actually support mixed pages in a single address space. 1. Unifying the allocation page size determination Since the allocation and map operations happen at distinct times both mapping and allocation of GVA space must agree on page size. This is because the allocation has to separate allocations into separate PDEs to avoid the necessity of supporting mixed PDEs. To this end a function __get_pte_size() was introduced which is used both by the balloc code and the core GPU MM code. It determines page size based only on the length of the mapping/ allocation. 2. Fixed address allocation + page size Similar to regular mappings/GVA allocations fixed address mapping page size determination had to be modified. In the past the address of the mapping determined page size since the address space split was by address (low addresses were small pages, high addresses large pages). Since that is no longer the case the page size field in the reserve memory ioctl is now honored by the mapping code. When, for instance, CUDA makes a memory reservation it specifies small or large pages. When CUDA requests mappings to be made within that address range the page size is then looked up in the reserved memory struct. Fixed address reservations were also modified to now always allocate at a PDE granularity (64M or 128M depending on large page size. This prevents non-fixed allocations from ending up in the same PDE and causing kernel panics or GMMU faults. 3. The rest... The rest of the changes are just by products of the above. Lots of places required minor updates to use a pointer to the GVA allocator struct instead of the struct itself. Lastly, this change is not truly complete. More work remains to be done in order to fully remove the notion that there was such a thing as separate address spaces for different page sizes. Basically after this patch what remains is cleanup and proper documentation. Bug 1396644 Bug 1729947 Change-Id: If51ab396a37ba16c69e434adb47edeef083dce57 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1265300 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Remove circular dependency in PMU includesTerje Bergstrom2017-01-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove including gk20a.h from pmu_gk20a.h. This causes a fallout as some #includes were missing. gr_gp10b.h uses mem_desc, but did not include mm_gk20a.h. Add the include. Including mm_gk20a.h in gr_gp10b.h causes recursive include, as mm_gk20a.h has some gr defines. Move the defines to gr_gk20a.h to remove the dependency. gr_ctx_gk20a.h used struct gk20a pointers, but did not forward declare it. Add a forward declaration. gr_gk20a.h uses dbg_session_gk20a, but was missing forward declaration. gr_gk20a.h did not include nvgpu.h but it uses preemption types from that header. Add include. Change-Id: I2168e2303b55e0d187b816bcb26f37c8af1649ba Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1283717 Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
* gpu: nvgpu: use soc/tegra/chip-id.h for soc headerShardar Shariff Md2017-01-20
| | | | | | | | | | | | The soc tegra headers are unified and moved all the content of linux/tegra-soc.h to the soc/tegra/chip-id.h to have the single soc header for Tegra. Change-Id: I281e19dd3eb1538b8dfbea4eb0779fb64d1fcffa Signed-off-by: Shardar Shariff Md <smohammed@nvidia.com> Reviewed-on: http://git-master/r/1288365 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Move allocators to common/mm/Alex Waterman2017-01-09
| | | | | | | | | | | | | | | | | | | Move the GPU allocators to common/mm/ since the allocators are common code across all GPUs. Also rename the allocator code to move away from gk20a_ prefixed structs and functions. This caused one issue with the nvgpu_alloc() and nvgpu_free() functions. There was a function for allocating either with kmalloc() or vmalloc() depending on the size of the allocation. Those have now been renamed to nvgpu_kalloc() and nvgpu_kfree(). Bug 1799159 Change-Id: Iddda92c013612bcb209847084ec85b8953002fa5 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1274400 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: make preemption modes unsignedDeepak Nibade2016-12-21
| | | | | | | | | | | | | | | | | | | | graphics and compute preemption modes are currently defined as int But it is more logical to have them as unsigned int Also, we treat preemption modes as unsigned almost everywhere in the code Fix prints in gk20a_fifo_sched_debugfs_seq_show() to print U32_MAX with %d which is same as printing -1 Bug 200263471 Change-Id: Iabd0ee3923b76d81620898e90a9b1fc5dd75b530 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1272514 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: copy data into channel context headerseshendra Gadagottu2016-12-20
| | | | | | | | | | | | | | If channel context has separate context header then copy required info into context header instead of main context header. JIRA GV11B-21 Change-Id: I5e0bdde132fb83956fd6ac473148ad4de498e830 Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1229243 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Use end of vidmem as bootstrap regionTerje Bergstrom2016-12-09
| | | | | | | | | | | | | | | | | Instead of hard coding bootstrap region, it should always be set to the last 256MB of vidmem. Bug 200244445 Change-Id: I91779d1bf861f4f23a0b646f70b1febbbc4581b5 Signed-off-by: David Nieto <dmartineznie@nvidia.com> Reviewed-on: http://git-master/r/1242409 Reviewed-by: David Martinez Nieto <dmartineznie@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-on: http://git-master/r/1267124 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: API to access fb memoryDeepak Nibade2016-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add IOCTL API NVGPU_DBG_GPU_IOCTL_ACCESS_FB_MEMORY to read/write fb/vidmem memory Interface will accept dmabuf_fd of the buffer in vidmem, offset into the buffer to access, temporary buffer to copy data across API, size of read/write and command indicating either read or write operation API will first parse all the inputs, and then call gk20a_vidbuf_access_memory() to complete fb access gk20a_vidbuf_access_memory() will then just use gk20a_mem_rd_n() or gk20a_mem_wr_n() depending on the command issued Bug 1804714 Jira DNVGPU-192 Change-Id: Iba3c42410abe12c2884d3b603fa33d27782e4c56 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1255556 (cherry picked from commit 2c49a8a79d93fc526adbf6f808484fa9a3fa2498) Reviewed-on: http://git-master/r/1260471 GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
* Revert "Revert "gpu: nvgpu: vgpu: alloc hwpm ctxt buf on client""Peter Daifuku2016-11-14
| | | | | | | | | | | | | | | | | This reverts commit 5f1c2bc27fb9dd66ed046b0590afc365be5011bf. Added back now that matching RM server has been updated: In hypervisor mode, all GPU VA allocations must be done by client; fix this for the allocation of the hwpm ctxt buffer Bug 200231611 Change-Id: Ie5ce2c2562401b1f00821231d37608e3fc30d4a4 Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1252138 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* Revert "gpu: nvgpu: vgpu: alloc hwpm ctxt buf on client"Sivaram Nair2016-11-04
| | | | | | | | | | This reverts commit 57821e215756b3df7acc9c0eb5017e39f141d381. Change-Id: Ic4801115064ccbcd1435298a61871921d056b8ea Signed-off-by: Sivaram Nair <sivaramn@nvidia.com> Reviewed-on: http://git-master/r/1247825 Reviewed-by: Rakesh Babu Bodla <rbodla@nvidia.com> Tested-by: Rakesh Babu Bodla <rbodla@nvidia.com>
* gpu: nvgpu: vgpu: alloc hwpm ctxt buf on clientPeter Daifuku2016-11-03
| | | | | | | | | | | | | | | | | | In hypervisor mode, all GPU VA allocations must be done by client; fix this for the allocation of the hwpm ctxt buffer Bug 200231611 Change-Id: I0270b1298308383a969a47d0a859ed53c20594ef Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1240913 (cherry picked from commit 49314d42b13e27dc2f8c1e569a8c3e750173148d) Reviewed-on: http://git-master/r/1245867 (cherry picked from commit d0b10e84d90d0fd61eca8be0f9e879d9cec71d3e) Reviewed-on: http://git-master/r/1246700 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Move CE cleanupAlex Waterman2016-10-26
| | | | | | | | | | | | | | | | Move the CE cleanup to before the FIFO cleanup. Since the CE closes a channel during its cleanup the FIFO needs to be initialized since the FIFO code maintains the vmalloc()'ed channels. Bug 1816516 Change-Id: Ia7a97059a12a0c2b52368ffe411e597f803e8e6e Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1225613 (cherry picked from commit 707bd2a6d4672c6a7b7a8b2e581ea3a606ed971d) Reviewed-on: http://git-master/r/1240106 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: use inplace allocation in sync frameworkSachit Kadle2016-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change is the first of a series of changes to support the usage of pre-allocated job tracking resources in the submit path. With this change, we still maintain a dynamically-allocated joblist, but make the necessary changes in the channel_sync & fence framework to use in-place allocations. Specifically, we: 1) Update channel sync framework routines to take in pre-allocated priv_cmd_entry(s) & gk20a_fence(s) rather than dynamically allocating themselves 2) Move allocation of priv_cmd_entry(s) & gk20a_fence(s) to gk20a_submit_prepare_syncs 3) Modify fence framework to have seperate allocation and init APIs. We expose allocation as a seperate API, so the client can allocate the object before passing it into the channel sync framework. 4) Fix clean_up logic in channel sync framework Bug 1795076 Change-Id: I96db457683cd207fd029c31c45f548f98055e844 Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1206725 (cherry picked from commit 9d196fd10db6c2f934c2a53b1fc0500eb4626624) Reviewed-on: http://git-master/r/1223933 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: allow skipping pramin barriersKonsta Holtta2016-10-17
| | | | | | | | | | | | | | | | | | | | | A wmb() next to each gk20a_mem_wr32() via PRAMIN may be overly careful, so support not inserting these barriers for performance, in cases where they are not necessary, where the caller would do an explicit barrier after a bunch of reads. Also, move those optional wmb()s to be done at the end of the whole internally batched write for gk20a_mem_{wr_n,memset} from the per-batch subloops that may run multiple times. Jira DNVGPU-23 Change-Id: I61ee65418335863110bca6f036b2e883b048c5c2 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1225149 (cherry picked from commit d2c40327d1995f76e8ab9cb4cd8c76407dabc6de) Reviewed-on: http://git-master/r/1227474 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: add ioctl for querying memory stateKonsta Holtta2016-10-14
| | | | | | | | | | | | | | | | | | | Add NVGPU_GPU_IOCTL_GET_MEMORY_STATE to read the amount of free device-local video memory, if applicable. Some reserved fields are added to support different types of queries in the future (e.g. context-local free amount). Bug 1787771 Bug 200233138 Change-Id: Id5ffd02ad4d6ed3a6dc196541938573c27b340ac Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1223762 (cherry picked from commit 96221d96c7972c6387944603e974f7639d6dbe70) Reviewed-on: http://git-master/r/1235980 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: track pending bytes for vidmem clearsKonsta Holtta2016-10-14
| | | | | | | | | | | | | | | | | Change clears_pending to bytes_pending and track accordingly the number of bytes to be freed instead of the number of buffers. This, atomically combined with the amount of space in the allocator, is the total amount of free memory available. Bug 200233138 Change-Id: Ibbb4e80a32728781ba19a74307d8a8ac1a4d7431 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1231422 (cherry picked from commit 025e765f312c253b201ecf2dbbe0f4972fe1d4bc) Reviewed-on: http://git-master/r/1235957 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: mark vidmem.cleared volatileKonsta Holtta2016-10-14
| | | | | | | | | | | | | | | | The boolean flag mm_gk20a.vidmem.cleared is shared across threads, so mark it volatile to prevent compiler from wrongly optimizing accesses to it. Jira DNVGPU-84 Change-Id: I1fe66b26966685d3f74ed95ba53b198f810231b9 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1233016 (cherry picked from commit dc6c9db56ea8a5f55f28f97fdfc3c1ac60d8b195) Reviewed-on: http://git-master/r/1235317 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: compact pte buffersKonsta Holtta2016-10-13
| | | | | | | | | | | | | | | | | | | | | | The lowest page table level may hold very few entries for mappings of large pages, but a new page is allocated for each list of entries at the lowest level, wasting memory and performance. Compact these so that the new "allocation" of ptes is appended at the end of the previous allocation, if there is space. 4 KB page is still the smallest size requested from the allocator; any possible overhead in the allocator (e.g., internally allocating big pages only) is not taken into account. Bug 1736604 Change-Id: I03fb795cbc06c869fcf5f1b92def89a04583ee83 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1221841 (cherry picked from commit fa92017ed48e1d5f48c1a12c512641c6ce9924af) Reviewed-on: http://git-master/r/1234996 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: make sure vidmem is cleared only onceKonsta Holtta2016-10-12
| | | | | | | | | | | | | | | | | | Protect the initial vidmem zeroing performed during the first userspace alloc with a mutex, so that it blocks next concurrent users and is run only once. Otherwise, multiple clears could end up running in parallel, so that the next ones corrupt memory allocated by the thread that has finished earlier and advanced to allocate and use memory. Jira DNVGPU-84 Change-Id: If497749abf481b230835250191d011c4a9d1483b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1232461 (cherry picked from commit 79435a68e6d2713b78acdb0ec6f77cfd78651d7f) Reviewed-on: http://git-master/r/1234990 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: test free user vidmem atomicallyKonsta Holtta2016-09-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | An empty list of soon-to-be-freed userspace vidmem buffers is not enough to safely assume that an allocation may succeed or not if tried again, because removal from the list and actually marking the memory freed is not atomic. Fix this by using an atomic counter for the number of pending frees (so that it's still safe to first remove from the job list and then perform the free), and making allocation attempts combined with a test of pending frees atomic. This still does not guarantee that there is memory available (as the actual amount of pending memory in bytes plus the current free amount isn't computed), but removes the race that produces false negatives in case a single program expects repeated frees and allocs to succeed. Bug 1809939 Change-Id: I6a92da2e21cbf3f886b727000c924d56f35ce55b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1217078 (cherry picked from commit 83c1f1e70dccd92fdd4481132cf5b6717760d432) Reviewed-on: http://git-master/r/1220545 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: add new API to get base address for sysmem/vidmem buffersDeepak Nibade2016-09-01
| | | | | | | | | | | | | | | | | | | | | | | Add new API gk20a_mem_get_base_addr() which will return vidmem base address in case of vidmem and IOVA address in case of sysmem Even though vidmem allocations are non-contiguous, this API is useful (and should only be used) for allocations with one chunk (e.g. page tables) Also, since page tables could either reside in sysmem or vidmem, use this API to get address of page tables Jira DNVGPU-20 Change-Id: Ie04af9ca7bfccfec1a8a8e4be2c507cef5cef8e1 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1206403 (cherry picked from commit a8c74dc188878f2948fa1e0e47bf1837fba6c5e0) Reviewed-on: http://git-master/r/1210957 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: track allocator and user for each memDeepak Nibade2016-09-01
| | | | | | | | | | | | | | | | | | | | | | | | | Store allocator pointer for each mem_desc This pointer should be used while freeing the mem instead of assuming a common allocator Add flag user_mem to mem_desc which will be set only in case of User vidmem allocations We will delay free of mem in worker only if this flag is set on mem. Otherwise, we will free it immediately This is needed so that all kernel allocations can work with both sysmem and vidmem Jira DNVGPU-84 Change-Id: Ib9a9209b164bc56b7880448f86bd6d42b324cc86 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1203099 (cherry picked from commit 8f0b0122f36a0b6f1932fa9a98d7eb03b1f623d1) Reviewed-on: http://git-master/r/1210953 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: clear vidmem buffers in workerDeepak Nibade2016-09-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We clear buffers allocated in vidmem in buffer free path. But to clear buffers, we need to submit CE jobs and this could cause issues/races if free called from critical path Hence solve this by moving buffer clear/free to a worker gk20a_gmmu_free_attr_vid() will now just put mem_desc into a list and schedule a worker And worker thread will traverse the list and clear/free the allocations In struct gk20a_vidmem_buf, mem variable is statically allocated. But since we delay free of mem, convert this variable into a pointer and allocate it dynamically Since we delay free of vidmem memory, it is now possible to face OOM conditions during allocations. Hence while allocating block until we have sufficient memory available with an upper limit of 1S Jira DNVGPU-84 Change-Id: I7925590644afae50b6fc04c6e1e43bbaa1c220fd Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1201346 (cherry picked from commit b4dec4a30de2431369d677acca00e420f8e581a5) Reviewed-on: http://git-master/r/1210950 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>