nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: fix gk20a_mm_smmu_vaddr_translate()	Richard Zhao	2016-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- remove checking of has_physical_mode - check whether get_physical_addr_bits is null JIRA VFND-1965 Change-Id: If19b297dc853b9e0b5879c5b2e0a350b5d9b279a Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1175738 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Thomas Fleury <tfleury@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: handle map/unmap for vidmem gmmu pages	Konsta Holtta	2016-07-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If page tables are allocated from vidmem, cpu cache flushing doesn't make sense, so skip it. Unify also map/unmap actions if the pages are not mapped. Jira DNVGPU-20 Change-Id: I36b22749aab99a7bae26c869075f8073eab0f860 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1178830 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
*	gpu: nvgpu: zero vidmem pages on allocation	Konsta Holtta	2016-07-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The allocator doesn't give us empty pages, so make sure that they're full of zeros, just like the sysmem alloc path does. Jira DNVGPU-16 Change-Id: I0ff8a0718829b13973535ba1111a8a11b91be04d Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1178829 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
*	gpu: nvgpu: use vidmem by default in gmmu_alloc variants	Konsta Holtta	2016-07-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For devices that have vidmem available, use the vidmem allocator in gk20a_gmmu_alloc{,attr,_map,_map_attr}. For others, use sysmem. Because all of the buffers haven't been tested to work in vidmem yet, rename calls to gk20a_gmmu_alloc{,attr,_map,_map_attr} to have _sys at the end to declare explicitly that vidmem is used. Enabling vidmem for each now is a matter of removing "_sys" from the function call. Jira DNVGPU-18 Change-Id: Ibe42f67eff2c2b68c36582e978ace419dc815dc5 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1176805 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Make gk20a_init_sema_pool() static	Alex Waterman	2016-07-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This function is only used in mm_gk20a.c and as a result should be static (fixes a sparse issue). Bug 200088648 Change-Id: I6787b4ebc5925a503d8ef2fed90c3d7cd5027589 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1176309 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Richard Zhao <rizhao@nvidia.com>
*	gpu: nvgpu: support in-kernel vidmem mappings	Konsta Holtta	2016-07-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Propagate the buffer aperture flag in gk20a_locked_gmmu_map up so that buffers represented as a mem_desc and present in vidmem can be mapped to gpu. JIRA DNVGPU-18 JIRA DNVGPU-76 Change-Id: I46cf87e27229123016727339b9349d5e2c835b3e Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1169308 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: ngpu: add support for vidmem in page tables	Konsta Holtta	2016-07-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Modify page table updates to take an aperture flag (up until gk20a_locked_gmmu_map()), don't hard-assume sysmem and propagate it to hardware. Jira DNVGPU-76 Change-Id: Ifcb22900c96db993068edd110e09368f72b06f69 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1169307 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: initial support for vidmem apertures	Konsta Holtta	2016-07-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	add gk20a_aperture_mask() for memory target selection now that buffers can actually be allocated from vidmem, and use it in all cases that have a mem_desc available. Jira DNVGPU-76 Change-Id: I4353cdc6e1e79488f0875581cfaf2a5cfb8c976a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1169306 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use 64-bit math for cacheline offset	Terje Bergstrom	2016-07-04
\| \| \| \| \| \| \| \| \| \|	Cast cacheline_start to u64 to use 64-bit maths for cacheline offset. Coverity ID 24250 Change-Id: Ic10c92ebb737bd39486a83e4de53cc1191193667 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1172053
*	gpu: nvgpu: Revamp semaphore support	Alex Waterman	2016-06-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Revamp the support the nvgpu driver has for semaphores. The original problem with nvgpu's semaphore support is that it required a SW based wait for every semaphore release. This was because for every fence that gk20a_channel_semaphore_wait_fd() waited on a new semaphore was created. This semaphore would then get released by SW when the fence signaled. This meant that for every release there was necessarily a sync_fence_wait_async() call which could block. The latency of this SW wait was enough to cause massive degredation in performance. To fix this a fast path was implemented. When a fence is passed to gk20a_channel_semaphore_wait_fd() that is backed by a GPU semaphore a semaphore acquire is directly used to block the GPU. No longer is a sync_fence_wait_async() performed nor is there an extra semaphore created. To implement this fast path the semaphore memory had to be shared between channels. Previously since a new semaphore was created every time through gk20a_channel_semaphore_wait_fd() what address space a semaphore was mapped into was irrelevant. However, when using the fast path a sempahore may be released on one address space but acquired in another. Sharing the semaphore memory was done by making a fixed GPU mapping in all channels. This mapping points to the semaphore memory (the so called semaphore sea). This global fixed mapping is read-only to make sure no semaphores can be incremented (i.e released) by a malicious channel. Each channel then gets a RW mapping of it's own semaphore. This way a channel may only acquire other channel's semaphores but may both acquire and release its own semaphore. The gk20a fence code was updated to allow introspection of the GPU backed fences. This allows detection of when the fast path can be taken. If the fast path cannot be used (for example when a fence is sync-pt backed) the original slow path is still present. This gets used when the GPU needs to wait on an event from something which only understands how to use sync-pts. Bug 1732449 JIRA DNVGPU-12 Change-Id: Ic0fea74994da5819a771deac726bb0d47a33c2de Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1133792 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add vidmem allocation API	Konsta Holtta	2016-06-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add in-nvgpu APIs for allocating and freeing mem_descs in video memory. Changes for gmmu tables etc. will be added in upcoming changes. Video memory is allocated via nvmap by initially registering the aperture size to it and binding it to a struct device, and then going via the usual dma alloc. This API allows also fixed-address allocations, meant for reserving special memory areas at boot. The aperture registration is skipped completely if vidmem isn't found for the particular device. gk20a_gmmu_alloc_attr() still uses sysmem, and the unmap/free paths select internally the correct path by the mem_desc's aperture. Video memory allocation is off by default, and can be turned on with CONFIG_GK20A_VIDMEM. JIRA DNVGPU-16 Change-Id: I77eae5ea90cbed6f4b5db0da86c5f70ddf2a34f9 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1157216 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: optimize mem_desc accessor loops	Konsta Holtta	2016-06-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of going via gk20a_mem_{wr,rd}32() on each iteration, do direct memcpy/memset with sysmem, and minimize the enter/exit overhead with vidmem. JIRA DNVGPU-23 Change-Id: I5437e35f8393a746777a40636c1e9b5d93ced1f6 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1159524 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: detect vidmem configuration from HW	Konsta Holtta	2016-06-08
\| \| \| \| \| \| \| \| \| \| \| \| \|	Read video memory size from hardware during initialization for devices that support it. JIRA DNVGPU-14 Change-Id: If190f2d89f7148520ee274ca674f972987c8056d Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1157215 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: cache whole bar0_window for mem accesses	Konsta Holtta	2016-06-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Save the whole bar0 window register that encodes also the target aperture (vid/sys mem) instead of only the base address that could overlap between the two. JIRA DNVGPU-23 Change-Id: I2ccbea0e1f7c7310c1ca6b158afafe8fd974a615 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1159523 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add PRAMIN support for mem accessors	Konsta Holtta	2016-05-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To support vidmem, implement a way to access buffers via the PRAMIN window instead of just kernel-mapped sysmem buffers for iGPU as of now. Depending on the buffer aperture, choose between the two access types in the buffer memory accessor functions. vmap()/vunmap() pairs are no-ops for buffers that can't be cpu-mapped. Two uses of DMA_ATTR_READ_ONLY are removed in the ucode loading path to support writing to them too via the indirection in addition to cpu. JIRA DNVGPU-23 Change-Id: I282dba6741c6b8224bc12e69c1fb3936bde7e6ed Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1141314 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	Revert "gpu: nvgpu: Enable FB before initializing L2"	Terje Bergstrom	2016-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit df05d2a7c214bc8cdb887f1609853d0f424ef6f1. It causes intermittent failures on laguna_t124. Bug 1766083 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Change-Id: Idabcf96d2eaddc989b029c429cec213bcabbf28c Reviewed-on: http://git-master/r/1147683 Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com> Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
*	gpu: nvgpu: refactor gk20a_mem_{wr,rd} for vidmem	Konsta Holtta	2016-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To support vidmem, pass g and mem_desc to the buffer memory accessor functions. This allows the functions to select the memory access method based on the buffer aperture instead of using the cpu pointer directly (like until now). The selection and aperture support will be in another patch; this patch only refactors these accessors, but keeps the underlying functionality as-is. gk20a_mem_{rd,wr}32() work as previously; add also gk20a_mem_{rd,wr}() for byte-indexed accesses, gk20a_mem_{rd,wr}_n() for memcpy()-like functionality, and gk20a_memset() for filling buffers with a constant. The 8 and 16 bit accessor functions are removed. vmap()/vunmap() pairs are abstracted to gk20a_mem_{begin,end}() to support other types of mappings or conditions where mapping the buffer is unnecessary or different. Several function arguments that would access these buffers are also changed to take a mem_desc instead of a plain cpu pointer. Some relevant occasions are changed to use the accessor functions instead of cpu pointers without them (e.g., memcpying to and from), but the majority of direct accesses will be adjusted later, when the buffers are moved to support vidmem. JIRA DNVGPU-23 Change-Id: I3dd22e14290c4ab742d42e2dd327ebeb5cd3f25a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1121143 Reviewed-by: Ken Adams <kadams@nvidia.com> Tested-by: Ken Adams <kadams@nvidia.com>
*	gpu: nvgpu: Enable FB before initializing L2	Terje Bergstrom	2016-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Deassert reset in L2 and FB before initializing L2. In gk20a L2 can be off and thus writing registers results in a priv ring failure. Change-Id: I680b8b1e77cf67a8269c6de59a15d9817301300e Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1140482 (cherry picked from commit d85edcf4170d7bc59d2c080f4343bc2f959be023) Reviewed-on: http://git-master/r/1143684 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: don't alloc gmmu bufs manually	Konsta Holtta	2016-05-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use gk20a_gmmu_{alloc,free}() instead of duplicating their code for page tables. The linsim-specific special case is kept as-is. JIRA DNVGPU-23 JIRA DNVGPU-20 Change-Id: I66d772337bad5d081256b13877c4e713ea8b634a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1139695 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: adapt gk20a_mm_entry for mem_desc	Konsta Holtta	2016-05-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For upcoming vidmem refactor, replace struct gk20a_mm_entry's contents identical to struct mem_desc, with a struct mem_desc member. This makes it possible to use the page table buffers like the others too. JIRA DNVGPU-23 JIRA DNVGPU-20 Change-Id: I714ee5dcb33c27aaee932e8e3ac367e84610b102 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1139694 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Do not generate any ctag info unless enabled	Alex Waterman	2016-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do not put any ctag data in the PTEs unless compression is actually enabled for the mapping. Bug 1732449 JIRA DNVGPU-12 Change-Id: I2abfbf9d1282af24541f8199bd9fbf2133c12899 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1133790 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add gk20a_gmmu_fixed_map() function	Alex Waterman	2016-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a function to allow the kernel to do fixed mappings. Necessary for the semaphore functionality since there needs to be a common address in each VM for the semaphores. Bug 1732449 JIRA DNVGPU-12 Change-Id: I2b451db2d3cb3c003d951f7b0ffc87f6c91db7dc Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1133789 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Program NISO sysmem flush addr	Terje Bergstrom	2016-04-25
\| \| \| \| \| \| \| \| \| \| \| \|	Program sysmem flush address to prevent random accesses of address 0. Change-Id: I886170395f036805f02e0bce7ecd3c8c46b921df Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1129216 GVS: Gerrit_Virtual_Submit Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
*	gpu: nvgpu: Wait for BAR1 bind	Terje Bergstrom	2016-04-15
\| \| \| \| \| \| \| \| \|	Wait for BAR1 bind to complete before continuing. The register to wait exists Maxwell onwards. Change-Id: Ie3736033fdb748c5da8d7a6085ad6d63acaf41f5 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1123941
*	gpu: nvgpu: Use sysmem aperture for SoC memory	Terje Bergstrom	2016-04-15
\| \| \| \| \| \| \| \| \|	In Tegra GPU, SoC memory has to be accessed as vidmem. In discrete GPU, it has to be accessed as sysmem. Change-Id: I4efe71b54a9a32f0bf1f02ec4016ed74405a14c5 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120468
*	gpu: nvgpu: Support GPUs with no physical mode	Terje Bergstrom	2016-04-13
\| \| \| \| \| \| \| \| \| \| \|	Support GPUs which cannot choose between SMMU and physical addressing. Change-Id: If3256fa1bc795a84d039ad3aa63ebdccf5cc0afb Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120469 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	gpu: nvgpu: Use device instead of platform_device	Terje Bergstrom	2016-04-08
\| \| \| \| \| \| \| \| \|	Use struct device instead of struct platform_device wherever possible. This allows adding other bus types later. Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120466
*	gpu: nvgpu: split address space for fixed allocs	Alex Waterman	2016-03-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow a special address space node to be split out from the user adress space or fixed allocations. A debugfs node, /d/<gpu>/separate_fixed_allocs Controls this feature. To enable it: # echo <SPLIT_ADDR> > /d/<gpu>/separate_fixed_allocs Where <SPLIT_ADDR> is the address to do the split on in the GVA address range. This will cause the split to be made in all subsequent address space ranges that get created until it is turned off. To turn this off just echo 0x0 into the same debugfs node. Change-Id: I21a3f051c635a90a6bfa8deae53a54db400876f9 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1030303 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Clear comptags for whole buffer	Terje Bergstrom	2016-03-22
\| \| \| \| \| \| \| \| \| \| \| \|	Clear comptags for whole buffer when nvgpu sees the buffer for the first time. Change-Id: I67108ce0f0def46ddda1aa9b9bb5ea22549cce13 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1013517 (cherry picked from commit 544446aacdc695dc2e27c42a0086292cd69c2eee) Reviewed-on: http://git-master/r/1031009 GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Use shift instead of div for comptag	Terje Bergstrom	2016-03-08
\| \| \| \| \| \| \| \| \| \| \| \|	Use right shift instead of division for computing the ctag offset. Bug 1704834 Change-Id: Id57526a08bad34e41b2335a21e299d1c0a2ffba1 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1024467 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: skip extracting kind from nvmap	Deepak Nibade	2016-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While mapping the buffer, if kind argument is -1, we extract kind value from nvmap but kind information from nvmap is going away and hence remove respective call to nvmap Bug 1616899 Change-Id: I2764655f60df691ac8a86484c6ec929d2b83b2e3 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1012239 GVS: Gerrit_Virtual_Submit Reviewed-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: bitmap allocator for comptags	Konsta Holtta	2016-01-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Restore comptags to be bitmap-allocated, like they were before we had the buddy allocator. The new buddy allocator introduced by e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff (originally 6ab2e0c49cb79ca68d2f83f1d4610783d2eaa79b) is fine for the big VAs, but unsuitable for the small compbit store. This commit reverts partially the combination of the above commit and also one after it, 86fc7ec9a05999bea8de320840b962db3ee11410, that fixed a bug which is not present when using a bitmap. With a bitmap allocator, pruning the extra allocation necessary for user-mapped mode is possible, so that is also restored. The original generic bitmap allocator is not restored; instead, a comptag-only allocator is introduced. Bug 200145635 Change-Id: I87f3a911826a801124cfd21e44857dfab1c3f378 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/837180 (cherry picked from commit 5a504aeb54f3e89e6561932971158a397157b3f2) Reviewed-on: http://git-master/r/839742 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Control comptagline assignment from kernel	Terje Bergstrom	2016-01-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Maxwell comptaglines are assigned per 128k, but preferred big page size for graphics is 64k. Bit 16 of GPU VA is used for determining which half of comptagline is used. This creates problems if user space wants to map a page multiple times and to arbitrary GPU VA. In one mapping the page might be mapped to lower half of 128k comptagline, and in another mapping the page might be mapped to upper half. Turn on mode where MSB of comptagline in PTE is used instead of bit 16 for determining the comptagline lower/upper half selection. Bug 1704834 Change-Id: If87e8f6ac0fc9c5624e80fa1ba2ceeb02781355b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/924322 Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	gpu: nvgpu: Add comptag offset to part mappings	Terje Bergstrom	2015-12-14
\| \| \| \| \| \| \| \| \| \|	Add offset to comptags when mapping partial buffers. Bug 1704834 Change-Id: I3405b465bb1373bcc79eb5ecbd93dd1b866abfb4 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/837401
*	gpu: nvgpu: fix Coverity issues	Deepak Nibade	2015-11-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- operands not affecting result (id = 12845) - logically dead code (id = 12890) - dereference after null check (id = 12968) - unsigned compared to 0 (id = 13176) - resource leak (id = 13338, 18673) - unused pointer value (id = 13916) Bug 1703084 Change-Id: I2f401dd93126af27748c53fa1b3a59cb154af36b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/835143 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: abstract set mmu debug mode	Richard Zhao	2015-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new operaton g->ops.mm.set_debug_mode and let other places that set debug mode call this callback. It's preparing for adding vgpu set mmu debug mode hook. JIRA VFND-1005 Bug 1594604 Change-Id: I1d227a0c0f96adb0035ae16ae1f4fbfa739bf0a7 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/833497 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: Fix alignment calculation overflow	Ari Hirvonen	2015-11-20
\| \| \| \| \| \| \| \| \| \| \|	Bug 200150865 Change-Id: If4f0e01bdeb95c303675b63444bd497b65d934f3 Signed-off-by: Ari Hirvonen <ahirvonen@nvidia.com> Reviewed-on: http://git-master/r/835151 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
*	gpu: nvgpu: User-space managed address space support	Sami Kiminki	2015-11-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which enables creating userspace-managed GPU address spaces. When an address space is marked as userspace-managed, the following changes are in effect: - Only fixed-address mappings are allowed. - VA space allocation for fixed-address mappings is not required, except to mark space as sparse. - Maps and unmaps are always immediate. In particular, the mapping ref increments at kickoffs and decrements at job completion are skipped. Bug 1614735 Bug 1623949 Bug 1660392 Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/738558 Reviewed-on: http://git-master/r/833253 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Implement NVGPU_GPU_IOCTL_GET_BUFFER_INFO	Sami Kiminki	2015-11-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_GPU_IOCTL_GET_BUFFER_INFO. The new IOCTL can be used to identify buffers and retrieve their sizes. This allows the userspace to be agnostic to the dmabuf implementation, as the generic dmabuf fd interface does not have a reliable way for buffer identification. Bug 1614735 Bug 1623949 Bug 1660392 Change-Id: Ic3dd0a9385c9852778110ccb80636dd6f4f36208 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/822845 Reviewed-on: http://git-master/r/833252 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Report all mapping calls	Terje Bergstrom	2015-11-12
\| \| \| \| \| \| \| \| \| \| \| \|	Add dbg_map debug spew for all mapping calls. This plugs the hole where kernel mappings were not logged, because the debug log is added only in ioctl path. Change-Id: I036bf41f92ba5b612d32805020ca7a16fe54f9f4 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/812288 (cherry picked from commit c37b2892d6d967ad48076b20e5a9ef97dc600b31) Reviewed-on: http://git-master/r/831333
*	gpu: nvgpu: use a separate big vm for cde	Konsta Holtta	2015-11-11
\| \| \| \| \| \| \| \| \| \| \| \| \|	Allocate a separate VM for CDE channels instead of using the system (PMU) vm, and make it much bigger than the PMU's to fit the maximum number of CDE channels there. Bug 1566740 Change-Id: I4f487c40c9ec79cc9ffb880b0ecd3f47eb450336 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/815149 Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: Do not use G_ELPG_FLUSH	Terje Bergstrom	2015-11-10
\| \| \| \| \| \| \| \| \| \| \| \| \|	G_ELPG_FLUSH is protected in some chips. Use L2 flush operations instead. Bug 1698618 Change-Id: I984a8ace8bcd0ad2d4a4e2d63af75a342bdeb75a Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/828656 (cherry picked from commit ba9075fa43975112a221d37d246f0b8f5af40fab) Reviewed-on: http://git-master/r/829415
*	Revert "gpu: nvgpu: Implement sparse PDEs"	Terje Bergstrom	2015-11-09
\| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit c44947b1314bb2afa1f116b4928f4e8a4c34d7b1. It introduces a regression in T124. Bug 1702063 Change-Id: I64e333f66d98bd4dbcfe40a60f1aa825d90376a5 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/830786 GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Implement sparse PDEs	Terje Bergstrom	2015-10-30
\| \| \| \| \| \| \|	Change-Id: Idfeb3bf95751902d52a895d77045a529f69abc0b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/758651 GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: add support to remove bar2 mm	Seshendra Gadagottu	2015-10-12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Adding support to remove bar2 mm on gpu module remove. Change-Id: Id5f680b1abf7056da9871d5460d9fbc40422673e Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/814571 (cherry picked from commit e7c6c87dd6b0893d26a9a3b4568121a691e1eb3c) Reviewed-on: http://git-master/r/815429 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	nvgpu: gk20a: Optimize vm_put_buffers for zero buffers	Sami Kiminki	2015-10-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Return immediately in case there are no buffers to put. This skips acquiring mutexes and map batch start/finish overheads. Bug 1614735 Bug 1623949 Bug 1660392 Change-Id: Ief04e36d995e65c1510496c17cb3f5bb90486c69 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/815376 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix ctag computation overflow with 8GB	Jussi Rasanen	2015-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug 1689976 Change-Id: I97ad14c9698030b630d3396199a2a5296c661392 Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-on: http://git-master/r/806590 (cherry picked from commit c90cd5ee674d6357db3be2243950ff0d81ef15ef) Reviewed-on: http://git-master/r/808249 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: prevent extra user unmaps	Deepak Nibade	2015-09-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is possible that user space requests more unmaps on a buffer than it requested maps In this case, we end up dropping one extra refcount which could lead to releasing buffer early Fix this by checking and returning if buffer's user_mapped refcount is already zero Bug 200130521 Change-Id: Ic8ef2dbfe0476b16d852ad899b1ed0404b5bb7de Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/788904 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: unmapped ptes handling	Seshendra Gadagottu	2015-09-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Correct logic for supporting unmapped ptes during gmmu map. Bug 1587825 Change-Id: I1b0b603f7758a65d9666046d0d908663f8e460e3 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/796577 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/759345 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Separate kernel and user GPU VA regions	Sami Kiminki	2015-09-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Separate the kernel and userspace regions in the GPU virtual address space. Do this by reserving the last part of the GPU VA aperture for the kernel, and extend GPU VA aperture accordingly for regular address spaces. This prevents the kernel polluting the userspace-visible GPU VA regions, and thus, makes the success of fixed-address mapping more predictable. Bug 200077571 Change-Id: I63f0e73d4c815a4a9fa4a9ce568709974690ef0f Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/747191 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>