nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: Use accessor for finding struct device	Terje Bergstrom	2017-06-30
\| \| \| \| \| \| \| \| \| \| \| \| \|	Use dev_from_gk20a() accessor whenever accessing struct device * from struct gk20a. JIRA NVGPU-38 Change-Id: Ide9fca3a56436c8f62e7872580a766c4c1e2353e Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: https://git-master/r/1507930 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Remove extraneous VM init/deinit APIs	Alex Waterman	2017-06-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support only VM pointers and ref-counting for maintaining VMs. This dramatically reduces the complexity of the APIs, avoids the API abuse that has existed, and ensures that future VM usage is consistent with current usage. Also remove the combined VM free/instance block deletion. Any place where this was done is now replaced with an explict free of the instance block and a nvgpu_vm_put(). JIRA NVGPU-12 JIRA NVGPU-30 Change-Id: Ib73e8d574ecc9abf6dad0b40a2c5795d6396cc8c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1480227 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: move debugfs code to linux module	Deepak Nibade	2017-06-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since all debugfs code is Linux specific, remove it from common code and move it to Linux module Debugfs code is now divided into below module specific files : common/linux/debug.c common/linux/debug_cde.c common/linux/debug_ce.c common/linux/debug_fifo.c common/linux/debug_gr.c common/linux/debug_mm.c common/linux/debug_allocator.c common/linux/debug_kmem.c common/linux/debug_pmu.c common/linux/debug_sched.c Add corresponding header files for above modules too And compile all of above files only if CONFIG_DEBUG_FS is set Some more details of the changes made - Move and rename gk20a/debug_gk20a.c to common/linux/debug.c - Move and rename gk20a/debug_gk20a.h to include/nvgpu/debug.h - Remove gm20b/debug_gm20b.c and gm20b/debug_gm20b.h and call gk20a_init_debug_ops() directly from gm20b_init_hal() - Update all debug APIs to receive struct gk20a as parameter instead of receiving struct device pointer - Update API gk20a_dmabuf_get_state() to receive struct gk20a pointer instead of struct device - Include <nvgpu/debug.h> explicitly in all files where debug operations are used - Remove "gk20a/platform_gk20a.h" include from HAL files which no longer need this include - Add new API gk20a_debug_deinit() to deinitialize debugfs and call it from gk20a_remove() - Move API gk20a_debug_dump_all_channel_status_ramfc() to gk20a/fifo_gk20a.c Jira NVGPU-62 Change-Id: I076975d3d7f669bdbe9212fa33d98529377feeb6 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1488902 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: Fix build failure by missing headers	skadamati	2017-05-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the platform_gk20a.h include out of the ifdef CONFIG_DEBUG_FS in the CDE code since dev_from_gk20a() is used regardless of whether debugfs is enabled. Also modify some of the CE ops to take a struct gk20a instead of a struct device. This avoids any requirement for including linux/device.h or platform_gk20a.h. Bug 200310575 Change-Id: Ifef963cd0f66d05094a698200386cc6140920eac Signed-off-by: skadamati <skadamati@nvidia.com> Reviewed-on: http://git-master/r/1487830 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Split VM implementation out	Alex Waterman	2017-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch begins splitting out the VM implementation from mm_gk20a.c and moves it to common/linux/vm.c and common/mm/vm.c. This split is necessary because the VM code has two portions: first, an interface for the OS specific code to use (i.e userspace mappings), and second, a set of APIs for the driver to use (init, cleanup, etc) which are not OS specific. This is only the beginning of the split - there's still a lot of things that need to be carefully moved around. JIRA NVGPU-12 JIRA NVGPU-30 Change-Id: I3b57cba245d7daf9e4326a143b9c6217e0f28c96 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1477743 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Split VM interface out	Alex Waterman	2017-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch begins the major rework of the GPU's virtual memory manager (VMM). The VMM is the piece of code that handles the userspace interface to buffers and their mappings into the GMMU. The core data structure is the VM - for now still known as 'struct vm_gk20a'. Each one of these structs represents one addres space to which channels or TSGs may bind themselves to. The VMM splits the interface up into two broad categories. First there's the common, OS independent interfaces; and second there's the OS specific interfaces. OS independent -------------- This is the code that manages the lifetime of VMs, the buffers inside VMs (search, batch mapping) creation, destruction, etc. OS Specific ----------- This handles mapping of buffers represented as they are represented by the OS (dma_buf's for example on Linux). This patch is by no means complete. There's still Linux specific functions scattered in ostensibly OS independent code. This is the first step. A patch that rewrites everything in one go would simply be too big to effectively review. Instead the goal of this change is to simply separate out the basic OS specific and OS agnostic interfaces into their own header files. The next series of patches will start to pull the relevant implementations into OS specific C files and common C files. JIRA NVGPU-12 JIRA NVGPU-30 Change-Id: I242c7206047b6c769296226d855b7e44d5c4bfa8 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1464939 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Scrub gk20a_platform dependencies	Terje Bergstrom	2017-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove gk20a_platform dependencies from gk20a.h. This makes gk20a_platform a Linux platform specific data structure. Add #include for platform_gk20a.h in the source files that still depend on Linux. JIRA NVGPU-16 Change-Id: Ib098accd34a1f5066eb8680c387f9b178169f3f0 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1463547 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Separate GMMU out of mm_gk20a.c	Alex Waterman	2017-05-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Begin moving (and renaming) the GMMU code into common/mm/gmmu.c. This block of code will be responsible for handling the platform/OS independent GMMU operations. JIRA NVGPU-12 JIRA NVGPU-30 Change-Id: Ide761bab75e5d84be3dcb977c4842ae4b3a7c1b3 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1464083 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Put debugfs dependencies inside #ifdef	Terje Bergstrom	2017-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Put all debugfs dependencies inside #ifdef CONFIG_DEBUG_FS. This includes some functions in allocators that were used only for debugging. Remove include of linux/debugfs.h on files that do not deal with debugfs. linux/debugfs.h implicitly included linux/fs.h, which we relied on. Add explicit include of linux/fs.h for all files where this is the case. Change-Id: I16feffae6b0e3a2edf366075cdc01ade86be06f9 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1467897 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Move Linux nvgpu_mem fields	Alex Waterman	2017-04-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Hide the Linux specific nvgpu_mem fields so that in subsequent patches core code can instead of using struct sg_table it can use mem_desc. Routines for accessing system specific fields will be added as needed. This is the first step in a fairly major overhaul of the GMMU mapping routines. There are numerous issues with the current design (or lack there of): massively coupled code, system dependencies, disorganization, etc. JIRA NVGPU-12 JIRA NVGPU-30 Change-Id: I2e7d3ae3a07468cfc17c1c642d28ed1b0952474d Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1464076 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Add abstraction for firmware loading	Terje Bergstrom	2017-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add nvgpu_firmware data structure, and return it instead of Linux struct firmare from nvgpu_request_firmware. Also add abstraction for releasing firmware: nvgpu_release_firmware. JIRA NVGPU-16 Change-Id: I6dae8262957c0d4506f710289e3a43a6c1729fc7 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1463538 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Add wrapper nvgpu/bug.h	Terje Bergstrom	2017-04-13
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add wrapper header file nvgpu/bug.h. It #includes <linux/bug.h> in Linux. JIRA NVGPU-13 Change-Id: I7bf02ba554333f7cbd79d72bd1cb423c81ebcb49 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1461545 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: gk20a: Use new error macro	Terje Bergstrom	2017-04-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gk20a_err() and gk20a_warn() require a struct device pointer, which is not portable across operating systems. The new nvgpu_err() and nvgpu_warn() macros take struct gk20a pointer. Convert code to use the more portable macros. JIRA NVGPU-16 Change-Id: Ia51f36d94c5ce57a5a0ab83b3c83a6bce09e2d5c Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1331694 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: remove logically dead code	Deepak Nibade	2017-04-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_cde_convert, we already unmap surface and then again try to unmap if it is mapped Remove this logically dead code Coverity id : 2477002 Bug 200291879 Change-Id: I66dbf43d0475403be27d5acee826d8037e71d4d8 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1455974 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	gpu: nvgpu: Rename nvgpu DMA APIs	Alex Waterman	2017-04-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rename the nvgpu DMA APIs from gk20a_gmmu_alloc* to nvgpu_dma_alloc*. This better reflects the purpose of the APIs (to allocate DMA suitable memory) and avoids confusion with GMMU related code. JIRA NVGPU-12 Change-Id: I673d607db56dd6e44f02008dc7b5293209ef67bf Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1325548 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Move DMA API to dma.h	Alex Waterman	2017-04-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make an nvgpu DMA API include file so that the intricacies of the Linux DMA API can be hidden from the calling code. Also document the nvgpu DMA API. JIRA NVGPU-12 Change-Id: I7578e4c726ad46344b7921179d95861858e9a27e Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1323326 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: rename mem_desc to nvgpu_mem	Alex Waterman	2017-04-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Renaming was done with the following command: $ find -type f \| \ xargs sed -i 's/struct mem_desc/struct nvgpu_mem/g' Also rename mem_desc.[ch] to nvgpu_mem.[ch]. JIRA NVGPU-12 Change-Id: I69395758c22a56aa01e3dffbcded70a729bf559a Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1325547 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Move channel IOCTL code to Linux module	Terje Bergstrom	2017-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \|	Move channel IOCTL specific code to Linux module. This clears some Linux dependencies from channel_gk20a.c. JIRA NVGPU-32 Change-Id: I41817d612b959709365bcabff9c8a15f2bfe4c60 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1330804 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: use nvgpu list for CDE contexts	Deepak Nibade	2017-04-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use nvgpu list APIs instead of linux list APIs to store CDE contexts in free_contexts/used_contexts lists Jira NVGPU-13 Change-Id: If1c5d8d8ca70afc90379b33232ceccf9ac4fb155 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1454009 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use new kmem API functions (misc)	Alex Waterman	2017-03-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the new kmem API functions in misc gk20a code. Some additional modifications were also made: o Add a struct gk20a pointer to gk20a_fence to enable proper kmem free usage. o Add gk20a pointer to alloc_session() in dbg_gpu_gk20a.c to use kmem API for allocating a session. o Plumb a gk20a pointer through the fence creation and deletion. o Use statically allocated buffers for names in file creation. Bug 1799159 Bug 1823380 Change-Id: I3678080e3ffa1f9bcf6934e3f4819a1bc531689b Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1318323 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: pass gk20a struct to gk20a_busy	David Nieto	2017-03-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After driver remove, the device structure passed in gk20a_busy can be invalid. To solve this the prototype of the function is modified to pass the gk20a struct instead of the device pointer. bug 200277762 JIRA: EVLR-1023 Change-Id: I08eb74bd3578834d45115098ed9936ebbb436fdf Signed-off-by: David Nieto <dmartineznie@nvidia.com> Reviewed-on: http://git-master/r/1320194 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: take power refcount in gk20a_cde_convert()	Deepak Nibade	2017-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have a gk20a_busy() call in gk20a_buffer_convert_gpu_to_cde() and we again call gk20a_busy() in gk20a_submit_channel_gpfifo() If gk20a_do_idle() is triggered in between these two calls, then this leads to a deadlock and results in idle failure Hence to avoid this take power refcount in a more fine-grained way i.e. in gk20a_cde_convert() instead of taking in gk20a_buffer_convert_gpu_to_cde() Keep gk20a_cde_execute_buffer() out of the gk20a_busy()/ gk20a_idle() pair since we take power refcount in submit path anyways Add correct error handling path in gk20a_cde_convert() Bug 200287073 Change-Id: Iffea2d4c03f42b47dbf05e7fe8fe2994f9c6b37c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1324329 (cherry picked from commit ce057d784d40a6ce57e892d58e211ed2fd9826f8) Reviewed-on: http://git-master/r/1320408 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: hold mutex to get cde context only	Deepak Nibade	2017-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_cde_convert(), we hold cde_app.mutex almost for everything which is unnecessary Also, this causes a deadlock scenario when gk20a_do_idle() is called In some cases it is possible that Thread 1 holds the lock and is trying to power on GPU, and Thread 2 is trying to power off GPU and then grab cde_app.mutex to cleanup GPU state To fix this, grab the mutex only for gk20a_cde_get_context() and then release it Bug 200287073 Change-Id: Ic4856604652d0d4024abdeb5c2f8f03910c601a5 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1324328 (cherry picked from commit 3c8eef0d0e9ef3307e2792e3e127c10f3ff118a7) Reviewed-on: http://git-master/r/1322030 GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: check return value of mutex_init in CDE code	Deepak Nibade	2017-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- check return value of nvgpu_mutex_init in cde_gk20a.c - add corresponding nvgpu_mutex_destroy calls Jira NVGPU-13 Change-Id: I99f59d191cc81eff4a330557b864925d36fc4b3d Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1321287 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: in-kernel kickoff profiling	David Nieto	2017-03-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a debugfs interface to profile the kickoff ioctl it provides the probability distribution and separates the information between time spent in: the full ioctl, the kickoff function, the amount of time spent in job tracking and the amount of time doing pushbuffer copies JIRA: EVLR-1003 Change-Id: I9888b114c3fbced61b1cf134c79f7a8afce15f56 Signed-off-by: David Nieto <dmartineznie@nvidia.com> Reviewed-on: http://git-master/r/1308997 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: use common nvgpu mutex/spinlock APIs	Deepak Nibade	2017-02-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of using Linux APIs for mutex and spinlocks directly, use new APIs defined in <nvgpu/lock.h> Replace Linux specific mutex/spinlock declaration, init, lock, unlock APIs with new APIs e.g struct mutex is replaced by struct nvgpu_mutex and mutex_lock() is replaced by nvgpu_mutex_acquire() And also include <nvgpu/lock.h> instead of including <linux/mutex.h> and <linux/spinlock.h> Add explicit nvgpu/lock.h includes to below files to fix complilation failures. gk20a/platform_gk20a.h include/nvgpu/allocator.h Jira NVGPU-13 Change-Id: I81a05d21ecdbd90c2076a9f0aefd0e40b215bd33 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1293187 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Organize semaphore_gk20a.[ch]	Alex Waterman	2017-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move semaphore_gk20a.c drivers/gpu/nvgpu/common/ since the semaphore code is common to all chips. Move the semaphore_gk20a.h header file to drivers/gpu/nvgpu/include/nvgpu and rename it to semaphore.h. Also update all places where the header is inluced to use the new path. This revealed an odd location for the enum gk20a_mem_rw_flag. This should be in the mm headers. As a result many places that did not need anything semaphore related had to include the semaphore header file. Fixing this oddity allowed the semaphore include to be removed from many C files that did not need it. Bug 1799159 Change-Id: Ie017219acf34c4c481747323b9f3ac33e76e064c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1284627 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Organize nvgpu_common.[ch]	Alex Waterman	2017-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move nvgpu_common.c to drivers/gpu/nvgpu/common since it is a common C file to all drivers. Similarly move nvgpu_common.h to drivers/gpu/nvgpu/include/nvgpu since this follows the new include guidelines. Bug 1799159 Change-Id: I00ebed289973b27704c2cff073526e36505bf699 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1284612 Reviewed-by: Varun Colbert <vcolbert@nvidia.com> Tested-by: Varun Colbert <vcolbert@nvidia.com>
*	gpu: nvgpu: Simplify ref-counting on VMs	Alex Waterman	2017-02-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Simplify ref-counting on VMs: take a ref when a VM is bound to a channel and drop a ref when a channel is freed. Previously ref-counts were scattered over the driver. Also the CE and CDE code would bind channels with custom rolled code. This was because the gk20a_vm_bind_channel() function took an as_share as the VM argument (the VM was then inferred from that as_share). However, it is trivial to abtract that bit out and allow a central bind channel function that just takes a VM and a channel. Bug 1846718 Change-Id: I156aab259f6c7a2fa338408c6c4a3a464cd44a0c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1261886 Reviewed-by: Richard Zhao <rizhao@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use timer API in gk20a code	Alex Waterman	2017-01-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the timers API in the gk20a code instead of Linux specific API calls. This also changes the behavior of several functions to wait for the full timeout for each operation that can timeout. Previously the timeout was shared across each operation. Bug 1799159 Change-Id: I2bbed54630667b2b879b56a63a853266afc1e5d8 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1273826 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Start re-organizing the HW headers	Alex Waterman	2017-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reorganize the HW headers of gk20a. The headers are moved to a new directory: include/nvgpu/hw/gk20a And from the code are included like so: #include <nvgpu/hw/gk20a/hw_pwr_gk20a.h> This is the first step in reorganizing all of the HW headers for gm20b, gm206, etc. This is part of a larger effort to re-structure and make the driver more readable and scalable. Bug 1799159 Change-Id: Ic151155cbc2e6f75009f2d9d597b364a1bed2c4c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1244790 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Fix signed comparison bugs	Terje Bergstrom	2016-11-17
\| \| \| \| \| \| \| \| \| \| \| \|	Fix small problems related to signed versus unsigned comparisons throughout the driver. Bump up the warning level to prevent such problems from occuring in future. Change-Id: I8ff5efb419f664e8a2aedadd6515ae4d18502ae0 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1252068 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Return error code on zero IOVA	Terje Bergstrom	2016-11-11
\| \| \| \| \| \| \| \| \| \| \| \| \|	When buffer's IOVA is zero, treat that as error condition instead of ignoring and continuing. Change-Id: I2ede9921945645f526b0600f61f7e5ed19af6d73 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1249963 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com>
*	gpu: nvgpu: Prevent integer overflow of GPU config	Terje Bergstrom	2016-11-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In CDE GPU CONFIGURATION the result is computed using 32-bit arithmetic and returned as 64-bit unsigned integer. Cast intermediate result to u64 to prevent unintentional overflow. Change-Id: Iebe53e2b17c1aaa498245a52962c3dbad7ce893e Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1249962 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com>
*	gpu: nvgpu: add support for pre-allocated resources	Sachit Kadle	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for pre-allocation of job tracking resources w/ new (extended) ioctl. Goal is to avoid dynamic memory allocation in the submit path. This patch does the following: 1) Intoduces a new ioctl, NVGPU_IOCTL_CHANNEL_ALLOC_GPFIFO_EX, which enables pre-allocation of tracking resources per job: a) 2x priv_cmd_entry b) 2x gk20a_fence 2) Implements circular ring buffer for job tracking to avoid lock contention between producer (submitter) and consumer (clean-up) Bug 1795076 Change-Id: I6b52e5c575871107ff380f9a5790f440a6969347 Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1203300 (cherry picked from commit 9fd270c22b860935dffe244753dabd87454bef39) Reviewed-on: http://git-master/r/1223934 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Suppress error msg from VBIOS overlay	Terje Bergstrom	2016-10-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Suppress error message when nvgpu tries to load VBIOS overlay, but one is not found. This situation is normal. This is done by moving gk20a_request_firmware() to be nvgpu generic function nvgpu_request_firmware(), and adding a NO_WARN flag to it. Introduce also a NO_SOC flag to suppress attempt to load firmware from SoC specific directory in addition to the chip specific directory. Use it for dGPU firmware files. Bug 200236777 Change-Id: I0294d3308f029a6a6d3c2effa579d5f69a91e418 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1223840 (cherry picked from commit cca44c3f010f15918cdd2259c15170ba1917828a) Reviewed-on: http://git-master/r/1233353 GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: unify nvgpu and pci probe	Deepak Nibade	2016-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have completely different versions of probe for nvgpu and pci device Extract out common steps into nvgpu_probe() function and separate it out in new file nvgpu_common.c Divide task of nvgpu_probe() into further smaller functions Do platform specific things (like irq handling, memresource management, power management) only in individual probes and then call nvgpu_probe() to complete the common initialization Move all debugfs initialization to common gk20a_debug_init() This also helps to bringup all debug nodes to pci device Pass debugfs_symlink name as a parameter to gk20a_debug_init() This allows us to set separate debugfs symlink for nvgpu and pci device In case of railgating, cde and ce debugfs, check if platform supports them or not Copy vidmem_is_vidmem from platform to mm structure and set it to true for pci device Return from gk20a_scale_init() if we don't have either of governor or qos_notifier Fix gk20a_alloc_debugfs_init() and gk20a_secure_page_alloc() to receive device pointer instead of platform_device Export gk20a_railgating_debugfs_init() so that we can call it from gk20a_debug_init() Jira DNVGPU-56 Jira DNVGPU-58 Change-Id: I3cc048082b0a1e57415a9fb8bfb9eec0f0a280cd Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1204207 (cherry picked from commit add6bb0a3d5bd98131bbe6f62d4358d4d722b0fe) Reviewed-on: http://git-master/r/1204462 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gk20a: Use spin_lock for jobs_lock	Bharat Nihalani	2016-08-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is done to boost performance of the GPU submit time, which is critical for compute use-cases. Bug 200215465 Bug 1804898 Conflicts: drivers/gpu/nvgpu/gk20a/channel_gk20a.c Change-Id: Ic4884ee4eac910b92b84a47fdc1b2e9f26b2f1f0 Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com> Reviewed-on: http://git-master/r/1199860 Reviewed-on: http://git-master/r/1209834 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add nvgpu infra to allow kernel to create privileged CE channels	Lakshmanan M	2016-07-20
\| \| \| \| \| \| \| \| \| \| \| \| \|	Added interface to allow kernel to create privileged CE channels for page migration and clearing support between sysmem and videmem. JIRA DNVGPU-53 Change-Id: I3e18d18403809c9e64fa45d40b6c4e3844992506 Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: http://git-master/r/1173085 GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
*	gpu: nvgpu: use vidmem by default in gmmu_alloc variants	Konsta Holtta	2016-07-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For devices that have vidmem available, use the vidmem allocator in gk20a_gmmu_alloc{,attr,_map,_map_attr}. For others, use sysmem. Because all of the buffers haven't been tested to work in vidmem yet, rename calls to gk20a_gmmu_alloc{,attr,_map,_map_attr} to have _sys at the end to declare explicitly that vidmem is used. Enabling vidmem for each now is a matter of removing "_sys" from the function call. Jira DNVGPU-18 Change-Id: Ibe42f67eff2c2b68c36582e978ace419dc815dc5 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1176805 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support in-kernel vidmem mappings	Konsta Holtta	2016-07-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Propagate the buffer aperture flag in gk20a_locked_gmmu_map up so that buffers represented as a mem_desc and present in vidmem can be mapped to gpu. JIRA DNVGPU-18 JIRA DNVGPU-76 Change-Id: I46cf87e27229123016727339b9349d5e2c835b3e Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1169308 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use device instead of platform_device	Terje Bergstrom	2016-04-08
\| \| \| \| \| \| \| \| \|	Use struct device instead of struct platform_device wherever possible. This allows adding other bus types later. Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120466
*	gpu: nvgpu: create sync_fence only if needed	Deepak Nibade	2015-12-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, we create sync_fence (from nvhost_sync_create_fence()) for every submit But not all submits request for a sync_fence. Also, nvhost_sync_create_fence() API takes about 1/3rd of the total submit path. Hence to optimize, we can allocate sync_fence only when user explicitly asks for it using (NVGPU_SUBMIT_GPFIFO_FLAGS_FENCE_GET && NVGPU_SUBMIT_GPFIFO_FLAGS_SYNC_FENCE) Also, in CDE path from gk20a_prepare_compressible_read(), we reuse existing fence stored in "state" and that can result into not returning sync_fence_fd when user asked for it Hence, force allocation of sync_fence when job submission comes from CDE path Bug 200141116 Change-Id: Ia921701bf0e2432d6b8a5e8b7d91160e7f52db1e Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/812845 (cherry picked from commit 5fd47015eeed00352cc8473eff969a66c94fee98) Reviewed-on: http://git-master/r/837662 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: remove temporary gpfifo allocation in submit path	Deepak Nibade	2015-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In GPU job submit path gk20a_ioctl_channel_submit_gpfifo(), we currently allocate a temporary gpfifo, copy user space gpfifo content into this temporary buffer, and then copy temp buffer content into channel's gpfifo. Allocation/copy/free of temporary buffer adds additional overhead Rewrite this sequence such that gk20a_submit_channel_gpfifo() can receive either a pre-filled gpfifo or pointer to user provided args. And then we can direclty copy the user provided gpfifo into the channel's gpfifo Also, if command buffer tracing is enabled, we still need to copy user provided gpfifo into temporaty buffer for reading But that should not cause overhead in real world use case Bug 200141116 Change-Id: I7166c9271da2694059da9853ab8839e98457b941 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/823386 (cherry picked from commit 3e0702db006c262dd8737a567b8e06f7ff005e2c) Reviewed-on: http://git-master/r/835799 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: use a separate big vm for cde	Konsta Holtta	2015-11-11
\| \| \| \| \| \| \| \| \| \| \| \| \|	Allocate a separate VM for CDE channels instead of using the system (PMU) vm, and make it much bigger than the PMU's to fit the maximum number of CDE channels there. Bug 1566740 Change-Id: I4f487c40c9ec79cc9ffb880b0ecd3f47eb450336 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/815149 Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: fix 4k compression	Jussi Rasanen	2015-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add CPU dcache flush after populating scatterBuffer so that the GPU will see the buffer contents. Bug 1679453 Change-Id: I564394ed1fcff4d08d753e753bd3243b460d76df Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-on: http://git-master/r/805197 (cherry picked from commit d6a5513745aa77c84ac5408a62f72f24839ef439) Reviewed-on: http://git-master/r/808246 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add CDE bits in FECS header	sujeet baranwal	2015-09-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In case of CDE channel, T1 (Tex) unit needs to be promoted to 128B aligned, otherwise causes a HW deadlock. Gpu driver makes changes in FECS header which FECS uses to configure the T1 promotions to aligned 128B accesses. Bug 200096226 Change-Id: I8a8deaf6fb91f4bbceacd491db7eb6f7bca5001b Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Reviewed-on: http://git-master/r/804625 Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add support for CDE scatter buffers	Jussi Rasanen	2015-09-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for CDE scatter buffers. When the bus addresses for surfaces are not contiguous as seen by the GPU (e.g., when SMMU is bypassed), CDE swizzling needs additional per-page information. This information is populated in a scatter buffer when required. Bug 1604102 Change-Id: I3384e2cfb5d5f628ed0f21375bdac8e36b77ae4f Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-on: http://git-master/r/789436 Reviewed-on: http://git-master/r/791243 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	Revert "gpu: nvgpu: Add CDE bits in FECS header"	Terje Bergstrom	2015-09-24
\| \| \| \| \| \| \| \|	This reverts commit 882975f7f1b4e050be79b0a047a2daa8b53a9187. Change-Id: I4940fc9f7a837840be1ea8e42d58d603235d88d5 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/804616
*	gpu: nvgpu: Add CDE bits in FECS header	sujeet baranwal	2015-09-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In case of CDE channel, T1 (Tex) unit needs to be promoted to 128B aligned, otherwise causes a HW deadlock. Gpu driver makes changes in FECS header which FECS uses to configure the T1 promotions to aligned 128B accesses. Bug 200096226 Change-Id: Ic006b2c7035bbeabe1081aeed968a6c6d11f9995 Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Reviewed-on: http://git-master/r/802327 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>