nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: vgpu: profiler reservation support	Peter Daifuku	2017-03-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support for hwpm reservations in the virtual case: - Add session ops for checking and setting global and context reservations, and releasing reservations - in the native case, these just update reservation counts and flags - in the vgpu case, when the reservation count is 0, check with the RM server that a reservation is possible: for global reservations, no other guest can have a reservation; for context reservations, no other guest can have a global reservation - in the vgpu case, when the reservation count is decremented to 0, notify the RM server that the guest no longer has any reservations Bug 1775465 JIRA VFND-3428 Change-Id: Idf115b730e465e35d0745c96a8f8ab6b645c7cae Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1323375 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Add bus HAL	Terje Bergstrom	2017-03-23
\| \| \| \| \| \| \| \| \| \| \|	Add bus HAL and move all bus related hardware sequencing to that file: BAR1 binding, timer access, and interrupt handling. Change-Id: Ibc5f5797dc338de10749b446a7bdbcae600fecb4 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1323353 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: profiler create/free, hwpm reserve	Peter Daifuku	2017-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for creating/freeing profiler objects, hwpm reservations Bug 1775465 JIRA EVLR-680 JIRA EVLR-682 Change-Id: I4db83d00e4b0b552b05b9aae96dc553dd1257d88 Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1294401 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Add refcounting to driver fds	David Nieto	2017-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main driver structure is not refcounted properly, so when the driver unload, file desciptors associated to the driver are kept open with dangling references to the main object. This change adds referencing to the gk20a structure. bug 200277762 JIRA: EVLR-1023 Change-Id: Id892e9e1677a344789e99bf649088c076f0bf8de Signed-off-by: David Nieto <dmartineznie@nvidia.com> Reviewed-on: http://git-master/r/1317420 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Move all FB programming to FB HAL	Terje Bergstrom	2017-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move all programming of FB to fb_*.c files, and remove the inclusion of FB hardware headers from other files. TLB invalidate function took previously a pointer to VM, but the new API takes only a PDB mem_desc, because FB does not need to know about higher level VM. GPC MMU is programmed from the same function as FB MMU, so added dependency to GR hardware header to FB. GP106 ACR was also triggering a VPR fetch, but that's not applicable to dGPU, so removed that call. Change-Id: I4eb69377ac3745da205907626cf60948b7c5392a Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1321516 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: vgpu: force gpu preepmtion policy	Aparna Das	2017-03-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Query the RM server to retrieve gpu preemption policy of guest based on pct configuration. If guest is not allowed to request wfi preemption mode then set context with either gfxp or cta preemption mode only. Jira VFND-3079 Jira VFND-3081 Change-Id: I60cbf121d6f0e2373568cf40b3dfdb4df76fe02d Signed-off-by: Aparna Das <aparnad@nvidia.com> Reviewed-on: http://git-master/r/1280903 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> Reviewed-by: Sachit Kadle <skadle@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: vgpu: add clear single SM error state	Thomas Fleury	2017-03-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for clearing single SM error state for CUDA debugger. In addition to clearing local copy of SM error state, vgpu_gr_clear_sm_error_state now sends a command to RM server (TEGRA_VGPU_CMD_CLEAR_SM_ERROR_STATE), to clear global ESR and warp ESR. Bug 1791111 Change-Id: I3a1f0644787fd900ec59a0e7974037d46a603487 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1296311 (cherry picked from commit fd07e03c3d086f396e4d65575c576a4dd68c920a) Reviewed-on: http://git-master/r/1299060 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Cory Perry <cperry@nvidia.com> Tested-by: Cory Perry <cperry@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: suspend/resume contexts	Thomas Fleury	2017-03-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add ability to suspend/resume contexts for a debug session (NVGPU_DBG_GPU_IOCTL_SUSPEND_RESUME_CONTEXTS), in virtualized case: - added hal function to resume contexts. - added vgpu support for suspend contexts, i.e. build a list of channel ids, and send TEGRA_VGPU_CMD_SUSPEND_CONTEXTS - added vgpu support for resume contexts, i.e. build a list of channel ids, and send TEGRA_VGPU_CMD_RESUME_CONTEXTS Bug 1791111 Change-Id: Icc1c00d94a94dab6384ac263fb811c00fa4b07bf Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1294761 (cherry picked from commit d17a38eda312ffa92ce92e5bafc30727a8b76c4e) Reviewed-on: http://git-master/r/1299059 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Cory Perry <cperry@nvidia.com> Tested-by: Cory Perry <cperry@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add NVGPU_GPU_FLAGS_SUPPORT_MAP_COMPBITS	Richard Zhao	2017-03-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	native gpu driver supports map compbits but vgpu does not. Bug 1778448 Bug 200275051 JIRA VFND-3513 Change-Id: I433a6f8631b495875ba899af9609203ab36187ef Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1314065 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: vgpu: remove cycle stats from characteristics	Richard Zhao	2017-03-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vgpu does not support cycle stats. Bug 1781434 Bug 200275051 JIRA VFND-3513 Change-Id: Id1d566027913632fc8a4f0285609be5f56b26288 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1314064 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: kmem abstraction and tracking	Alex Waterman	2017-03-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement kmem abstraction and tracking in nvgpu. The abstraction helps move nvgpu's core code away from being Linux dependent and allows kmem allocation tracking to be done for Linux and any other OS supported by nvgpu. Bug 1799159 Bug 1823380 Change-Id: Ieaae4ca1bbd1d4db4a1546616ab8b9fc53a4079d Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1283828 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Remove nvgpu_gpuid_t18x.h	Alex Waterman	2017-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove nvgpu_gpuid_t18x.h since this file is now visible. Migrate the relevant definitions and defines into their expected places and make the code use the real defines. No longer is hiding t18x specific stuff necessary. Bug 1799159 Change-Id: I47fa2392e46fdb7aacc70aeb0cc8c3f5ca0dc22f Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1300976 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Add and move FECS trace flag	vishnu Reddy	2017-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added fecs trace characteristic flag to vgpu to determine for platform support Also moved fecs flag from ctxsw init to fecs init for native linux. This makes flag init uniform across both platforms. JIRA EVLR-993 Change-Id: Id62f31954cb5d7028ef01a199548f5f908b51eb0 Signed-off-by: Vishnu Reddy Mandalapu <vmandalapu@nvidia.com> Reviewed-on: http://git-master/r/1305045 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: add worker for watchdog and job cleanup	Konsta Holtta	2017-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement a worker thread to replace the delayed works in channel watchdog and job cleanups. Watchdog runs by polling the channel states periodically, and job cleanup is performed on channels that are appended on a work queue consumed by the worker thread. Handling both of these two in the same thread makes it impossible for them to cause a deadlock, as has previously happened. The watchdog takes references to channels during checking and possibly recovering channels. Jobs in the cleanup queue have an additional reference taken which is released after the channel is processed. The worker is woken up from periodic sleep when channels are added to the queue. Currently, the queue is only used for job cleanups, but it is extendable for other per-channel works too. The worker can also process other periodic actions dependent on channels. Neither the semantics of timeout handling or of job cleanups are yet significantly changed - this patch only serializes them into one background thread. Each job that needs cleanup is tracked and holds a reference to its channel and a power reference, and timeouts can only be processed on channels that are tracked, so the thread will always be idle if the system is going to be suspended, so there is currently no need to explicitly suspend or stop it. Bug 1848834 Bug 1851689 Bug 1814773 Bug 200270332 Jira NVGPU-21 Change-Id: I355101802f50841ea9bd8042a017f91c931d2dc7 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1297183 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: ignore set preempt mode to default	Thomas Fleury	2017-02-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In native case, attempting to set graphics/compute preempt mode to default is ignored if preemption mode has already been set for the context. In virtualized case an error is currently returned. Align behaviour for native and virtualized case, by ignoring such request. Bug 200186530 Change-Id: Ieb3a37107bdbd3284804ee9fd392786409224082 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1306506 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: remove use of DEFINE_MUTEX()	Deepak Nibade	2017-02-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	API DEFINE_MUTEX() is defined in Linux and might not be available in other OSs. Hence remove its usage from nvgpu Declare and explicitly initialize below mutexes for both nvgpu and vgpu g->mm.priv_lock g->mm.tlb_lock Jira NVGPU-13 Change-Id: If72885a6da0227a1552303206172f1f2b751471d Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1298042 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: use common nvgpu mutex/spinlock APIs	Deepak Nibade	2017-02-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of using Linux APIs for mutex and spinlocks directly, use new APIs defined in <nvgpu/lock.h> Replace Linux specific mutex/spinlock declaration, init, lock, unlock APIs with new APIs e.g struct mutex is replaced by struct nvgpu_mutex and mutex_lock() is replaced by nvgpu_mutex_acquire() And also include <nvgpu/lock.h> instead of including <linux/mutex.h> and <linux/spinlock.h> Add explicit nvgpu/lock.h includes to below files to fix complilation failures. gk20a/platform_gk20a.h include/nvgpu/allocator.h Jira NVGPU-13 Change-Id: I81a05d21ecdbd90c2076a9f0aefd0e40b215bd33 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1293187 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Add NVGPU_IOCTL_CHANNEL_SET_BOOSTED_CTX	Peter Boonstoppel	2017-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This ioctl can be used on gp10b to set a flag in the context header indicating this context should be run at elevated clock frequency. FECS ctxsw ucode will read this flag as part of the context switch and will request higher GPU clock frequencies from BPMP for the duration of the context execution. Bug 1819874 Change-Id: I84bf580923d95585095716d49cea24e58c9440ed Signed-off-by: Peter Boonstoppel <pboonstoppel@nvidia.com> Reviewed-on: http://git-master/r/1292746 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: remove call to invalidate tlb	Aparna Das	2017-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Guest doesn't explicitly send command to the RM server to invalidate tlb which is done implicitly when mapping or unmapping buffer. Remove support for this call. Bug 1665111 Change-Id: Icf2edae7feffa35b1dbf87c227b3e98b506e6519 Signed-off-by: Aparna Das <aparnad@nvidia.com> Reviewed-on: http://git-master/r/1287728 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Organize semaphore_gk20a.[ch]	Alex Waterman	2017-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move semaphore_gk20a.c drivers/gpu/nvgpu/common/ since the semaphore code is common to all chips. Move the semaphore_gk20a.h header file to drivers/gpu/nvgpu/include/nvgpu and rename it to semaphore.h. Also update all places where the header is inluced to use the new path. This revealed an odd location for the enum gk20a_mem_rw_flag. This should be in the mm headers. As a result many places that did not need anything semaphore related had to include the semaphore header file. Fixing this oddity allowed the semaphore include to be removed from many C files that did not need it. Bug 1799159 Change-Id: Ie017219acf34c4c481747323b9f3ac33e76e064c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1284627 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Simplify ref-counting on VMs	Alex Waterman	2017-02-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Simplify ref-counting on VMs: take a ref when a VM is bound to a channel and drop a ref when a channel is freed. Previously ref-counts were scattered over the driver. Also the CE and CDE code would bind channels with custom rolled code. This was because the gk20a_vm_bind_channel() function took an as_share as the VM argument (the VM was then inferred from that as_share). However, it is trivial to abtract that bit out and allow a central bind channel function that just takes a VM and a channel. Bug 1846718 Change-Id: I156aab259f6c7a2fa338408c6c4a3a464cd44a0c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1261886 Reviewed-by: Richard Zhao <rizhao@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Remove separate fixed address VMA	Alex Waterman	2017-01-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the special VMA that could be used for allocating fixed addresses. This feature was never used and is not worth maintaining. Bug 1396644 Bug 1729947 Change-Id: I06f92caa01623535516935acc03ce38dbdb0e318 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1265302 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Unify the small and large page address spaces	Alex Waterman	2017-01-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The basic structure of this patch is to make the small page allocator and the large page allocator into pointers (where they used to be just structs). Then assign each of those pointers to the same actual allocator since the buddy allocator has supported mixed page sizes since its inception. For the rest of the driver some changes had to be made in order to actually support mixed pages in a single address space. 1. Unifying the allocation page size determination Since the allocation and map operations happen at distinct times both mapping and allocation of GVA space must agree on page size. This is because the allocation has to separate allocations into separate PDEs to avoid the necessity of supporting mixed PDEs. To this end a function __get_pte_size() was introduced which is used both by the balloc code and the core GPU MM code. It determines page size based only on the length of the mapping/ allocation. 2. Fixed address allocation + page size Similar to regular mappings/GVA allocations fixed address mapping page size determination had to be modified. In the past the address of the mapping determined page size since the address space split was by address (low addresses were small pages, high addresses large pages). Since that is no longer the case the page size field in the reserve memory ioctl is now honored by the mapping code. When, for instance, CUDA makes a memory reservation it specifies small or large pages. When CUDA requests mappings to be made within that address range the page size is then looked up in the reserved memory struct. Fixed address reservations were also modified to now always allocate at a PDE granularity (64M or 128M depending on large page size. This prevents non-fixed allocations from ending up in the same PDE and causing kernel panics or GMMU faults. 3. The rest... The rest of the changes are just by products of the above. Lots of places required minor updates to use a pointer to the GVA allocator struct instead of the struct itself. Lastly, this change is not truly complete. More work remains to be done in order to fully remove the notion that there was such a thing as separate address spaces for different page sizes. Basically after this patch what remains is cleanup and proper documentation. Bug 1396644 Bug 1729947 Change-Id: If51ab396a37ba16c69e434adb47edeef083dce57 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1265300 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: retrieve gpu load	Aparna Das	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support to send command to RM server to retrieve GPU load. Bug 200261903 Change-Id: Ie3d0ba7ec91317e9a2911f71613ad78d20f9c1fb Signed-off-by: Aparna Das <aparnad@nvidia.com> Reviewed-on: http://git-master/r/1275045 (cherry picked from commit 5a6c1de1e6997bfd803b4b95b3e44e282ba32f67) Reviewed-on: http://git-master/r/1283279 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: use HAL to set TSG timeslice	Thomas Fleury	2017-01-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Setting timeslice for virtualized case was not effective, because both ioctls NVGPU_TSG_IOCTL_SET_TIMESLICE and NVGPU_SCHED_IOCTL_TSG_SET_TIMESLICE were calling the native function to set TSG timeslice. - Fixed wrapper function to call HAL - Defined HAL function for "native" set TSG timeslice - Also, properly update timeout_us in TSG context, in virtualized case. This change also moves the min/max bounds checking for tsg timeslice into the native function implementation. There is no sysfs node for these parameters for vgpu, as RM server is ultimately responsible for this check. Bug 200263575 Change-Id: Ibceab9427561ad58ec28abfff0c96ca8f592bdb9 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1283180 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Move gp10b HW headers	Alex Waterman	2017-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the gp10b HW headers to a new directory specially for them: include/nvgpu/hw/gp10b And change the code to include like so: #include <nvgpu/hw/gp10b/hw_fb_gp10b.h> This is part of the process to restructure the nvgpu driver. Bug 1799159 Change-Id: Ic80ea5b7f5c280839e502e2178a345181f7a7ef9 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1280326 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Start re-organizing the HW headers	Alex Waterman	2017-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reorganize the HW headers of gk20a. The headers are moved to a new directory: include/nvgpu/hw/gk20a And from the code are included like so: #include <nvgpu/hw/gk20a/hw_pwr_gk20a.h> This is the first step in reorganizing all of the HW headers for gm20b, gm206, etc. This is part of a larger effort to re-structure and make the driver more readable and scalable. Bug 1799159 Change-Id: Ic151155cbc2e6f75009f2d9d597b364a1bed2c4c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1244790 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Move allocators to common/mm/	Alex Waterman	2017-01-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the GPU allocators to common/mm/ since the allocators are common code across all GPUs. Also rename the allocator code to move away from gk20a_ prefixed structs and functions. This caused one issue with the nvgpu_alloc() and nvgpu_free() functions. There was a function for allocating either with kmalloc() or vmalloc() depending on the size of the allocation. Those have now been renamed to nvgpu_kalloc() and nvgpu_kfree(). Bug 1799159 Change-Id: Iddda92c013612bcb209847084ec85b8953002fa5 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1274400 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: vgpu: receive event TEGRA_VGPU_EVENT_SM_ESR	Richard Zhao	2017-01-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- allocate gr.sm_error_state - handle event of sm error state - add callback of clear sm error state JIRA VFND-3291 Bug 200257899 Change-Id: I49b9437013e8c65290750b7fe21fc6819ea93b1c Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1278397 Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: restructure event handling	Richard Zhao	2017-01-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Take interrupts as one kind of event message, and make it easier to add new kind of events. JIRA VFND-3291 Bug 200257899 Change-Id: I83482293230c0aa10b05caf61e249a042bf6653c Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1278396 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: no support for sparse mapping	Aparna Das	2016-12-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently sparse mapping is not supported for gp10b in virtualized environment. Modify gpu characteristics to reflect non-implementation of this functionality. Also fix return value in vgpu_gp10b_locked_gmmu_map() on error condition. Bug 200243373 Change-Id: Ia367b923b87738a5cad0617cdb074f5a24fb1c81 Signed-off-by: Aparna Das <aparnad@nvidia.com> Reviewed-on: http://git-master/r/1269710 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-by: Sachit Kadle <skadle@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: vgpu: fix va leak when call gk20a_vm_free_va	Richard Zhao	2016-12-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	page size index needs to be set explicitly when call gk20a_vm_free_va. Bug 200255799 JIRA VFND-3033 Change-Id: Ic23ea68905ea423173d1859fd100e7b2c82a1bcc Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1262590 (cherry picked from commit 918aea147b395f7337db348d2616fb4b195dc53a) Reviewed-on: http://git-master/r/1263400 Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: vgpu: add set_preemption_mode	Richard Zhao	2016-12-27
\| \| \| \| \| \| \| \| \| \| \| \| \|	Implement HAL callback set_preemption_mode Bug 200238497 JIRA VFND-2683 Change-Id: I8fca8e1ba112d8782ce18f0899eca38a1d12b512 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1236976 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: move to use vgpu_get_handle helper function	Richard Zhao	2016-12-27
\| \| \| \| \| \| \| \| \| \|	JIRA VFND-2103 Change-Id: Ic11cff40e64849cb6abb193bec54d03857433416 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1185205 GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: gp10x: support in-kernel vidmem mappings	Konsta Holtta	2016-12-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Propagate the buffer aperture flag in gk20a_locked_gmmu_map up so that buffers represented as a mem_desc and present in vidmem can be mapped to gpu. JIRA DNVGPU-18 JIRA DNVGPU-76 Change-Id: Icd675e83e3c28836f0ed8880425748697713bb0a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1169296 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: Add CE engine to engine list	Terje Bergstrom	2016-12-27
\| \| \| \| \| \| \| \| \| \| \| \| \|	Initialize CE engine also for gp10b. Change-Id: Ibce2f80b523a09fb1345995c03c5430f3b20844f Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1170453 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> Tested-by: Aingara Paramakuru <aparamakuru@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Lakshmanan M <lm@nvidia.com>
*	gpu: nvgpu: vgpu: manage gr_ctx as independent resource	Richard Zhao	2016-12-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gr_ctx will managed as independent resource in RM server and vgpu can get a gr_ctx handle. Bug 1702773 Change-Id: Ifceb44b7d9a1ba03fc2a4df847f4a78ac4c4a0d4 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1144934 (cherry picked from commit 0da3101d9b59fe1f9a47ce7b70b30cb8919f35ac) Reviewed-on: http://git-master/r/1150707 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: API to set preemption mode	Deepak Nibade	2016-12-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Separate out new API gr_gp10b_set_ctxsw_preemption_mode() which will check requested preemption modes and take appropriate action for each preemption mode This API will also do some sanity checking for valid preemption modes and combinations Define API set_preemption_mode() for gp10b which will set the preemption modes passed as argument and then use gr_gp10b_set_ctxsw_preemption_mode() and update_ctxsw_preemption_mode() to update preemption mode Legacy path from gr_gp10b_alloc_gr_ctx() will convert flags NVGPU_ALLOC_OBJ_FLAGS_* into appropriate preemption modes and then call gr_gp10b_set_ctxsw_preemption_mode() New API set_preemption_mode() will use new flags NVGPU_GRAPHICS/COMPUTE_PREEMPTION_MODE_* and set and update ctxsw preemption mode In gr_gp10b_update_ctxsw_preemption_mode(), update graphics context to set CTA premption mode if mode NVGPU_COMPUTE_PREEMPTION_MODE_CTA is set Also, define preemption modes in nvgpu-t18x.h and use them everywhere Remove old definitions of modes from gr_gp10b.h Bug 1646259 Change-Id: Ib4dc1fb9933b15d32f0122a9e52665b69402df18 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1131806 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gp10b: Allow importing makefile via include	Terje Bergstrom	2016-12-27
\| \| \| \| \| \| \| \| \| \| \| \| \|	Refactor makefiles so that there is one makefile, and that file can be included in the main nvgpu build. Bug 1476801 Change-Id: I23ac451d695fc64064de2300e83b9d9487c52743 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1028353 Reviewed-by: Konsta Holtta <kholtta@nvidia.com> GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: vgpu: fix sparse warnings	Richard Zhao	2016-12-27
\| \| \| \| \| \| \| \| \| \| \|	Bug 200088648 Change-Id: I61be7b4787e9bc9bac310a8739977f43c38a67ee Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1000174 Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: update interface to free GR ctx	Aingara Paramakuru	2016-12-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The server only releases ownership of the ctxsw buffer mappings after the GR ctx has been released. Update the sequence to account for this. JIRA VFND-1117 Bug 1708163 Change-Id: I3aed015805b4ca51433e7d37ad32de2f8353999f Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/922817 Reviewed-by: Richard Zhao <rizhao@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: gp10b: map GfxP buffers as GPU cacheable	Aingara Paramakuru	2016-12-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some of the allocated buffers are used during normal graphics processing. Mark them as GPU cacheable to improve performance. Bug 1695718 Change-Id: I71d5d1538516e966526abe5e38a557776321597f Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/827087 (cherry picked from commit 60b40ac144c94e24a2c449c8be937edf8865e1ed) Reviewed-on: http://git-master/r/828493 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: set correct page size index for gp10b	Richard Zhao	2016-12-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	VM server only know big page and small page, so convert gmmu_page_size_kernel to according page size index. JIRA VFND-890 Change-Id: Id1f932752b8ca33d14635ac9d71019364aa89dc4 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/816359 (cherry picked from commit 5bfc4a2a55889f5457bd34aa06861c042ee67421) Reviewed-on: http://git-master/r/827131 GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: vgpu: add interface to alloc ctxsw buffers	Aingara Paramakuru	2016-12-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gp10b introduces support for preemption (GfxP and CILP). Add a new interface to allow allocating buffers needed to support this functionality. Bug 1677153 Change-Id: I8578a7b0a4327f3496d852eeb8be5fc778e2c225 Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/806963 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: http://git-master/r/817039 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: add gp10b support	Aingara Paramakuru	2016-12-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for gp10b in a virtualized environment. Bug 1677153 VFND-693 Change-Id: I919ffa44c6773940a7a3411ee8bbc403a992b7cb Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/792556 Reviewed-on: http://git-master/r/806193 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: acquire mutex for notifier read	Deepak Nibade	2016-12-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We use &ch->error_notifier_mutex to protect writes and free of error notifier But we currently do not protect reading of notifier in gk20a_fifo_set_ctx_mmu_error() and vgpu_fifo_set_ctx_mmu_error() Add new API gk20a_set_error_notifier_locked() which is same as gk20a_set_error_notifier() but without the locks. In *_fifo_set_ctx_mmu_error() APIs, acquire the mutex explicitly, and then use this new API gk20a_set_error_notifier() will now just call gk20a_set_error_notifier_locked() within a mutex Bug 1824788 Bug 1844312 Change-Id: I1f3831dc63fe1daa761b2e17e4de3c155f505d6f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1273471 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Richard Zhao <rizhao@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: gk20a: Allow regops lists longer than 128	Sami Kiminki	2016-12-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Process long regops lists in 4-kB fragments, overcoming the overly low limit of 128 reg ops per IOCTL call. Bump the list limit to 1024 and report the limit in GPU characteristics. Bug 200248726 Change-Id: I3ad49139409f32aea8b1226d6562e88edccc8053 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/1253716 (cherry picked from commit 22314619b28f52610cb8769cd4c3f9eb01904eab) Reviewed-on: http://git-master/r/1266652 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: replace tsg list mutex with rwsem	Konsta Holtta	2016-12-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Lock only for modifications to the tsg channel list, and allow multiple concurrent readers. Bug 1848834 Bug 1814773 Change-Id: Ie3938d4239cfe36a14211f4649ce72b7fc3e2fa4 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1269579 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: add tsg_open HAL interface	Sachit Kadle	2016-12-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add HAL interface for TSG open, which is intended to be called from the exisiting gk20a_tsg_open function. The tsg_open entryoint is only implemented for vgpu, as the server needs to clear metadata when a tsg is opened. Bug 200215060 Change-Id: Icc8fd602f31e52d9fa9b2e7786b665b9e7b9294e Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1249218 (cherry picked from commit 35c86f7c796c6574d3dc336e20012ea5c16d7cb4) Reviewed-on: http://git-master/r/1256468 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: vgpu: fix va leak when call gk20a_vm_free_va	Richard Zhao	2016-12-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	page size index needs to be set explicitly when call gk20a_vm_free_va. Bug 200255799 JIRA VFND-3033 Change-Id: I376c63e724b8f59aee389c54ca1589683536f043 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1262586 (cherry picked from commit 82c05633f17fa094d8e08c8a0fa4bad2d3275268) Reviewed-on: http://git-master/r/1263403 Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>