nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: cyclestats mode E snapshots support	Leonid Moiseichuk	2015-06-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	That is a kernel supporting code for cyclestats mode E. Cyclestats mode E implemented following Windows-design in user-space and required the following operations to be implemented: - attach a client for shared hardware buffer of device - detach client from shared hardware buffer - flush means copy of available data from hardware buffer to private client buffers according to perfmon IDs assigned for clients - perfmon IDs management for user-space clients - a NVGPU_GPU_FLAGS_SUPPORT_CYCLE_STATS_SNAPSHOT capability added Bug 1573150 Change-Id: I9e09f0fbb2be5a95c47e6d80a2e23fa839b46f9a Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com> Reviewed-on: http://git-master/r/740653 (cherry picked from commit 79fe89fd4cea39d8ab9dbef0558cd806ddfda87f) Reviewed-on: http://git-master/r/753274 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: use updated APIs to get/put syncpoint	Deepak Nibade	2015-06-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pass host1x device pointer for below APIs : nvhost_get_syncpt_host_managed() nvhost_syncpt_put_ref_ext() Also, compose a name for syncpoints in GPU driver itself. This name will be created as combination of device name and channel index Bug 1611482 Change-Id: Id1ddd0e87e0272ddb0758713d6b6c2544bc36bf4 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/751908 Reviewed-by: Arto Merilainen <amerilainen@nvidia.com> Tested-by: Arto Merilainen <amerilainen@nvidia.com>
*	Revert "Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space""""	Bharat Nihalani	2015-06-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 2e5803d0f2b7d7a1577a40f45ab9f3b22ef2df80 since the issue seen with bug 200106514 is fixed with change http://git-master/r/#/c/752080/. Bug 200112195 Change-Id: I588151c2a7ea74bd89dc3fd48bb81ff2c49f5a0a Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com> Reviewed-on: http://git-master/r/752503 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Fix invalid GPFIFO entries	Alex Waterman	2015-06-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the addition of the buddy allocator often times push buffers are allocated by the kernel in high GVA memory regions. These addresses, when written into a GPFIFO entry, have bits set in entry1 of the GPFIFO command. As a result, if no length is set, then these address bits will be interpreted as opcodes by the GPU. The bug fixed by this patch was caused by a wait_cmd being inserted into the GPFIFO with an address of a pushbuffer above 4GB and a zero length. This occured becasue the code that creates the wait_cmd was able to return what appeared to be a valid priv_cmd_entry even though there was nothing in that command. This bug does not appear before the buddy allocator because the FFF allocator always starts allocating from low addresses. As such when a channel's GPFIFO is allocated it gets an address below 32bits. The, because no higher address bits are set, entry1 of the GPFIFO is simply 0 and the GPU trets the command as a no-op. Change-Id: I9c1e600c368b55626e99f6f712f1821148bbb76d Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/752080 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space"""	Bharat Nihalani	2015-06-02
\| \| \| \| \| \| \| \| \| \| \|	This reverts commit ce1cf06b9a8eb6314ba0ca294e8cb430e1e141c0 since it causes GPU pbdma interrupt to be generated. Bug 200106514 Change-Id: If3ed9a914c4e3e7f3f98c6609c6dbf57e1eb9aad Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com> Reviewed-on: http://git-master/r/749291
*	gpu:nvgpu: update channel_setup_ramfc interface	Seshendra Gadagottu	2015-06-02
\| \| \| \| \| \| \| \| \| \| \| \| \|	Pass flags parameter to channel_setup_ramfc for indicating nvgpu_alloc_gpfifo_args characteristics. Bug 1645628 Change-Id: Ia40b37c5c7b208d459aa84f1b022036dd5e1b599 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/744526 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use HAL for waiting for GR quiet	Terje Bergstrom	2015-06-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create a HAL for waiting for GR to become quiet. Use it forall cases where we require GR to be quiet, but where it does not need to be idle. Bug 1640378 Change-Id: Ic0222d595a2d049e0fa8864b069ab94a97fac143 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/745640 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	gpu: nvgpu: Support >32bit addresses in simulation	Terje Bergstrom	2015-06-01
\| \| \| \| \| \| \| \|	Change-Id: I96282b4e047ba8b5369dac039f0f51856c69235b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/747935 (cherry-picked from commit 0bb090745b4122fc4149b1bd6026138a1b9a32bc) Reviewed-on: http://git-master/r/749235
*	gpu: nvgpu: drop syncpoint refcount instead of direct free	Deepak Nibade	2015-06-01
\| \| \| \| \| \| \| \| \| \| \| \| \|	Drop host1x syncpoint refcount with nvhost_syncpt_put_ref_ext() instead of freeing it directly Bug 1646883 Change-Id: Ib213e58031a9302e683f8d13ebb4e1f913206464 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/747150 GVS: Gerrit_Virtual_Submit Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
*	gpu: nvgpu: fix channel leak with immediate close	Deepak Nibade	2015-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a GPU channel is closed immediately after opening without performing any operation on it, we leak that channel e.g. below command leaks a channel echo > /dev/nvhost-gpu Fix this leak by releasing the channel before returning Change-Id: I2598e3cabec6996cb1cf8066a1e6d7d5864ae02b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/743235 (cherry picked from commit 428771509b4431ebe88e38061b495cabc5192327) Reviewed-on: http://git-master/r/744279 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	Revert "Revert "gpu: nvgpu: New allocator for VA space""	Alex Waterman	2015-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 7eb42bc239dbd207208ff491c3fb65c3d83274d8. The original commit was actually fine. Change-Id: I564ce6530ac73fcfad17dcec9c53f0353b4f02d4 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/743300 (cherry picked from commit e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff) Reviewed-on: http://git-master/r/743301 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: wait for running jobs to finish before shutdown	Deepak Nibade	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_pm_shutdown(), we currently call __pm_runtime_disable() which prevents h/w access to new requests made after shutdown() call Also, once gk20a_pm_shutdown() completes, platform code will just rail gate the GPU But it is possible that some other thread is already accessing h/w while shutdown() was triggered and this can result in hang Hence, wait until all currently executing jobs are finished before returning from gk20a_pm_shutdown() Also, we need to wait for GPU's usage count to become 1 since platform code will increase the usage count and then call shutdown(). Hence usage count of 1 indicates that GPU is idle Bug 200099940 Change-Id: I1f2457829e2737c07302d13f355353a30c3b4e67 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/734920 Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com> Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: add secure gpccs boot support	Vijayakumar	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 200080684 keeping it disabled by default also trimming the code by removing redundant variable to check recovery. pmu quick wait now checks only for irqs which are serviced by kernel. requests pmu to bit bang gpccs ucode. Change-Id: I12ef23d6d59b507e86a129b69eab65b21d0438c6 Signed-off-by: Vijayakumar <vsubbu@nvidia.com> Reviewed-on: http://git-master/r/729622 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Optimize validate_fixed_buffer	Sami Kiminki	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \|	Function validate_fixed_buffer used to do a linear search for collision detection of already mapped buffers. Optimize this by doing a nice logarithmic search instead. Change-Id: Ifbf2ec015741d44883da27bc6f8cc090c48da145 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/739682 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	Revert "gpu: nvgpu: Fix gk20a shutdown issue"	Deepak Nibade	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit c18edd3686115ca0b7d8bb08b35f23264f865358. Proper fixes for shutdown issue are being added with below changes http://git-master/r/#/c/738509/ http://git-master/r/#/c/734920/ Hence revert this workaround Bug 200099940 Change-Id: I74b29c804af2bdb9d95c6b93c5308a323575ae57 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/739082 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: jump to fail path if pm_runtime_get_sync() fails	Deepak Nibade	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we execute pm_runtime_get_sync() and then gk20a_scale_notify_busy() without checking return value of pm_runtime_get_sync() In case of shutdown of GPU is already initiate, we get a hard hang due to this as per below sequence : - one thread invokes GPU shutdown and then forcibly rail gates the GPU - another thread (unaware of shutdown) calls gk20a_busy() - since runtime PM is disabled in shutdown path, pm_runtime_get_sync() fails - but we still go on running gk20a_scale_notify_busy() which tries to access some GPU registers and hangs Fix this by jumping to failure path in case pm_runtime_get_sync() fails Bug 200099940 Change-Id: I022f2dfa9408f640fb44e6f4b10a437688779c0a Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/738509 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: use vzalloc for mm entries	Kerwin Wan	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When system is in low memory, kzalloc will fail if kernel requests more than PAGE_SIZE continous memory block. Bug 200096099 Change-Id: I44e217ffa6aa6c453a4d4afba45a8ee3b5756cc1 Signed-off-by: Kerwin Wan <kerwinw@nvidia.com> Reviewed-on: http://git-master/r/732197 (cherry picked from commit 62861976421415f93e98a0a9f977ac1f66046714) Reviewed-on: http://git-master/r/737057 Reviewed-by: Krishna Reddy <vdumpa@nvidia.com> Tested-by: Krishna Reddy <vdumpa@nvidia.com>
*	gpu: nvgpu: protect missing sgl in gk20a_mem_phys	Konsta Holtta	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Return zero for missing sgl (sgt is already checked) instead of attempting to dereference NULL. Those NULL conditions should be almost nonexistent, and zero is not normally used. When reading gk20a_mem_phys() in gk20a_gr_get_chid_from_ctx() from an isr, the mem desc may race with channel deletion and get suddendly zeroed, even if the channel's in_use flag would be set. Plain zero results in expected behaviour. Change-Id: I7033979091951cba3e3004ddc7550cd327ad0baf Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/737759 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Implement compbits mapping	Sami Kiminki	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_AS_IOCTL_GET_BUFFER_COMPBITS_INFO for requesting info on compbits-mappable buffers; and NVGPU_AS_IOCTL_MAP_BUFFER_COMPBITS, which enables mapping compbits to the GPU address space of said buffers. This, subsequently, enables moving comptag swizzling from GPU to CDEH/CDEV formats to userspace. Compbits mapping is conservative and it may map more than what is strictly needed. This is because two reasons: 1) mapping must be done on small page alignment (4kB), and 2) GPU comptags are swizzled all around the aggregate cache line, which means that the whole cache line must be visible even if only some comptag lines are required from it. Cache line size is not necessarily a multiple of the small page size. Bug 200077571 Change-Id: I5ae88fe6b616e5ea37d3bff0dff46c07e9c9267e Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/719710 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: tegra gpu to emc frequency mapping	Anders Kugler	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	o emc clock scaling (bug fix): Take the gpu load into account for gpu frequencies less than or equal to fmax @ Vmin. Bug 1591643 Change-Id: I0298adfdd4b7111557907c3bd6022fd6005355f0 Signed-off-by: Anders Kugler <akugler@nvidia.com> Reviewed-on: http://git-master/r/735846 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Dynamic betacb size	Terje Bergstrom	2015-05-18
\| \| \| \| \| \| \| \| \| \| \|	Allow querying and setting default betacb size via debugfs. For global buffers the value takes effect upon first boot of GPU, and has no effect after that. Bug 1628352 Change-Id: Ib63f4299249c41eab1b36cc501b525cc54211195 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/733328
*	gpu: nvgpu: Fix gk20a shutdown issue	Sumit Singh	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With CONFIG_PM_GENERIC_DOMAINS_OF enabled, device reboot was getting hung while shutting-down gk20a. It was happening because genpd_dev_pm_detach() was railgating gk20a while other thread was still accessing it. So, assigning NULL to dev->pm_domain->detach for gk20a, so that genpd_dev_pm_detach() is not called during gk20a shutdown, which will not railgate it. This patch will be reverted once we have clean shutdown for gk20a. Bug 200070810 Bug 200099940 Change-Id: Ie2e89ea01a98a9d4f2f68a3ab07b6923ffa374f6 Signed-off-by: Sumit Singh <sumsingh@nvidia.com> Reviewed-on: http://git-master/r/735455 Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com> Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: Combine delays with GK20A parameters	Alex Frid	2015-05-18
\| \| \| \| \| \| \| \| \| \|	Specified locking timeout and IDDQ exit delay as GK20A PLL parameters, and used this data instead of hard-coded numbers. Change-Id: I59e16ed11fdba6911f2751195d182e68aed96851 Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/735481 Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: fix setting gr_pd_ab_dist_cfg1_r()	David Li	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	gr_*__set_alpha_circular_buffer_size() left max_batches field of gr_pd_ab_dist_cfg1_r as 0 which results in too many alpha beta transitions and poor performance when tessellation or geometry shaders are used Change-Id: If18feb1119e9672005455155dc56337cd444a1f1 Signed-off-by: David Li <davli@nvidia.com> Reviewed-on: http://git-master/r/735476 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: dbg level for per-write ctx patch msg	Konsta Holtta	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	The message "per-write ctx patch begin?" is a legacy message for warning about probably inefficient code, but it's written at error loglevel. Silence it out a bit by using gk20a_dbg_info(). The inefficient paths can be fixed later. Bug 200075565 Change-Id: Idae821aef3001ea5016de22a1a87fec747c42d31 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/734248
*	gpu: nvgpu: check sync existence in channel update	Konsta Holtta	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The channel sync object can get deleted before all channel updates have finished if the channel is freed before them, so work around a null dereference by testing if the sync exists. Channel and/or c->sync refcounting would be necessary for proper fix. Bug 200076344 Change-Id: Ica8ef2df9cd95cfa593cd4f41768dbb6641357b2 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/734266 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: updated gpmu interface data struct.	Mahantesh Kumbar	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- pmu version 19494277 is from CL 19495746 - updated gpmu interface data struct with respect to latest pmu ucode interface headers. gpmuifpg.h - 19199047 gpmuifperfmon.h - 18238819 gpmuifpmu.h - 19199047 gpmuifacr.h - 19343196 gpmuifcmn.h - 19264862 rmflcnbl.h - 19317152 Bug 200085428 Change-Id: I7db56dcf5a3038b40da37a69e8723a2e9a652e4b Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/728461 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Do not leak ACR header	Terje Bergstrom	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	4b6f83704f054f5b21e05873fa5862c667a9992e tried to fix ACR related leak. It fell short, because the data structures related were local and thus the leak was not really fixed. This patch stores the ACR ucode blob in a global variable, which survives across rail gating. Change-Id: Iec3ac9d41156baa26048e079732568c0a95264f4 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/733732 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	gpu: nvgpu: zbc: disable activity only from ioctl	Terje Bergstrom	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	Move the fifo engine activity disabling and wait-for-idle from the lowest-level functions higher, into the ioctl path of zbc operations, so that the sw initialization path wouldn't call them. During the init path, the disable isn't necessary, and the code path could result in a deadlock in the fifo runlist mutex. Change-Id: Icf5c270ba29bc1c7f88874fba2d176d68e11278a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/733668
*	gpu: nvgpu: Combine delays with GM20B parameters	Alex Frid	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added delays definitions to GPCPLL parameters structure: - locking timeout delay (applied to locking in fixed frequency mode and to PLL dynamic ramp in any mode) - lock delay for GPCPLL NA mode - IDDQ exit delay in any mode Specified delay parameters for GM20B PLL, and used this data instead of hard-coded numbers. Change-Id: I63ce0abc9ee900c36ec34b8641513db3cbb6f7d5 Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/732094 Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: Add GPU voltage debug access	Alex Frid	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Added GPU voltage debug print to the initial locking of GPCPLL under bypass (available only when GPCPLL is in NA mode). - Added /sys/kernel/debug/gpu.0/voltage debugfs node to read voltage through GPCPLL (available only when GPCPLL is in NA mode). Change-Id: I6643ad4d1b228ec4cbc4ff5e8716cce3ef9dccfc Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/731572 Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: Use MC API for SECURITY_CARVEOUT2	Alex Waterman	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This removes all direct access to the MC registers. This requires that the MC be loaded before the GPU. Bug 1540908 Change-Id: I90bcde62f65a0c0d73a2bbe92cbf4a980c671c7d Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/453653 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Supriya Sharatkumar <ssharatkumar@nvidia.com> Reviewed-by: Krishna Reddy <vdumpa@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	Revert "gpu: nvgpu: Skip reg read of gpc2clk"	Terje Bergstrom	2015-05-18
\| \| \| \| \| \| \| \| \|	This reverts commit 259842f9d222dd2ca2e66bddaceef4a2fd626bc7. The commit clears some init values that are never restored. Change-Id: I4efee115863cbfb08b2e280a58b525cb49adc0b6 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/732428
*	gpu: nvgpu: Power up GPU in CDE only when converting	Terje Bergstrom	2015-05-18
\| \| \| \| \| \| \| \| \| \| \|	GPU does not need to powered up if user space calls kernel and there is no new work to be done. Bug 1623918 Change-Id: I531aa7033530ae652d13684d8f8568a0e05fc2e1 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/732748
*	gpu: nvgpu: fix compile error with ALLOCATOR_DEBUG	Deepak Nibade	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix compile time error of missing argument when ALLOCATOR_DEBUG is enabled Bug 200095967 Change-Id: I600330f3a75cf777d9cd35ec1f00fdd926fba429 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/731320 GVS: Gerrit_Virtual_Submit Reviewed-by: Sri Krishna Chowdary <schowdary@nvidia.com> Tested-by: Sri Krishna Chowdary <schowdary@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gm20b: correct hdr #define	Scott Long	2015-05-18
\| \| \| \| \| \| \| \| \| \| \|	__REGOPS_GK20A_H_ -> __REGOPS_GM20B_H_ Bug 1634208 Change-Id: Ic623563492c084162bfad10f895896d77b4192ed Signed-off-by: Scott Long <scottl@nvidia.com> Reviewed-on: http://git-master/r/729749 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: return error if GPU not initialized	Deepak Nibade	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While writing to sysfs "tpc_fs_mask", we need to have GPU initialized (we need to have called gk20a_busy() at least once before) If this is not happened yet, then return error Bug 1456969 Change-Id: I09db6bcaa44b8939246cb5ed1205f3fbc0ee0552 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/731327 (cherry picked from commit 0dbbcf60bbad6b9a31392d2290a3e26c5daa1e5d) Reviewed-on: http://git-master/r/731671 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: Fill in ACR header only once	Terje Bergstrom	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We call prepare_ucode_blob() once each time we un-railgate. We allocate prepare the header for ACR ucode there, but the header never gets freed. Allocate and prepare the ACR header only once. Change-Id: I948da8b47d6bb2fa021868d7038d2cc35eccb460 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/729745 Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
*	gpu: nvgpu: protect missing sgt in gk20a_mem_phys	Konsta Holtta	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Return zero for missing sgt instead of attempting to dereference NULL. Those NULL conditions should be almost nonexistent, and zero is not normally used. When reading gk20a_mem_phys() in gk20a_gr_get_chid_from_ctx() from an isr, the mem desc may race with channel deletion and get suddendly zeroed, even if the channel's in_use flag would be set. Plain zero results in expected behaviour. Change-Id: Id8ce37798d6fd3ceeb96a3f521c82569fccf30aa Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/729006 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gm20b: enable slcg fb	Seshendra Gadagottu	2015-05-18
\| \| \| \| \| \| \| \| \| \|	Bug 1550628 Change-Id: I8daed555704b49ee0d50530e3d51c03027d31fc5 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/719892 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix return code in *_ltc_cbc_ctrl()	Alex Waterman	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \|	Fix the return code for both gk20a_ and gm20b_ltc_cbc_ctrl() functions. Before a positive return woudl always happen. Now, if there's a timeout -EBUSY is returned. Change-Id: Id76dc44af1376fceebf5043afb057c153cb0752e Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/729165 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Fix timeout in gm20b's LTC flush	Alex Waterman	2015-05-18
\| \| \| \| \| \| \| \| \| \| \|	The flush timeout should have been comparing between the current time (jiffies) not the snapshot in time when the L2 flush started. Change-Id: Idba0ccbfeeab9e3fadd0b5bed7073acefbd403e3 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/729090 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use common allocator for ACR	Terje Bergstrom	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \|	Reduce amount of duplicate code around memory allocation by using common helpers, and common data structure for storing results of allocations. Bug 1605769 Change-Id: Ib70db4dff782176ed7f92b6809c8415b8c35abe1 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/721120
*	gpu: nvgpu: fix deadlock on railgate_lock during race condition	Deepak Nibade	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have below race condition during __gk20a_do_idle() and force_reset case : - before execution of __gk20a_do_idle(), a process drops the last usage count of GPU, which triggers GPU railgate process - but before GPU is really railgated (there is 500 mS delay), some process calls __gk20a_do_idle() - in __gk20a_do_idle(), we first take railgate_lock - then we check if GPU is already railgated or not - since it is not railgated yet (due to 500 mS delay), this returns false - then we call pm_runtime_get_noresume() which just increases the usage counter - in this particular case, this call just increases usage count to 1 from 0, but whereas GPU is already on its way to railgate - while we check if GPU usage count drops to one, GPU gets railgated - now if we have force_reset=true case, we will end up calling pm_runtime_get_sync() which will take railgate_lock lock _again_ and try to unrailgate GPU - this causes a deadlock on railgate_lock To fix this, use below sequence : - take railgate_lock - check if GPU is already railgated - release railgate_lock - call pm_runtime_get_sync() which will keep GPU active even if railgating is already triggered - take railgate_lock again to prevent unrailgate in futher process Also, add more descriptive comments to explain the flow Bug 1624537 Change-Id: I0febc65d7bfac03ee738be200cf321322ffbe5a6 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/719625 (cherry picked from commit 480284eda16e2b50ee6368bad3d15574e098b231) Reviewed-on: http://git-master/r/719620 Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: don't reset clk that doesn't exist	Alex Van Brunt	2015-05-15
\| \| \| \| \| \| \| \| \| \| \|	If the clock is null, calling the reset function will crash the kernel. So, don't call the reset function. Change-Id: I37ef25c8dca67bec8bf6654eb6e275b866bdae53 Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com> Reviewed-on: http://git-master/r/742361 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit
*	Revert "gpu: nvgpu: New allocator for VA space"	Terje Bergstrom	2015-05-12
\| \| \| \| \| \| \| \| \| \| \|	This reverts commit 2e235ac150fa4af8632c9abf0f109a10973a0bf5. Change-Id: I3aa745152124c2bc09c6c6dc5aeb1084ae7e08a4 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/741469 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Hiroshi Doyu <hdoyu@nvidia.com> Tested-by: Hiroshi Doyu <hdoyu@nvidia.com>
*	gpu: nvgpu: New allocator for VA space	Alex Waterman	2015-05-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement a new buddy allocation scheme for the GPU's VA space. The bitmap allocator was using too much memory and is not a scaleable solution as the GPU's address space keeps getting bigger. The buddy allocation scheme is much more memory efficient when the majority of the address space is not allocated. The buddy allocator is not constrained by the notion of a split address space. The bitmap allocator could only manage either small pages or large pages but not both at the same time. Thus the bottom of the address space was for small pages, the top for large pages. Although, that split is not removed quite yet, the new allocator enables that to happen. The buddy allocator is also very scalable. It manages the relatively small comptag space to the enormous GPU VA space and everything in between. This is important since the GPU has lots of different sized spaces that need managing. Currently there are certain limitations. For one the allocator does not handle the fixed allocations from CUDA very well. It can do so but with certain caveats. The PTE page size is always set to small. This means the BA may place other small page allocations in the buddies around the fixed allocation. It does this to avoid having large and small page allocations in the same PDE. Change-Id: I501cd15af03611536490137331d43761c402c7f9 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/740694 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: WAR for simulator bug	Alex Waterman	2015-05-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On linsim, when the push buffers are allowed to be allocated with small pages above 4GB the simulator crashes. This patch ensures that for linsim all small page allocations are forced to be below 4GB in the GPU VA space. By doing so the simulator no longer crashes. This bug has come up because the GPU buddy allocator work generates allocations at the top of the address space first. Thus push buffers were located at between 12GB and 16GB in the GPU VA space. Change-Id: Iaef0af3fda3f37ac09a66b5e1179527d6fe08ccc Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/740728 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Fix off-by-one error in PDE calculations	Alex Waterman	2015-05-11
\| \| \| \| \| \| \| \| \| \| \| \| \|	The number of entries in the next level PDE data structure was one half of what was needed since the bit shift was 1 bit too small. Change-Id: Id4981f230dd206ae94336cddab117312e143e6a1 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/740727 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Reduce BAR1 kernel size	Alex Waterman	2015-05-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reduce the BAR1 size in the kernel to match the reserved size in the DTB. This caused problems for the buddy allocator since the allocator can sometimes allocate from higher memory before lower memory in the managed space. This would cause the kernel to access unmapped memory. Change-Id: I70b72ef5bb4db01253e5087757051ef852e99bc6 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/740726 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>