nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space"""	Bharat Nihalani	2015-06-02
\| \| \| \| \| \| \| \| \| \| \|	This reverts commit ce1cf06b9a8eb6314ba0ca294e8cb430e1e141c0 since it causes GPU pbdma interrupt to be generated. Bug 200106514 Change-Id: If3ed9a914c4e3e7f3f98c6609c6dbf57e1eb9aad Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com> Reviewed-on: http://git-master/r/749291
*	gpu: nvgpu: Use HAL for waiting for GR quiet	Terje Bergstrom	2015-06-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create a HAL for waiting for GR to become quiet. Use it forall cases where we require GR to be quiet, but where it does not need to be idle. Bug 1640378 Change-Id: Ic0222d595a2d049e0fa8864b069ab94a97fac143 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/745640 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	Revert "Revert "gpu: nvgpu: New allocator for VA space""	Alex Waterman	2015-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 7eb42bc239dbd207208ff491c3fb65c3d83274d8. The original commit was actually fine. Change-Id: I564ce6530ac73fcfad17dcec9c53f0353b4f02d4 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/743300 (cherry picked from commit e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff) Reviewed-on: http://git-master/r/743301 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add secure gpccs boot support	Vijayakumar	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 200080684 keeping it disabled by default also trimming the code by removing redundant variable to check recovery. pmu quick wait now checks only for irqs which are serviced by kernel. requests pmu to bit bang gpccs ucode. Change-Id: I12ef23d6d59b507e86a129b69eab65b21d0438c6 Signed-off-by: Vijayakumar <vsubbu@nvidia.com> Reviewed-on: http://git-master/r/729622 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Dynamic betacb size	Terje Bergstrom	2015-05-18
\| \| \| \| \| \| \| \| \| \| \|	Allow querying and setting default betacb size via debugfs. For global buffers the value takes effect upon first boot of GPU, and has no effect after that. Bug 1628352 Change-Id: Ib63f4299249c41eab1b36cc501b525cc54211195 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/733328
*	gpu: nvgpu: fix setting gr_pd_ab_dist_cfg1_r()	David Li	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	gr_*__set_alpha_circular_buffer_size() left max_batches field of gr_pd_ab_dist_cfg1_r as 0 which results in too many alpha beta transitions and poor performance when tessellation or geometry shaders are used Change-Id: If18feb1119e9672005455155dc56337cd444a1f1 Signed-off-by: David Li <davli@nvidia.com> Reviewed-on: http://git-master/r/735476 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: updated gpmu interface data struct.	Mahantesh Kumbar	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- pmu version 19494277 is from CL 19495746 - updated gpmu interface data struct with respect to latest pmu ucode interface headers. gpmuifpg.h - 19199047 gpmuifperfmon.h - 18238819 gpmuifpmu.h - 19199047 gpmuifacr.h - 19343196 gpmuifcmn.h - 19264862 rmflcnbl.h - 19317152 Bug 200085428 Change-Id: I7db56dcf5a3038b40da37a69e8723a2e9a652e4b Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/728461 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Do not leak ACR header	Terje Bergstrom	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	4b6f83704f054f5b21e05873fa5862c667a9992e tried to fix ACR related leak. It fell short, because the data structures related were local and thus the leak was not really fixed. This patch stores the ACR ucode blob in a global variable, which survives across rail gating. Change-Id: Iec3ac9d41156baa26048e079732568c0a95264f4 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/733732 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	gpu: nvgpu: Combine delays with GM20B parameters	Alex Frid	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added delays definitions to GPCPLL parameters structure: - locking timeout delay (applied to locking in fixed frequency mode and to PLL dynamic ramp in any mode) - lock delay for GPCPLL NA mode - IDDQ exit delay in any mode Specified delay parameters for GM20B PLL, and used this data instead of hard-coded numbers. Change-Id: I63ce0abc9ee900c36ec34b8641513db3cbb6f7d5 Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/732094 Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: Add GPU voltage debug access	Alex Frid	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Added GPU voltage debug print to the initial locking of GPCPLL under bypass (available only when GPCPLL is in NA mode). - Added /sys/kernel/debug/gpu.0/voltage debugfs node to read voltage through GPCPLL (available only when GPCPLL is in NA mode). Change-Id: I6643ad4d1b228ec4cbc4ff5e8716cce3ef9dccfc Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/731572 Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: Use MC API for SECURITY_CARVEOUT2	Alex Waterman	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This removes all direct access to the MC registers. This requires that the MC be loaded before the GPU. Bug 1540908 Change-Id: I90bcde62f65a0c0d73a2bbe92cbf4a980c671c7d Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/453653 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Supriya Sharatkumar <ssharatkumar@nvidia.com> Reviewed-by: Krishna Reddy <vdumpa@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gm20b: correct hdr #define	Scott Long	2015-05-18
\| \| \| \| \| \| \| \| \| \| \|	__REGOPS_GK20A_H_ -> __REGOPS_GM20B_H_ Bug 1634208 Change-Id: Ic623563492c084162bfad10f895896d77b4192ed Signed-off-by: Scott Long <scottl@nvidia.com> Reviewed-on: http://git-master/r/729749 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Fill in ACR header only once	Terje Bergstrom	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We call prepare_ucode_blob() once each time we un-railgate. We allocate prepare the header for ACR ucode there, but the header never gets freed. Allocate and prepare the ACR header only once. Change-Id: I948da8b47d6bb2fa021868d7038d2cc35eccb460 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/729745 Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
*	gpu: nvgpu: gm20b: enable slcg fb	Seshendra Gadagottu	2015-05-18
\| \| \| \| \| \| \| \| \| \|	Bug 1550628 Change-Id: I8daed555704b49ee0d50530e3d51c03027d31fc5 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/719892 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix return code in *_ltc_cbc_ctrl()	Alex Waterman	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \|	Fix the return code for both gk20a_ and gm20b_ltc_cbc_ctrl() functions. Before a positive return woudl always happen. Now, if there's a timeout -EBUSY is returned. Change-Id: Id76dc44af1376fceebf5043afb057c153cb0752e Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/729165 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Fix timeout in gm20b's LTC flush	Alex Waterman	2015-05-18
\| \| \| \| \| \| \| \| \| \| \|	The flush timeout should have been comparing between the current time (jiffies) not the snapshot in time when the L2 flush started. Change-Id: Idba0ccbfeeab9e3fadd0b5bed7073acefbd403e3 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/729090 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use common allocator for ACR	Terje Bergstrom	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \|	Reduce amount of duplicate code around memory allocation by using common helpers, and common data structure for storing results of allocations. Bug 1605769 Change-Id: Ib70db4dff782176ed7f92b6809c8415b8c35abe1 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/721120
*	Revert "gpu: nvgpu: New allocator for VA space"	Terje Bergstrom	2015-05-12
\| \| \| \| \| \| \| \| \| \| \|	This reverts commit 2e235ac150fa4af8632c9abf0f109a10973a0bf5. Change-Id: I3aa745152124c2bc09c6c6dc5aeb1084ae7e08a4 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/741469 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Hiroshi Doyu <hdoyu@nvidia.com> Tested-by: Hiroshi Doyu <hdoyu@nvidia.com>
*	gpu: nvgpu: New allocator for VA space	Alex Waterman	2015-05-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement a new buddy allocation scheme for the GPU's VA space. The bitmap allocator was using too much memory and is not a scaleable solution as the GPU's address space keeps getting bigger. The buddy allocation scheme is much more memory efficient when the majority of the address space is not allocated. The buddy allocator is not constrained by the notion of a split address space. The bitmap allocator could only manage either small pages or large pages but not both at the same time. Thus the bottom of the address space was for small pages, the top for large pages. Although, that split is not removed quite yet, the new allocator enables that to happen. The buddy allocator is also very scalable. It manages the relatively small comptag space to the enormous GPU VA space and everything in between. This is important since the GPU has lots of different sized spaces that need managing. Currently there are certain limitations. For one the allocator does not handle the fixed allocations from CUDA very well. It can do so but with certain caveats. The PTE page size is always set to small. This means the BA may place other small page allocations in the buddies around the fixed allocation. It does this to avoid having large and small page allocations in the same PDE. Change-Id: I501cd15af03611536490137331d43761c402c7f9 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/740694 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: made gm20b_pmu_init_acr() global.	Mahantesh Kumbar	2015-04-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-made gm20b_pmu_init_acr() method to global to access in pmu-T18x. Bug 200085428 Change-Id: Ic262997d5c6f97cecf12d17d9a64a9d1cd20c83b Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/732210 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Tested-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/735727 Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com> Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
*	gpu: nvgpu: Per-SoC compressible page size	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Define smallest compressible page size per SoC, and use that for determining if a compressible kind should be downgraded to uncompressed. Bug 1605769 Change-Id: I7c9991ba0ae82fe533641f045e506c0b01a10d8b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/724492
*	gpu: nvgpu: Use common allocator for context	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Reduce amount of duplicate code around memory allocation by using common helpers, and common data structure for storing results of allocations. Bug 1605769 Change-Id: I10c226e2377aa867a5cf11be61d08a9d67206b1d Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/720507
*	gpu: nvgpu: Regenerate HW headers	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \|	Added fuse for FBP and DS exception register. Change-Id: Ie38a84eac40ca2d8cf3ac8f19ed6bad0d6bc1dd9 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/722846
*	gpu: nvgpu: fix sparse tool warnings	Sandarbh Jain	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix the following sparse warning: -drivers/gpu/nvgpu/gk20a/gr_gk20a.c:6028:6: warning: symbol 'gr_gk20a_init_sm_dsm_reg_info' was not declared. Should it be static? -gr_gk20a.c:6174:6: warning: symbol 'gr_gk20a_get_sm_dsm_perf_regs' was not declared. Should it be static? -gr_gk20a.c:6184:6: warning: symbol 'gr_gk20a_get_sm_dsm_perf_ctrl_regs' was not declared. Should it be static? -gr_gm20b.c:465:6: warning: symbol 'gr_gm20b_init_sm_dsm_reg_info' was not declared. Should it be static? -gr_gm20b.c:476:6: warning: symbol 'gr_gm20b_get_sm_dsm_perf_regs' was not declared. Should it be static? -gr_gm20b.c:486:6: warning: symbol 'gr_gm20b_get_sm_dsm_perf_ctrl_regs' was not declared. Should it be static? Bug 200067946 Change-Id: Ic14080e94e343386f4fef379152a623c402984fd Signed-off-by: Sandarbh Jain <sanjain@nvidia.com> Reviewed-on: http://git-master/r/723518 Reviewed-by: Sachin Nikam <snikam@nvidia.com> Tested-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: Fix offset of PC sampling field	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	Fix offset of PM word when changing PC sampling field. Bug 1517458 Bug 1573150 Change-Id: I2b8489b1d3c05f3a20416fc1a46ac1827f453cbc Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/721032
*	gpu: nvgpu: add platform specific get_iova_addr()	Deepak Nibade	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add platform specific API pointer (*get_iova_addr)() which can be used to get iova/physical address from given scatterlist and flags Use this API with g->ops.mm.get_iova_addr() instead of calling API gk20a_mm_iova_addr() which makes it platform specific Bug 1605653 Change-Id: I798763db1501bd0b16e84daab68f6093a83caac2 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/713089 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use common allocator for compbit store	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Reduce amount of duplicate code around memory allocation by using common helpers, and common data structure for storing results of allocations. Bug 1605769 Change-Id: I7c1662b669ed8c86465254f6001e536141051ee5 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/720435
*	gpu: nvgpu: GM20B extended buffer definition	Sandarbh Jain	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update extended buffer definition for Maxwell. On GM20B only PERF_CONTROL0 and PERF_CONTROL5 registers are restored in extended buffer. They are needed for stopping the counters as late as possible during ctx save and start them as early as possible during context restore. On Maxwell, these registers contain the enable/disable bit. Bug 200086767 Change-Id: I59125a2f04bd0975be8a1ccecf993c9370f20337 Signed-off-by: Sandarbh Jain <sanjain@nvidia.com> Reviewed-on: http://git-master/r/717421 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use gk20a_mem_phys instead of sg_phys	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \|	There were still a couple of places using sg_phys directly. Use new gk20a_mem_phys() to make the code shorter. Change-Id: I6eb9b14e0c14a27ec39bacd06ab24e31e99769ca Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/717502
*	gpu: nvgpu: make the local function static	Amit Sharma (SW-TEGRA)	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed the following sparse warnings by making below APIs static: - gk20a.c: warning: symbol 'gk20a_pm_restore_debug_setting' was not declared. Should it be static? - gr_gk20a.c: warning: symbol 'gr_gk20a_rop_l2_en_mask' was not declared. Should it be static? - gr_gm20b.c: warning: symbol 'gr_gm20b_rop_l2_en_mask' was not declared. Should it be static? Bug 200067946 Change-Id: I334893bb6614171bff835d270716a7dd262c9ba7 Signed-off-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com> Reviewed-on: http://git-master/r/718756 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: Implement common allocator and mem_desc	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \|	Introduce mem_desc, which holds all information needed for a buffer. Implement helper functions for allocation and freeing that use this data type. Change-Id: I82c88595d058d4fb8c5c5fbf19d13269e48e422f Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/712699
*	gpu: nvgpu: Gpu characterstics enhancement	sujeet baranwal	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	New members are added in nvgpu_gpu_characterstics to export more information required specially from CUDA tools. Change-Id: I907f3bcbd272405a13f47ef6236bc2cff01c6c80 Signed-off-by: Sujeet Baranwal <sbaranwal@nvidia.com> Reviewed-on: http://git-master/r/679202 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Removal of regops from CUDA driver	sujeet baranwal	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current CUDA drivers have been using the regops to directly accessing the GPU registers from user space through the dbg node. This is a security hole and needs to be avoided. The patch alternatively implements the similar functionality in the kernel and provide an ioctl for it. Bug 200083334 Change-Id: Ic5ff5a215cbabe7a46837bc4e15efcceb0df0367 Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Reviewed-on: http://git-master/r/711758 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: do not enable unhandled exceptions	Deepak Nibade	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently have below exceptions enabled but we do not have any handler for them. So if any of these exception is raised, we do not clear it. NV_PGRAPH_EXCEPTION_PD NV_PGRAPH_EXCEPTION_SCC NV_PGRAPH_EXCEPTION_DS NV_PGRAPH_EXCEPTION_MME NV_PGRAPH_EXCEPTION_SKED Hence do not enable above exceptions. Bug 200078514 Change-Id: I0dd3a2299f80f3fe06994818f64151e7cc83a84e Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/714166 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: handle memfmt exception	Deepak Nibade	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_gr_isr(), handle memfmt exception as below : - read NV_PGRAPH_PRI_MEMFMT_HWW_ESR - debug print for contents of above register - write same value back to NV_PGRAPH_PRI_MEMFMT_HWW_ESR and clear the exception Bug 200078514 Change-Id: I5b9afacd7f99b5a37de953041582b3a53b863642 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/713713 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support for dumping vpr/wpr info	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added support for dumping vpr/wpr info for gm20b. This dump info called when ever gk20a_mm_fb_flush is timed-out. Bug 200082817 Change-Id: I21b0372d0e3f976a189c9c428c015165b715bf88 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/711439 (cherry picked from commit b69897d71c8f6119b49ceb8d3273cdb354178cc5) Reviewed-on: http://git-master/r/712675 GVS: Gerrit_Virtual_Submit Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: Use busy looping for flush operations	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use busy looping for l2 tag flush and elpg flush operations. This is making total flash time more accurate and reduced overall time compared with usleep. Also added trace points to measure performance for these operations. Also corrected timeout error check for non-silicon platforms. Bug 200081799 Change-Id: I63410bb7528db9258501633996fbdee5fdec1c74 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/710472 (cherry picked from commit 18684cf9d5d6870a1a1fd5711c4fc2d733caad20) Reviewed-on: http://git-master/r/710986 GVS: Gerrit_Virtual_Submit Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: Allow enabling PC sampling	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Allow enabling of PC sampling hardware workaround. It is only applicable to gm20b. Bug 1517458 Bug 1573150 Change-Id: Iad6a3ae556489fb7ab9628637d291849d2cd98ea Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/710421
*	gpu: nvgpu: add exception registers to dump	Deepak Nibade	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add below exception registers to GR dump : NV_PGRAPH_PRI_BE0_BECS_BE_EXCEPTION NV_PGRAPH_PRI_BE0_BECS_BE_EXCEPTION_EN NV_PGRAPH_PRI_GPC0_GPCCS_GPC_EXCEPTION NV_PGRAPH_PRI_GPC0_GPCCS_GPC_EXCEPTION_EN NV_PGRAPH_PRI_GPC0_TPC0_TPCCS_TPC_EXCEPTION NV_PGRAPH_PRI_GPC0_TPC0_TPCCS_TPC_EXCEPTION_EN Bug 200078514 Change-Id: Ib0ec34f7bf5a136928c53cf8398b4929fb4639c5 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/712480 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: setup chip specific mm hw init	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Add support for setting-up mm hw init per soc. Bug 1587825 Change-Id: Ie5c5e49a767cfb14e3dbbb6902349284cd3dca95 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/681784 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: disable ce2 interrupts when unhandled	Sam Payne	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	ce2 interrupts enabled only on gk20a and gm20b when interrupts are handled through hal Change-Id: Ib570db8f5f41e71e768b95e781153ec8a5d20015 Signed-off-by: Sam Payne <spayne@nvidia.com> Reviewed-on: http://git-master/r/677447 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Refactor page mapping code	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \|	Pass always the directory structure to mm functions instead of pointers to members to it. Also split update_gmmu_ptes_locked() into smaller functions, and turn the hard coded MMU levels (PDE, PTE) into run-time parameters. Change-Id: I315ef7aebbea1e61156705361f2e2a63b5fb7bf1 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/672485 Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: Ensure memory subsystem is enabled	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	Ensure that memory subsystem is enabled at init. Bug 1603128 Change-Id: Ie3fcd4d9df4dbd480e44fa8919fc311e61b627ca Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/707027 Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
*	gpu: nvgpu: Per-chip PBDMA signature	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \|	PBDMA HW signature depends on the chip. Change-Id: If57d721d9bb77a090f967930a1aa2037bf4a16fe Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/672922
*	gpu: nvgpu: Unify PDE & PTE structs	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \|	Introduce a new struct gk20a_mm_entry. Allocate and store PDE and PTE arrays using the same structure. Always pass pointer to this struct when possible between functions in memory code. Change-Id: Ia4a2a6abdac9ab7ba522dafbf73fc3a3d5355c5f Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/696414
*	gpu: nvgpu: APIs to dump GR status	Deepak Nibade	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add below APIs to dump various GR status registers 1. debugfs : /d/gpu.0/gr_status Read this debugfs at runtime to get status registers 2. API gk20a_gr_debug_dump() Add this API in code to dump registers at any point Bug 200062436 Change-Id: Ic1115b5a2fc16362954b5ed8a9e70afb872a8d91 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/486465 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: optimize fecs status polling	Vijayakumar	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 200078367 using udelay for fecs status polling during GR init phase brings down fecs transaction time to < 20usec from few hundred usec. Change-Id: I61a27daaf1187ac086a42779b46aa3fbee3b37f2 Signed-off-by: Vijayakumar <vsubbu@nvidia.com> Reviewed-on: http://git-master/r/691918 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gm20b: WPR size 0, on railgate exit	Supriya	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug 200066741 ACR ucode has mechanism to skip WPR blob copy for second time, in case WPR size is sent as 0 to acr ucode. With above there is a saving of around 0.5 ms, but, in conjunction with acr change to disable LS sig verification, and scrubbing empty spaces in WPR sections to 0. This change can reduce railgate exit latency by 4ms ACR ucodes to be checked in main, as a different CL, and after getting prod signs for ACR Change-Id: I9d662027abf0b2615176d17433ff3ec3ae53d78a Signed-off-by: Supriya <ssharatkumar@nvidia.com> Reviewed-on: http://git-master/r/681892 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gk20a: FECS HALT method	Supriya	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	FECS halt method is used to do graceful FECS shutdown. Bug 1551865 Change-Id: Iec8590e86cb09f9b54c36f85859208fc8650f6a6 Signed-off-by: Supriya <ssharatkumar@nvidia.com> Reviewed-on: http://git-master/r/682459 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: More robust recovery	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \|	Make recovery a more straightforward process. When we detect a fault, trigger MMU fault, and wait for it to trigger, and complete recovery. Also reset engines before aborting channel to ensure no stray sync point increments can happen. Change-Id: Iac685db6534cb64fe62d9fb452391f43100f2999 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/709060 (cherry picked from commit 95c62ffd9ac30a0d2eb88d033dcc6e6ff25efd6f) Reviewed-on: http://git-master/r/707443