nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: Fix CWD floorsweep programming	Terje Bergstrom	2016-05-16
\| \| \| \| \| \| \| \| \| \| \| \| \|	Program CWD TPC and SM registers correctly. The old code did not work when there are more than 4 TPCs. Refactor init_fs_mask to reduce code duplication. Change-Id: Id93c1f8df24f1b7ee60314c3204e288b91951a88 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1143697 GVS: Gerrit_Virtual_Submit Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
*	gpu: nvgpu: Do not program max ways evict	Terje Bergstrom	2016-05-13
\| \| \| \| \| \| \| \| \| \| \|	Setting max_ways_evict reserves some of L2 for CB. In gk20a CB is in dedicated RAM, so we don't need to reserve space for it. The code gets invoked only on gk20a. Change-Id: Ib8efec8c5e90c135bd0c10bb1eaa3f797ec68698 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1144993
*	gpu: nvgpu: Add support for multiple PBDMAs	Lakshmanan M	2016-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added support for multiple PBDMAs handling during fifo_pbdma_isr and gk20a_init_fifo_reset_enable_hw use case. JIRA DNVGPU-26 Change-Id: I5f013c5373f7a4b80a8de8863f0e175576ed4c22 Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: http://git-master/r/1145591 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: refactor gk20a_mem_{wr,rd} for vidmem	Konsta Holtta	2016-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To support vidmem, pass g and mem_desc to the buffer memory accessor functions. This allows the functions to select the memory access method based on the buffer aperture instead of using the cpu pointer directly (like until now). The selection and aperture support will be in another patch; this patch only refactors these accessors, but keeps the underlying functionality as-is. gk20a_mem_{rd,wr}32() work as previously; add also gk20a_mem_{rd,wr}() for byte-indexed accesses, gk20a_mem_{rd,wr}_n() for memcpy()-like functionality, and gk20a_memset() for filling buffers with a constant. The 8 and 16 bit accessor functions are removed. vmap()/vunmap() pairs are abstracted to gk20a_mem_{begin,end}() to support other types of mappings or conditions where mapping the buffer is unnecessary or different. Several function arguments that would access these buffers are also changed to take a mem_desc instead of a plain cpu pointer. Some relevant occasions are changed to use the accessor functions instead of cpu pointers without them (e.g., memcpying to and from), but the majority of direct accesses will be adjusted later, when the buffers are moved to support vidmem. JIRA DNVGPU-23 Change-Id: I3dd22e14290c4ab742d42e2dd327ebeb5cd3f25a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1121143 Reviewed-by: Ken Adams <kadams@nvidia.com> Tested-by: Ken Adams <kadams@nvidia.com>
*	gpu: nvgpu: Use num LTCs instead of num FBs in GR	Terje Bergstrom	2016-05-11
\| \| \| \| \| \| \| \| \|	When programming GR ZROP and CROP settings for number of active LTCs we use FB count, which is always 1. Use LTC count instead. Change-Id: I4d6f9b2b59d44b82ba098b243d6390cf321dbc36 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1144698
*	gpu: nvgpu: add fuse overrides for tpc disabling	Richard Zhao	2016-05-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- add fuse_override in gops. Implement it starting from gm20b. - set cwd fs register, so cuda won't use disabled TPCs Bug 1757262 Bug 200169697 Change-Id: If7bac58bd3a6bcf2925197ea5b7c2d10a77e0933 Signed-off-by: Richard Zhao <rizhao@nvidia.com> (cherry picked from commit 66cde7724815e9e5e85ab9b07fc985a78530222f) Reviewed-on: http://git-master/r/1132177 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Adeel Raza <araza@nvidia.com> Tested-by: Adeel Raza <araza@nvidia.com>
*	gpu: nvgpu: add supported preemptions to gpu characteristics	Deepak Nibade	2016-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add below flag fields to gpu characteristics to indicate supported and default preemption modes on platform for graphics and compute __u32 graphics_preemption_mode_flags; __u32 compute_preemption_mode_flags; __u32 default_graphics_preempt_mode; __u32 default_compute_preempt_mode; Add struct nvgpu_preemption_modes_rec to struct gr_gk20a to store these values locally Use platform specific get_preemption_mode_flags() to get the flags and define gk20a/gm20b specific get_preemption_mode_flags() API Bug 1646259 Change-Id: I80193c0d988dc93bd96585f9aa631fd817f4dfa3 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1133595 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: separate IOCTL to set preemption mode	Deepak Nibade	2016-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add separate IOCTL NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE to allow setting preemption modes from UMD Define preemption modes in nvgpu.h and use them everywhere Remove mode definitions from mm_gk20a.h Also, we support setting only one preemption mode in a channel But it is possible to have multiple preemption modes (one from graphics and one from compute) set simultaneously Hence, update struct gr_ctx_desc to include two separate preemption modes (graphics_preempt_mode and compute_preempt_mode) API NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE also supports setting two separate preemption modes i.e. one for graphics and one for compute Make necessary changes in code to support two preemption modes Bug 1646259 Change-Id: Ia1dea19e609ba8cc0de2f39ab6c0c4cd6b0a752c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1131805 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	nvgpu: vgpu: create fifo.force_reset_ch in gpu_ops	Haley Teng	2016-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	gk20a_fifo_force_reset_ch() does not support vgpu now, so we need to create a function pointer in gpu_ops and assign it differently for vgpu and non-vgpu. Bug 200184349 Change-Id: I5f8f4f731b4b970c4ff8de65531f25568e7691b6 Signed-off-by: Haley Teng <hteng@nvidia.com> Reviewed-on: http://git-master/r/1130420 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Fix floorsweeping for multi-GPC GPU	Terje Bergstrom	2016-04-27
\| \| \| \| \| \| \| \| \| \| \| \| \|	There were multiple bugs in dealing with a GPU with more than one GPC. * Beta CB size was set to wrong PPC * TPC mask did not shift fields correctly * PD skip table used \|\| instead of \| operator Change-Id: I849e2331a943586df16996fe573da2a0ac4cce19 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1132109
*	gpu: nvgpu: Idle GR before calling PMU ZBC save	Terje Bergstrom	2016-04-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On gk20a when PMU is updating ZBC colors it is reading them from L2. But L2 has one port, and ZBC reads can race with other transactions. Idle graphics before sending PMU the ZBC_UPDATE request. Also makes pmu_save_zbc a HAL, because PMU ucode has changes to bypass this problem on some chips. Bug 1746047 Change-Id: Id8fcd6850af7ef1d8f0a6aafa0fe6b4f88b5f2d9 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1129017
*	gpu: nvgpu: do not include hw_proj_*.h	Deepak Nibade	2016-04-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	hw_proj_gk20a.h and hw_proj_gm20b.h should not be included, hence remove the includes and APIs used from the header Use nvgpu_get_litter_value() API to replace use of header Bug 200156699 Change-Id: I5e88f71657682dd94ac7f0a45f940b70cf8222e7 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1129611 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add accessors for global_esr values and sm_dbgr_control	Deepak Nibade	2016-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add gk20a/gm20b accessors for various global_esr values and for sm_dbgr_control modes Bug 200156699 Change-Id: If7fd8cd7567f8bcd1f645facf9553bdc0a153526 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1120333 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: IOCTL to suspend/resume context	Deepak Nibade	2016-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add below IOCTL to suspend/resume a context NVGPU_DBG_GPU_IOCTL_SUSPEND_RESUME_CONTEXTS: Suspend sequence : - disable ctxsw - loop through list of channels - if channel is ctx resident, suspend all SMs - otherwise, disable channel/TSG - enable ctxsw Resume sequence : - disable ctxsw - loop through list of channels - if channel is ctx resident, resume all SMs - otherwise, enable channel/TSG - enable ctxsw Bug 200156699 Change-Id: Iacf1bf7877b67ddf87cc6891c37c758a4644b014 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1120332 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu; nvgpu: IOCTL to write/clear SM error states	Deepak Nibade	2016-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add below IOCTLs to write/clear SM error states NVGPU_DBG_GPU_IOCTL_CLEAR_SINGLE_SM_ERROR_STATE NVGPU_DBG_GPU_IOCTL_WRITE_SINGLE_SM_ERROR_STATE Bug 200156699 Change-Id: I89e3ec51c33b8e131a67d28807d5acf57b3a48fd Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1120330 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support storing/reading single SM error state	Deepak Nibade	2016-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support to store error state of single SM before preprocessing SM exception Error state is stored as : struct nvgpu_dbg_gpu_sm_error_state_record { u32 hww_global_esr; u32 hww_warp_esr; u64 hww_warp_esr_pc; u32 hww_global_esr_report_mask; u32 hww_warp_esr_report_mask; } Note that we can safely append new fields to above structure in the future if required Also, add IOCTL NVGPU_DBG_GPU_IOCTL_READ_SINGLE_SM_ERROR_STATE to support reading SM's error state by user space Bug 200156699 Change-Id: I9a62cb01e8a35c720b52d5d202986347706c7308 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1120329 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Wait for BAR1 bind	Terje Bergstrom	2016-04-15
\| \| \| \| \| \| \| \| \|	Wait for BAR1 bind to complete before continuing. The register to wait exists Maxwell onwards. Change-Id: Ie3736033fdb748c5da8d7a6085ad6d63acaf41f5 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1123941
*	gpu: nvgpu: Use sysmem aperture for SoC memory	Terje Bergstrom	2016-04-15
\| \| \| \| \| \| \| \| \|	In Tegra GPU, SoC memory has to be accessed as vidmem. In discrete GPU, it has to be accessed as sysmem. Change-Id: I4efe71b54a9a32f0bf1f02ec4016ed74405a14c5 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120468
*	gpu: nvgpu: Add litter values HAL	Terje Bergstrom	2016-04-15
\| \| \| \| \| \| \| \| \|	Move per-chip constants to be returned by a chip specific function. Implement get_litter_value() for each chip. Change-Id: I2a2730fce14010924d2507f6fa15cc2ea0795113 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1121383
*	gpu: nvgpu: make tpc_fs_mask work on production board	Richard Zhao	2016-04-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On production fused boards, it uses gr_fe_tpc_fs_r() to mask TPCs, rather than fues. Bug 1734150 Change-Id: I7b4eb428f1ad0cf841a57214e0c8c1e8f17b2c5a Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1111630 (cherry picked from commit 869ea54967812e03d9f1e69775ca56fd6459216c) Reviewed-on: http://git-master/r/1122121 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Support GPUs with no physical mode	Terje Bergstrom	2016-04-13
\| \| \| \| \| \| \| \| \| \| \|	Support GPUs which cannot choose between SMMU and physical addressing. Change-Id: If3256fa1bc795a84d039ad3aa63ebdccf5cc0afb Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120469 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	gpu: nvgpu: vgpu: virtualized SMPC/HWPM ctx switch	Peter Daifuku	2016-04-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for SMPC and HWPM context switching when virtualized Bug 1648200 JIRASW EVLR-219 JIRASW EVLR-253 Change-Id: I80a1613eaad87d8510f00d9aef001400d642ecdf Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1122034 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use device instead of platform_device	Terje Bergstrom	2016-04-08
\| \| \| \| \| \| \| \| \|	Use struct device instead of struct platform_device wherever possible. This allows adding other bus types later. Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120466
*	gpu: nvgpu: hw definitions for FECS trace on gm20b	Thomas Fleury	2016-04-07
\| \| \| \| \| \| \| \| \| \| \| \|	Bug 1648908 Change-Id: I3b9a1ebaf062f15db397acd81b8312dc8daa9193 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1117262 Reviewed-on: http://git-master/r/1121233 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support for hwpm context switching	Peter Daifuku	2016-04-07
\| \| \| \| \| \| \| \| \| \| \| \|	Add support for hwpm context switching Bug 1648200 Change-Id: I482899bf165cd2ef24bb8617be16df01218e462f Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1120450 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Sync with register generator	Terje Bergstrom	2016-04-07
\| \| \| \| \| \| \| \| \|	Use re-generated register definitions. This synchronizes kernel with the register generator. Change-Id: I85a00f8f5c7bdfbc56cf4df909e5ae892d86f062 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120812
*	gpu: nvgpu: Add Fuse prints on PMU Halt	Supriya	2016-04-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-Print fuse values in case of PMU halt error -and mailbox reads 0xDEADDEAD Bug 1737044 Change-Id: I59f5fcf4a69bdd2a2eea81a69dd99bb9c4c21e1d Signed-off-by: Supriya <ssharatkumar@nvidia.com> Reviewed-on: http://git-master/r/1113464 (cherry picked from commit d0320eed72c5070c4fcc7564c02fa38599984751) Reviewed-on: http://git-master/r/1120429 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add HAL for GPU characteristics	Sami Kiminki	2016-04-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add function pointer for chip specific GPU characteristics init. Bug 1637486 Change-Id: I6ce5eea124d8057393dec6e86e72412cc87e1cfa Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Signed-off-by: Adeel Raza <araza@nvidia.com> Reviewed-on: http://git-master/r/780535 (cherry picked from commit f5c240d6ed19b5b9eedff05767c885ad5812c71e) Reviewed-on: http://git-master/r/1120428 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	Merge branch 'PM_RUNTIME-Removal' into 'dev-kernel-3.18'	Sumit Singh	2016-03-30
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change performs merge of 'PM_RUNTIME_Removal' dev-branch with 'dev-kernel-3.18' branch. It replaces CONFIG_PM_RUNTIME with CONFIG_PM. JIRA TPM-704 Change-Id: I306e254716f275c283f727fc232d7244939542b6 Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
\| *	gpu: nvgpu: Replace CONFIG_PM_RUNTIME with CONFIG_PM	Sumit Singh	2016-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After commit b2b49ccbdd54 (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is selected) PM_RUNTIME is always set if PM is set, so #ifdef blocks depending on CONFIG_PM_RUNTIME may now be changed to depend on CONFIG_PM. Replace CONFIG_PM_RUNTIME with CONFIG_PM everywhere under drivers/gpu/nvgpu/. JIRA TPM-704 Change-Id: I23965838ff6ec77829076cd834e87641fb68e268 Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
* \|	gpu: nvgpu: use gk20a_free_sgtable to free sgtable	Yogish Kulkarni	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use gk20a_free_sgtable to free sgtable Bug 200130473 Change-Id: I6ddffb848a289ce81804502b7628feb5a4a8d000 Signed-off-by: Yogish Kulkarni <yogishk@nvidia.com> Reviewed-on: http://git-master/r/785884 (cherry picked from commit a4f3b53f2ed3971d9b8945f5bc9c1b2822156a89) Reviewed-on: http://git-master/r/833646 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* \|	gpu: nvgpu: add support to set channel timeslice	Aingara Paramakuru	2016-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of improving GPU scheduling, userspace can now set a channel's timeslice, within reasonable limits imposed by the kernel driver. JIRA VFND-1312 Bug 1729664 Change-Id: I4c3430c43437889b8685f12988d4b967bb7877bb Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1020917 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* \|	gpu: nvgpu: Provide cpu gpu time correlation via ioctl	Arul Sekar	2016-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 1648908 Provides pairs of CPU and GPU timestamps that can be used for correlatiing the two timebases - IOCTL made available /dev/nvhost-ctrl-gpu Change-Id: I1458b9d33d794b1b02ec9fd29ed9426756b94bcd Signed-off-by: Arul Sekar <aruls@nvidia.com> Reviewed-on: http://git-master/r/1029732 Reviewed-by: Arun Gona <agona@nvidia.com> Tested-by: Arun Gona <agona@nvidia.com> Reviewed-on: http://git-master/r/1111715 GVS: Gerrit_Virtual_Submit Reviewed-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* \|	gpu: nvgpu: Disable illegal comptag interrupt	Terje Bergstrom	2016-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Illegal comptag interrupt is triggered when a page is mapped with two different kinds with incompatible compression status. This can be intentional, so disable the interrupt. Change-Id: I84a212beac147991d09d2d381a9e770b1364f4d8 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1029663 (cherry picked from commit 819607a768f9fccdd0b233d58bcf88b9eee4ee19) Reviewed-on: http://git-master/r/1031010 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit
* \|	gpu: nvgpu: improve channel interleave support	Aingara Paramakuru	2016-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, only "high" priority bare channels were interleaved between all other bare channels and TSGs. This patch decouples priority from interleaving and introduces 3 levels for interleaving a bare channel or TSG: high, medium, and low. The levels define the number of times a channel or TSG will appear on a runlist (see nvgpu.h for details). By default, all bare channels and TSGs are set to interleave level low. Userspace can then request the interleave level to be increased via the CHANNEL_SET_RUNLIST_INTERLEAVE ioctl (TSG-specific ioctl will be added later). As timeslice settings will soon be coming from userspace, the default timeslice for "high" priority channels has been restored. JIRA VFND-1302 Bug 1729664 Change-Id: I178bc1cecda23f5002fec6d791e6dcaedfa05c0c Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1014962 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* \|	gpu: nvgpu: tegra: fix sparse errors	Seshendra Gadagottu	2016-03-14
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed following sparse errors: - therm_gm20b.c:68:6: warning: symbol 'gm20b_init_therm_ops' was not declared. Should it be static? - platform_gk20a_tegra.c:825:5: warning: symbol 'gk20a_set_clk_rate' was not declared. Should it be static? Bug 200067946 Change-Id: I485d5e76302fb294865854f314db2d27f71520f7 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1026685 GVS: Gerrit_Virtual_Submit Reviewed-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: LRF, TEX, LTC, DRAM override	Supriya	2016-02-26
\| \| \| \| \| \| \| \| \| \| \| \|	- Adding support for FECS mem overrides Bug 1699676 Change-Id: I6c9ddcd98d57b29059513ee508c6f92b194c4fc7 Signed-off-by: Supriya <ssharatkumar@nvidia.com> Reviewed-on: http://git-master/r/921253 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: enable use_full_comp_tag_line in gpc mmu	mheyer	2016-02-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also GPC MMU needs to have its PRI_MMU_CTRL_USE_FULL_COMP_TAG_LINE control bit set. Bug 1730611 Signed-off-by: Mathias Heyer <mheyer@nvidia.com> Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Change-Id: I01e11de066ea5487bf1d9c8c8eddbf159e4882da Reviewed-on: http://git-master/r/1014881 (cherry picked from commit d1651bbebe1b3e46d2173dec1651b3d2f4307b40) Reviewed-on: http://git-master/r/1017459 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: enable ctxsw_intr1 interrupt	Deepak Nibade	2016-02-05
\| \| \| \| \| \| \| \| \| \| \| \|	Enable NV_PGRAPH_PRI_FECS_HOST_INT_ENABLE_CTXSW_INTR1 Bug 200156699 Change-Id: I170dd6998381897a4b4ca832774eb0f11f92fd86 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/935772 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: SM/TEX exception handling support	Adeel Raza	2016-01-29
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add TEX exception handling support. Also make SM exception handler into a function pointer, which should allow different chips to implement their own SM exception handling routine. Bug 1635727 Bug 1637486 Change-Id: I429905726c1840c11e83780843d82729495dc6a5 Signed-off-by: Adeel Raza <araza@nvidia.com> Reviewed-on: http://git-master/r/935329
*	gpu: nvgpu: set set_sm_debug_mode() for gm20b	Deepak Nibade	2016-01-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Set function pointer gops->gr.set_sm_debug_mode() for gm20b Bug 200168107 Change-Id: I40eebbc55b0f82f793fcea90245ae6dad0f5779c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/935773 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Richard Zhao <rizhao@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add support for therm gate ctrl	Seshendra Gadagottu	2016-01-27
\| \| \| \| \| \| \| \| \| \| \| \| \|	During gpu init, therm gate control is required to add delay cycles before clock gating. Bug 1717152 Change-Id: Ifabc428cf7b49e49964dc994eba2c38af4aa1a91 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/936443 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: add channel_set_priority support	Richard Zhao	2016-01-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- add gops.fifo.channel_set_priority and move current code as native callback. - implement the callback for vgpu Bug 1701079 Change-Id: If1cd13ea4478d11d578da2f682598e0c4522bcaf Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/932829 Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gm20b: correct thermal slowdown factor	Seshendra Gadagottu	2016-01-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	With extended mode enable, correct thermal slowdown factors to have divideby2, divideby4 and divideby8 slowdown. Bug 1719974 Change-Id: I1e3a3f869657ce7c6409851df0ccd1523a06544b Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/933282 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: bitmap allocator for comptags	Konsta Holtta	2016-01-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Restore comptags to be bitmap-allocated, like they were before we had the buddy allocator. The new buddy allocator introduced by e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff (originally 6ab2e0c49cb79ca68d2f83f1d4610783d2eaa79b) is fine for the big VAs, but unsuitable for the small compbit store. This commit reverts partially the combination of the above commit and also one after it, 86fc7ec9a05999bea8de320840b962db3ee11410, that fixed a bug which is not present when using a bitmap. With a bitmap allocator, pruning the extra allocation necessary for user-mapped mode is possible, so that is also restored. The original generic bitmap allocator is not restored; instead, a comptag-only allocator is introduced. Bug 200145635 Change-Id: I87f3a911826a801124cfd21e44857dfab1c3f378 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/837180 (cherry picked from commit 5a504aeb54f3e89e6561932971158a397157b3f2) Reviewed-on: http://git-master/r/839742 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support masking hww_warp_esr	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add below API pointer to support masking of hww_warp_esr after hardware read of register and before using it further u32 (*mask_hww_warp_esr)(u32 hww_warp_esr) If needed, this API will mask value of hww_warp_esr appropriately and return it Bug 200156699 Change-Id: I1afb1347e650fab607009c1ee55691484653a4c1 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/927133 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: API to extract context id	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add new API gr_gk20a_get_ctx_id() to get/extract context id from GR context Bug 200156699 Change-Id: If0e8887a9a6b139cd795bf03f5def64fd664d12b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/927130 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support preprocessing of SM exceptions	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support preprocessing of SM exceptions if API pointer pre_process_sm_exception() is defined Also, expose some common APIs Bug 200156699 Change-Id: I1303642c1c4403c520b62efb6fd83e95eaeb519b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/925883 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add high priority channel interleave	Peter Pipkorn	2016-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Interleave all high priority channels between all other channels. This reduces the latency for high priority work when there are a lot of lower priority work present, imposing an upper bound on the latency. Change the default high priority timeslice from 5.2ms to 3.0 in the process, to prevent long running high priority apps from hogging the GPU too much. Introduce a new debugfs node to enable/disable high priority channel interleaving. It is currently enabled by default. Adds new runlist length max register, used for allocating suitable sized runlist. Limit the number of interleaved channels to 32. This change reduces the maximum time a lower priority job is running (one timeslice) before we check that high priority jobs are running. Tested with gles2_context_priority (still passes) Basic sanity testing is done with graphics_submit (one app is high priority) Also more functional testing using lots of parallel runs with: NVRM_GPU_CHANNEL_PRIORITY=3 ./gles2_expensive_draw –drawsperframe 20000 –triangles 50 –runtime 30 –finish plus multiple: NVRM_GPU_CHANNEL_PRIORITY=2 ./gles2_expensive_draw –drawsperframe 20000 –triangles 50 –runtime 30 -finish Previous to this change, the relative performance between high priority work and normal priority work comes down to timeslice value. This means that when there are many low priority channels, the high priority work will still drop quite a lot. But with this change, the high priority work will roughly get about half the entire GPU time, meaning that after the initial lower performance, it is less likely to get lower in performance due to more apps running on the system. This change makes a large step towards real priority levels. It is not perfect and there are no guarantees on anything, but it is a step forwards without any additional CPU overhead or other complications. It will also serve as a baseline to judge other algorithms against. Support for priorities with TSG is future work. Support for interleave mid + high priority channels, instead of just high, is also future work. Bug 1419900 Change-Id: I0f7d0ce83b6598fe86000577d72e14d312fdad98 Signed-off-by: Peter Pipkorn <ppipkorn@nvidia.com> Reviewed-on: http://git-master/r/805961 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: enable semaphore acquire timeout	Richard Zhao	2016-01-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It'll detect dead semaphore acquire. The worst case is when ACQUIRE_SWITCH is disabled, semaphore acquire will poll and consume full gpu timeslicees. The timeout value is set to half of channel WDT. Bug 1636800 Change-Id: Ida6ccc534006a191513edf47e7b82d4b5b758684 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/928827 GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>