nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: Initialize hwpm perfmons (engine_sel)	Vaibhav Kachore	2018-07-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- For Mode-E ctxsw it is required that engine_sel is set to 0xFFFFFFFF. - Default 0 is a valid signal and causes problems. Bug 2106999 Change-Id: I5cdb4441a8e6d7e8133c31a9e361b54611dd2995 Signed-off-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1770755 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: enable HWPM Mode-E context switch	Vaibhav Kachore	2018-07-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Write new pm mode to context buffer header. Ucode use this mode to enable mode-e context switch. This is Mode-B context switch of PMs with Mode-E streamout on one context. If this mode is set, Ucode makes sure that Mode-E pipe (perfmons, routers, pma) is idle before it context switches PMs. - This allows us to collect counters in a secure way (i.e. on context basis) with stream out. Bug 2106999 Change-Id: I5a7435f09d1bf053ca428e538b0a57f3a175ac37 Signed-off-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1760366 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: gv11b: enable RMW for gpu atomics	Ashish Srivastava	2018-06-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Separate HAL added in gv11b and gv100 for init_gpc_mmu function. In gv11b HAL, RMW is enabled for gpu atomics as default. In gv100 HAL, GPC atomic capability mode will get set based on the FB MMU capability. If GPU is connected through NVLINK then mmu will be set to RMW mode, else it will be in L2 mode. Bug 200390336 Change-Id: I224934f83d1762ec864ef8da7265dd01d86893a0 Signed-off-by: Ashish Srivastava <assrivastava@nvidia.com> Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1735137 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: gv100: consider floorswept FBPA for getting unicast list	Deepak Nibade	2018-04-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gr_gv11b/gk20a_create_priv_addr_table() we do not consider floorswept FBPAs and just calculate the unicast list assuming all FBPAs are present This generates incorrect list of unicast addresses Fix this introducing new HAL ops.gr.split_fbpa_broadcast_addr Set gr_gv100_get_active_fpba_mask() for GV100 Set gr_gk20a_split_fbpa_broadcast_addr() for rest of the chips gr_gv100_get_active_fpba_mask() will first get active FPBA mask and generate unicast list only for active FBPAs Bug 200398811 Jira NVGPU-556 Change-Id: Idd11d6e7ad7b6836525fe41509aeccf52038321f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1694444 GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: gv100: fix PMA list alignment in ctxsw buffer	Deepak Nibade	2018-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GV100 ucode is changed so that it expects LIST_nv_perf_pma_ctx_reg list in ctxsw buffer to be 256 byte aligned but same change is not applied to other chip ucodes ADD new HAL (*add_ctxsw_reg_perf_pma) to configure PMA register list and define a common HAL gr_gk20a_add_ctxsw_reg_perf_pma() for all other chips except GV100 Define a separate HAL for GV100 gr_gv100_add_ctxsw_reg_perf_pma() and fix the required alignment in this function Bug 1998067 Change-Id: Ie172fe90e2cdbac2509f2ece953cd8552e66fc56 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1676655 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: gv100: fix num_fbpas while adding ctxsw buffer entries	Deepak Nibade	2018-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For LIST_nv_pm_fbpa_ctx_regs, we right now call add_ctxsw_buffer_map_entries_subunits() to add registers corresponding to all the FBPAs But while configuring total number of registers, we do not consider floorswept FBPAs and that causes misalignment in subsequent lists for GV100 Fix this by reading disabled/floorswept FBPAs from fuse and consider only those FBPAs which are active for GV100 Add new HAL (*add_ctxsw_reg_pm_fbpa) to support this setting and define a common HAL gr_gk20a_add_ctxsw_reg_pm_fbpa() for all chips except GV100 Define GV100 specific gr_gv100_add_ctxsw_reg_pm_fbpa() with above mentioned implementation to consider floorsweeping Bug 1998067 Change-Id: Id560551bb0b8142791c117b6d27864566c90b489 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1676654 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: vgpu: get virtual SMs mapping	Thomas Fleury	2018-01-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On gv11b we can have multiple SMs per TPC. Add sm_per_tpc in vgpu constants to properly dimension the virtual SM to TPC/GPC mapping in virtualization case. Use TEGRA_VGPU_CMD_GET_SMS_MAPPING to query current mapping. Bug 2039676 Change-Id: I817be18f9a28cfb9bd8af207d7d6341a2ec3994b Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1631203 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: gv1xx: resize patch buffer	David Nieto	2017-10-26
\| \| \| \| \| \| \| \| \| \| \| \| \|	Follow the sizing consideration in bug 1753763 to support dynamic TPC modes and subcontexts. bug 200350539 Change-Id: Ibbdbf02f9c2ea3f082c1b2810ae7176b0775d461 Signed-off-by: David Nieto <dmartineznie@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1584034 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: fix smid generation of perf tables	David Nieto	2017-10-20
	SMID tables were generated according with the local tpc and the pagepool and cb buffers from a different chip and did not take performance in consideration, which made compute kernels hang with CTAs on the fly. This change ensures we are using the right sizes and adds proper enumeration of smids. JIRA: NVGPUGV100-36 bug 2004378 Change-Id: Ic8f50c325d6d6720cca41d9740ae4f5f51e1100a Signed-off-by: David Nieto <dmartineznie@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1581664 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>