nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: Update patch count after adding	Terje Bergstrom	2017-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When kernel adds patches to a context, kernel needs to update the patch count in order for FECS to pick up the new patches. Previously patching was done only at the context creation time. Now patching is used also when changing preemption mode, but the patches did not take effect due to not updating count. Update patch count every time we end patching of a context. Bug 1852094 Change-Id: Ic2150741609d1d1956769e439ce1c5f2edcacb84 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1280424 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Start re-organizing the HW headers	Alex Waterman	2017-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reorganize the HW headers of gk20a. The headers are moved to a new directory: include/nvgpu/hw/gk20a And from the code are included like so: #include <nvgpu/hw/gk20a/hw_pwr_gk20a.h> This is the first step in reorganizing all of the HW headers for gm20b, gm206, etc. This is part of a larger effort to re-structure and make the driver more readable and scalable. Bug 1799159 Change-Id: Ic151155cbc2e6f75009f2d9d597b364a1bed2c4c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1244790 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Move allocators to common/mm/	Alex Waterman	2017-01-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the GPU allocators to common/mm/ since the allocators are common code across all GPUs. Also rename the allocator code to move away from gk20a_ prefixed structs and functions. This caused one issue with the nvgpu_alloc() and nvgpu_free() functions. There was a function for allocating either with kmalloc() or vmalloc() depending on the size of the allocation. Those have now been renamed to nvgpu_kalloc() and nvgpu_kfree(). Bug 1799159 Change-Id: Iddda92c013612bcb209847084ec85b8953002fa5 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1274400 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: fix out-of-bound access on gr->map_tiles	Deepak Nibade	2017-01-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix slab-out-of-bounds issue reported by KASAN [ 28.464077] BUG: KASAN: slab-out-of-bounds in gr_gk20a_init_map_tiles+0x624/0x708 at addr ffffffc1a098ee01 ... [ 28.503241] INFO: Allocated in gr_gk20a_init_map_tiles+0x2dc/0x708 age=11 cpu=5 pid=1 out-of-bound access from below 3 stacks : [1] [ 28.782886] [<ffffffc0007d5f64>] gr_gk20a_init_map_tiles+0x624/0x708 [ 28.789228] [<ffffffc0007eadf0>] gk20a_init_gr_support+0x2d0/0xeb0 [ 28.795397] [<ffffffc00079d9c8>] gk20a_pm_finalize_poweron+0x738/0xd10 [2] [ 29.268070] [<ffffffc0007d618c>] gr_gk20a_zcull_init_hw+0x144/0x730 [ 29.274329] [<ffffffc0007d6a00>] gk20a_init_gr_setup_hw+0x288/0x1530 [ 29.280677] [<ffffffc0007eac6c>] gk20a_init_gr_support+0x14c/0xeb0 [ 29.286938] [<ffffffc00079d9c8>] gk20a_pm_finalize_poweron+0x738/0xd10 [3] [ 50.076223] [<ffffffc000d1df14>] gr_gk20a_setup_rop_mapping+0x5e4/0x2018 [ 50.082913] [<ffffffc000d2559c>] gr_gk20a_init_fs_state+0x80c/0x1028 [ 50.089259] [<ffffffc000ddcbc8>] gr_gm20b_init_fs_state+0xc8/0x960 [ 50.095430] [<ffffffc000e413f8>] gr_gp10b_init_fs_state+0x5c0/0x5d8 [ 50.101687] [<ffffffc000d2ed30>] gk20a_init_gr_setup_hw+0x1b48/0x2418 [ 50.108115] [<ffffffc000d50bc0>] gk20a_init_gr_support+0x19e0/0x1ab0 [ 50.114457] [<ffffffc000cc7af8>] gk20a_pm_finalize_poweron+0xd20/0x1558 Fix this by adding below - allocate gr->map_tiles[] with size of (num_gpc * num_tpc_per_gpc) intead of num_gpc - add new static API gr_gk20a_get_map_tile_count() which returns tile count for given index, and returns 0 for out-of-bounds access Bug 200257557 Change-Id: If572837ffb661f92a21be5ce855d0146b2609cb0 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1279411 Reviewed-by: Konsta Holtta <kholtta@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix out-of-bound access on gr->gpc_tpc_count	Deepak Nibade	2017-01-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix slab-out-of-bounds issue reported by KASAN [ 29.922710] BUG: KASAN: slab-out-of-bounds in gr_gk20a_init_fs_state+0x1bc/0x898 at addr ffffffc1a0988c04 ... [ 29.961820] INFO: Allocated in gr_gk20a_init_gr_config+0x380/0x1b20 age=374 cpu=5 pid=1 ... Out-of-bound access from [ 30.241943] [<ffffffc0007d2674>] gr_gk20a_init_fs_state+0x1bc/0x898 [ 30.248205] [<ffffffc000839a2c>] gr_gm20b_init_fs_state+0x4c/0x5c8 [ 30.254381] [<ffffffc000871670>] gr_gp10b_init_fs_state+0x160/0x3a8 [ 30.260643] [<ffffffc0007d70ec>] gk20a_init_gr_setup_hw+0x974/0x1530 [ 30.266991] [<ffffffc0007eac6c>] gk20a_init_gr_support+0x14c/0xeb0 [ 30.273164] [<ffffffc00079d9c8>] gk20a_pm_finalize_poweron+0x738/0xd10 [ 30.279684] [<ffffffc00079dfd0>] gk20a_pm_runtime_resume+0x30/0x58 Fix this by using a separate API gr_gk20a_get_tpc_count() which returns tpc count for a gpc and returns 0 if gpc index is greater than available gpcs Bug 200257557 Change-Id: I78856ca93c0381cb4bcef7a56a5210fa269cf3ac Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1277692 GVS: Gerrit_Virtual_Submit Reviewed-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: copy data into channel context header	seshendra Gadagottu	2016-12-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If channel context has separate context header then copy required info into context header instead of main context header. JIRA GV11B-21 Change-Id: I5e0bdde132fb83956fd6ac473148ad4de498e830 Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1229243 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Enable signed versus non-signed errors	Terje Bergstrom	2016-12-08
\| \| \| \| \| \| \| \| \| \| \|	Fix a few trivial signed versus unsigned problems, and enable compilation flag to treat them as errors. Change-Id: I68cc327885ef1efb12db7f347a2699a65415f889 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1265291 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: fix pes_tpc_count	Peter Daifuku	2016-12-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In calculation of pes_tpc_count, accumulate the number of PEs with TPCs connected to them instead of using the architectural maximum number. Bug 200250616 Change-Id: I4b2edc420ac03e24f2c298587d4dd1d77c51f5d6 Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1262642 (cherry picked from commit 65723cf5be8fe24bcaf56570883f0880a198efcb) Reviewed-on: http://git-master/r/1263958 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: chip specific channel commit_inst	seshendra Gadagottu	2016-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add function pointer to add chip specific commit_inst. Update this function pointer for gk20a and gm20b. JIRA GV11B-21 Change-Id: Iae7231fae70c7b4f56647fe242776670675de3fd Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1258275 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: fix setup_rop_mapping for gm20b+	Konsta Holtta	2016-11-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gm20b_init_gr does not inherit the ops set by gk20a_init_gr_ops, and the gr.setup_rop_mapping HAL was not set there, so it was not set for chips that inherit from gm20b_init_gr and do not override it explicitly. Set the pointer in gm20b_init_gr, which other chips inherit, and delete the surrounding if condition from the call, making sure that future users always call it, because there is an implementation since the earliest supported chip. Bug 1833382 Change-Id: I7893c9aac7c5c49ce9a55031ea6baa9382a1b7ca Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1258960 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: free veid bundle init data	seshendra Gadagottu	2016-11-21
\| \| \| \| \| \| \| \| \| \| \| \| \|	During gk20a_remove_gr_support, free veid bundle init data. JIRA GV11B-21 Change-Id: Ie1ea7387202c0bae55d5e5f0e1827b5b7b826e96 Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1254869 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: FBPA broadcast support	tk	2016-11-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add FBPA broadcast support to hwpm regops Bug 200249125 Change-Id: Iaf413a162a8985bcce94ff96ec6318e129609c4c Signed-off-by: Tejaswi K <tk@nvidia.com> Reviewed-on: http://git-master/r/1247408 (cherry picked from commit 4e0a805f5a8762d1a90f3b5dd76902a04941d9ef) Reviewed-on: http://git-master/r/1252160 Tested-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Fix signed comparison bugs	Terje Bergstrom	2016-11-17
\| \| \| \| \| \| \| \| \| \| \| \|	Fix small problems related to signed versus unsigned comparisons throughout the driver. Bump up the warning level to prevent such problems from occuring in future. Change-Id: I8ff5efb419f664e8a2aedadd6515ae4d18502ae0 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1252068 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Remove IOCTL FREE_OBJ_CTX	Terje Bergstrom	2016-11-11
\| \| \| \| \| \| \| \| \| \| \| \| \|	We have never used the IOCTL FREE_OBJ_CTX. Using it leads to context being only partially available, and can lead to use-after-free. Bug 1834225 Change-Id: I9d2b632ab79760f8186d02e0f35861b3a6aae649 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1250004 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Do not use invalid engine ID in bitshift	Terje Bergstrom	2016-11-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In calls to gk20a_fifo_recover() we pass a bitfield of engines to recover. We generate the bitfield by acquiring engine id from FIFO, and using BIT(). If GR engine is now known, the resulting engine ID is u32 with all bits set, which cannot be passed to BIT(). gk20a_fifo_recover() can already deal with all bits set, so pass that verbatim instead. Change-Id: Ib79d8e7e156deef0d483642cfb1ce7bf55f3c572 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1249964 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: gk20a: Fix FBP/L2 masks, add GET_FBP_L2_MASKS	Sami Kiminki	2016-11-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix FBP and ROP_L2 enable masks for Maxwell+. Deprecate rop_l2_en_mask in GPU characteristics by adding _DEPRECATED postfix. The array is too small to hold ROP_L2 enable masks for desktop GPUs. Add NVGPU_GPU_IOCTL_GET_FBP_L2_MASKS to expose the ROP_L2 masks for userspace. Bug 200136909 Bug 200241845 Change-Id: I5ad5a5c09f3962ebb631b8d6e7a2f9df02f75ac7 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/1245294 (cherry picked from commit 0823b33e59defec341ea7919dae4e5f73a36d256) Reviewed-on: http://git-master/r/1249883 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: smid programming	seshendra Gadagottu	2016-11-03
\| \| \| \| \| \| \| \| \| \| \| \| \|	Populate chip specific sm id table. JIRA GV11B-21 Change-Id: I58869b2c3e55449a7d999ddf73d6eb7b359b2a07 Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1227095 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: chip specific commit global timeslice	seshendra Gadagottu	2016-11-01
\| \| \| \| \| \| \| \| \| \| \| \|	Implement chip specific commit_global_timeslice function. JIRA GV11B-21 Change-Id: I937dda77870f164d034686d6d41482c875940320 Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1243944 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Move ELCG programming to therm	Terje Bergstrom	2016-10-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move ELCG parameter programming to a new function in therm, elcg_init_idle_filter. Implement gk20a variant and use it for gk20a and gm20b. JIRA DNVGPU-74 Change-Id: I8ef400f3a6195311fb9e7da8db6c34993d62f461 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1220433 (cherry picked from commit f6654ae4d83d31cd40b317bf55922964bbfa575d) Reviewed-on: http://git-master/r/1239421 GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: check engine ctx_status in wait_idle	Deepak Nibade	2016-10-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have following bug where GPU Host reports non-idle when it should report engine idle - if a context is preempted off the GPU, and there is no other context to load, NV_PGRAPH_ENGINE_STATUS will not be idle until new context is loaded - this could cause gr_gk20a_wait_idle() to fail since here we rely only on NV_PGRAPH_ENGINE_STATUS to decide if engine is busy or not To fix this, first check if context is valid or not from NV_PFIFO_ENGINE_STATUS_CTX_STATUS If context is invalid, return immediately Otherwise, continue as before Also, add accessors for invalid ctx_status Bug 1826768 Change-Id: Id627be3f02e79f4beac59a8b5195d08eabf651f2 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1237521 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: add func ptr for gpc exceptions	Seema Khowala	2016-10-17
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add function ptr for enabling gpc exceptions JIRA GV11B-28 JIRA GV11B-27 Change-Id: I4c7e4300825bf096c22f229ae7196f324ce40037 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: http://git-master/r/1236902 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: fix zcull programming	Seema Khowala	2016-10-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are eight tiles per map tile register and depending on how many tpcs are present, there is a chance that s/w will be accessing un-allocated memory for reading tile values from temp buffers. Bug 1735760 Change-Id: I5c0e09ec75099aaf6ad03dde964b9e93c2dc2408 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: http://git-master/r/1221580 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: program sw veid bundles	seshendra Gadagottu	2016-10-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Query sw veid bundles from sim/netlist and initialize hardware with those bundles. JIRA GV11B-11 Change-Id: I26f174781f0b00b919afac407e2bb9e1fa7b158a Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1231597 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: setup chip specific rop mapping	seshendra Gadagottu	2016-10-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for setting-up chip specific rop mapping. JIRA GV11B-21 Change-Id: If94f0de7d767f572095602a831ad6be4b764fff4 Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1234547 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Suppress error msg from VBIOS overlay	Terje Bergstrom	2016-10-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Suppress error message when nvgpu tries to load VBIOS overlay, but one is not found. This situation is normal. This is done by moving gk20a_request_firmware() to be nvgpu generic function nvgpu_request_firmware(), and adding a NO_WARN flag to it. Introduce also a NO_SOC flag to suppress attempt to load firmware from SoC specific directory in addition to the chip specific directory. Use it for dGPU firmware files. Bug 200236777 Change-Id: I0294d3308f029a6a6d3c2effa579d5f69a91e418 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1223840 (cherry picked from commit cca44c3f010f15918cdd2259c15170ba1917828a) Reviewed-on: http://git-master/r/1233353 GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: use vzalloc for golden_ctx_image	Sachit Kadle	2016-09-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As the size of the golden_ctx_image is large, the allocation may intermittently fail when using kzalloc. Since we don't need physically continguous memory, use vzalloc instead. Bug 200231436 Change-Id: Ic2fb31dea94c8721832dc257334608e1fc283943 Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1207172 (cherry picked from commit 994a7b162ec74518ae0f50dfb5ac197e44019992) Reviewed-on: http://git-master/r/1229472 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Skip calling undefined prod callsbacks	Terje Bergstrom	2016-09-29
\| \| \| \| \| \| \| \| \| \| \| \| \|	Do not call load prod callbacks that are set to NULL. Bug 1799537 Change-Id: Ie951fb71fa8eacd10623abcd058f32db59004c2e Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1208467 (cherry picked from commit c020e16adfa2b2bc2e3e8d0c63527a6089c59906) Reviewed-on: http://git-master/r/1227268 GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Post GR_SEMAPHORE_WRITE_AWAKEN event	Nikhil Mahale	2016-09-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Post GR_SEMAPHORE_WRITE_AWAKEN event on semaphore write awken interrupt for channel. BUG 200223530 Change-Id: I19eb61578d1c562be84e20ecaff9fb3bc9ace516 Signed-off-by: Nikhil Mahale <nmahale@nvidia.com> Reviewed-on: http://git-master/r/1193726 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: select target based on aperture	Deepak Nibade	2016-09-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While programming ucode's inst block in API gr_gk20a_load_falcon_bind_instblk(), use gk20a_aperture_mask() to select target address (i.e. if address is in sysmem or vidmem) based on aperture Also add target accessors for gr_fecs_new_ctx and gr_fecs_arb_ctx_ptr Jira DNVGPU-22 Change-Id: I88198080f188b349a4448a229dff8416a6a18073 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1216139 (cherry picked from commit 42bc14110df17400dd655bc994dc9e61c73048b1) Reviewed-on: http://git-master/r/1219703 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Call init_cbc only when defined	Terje Bergstrom	2016-09-12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Call init_cbc only when it contains a non-NULL pointer. Bug 1799537 Change-Id: Ic23f264e10daff30365bf3cf86ac9c155f50e497 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1208008 (cherry picked from commit ec69fa15c32f49d96939fd9a672faec45e078dfa) Reviewed-on: http://git-master/r/1217298 Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: use vidmem for gr ctx if available	Konsta Holtta	2016-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the common gk20a_gmmu_alloc() that tries vidmem too. Jira DNVGPU-24 Change-Id: I5dfd7eaab737a5290b4d21ac575d6b89777a567e Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1209077 (cherry picked from commit e3085d37735c8f1cf4845621f29fe9d2689aad4b) Reviewed-on: http://git-master/r/1184330 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Tested-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: send only one event to the debugger	Cory Perry	2016-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Event notifications on TSGs should only be sent to the channel that caused the event to happen in the first place, not evey channel in the tsg. Any more and the debugger will not be able to tell what channel actually got the event. Worse yet, if all the channels in a tsg are bound to the same debug session (as is the case with cuda-gdb), then multiple nvgpu events for the same gpu event will be triggered, causing events to be buffered and the client to get out of sync. One gpu exception, one nvgpu event per tsg. Bug 1793988 Signed-off-by: Cory Perry <cperry@nvidia.com> Change-Id: I4efb83b0593bd1af38f2342c80793d9db56e42b1 Reviewed-on: http://git-master/r/1194203 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: post bpt events after processing	Deepak Nibade	2016-08-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently post bpt events (bpt.int and bpt.pause) even before we process and clear the interrupts and this could cause races with UMD Fix this by posting bpt events only after we are done processing the interrupts Bug 200209410 Change-Id: Ic3ff7148189fccb796cb6175d6d22ac25a4097fb Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1184109 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add check for is_fmodel	Seema Khowala	2016-07-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	is_fmodel flag will be set in gk20a_probe(). Updated code for is_fmodel check, instead of check for supported simulated platforms. Bug 1735760 Change-Id: I7cbac2196130fe5ce4c1a910504879e6948c13da Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: http://git-master/r/1177869 Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Adeel Raza <araza@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: check for valid function pointers	seshendra Gadagottu	2016-07-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before calling prod settings functions, check for availability of those functions. Similar check is extended for get_clk_freqs. Bug 1735760 Change-Id: Ic4b38079043ab2049a479a2d8bb0cb6091e94f4a Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1181571 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Adeel Raza <araza@nvidia.com>
*	gpu: nvgpu: Full chip support for ctxsw	neha	2016-07-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nvgpu changes needed to handle the newly added ctxsw lists Fix regops support for ppc registers Squashed from: Change-Id: I08e6dec3bb2f7aa51de912c9d1c84a350ce07f72 Signed-off-by: neha <njoshi@nvidia.com> Reviewed-on: http://git-master/r/1151010 (cherry picked from commit fd03ad9f09e66f78db88fb7ece448e26e0515821) and: Change-Id: I75a7f810ee0b613c22ac2cef2d936563d8067f97 Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1158888 (cherry picked from commit f00a7fcc57fb937b800e46760087ff6f7637520c) Bug 200180000 Bug 1771830 Reviewed-on: http://git-master/r/1164397 (cherry picked from commit 7028f051e4f37edeff90a9923f022cec6c645a8f) Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Change-Id: I796ddf93ef37170843a4a6b44190cd6780d25852 Reviewed-on: http://git-master/r/1183588 Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com> GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: use vidmem by default in gmmu_alloc variants	Konsta Holtta	2016-07-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For devices that have vidmem available, use the vidmem allocator in gk20a_gmmu_alloc{,attr,_map,_map_attr}. For others, use sysmem. Because all of the buffers haven't been tested to work in vidmem yet, rename calls to gk20a_gmmu_alloc{,attr,_map,_map_attr} to have _sys at the end to declare explicitly that vidmem is used. Enabling vidmem for each now is a matter of removing "_sys" from the function call. Jira DNVGPU-18 Change-Id: Ibe42f67eff2c2b68c36582e978ace419dc815dc5 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1176805 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support in-kernel vidmem mappings	Konsta Holtta	2016-07-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Propagate the buffer aperture flag in gk20a_locked_gmmu_map up so that buffers represented as a mem_desc and present in vidmem can be mapped to gpu. JIRA DNVGPU-18 JIRA DNVGPU-76 Change-Id: I46cf87e27229123016727339b9349d5e2c835b3e Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1169308 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: initial support for vidmem apertures	Konsta Holtta	2016-07-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	add gk20a_aperture_mask() for memory target selection now that buffers can actually be allocated from vidmem, and use it in all cases that have a mem_desc available. Jira DNVGPU-76 Change-Id: I4353cdc6e1e79488f0875581cfaf2a5cfb8c976a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1169306 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: kmalloc does not return error codes	Terje Bergstrom	2016-07-04
\| \| \| \| \| \| \| \| \|	kmalloc() returns NULL instead of error code on failure. Do not check if the return value is an error code. Change-Id: I31a46080ab51773a22bebe4cf03a5b0c94467204 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1172052
*	gpu: nvgpu: set preempt state in golden ctx init	Thomas Fleury	2016-06-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some parameters like gfxp_wfi_timeout are context switched. Once context has been initialized with default values (sw_ctx_load), we need to ensure that preemption state is properly set before saving golden ctx image. Bug 1593548 Jira VFND-1894 Change-Id: Ib1ba03f4ca1606302b1cf1f0738d3610a162a5c6 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1168662 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add init_preemption_state gr method	Thomas Fleury	2016-06-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This method is called when setting up gr hardware. It is meant to adjust preemption parameters. Bug 1593548 Jira VFND-1894 Change-Id: I0f5aa3212bec3058a0493366bed6fe2a365c9542 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1162625 (cherry picked from commit c2e6d12570af28b3aae087401d7f670df40d40bd) Reviewed-on: http://git-master/r/1166987 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: force clean patch ctx begin/end	Konsta Holtta	2016-06-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch_context map/unmap pair has become a mere wrapper for the more general gk20a_mem_{begin,end}(). To be consistent about mappings, require that each patch_write is surrounded by an explicit begin/end pair, instead of relying on possible inefficient per-write map/unmap. Remove also the cpu_va check from .._write_end() since the buffers may be exist in vidmem without a cpu mapping. JIRA DNVGPU-24 Change-Id: Ia05d52d3d712f2d63730eedc078845fde3e217c1 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1157298 GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add multiple engine and runlist support	Lakshmanan M	2016-06-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL covers the following modification, 1) Added multiple engine_info support 2) Added multiple runlist_info support 3) Initial changes for ASYNC CE support 4) Added ASYNC CE interrupt handling support for gm206 GPU family 5) Added generic mechanism to identify the CE engine pri_base address for gm206 (CE0, CE1 and CE2) 6) Removed hard coded engine_id logic and made generic way 7) Code cleanup for readability JIRA DNVGPU-26 Change-Id: I2c3846c40bcc8d10c2dfb225caa4105fc9123b65 Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: http://git-master/r/1155963 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: return if no fecs intr	Deepak Nibade	2016-06-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_gr_handle_fecs_error(), if we do not see any error interrupt from gr_fecs_host_int_status_r(), just return immediately Bug 1646259 Change-Id: Iea037e0dab57111d2a0fb41c5c19529b7d6c83c0 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1158591 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Fix calculation of timeout	Terje Bergstrom	2016-06-05
\| \| \| \| \| \| \| \| \| \| \|	Fix calculation of timeout in multiple places. The #defines GR_IDLE_CHECK_DEFAULT and GR_IDLE_CHECK_MAX are meant to be used only for defining the frequency of checking for timeout. Using them for actual timeouts makes the timeout really short. Change-Id: I3d0f8cbc91d619be8e5a9168ee1ab1d6298f129b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1158269
*	gpu: nvgpu: Add context reset at golden context init	Terje Bergstrom	2016-06-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Part of golden context initialization is in powerup sequence, and part done as part of first channel creation. The sequence is missing a context reset, which causes initialization of golden context to fail on dGPU. Just moving the code to golden context initialization does not work, because iGPU can be rail gated, and part of the sequence is required in GPU boot. Thus a part of context initialization is replicated to golden context init after a context reset. Change-Id: Ife1b167447018317d3a692b706880e0eda073e43 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1130698
*	gpu: nvgpu: use correct APIs for disable and preempt	Deepak Nibade	2016-06-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gr_gk20a_ctx_zcull_setup(), gr_gk20a_update_smpc_ctxsw_mode(), and in gk20a_channel_suspend(), we call channel specific APIs to disable/preempt/enable channel But we do not consider TSGs in this case Hence use correct (below) APIs in above functions which will handle channel or TSG internally : gk20a_disable_channel_tsg() gk20a_fifo_preempt() gk20a_enable_channel_tsg() Bug 200205041 Change-Id: Ieed378dac4ad2322b35f9102706176ec326d386c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1157189 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: disable/preempt TSG for hwpm update	Deepak Nibade	2016-05-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To update hwpm, we currently disable/preempt only one channel without considering if channel could be part of a TSG Hence, use proper APIs to disable/preempt/enable which will internally handle channel/TSG case Bug 200203191 Change-Id: I329a3c02d635265775f2081abba8e047f491fe7d Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1155838 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use correct IO register pointer	Terje Bergstrom	2016-05-27
\| \| \| \| \| \| \| \| \| \| \| \| \|	Use correct IO register pointer in cyclestats code. The code used reg_mem which is not supposed to be used. It is defined only on iGPU. Change-Id: I03cdaf5d2add2bf2c7cc6d7b3c41ac3be0f9a768 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1154708 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: David Martinez Nieto <dmartineznie@nvidia.com> Reviewed-by: Ken Adams <kadams@nvidia.com>