nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: remove last_submit tracking	Sachit Kadle	2016-09-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We previously used to wait on the last_submit fence before disabling a channel. Since this part of the code is no longer exercised, we can remove this tracking. Bug 1795076 Change-Id: I54ba2ebaf48772aa775654c0fb4ab614a7167969 Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1206585 Reviewed-by: Automatic_Commit_Validation_User (cherry picked from commit e4e236f2b487b8cfa31f7afd29fad3c97de5f844) Reviewed-on: http://git-master/r/1209166 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvpgu: free hw_sema before as binding	Sachit Kadle	2016-09-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Free the hw_sema before releasing a channel's address space binding when freeing a channel. Since the semaphore pool can be freed after releasing the address space, we need to do this earlier on. Bug 1795076 Change-Id: Ic8ae7510af7be862feb6694130c6ce8fc0b8e411 Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1208071 (cherry picked from commit 82a52fb6789b1c9361c1567f082ca36135287294) Reviewed-on: http://git-master/r/1209165 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: improve sync create/destroy logic	Sachit Kadle	2016-09-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change improves the aggressive sync creation & destruction logic to avoid lock contention in the submit path. It does the following: 1) Removes the global sync destruction (channel) threshold, and adds a per-platform parameter. 2) Avoids lock contention in the clean-up/submit path when aggressive sync destruction is disabled. 3) Creates sync object at gpfifo allocation time (as long as we are not in aggressive sync destroy mode), to enable faster first submits Bug 1795076 Change-Id: Ifdb680100b08d00f37338063355bb2123ceb1b9f Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1202425 (cherry picked from commit ac0978711943a59c6f28c98c76b10759e0bff610) Reviewed-on: http://git-master/r/1202427 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: enable fast submit path	Aingara Paramakuru	2016-09-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Submit job-tracking is necessary for any of the following conditions: - pre- or post-fence functionality - channel wdt - GPU rail-gating - buffer refcounting If none of the conditions are met, then job tracking is not required and a fast submit can be done (ie. only need to write out userspace GPFIFO entries and update GP_PUT). Bug 1795076 Change-Id: If94d195e3a18a6b623e167829d291ec98a7a43a1 Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1203511 (cherry picked from commit 13d7cfe94559dc52cb0bba7f9e48848e0858be81) Reviewed-on: http://git-master/r/1223066 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix handling of return data in pmu messages	Vijayakumar	2016-09-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Irrespective generic MSG ptr, pick up data that PMU sends as response to commands JIRA DNVGPU-85 Change-Id: I97dd2abcd9e2a7ad7bfe1270f9905a5b69e196f3 Signed-off-by: Vijayakumar <vsubbu@nvidia.com> Reviewed-on: http://git-master/r/1205119 Reviewed-on: http://git-master/r/1205447 (cherry picked from commit b1130124157acb2cfb4d04a0dd6ee8c4c0c830e5) Reviewed-on: http://git-master/r/1222684 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: fix pmu msg and command queue init	Vijayakumar Subbu	2016-09-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Offset needs to be calculated for individual queues from init ack content Bug 200229814 Signed-off-by: Vijayakumar Subbu <vsubbu@nvidia.com> Change-Id: I93276b9cbab48e7fc42fb6c2a8edf382afb82f71 Reviewed-on: http://git-master/r/1202291 (cherry picked from commit 0e0abd478a13a5163e2b83d07307ed7136c4920e) Reviewed-on: http://git-master/r/1205442 (cherry picked from commit f402cc2a9d0be05b5b95d5d0acbfc66f3b78b309) Reviewed-on: http://git-master/r/1222683 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: update mclk support	Mahantesh Kumbar	2016-09-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Update payload interface to support mclk - Call mclk after gr init complete JIRA DNVGPU-85 Change-Id: I14c5c6cb438f1a7d56d96daa0fafc09d6abef46b Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/1205461 (cherry picked from commit f1bf1ec946aaacae40ecb405341eb2e169cf5754) Reviewed-on: http://git-master/r/1217989 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Vidmem support for PMU	Mahantesh Kumbar	2016-09-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add vidmem support for PMU. Introduces pmu_surface, which abstracts the memory used, and allocator helpers for both sysmem and vidmem. JIRA DNVGPU-85 Change-Id: I61ce137c7007d82010e900759bf8acaf31fba286 Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/1196518 Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/1203125 (cherry picked from commit 665f5748108c50fe0c9b4c1486b9d74869477668) Reviewed-on: http://git-master/r/1217628 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: log page table addr only for sysmem	Konsta Holtta	2016-09-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Don't attempt to use get_iova_addr() on vidmem which does not make sense. Jira DNVGPU-20 Change-Id: Ibfe1516b88ed8b60b8134c330e6b0569d52cbb5b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1217077 (cherry picked from commit c912f0349d24fde033dbcd9874948ff14ad89a43) Reviewed-on: http://git-master/r/1221264 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Fix invalid test for signaled sync_fences	Alex Waterman	2016-09-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix a check that was backwards for signaled sync_fences. This would cause the code to not wait on some sync_fences that had not already signaled and wait on other fences that had signaled. Bug 1787348 Reviewed-on: http://git-master/r/1204710 (cherry picked from commit 75b94bb30f79c3a7a9992773dc8a93b507121006) Change-Id: I00b0f8a373a9954a5ad9ab31aff6423e91574153 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1221044 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Move submit sync code into its own function	Alex Waterman	2016-09-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the submit synchornization code into it's own function. This should help keep the submit code path a little more readable and understandable. Bug 1732449 Reviewed-on: http://git-master/r/1203833 (cherry picked from commit f931c65c166aeca3b8fe2996dba4ea5133febc5a) Change-Id: I4111252d242a4dbffe7f9c31e397a27b66403efc Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1221043 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Greatly simplify the semaphore detection	Alex Waterman	2016-09-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Greatly simplify and make more robust the gpu semaphore detection in sync_fences. Instead of using a magic number use the parent timeline of sync_pts. This will also work with multi-GPU setups using nvgpu since the timeline ops pointer will be the same across all instances of nvgpu. Bug 1732449 Reviewed-on: http://git-master/r/1203834 (cherry picked from commit 66eeb577eae5d10741fd15f3659e843c70792cd6) Change-Id: I4c6619d70b5531e2676e18d1330724e8f8b9bcb3 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1221042 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Optimize sync fence creation	Alex Waterman	2016-09-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Only create sync-fences in the semaphore synchronization path when they are actually needed (i.e requested by userspace). Bug 1795076 Reviewed-on: http://git-master/r/1201564 (cherry picked from commit dc52d424a839e6c064c02b7f02905dd6a59a50af) Change-Id: Ieac6aef415678d4ea982683a955897c64959436e Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1221041 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add safety for vidmem addresses	Deepak Nibade	2016-09-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new API set_vidmem_page_alloc() which sets BIT(0) in sg_dma_address() only for vidmem allocation Add and use new API get_vidmem_page_alloc() which receives scatterlist and returns pointer to vidmem allocation i.e. struct gk20a_page_alloc *alloc In this API, check if BIT(0) is set or not in sg_dma_address() before converting it to allocation address In gk20a_mm_smmu_vaddr_translate(), ensure that the address is pure IOVA address by verifying that BIT(0) is not set in that address Jira DNVGPU-22 Change-Id: Ib53ff4b63ac59a8d870bc01d0af59839c6143334 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1216142 (cherry picked from commit 03c9fbdaa40746dc43335cd8fbe9f97ef2ef50c9) Reviewed-on: http://git-master/r/1219705 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: move all instance blocks to vidmem	Deepak Nibade	2016-09-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use gk20a_gmmu_alloc() in gk20a_alloc_inst_block() so that we always try to allocate all inst blocks in vidmem first Also use common API gk20a_alloc_inst_block() in channel_gk20a_alloc_inst() as well Jira DNVGPU-22 Change-Id: I6c47c19aae1189d7e57f47a51d21a32e2df53c1f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1216140 (cherry picked from commit 6c84961a50eb8a8b080b2db08f87e58143f5a6e8) Reviewed-on: http://git-master/r/1219704 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: select target based on aperture	Deepak Nibade	2016-09-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While programming ucode's inst block in API gr_gk20a_load_falcon_bind_instblk(), use gk20a_aperture_mask() to select target address (i.e. if address is in sysmem or vidmem) based on aperture Also add target accessors for gr_fecs_new_ctx and gr_fecs_arb_ctx_ptr Jira DNVGPU-22 Change-Id: I88198080f188b349a4448a229dff8416a6a18073 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1216139 (cherry picked from commit 42bc14110df17400dd655bc994dc9e61c73048b1) Reviewed-on: http://git-master/r/1219703 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: fix inst block leak for vidmem	Konsta Holtta	2016-09-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Test for size, not cpu_va, to check for buffer validity before attempting to free. Jira DNVGPU-22 Change-Id: I416c0963bf4e1819aa2f8d200c69a2d989524f83 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1215575 (cherry picked from commit ce0077feca55bfb5665c82972598a075abd8f2a0) Reviewed-on: http://git-master/r/1219702 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: allocate channel inst blocks in vidmem	Konsta Holtta	2016-09-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use gk20a_gmmu_alloc() to allocate channel inst block which first tries to allocate in vidmem Jira DNVGPU-22 Change-Id: Ib4d92bf4d2bc0c3d53a82812d635fa8abca4340a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1206274 (cherry picked from commit 0c81c8984c42df27d3520f800eb87728f67d4453) Reviewed-on: http://git-master/r/1219701 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: use get_base_addr() to get inst_block address	Deepak Nibade	2016-09-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since inst_block could reside either in sysmem or vidmem, use gk20a_mem_get_base_addr() to get it's base address Jira DNVGPU-22 Change-Id: Ic9b4370e0a88b585483e78ea81df0ec6ff799487 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1212702 (cherry picked from commit ecdffa7664f48dba0bcbd15b1340af5bf3b45802) Reviewed-on: http://git-master/r/1219700 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: fall back to sysmem for generic alloc-maps	Konsta Holtta	2016-09-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_gmmu_alloc_map_attr(), which is used for in-kernel allocations combined with immediate gmmu map, fall back to attempting to allocate sysmem when vidmem allocation fails. Bug 1809939 Change-Id: I4ec4fbf93d41fd9681166b47b3ecad24b51ea274 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1216814 (cherry picked from commit a9929682f1f356f7e8a652a2cec8ed73cc492448) Reviewed-on: http://git-master/r/1217688 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: fall back to sysmem for generic allocs	Konsta Holtta	2016-09-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_gmmu_alloc_attr(), which is used for in-kernel allocations, fall back to attempting to allocate sysmem when vidmem allocation fails. Bug 1809939 Change-Id: I0397026fd1b3bc803f6d8bb7409e05ab31ec961d Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1215447 (cherry picked from commit 3ec37992b830cee917e8ad35ede50e048907014a) Reviewed-on: http://git-master/r/1217687 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: When powering down, abort if not idle	Terje Bergstrom	2016-09-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When trying to power down GPU the engine might be still busy. In this case delay power down by returning -EBUSY from gk20a_pm_runtime_suspend(). Bug 200224907 Change-Id: Ibad74c090add24a185bc1a7a02df367af9b95ced Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1213042 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: move gpfifo submit wait to userspace	Aingara Paramakuru	2016-09-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of blocking for gpfifo space in the nvgpu driver, return -EAGAIN and allow userspace to decide the blocking policy. Bug 1795076 Change-Id: Ie091caa92aad3f68bc01a3456ad948e76883bc50 Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1202591 (cherry picked from commit 8056f422c6a34a4239fc4993c40c2e517c932714) Reviewed-on: http://git-master/r/1203800 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix null access in page table allocation	Konsta Holtta	2016-09-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Check entry->mem.sgt for validity before attempting to dereference it in a debug print. Bug 1809939 Change-Id: If7aa7444c162a076d8f23a88dfd2e3e0a9c33813 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1215522 (cherry picked from commit 48c25cd4f1db9d5bb07847af4de29d8f369b52e3) Reviewed-on: http://git-master/r/1220547 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: fix chunk size mismatch in page allocator	Konsta Holtta	2016-09-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When allocating discontiguous memory composed of several chunks, update also the number of pages used by the current chunk, if a large chunk was not available and a retry is performed with a smaller one. Failing to do this would result in too few chunks reserved for a large enough allocation in certain conditions. Bug 1805067 Change-Id: I9d14864724d228b42c47eb4669fbe0f789334397 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1214914 (cherry picked from commit 9bece931b13e4dad808622462d4d98d421cfb383) Reviewed-on: http://git-master/r/1220546 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: test free user vidmem atomically	Konsta Holtta	2016-09-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An empty list of soon-to-be-freed userspace vidmem buffers is not enough to safely assume that an allocation may succeed or not if tried again, because removal from the list and actually marking the memory freed is not atomic. Fix this by using an atomic counter for the number of pending frees (so that it's still safe to first remove from the job list and then perform the free), and making allocation attempts combined with a test of pending frees atomic. This still does not guarantee that there is memory available (as the actual amount of pending memory in bytes plus the current free amount isn't computed), but removes the race that produces false negatives in case a single program expects repeated frees and allocs to succeed. Bug 1809939 Change-Id: I6a92da2e21cbf3f886b727000c924d56f35ce55b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1217078 (cherry picked from commit 83c1f1e70dccd92fdd4481132cf5b6717760d432) Reviewed-on: http://git-master/r/1220545 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: NULL out unused css entries	Peter Daifuku	2016-09-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix cyclestats snapshots HAL entries in the vgpu case, need to null out the ones that don't apply. Bug 1700143 JIRA EVLR-278 Change-Id: I1b5f4652d1bf3283d96fdb3c2f66c4f69a9f6acc Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1217507 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: use spinlock for ch timeout lock	Aingara Paramakuru	2016-09-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The channel timeout lock guards a very small critical section. Use a spinlock instead of a mutex for performance. Bug 1795076 Change-Id: I94940f3fbe84ed539bcf1bc76ca6ae7a0ef2fe13 Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1200803 (cherry picked from commit 4fa9e973da141067be145d9eba2ea74e96869dcd) Reviewed-on: http://git-master/r/1203799 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: change the usage of tegra_fuse_readl	Shardar Shariff Md	2016-09-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tegra_fuse_readl() prototype is changed to match upstreamed fuse driver, so change implementation accordingly. Bug 200233653 Change-Id: I01f23cfafd5923d86ac48e67b36132ce690e962b Signed-off-by: Shardar Shariff Md <smohammed@nvidia.com> Reviewed-on: http://git-master/r/1217374 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: fix pmu_copy_to_dmem spew	David Nieto	2016-09-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The error check was not taking account of the DMEM address wrap-around JIRA DNVGPU-34 Change-Id: Ibfed5532c3ee785b3061e6837f012939118a7ece Signed-off-by: David Nieto <dmartineznie@nvidia.com> Reviewed-on: http://git-master/r/1206460 (cherry picked from commit 080953c20f91068ccaaa564d9492a1582ffa28fe) Reviewed-on: http://git-master/r/1218297 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Call init_cbc only when defined	Terje Bergstrom	2016-09-12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Call init_cbc only when it contains a non-NULL pointer. Bug 1799537 Change-Id: Ic23f264e10daff30365bf3cf86ac9c155f50e497 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1208008 (cherry picked from commit ec69fa15c32f49d96939fd9a672faec45e078dfa) Reviewed-on: http://git-master/r/1217298 Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: refactor pmu include	Vijayakumar Subbu	2016-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	split pmu include files to add lot more APIs pmu_api.h - all the current APIs used in igpu pmu_common.h - common defines for all APIs pmu_gk20a.h - SW defines specific needed for nvgpu like PMU version, PMU SW structure definition etc. Splitting APIs to separate files allows us to use auto generated PMU task headers from RM We have script which generates pmu interface herader files in linux format. It replaces RM with NV. Adding typedef in existing pmu code make auto generated files easy to compile/add JIRA DNVGPU-85 Change-Id: I851b88769fe8d60561a44754ddb7dde45b45959e Signed-off-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-on: http://git-master/r/1192702 Reviewed-on: http://git-master/r/1203124 (cherry picked from commit 0fe5f020c3f934cf2cc5336f1b6c3bafaf9e0c2a) Reviewed-on: http://git-master/r/1217301 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Support mclk initialization	Terje Bergstrom	2016-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add ops for calling mclk initialization. JIRA DNVGPU-85 Change-Id: I2e9da80fdb014d916b40513d605c38711818d2f6 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1203975 (cherry picked from commit 9be482c4ece7ffc550ae19f133638c808b3a768f) Reviewed-on: http://git-master/r/1217300 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: get bios perf and clk table ptr	Mahantesh Kumbar	2016-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement support for reading perf and clk tables from VBIOS. JIRA DNVGPU-83 Change-Id: I095fea08479161362e4c2ffa7500ee6a57d6d447 Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/1202602 (cherry picked from commit fb7c7356f131a198bd655a25fc6ff17067477e1b) Reviewed-on: http://git-master/r/1217299 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Skip calling undefined prod callbacks	Terje Bergstrom	2016-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix rest of code to not call prod callbacks that are set to NULL. Bug 1799537 Change-Id: I756bb1f7ef58ba753ac43a2be6f125107be3cf34 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1209133 (cherry picked from commit 5f4d7b42b6101407fde8c4a7dcdd3633eca85ae5) Reviewed-on: http://git-master/r/1217297 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Allocate vidmem fds from 1024	Terje Bergstrom	2016-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allocate vidmem fds from 1024 onwards. This prevents us from using up the 0-1023 range which is tracked per process, and fits within FD_SETSIZE. Bug 200222681 Change-Id: I104b81f2831f1816ff66fc245fa63013d78001ec Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1199269 (cherry picked from commit 5d5cbaf6a63dd31538fa35081b70e103d8a658f4) Reviewed-on: http://git-master/r/1217294 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Do not print error on unknown engine	Terje Bergstrom	2016-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unknown engine is expected, as we do not support all dGPU engines. Remove the error spew. JIRA DNVGPU-26 Change-Id: I6f7897c6ead168f1d8100421d16d0540a7f7b542 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1206449 (cherry picked from commit 4cc610755df94065afd28a90c63aca8fff9685b1) Reviewed-on: http://git-master/r/1217292 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: vgpu: cyclestat snapshot support	Peter Daifuku	2016-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for cyclestats snapshots in the virtual case Bug 1700143 JIRA EVLR-278 Change-Id: I376a8804d57324f43eb16452d857a3b7bb0ecc90 Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1211547 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: unify nvgpu and pci probe	Deepak Nibade	2016-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have completely different versions of probe for nvgpu and pci device Extract out common steps into nvgpu_probe() function and separate it out in new file nvgpu_common.c Divide task of nvgpu_probe() into further smaller functions Do platform specific things (like irq handling, memresource management, power management) only in individual probes and then call nvgpu_probe() to complete the common initialization Move all debugfs initialization to common gk20a_debug_init() This also helps to bringup all debug nodes to pci device Pass debugfs_symlink name as a parameter to gk20a_debug_init() This allows us to set separate debugfs symlink for nvgpu and pci device In case of railgating, cde and ce debugfs, check if platform supports them or not Copy vidmem_is_vidmem from platform to mm structure and set it to true for pci device Return from gk20a_scale_init() if we don't have either of governor or qos_notifier Fix gk20a_alloc_debugfs_init() and gk20a_secure_page_alloc() to receive device pointer instead of platform_device Export gk20a_railgating_debugfs_init() so that we can call it from gk20a_debug_init() Jira DNVGPU-56 Jira DNVGPU-58 Change-Id: I3cc048082b0a1e57415a9fb8bfb9eec0f0a280cd Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1204207 (cherry picked from commit add6bb0a3d5bd98131bbe6f62d4358d4d722b0fe) Reviewed-on: http://git-master/r/1204462 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: remove blocking wait for vidmem allocation	Deepak Nibade	2016-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have blocking 1sec wait for vidmem allocation Remove this blocking wait and just return proper error code to the caller In case we have some buffers to be cleaned up in the list (clear_list_head), return EAGAIN so that caller can retry Otherwise return ENOMEM indicating that no memory is available right now Jira DNVGPU-84 Change-Id: Ife2b17c989fc80e568f03bb18ad75b93a25be962 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1204969 (cherry picked from commit 2bacdf0bc6d5b1cdcb8be37e574ca5f4f0663cae) Reviewed-on: http://git-master/r/1213451 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: use vidmem for gr ctx if available	Konsta Holtta	2016-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the common gk20a_gmmu_alloc() that tries vidmem too. Jira DNVGPU-24 Change-Id: I5dfd7eaab737a5290b4d21ac575d6b89777a567e Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1209077 (cherry picked from commit e3085d37735c8f1cf4845621f29fe9d2689aad4b) Reviewed-on: http://git-master/r/1184330 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Tested-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix non-contiguous pramin access	Deepak Nibade	2016-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In pramin_access_batched(), in each iteration of the loop we first decide size of data that we should write in that iteration. In case this size is equal to length of the chunk, we need to move to use next chunk for subsequent iteration But since we change offset variable before we check above, we end up using same chunk in next iteration Fix this by correcting the sequnce to first check if we should move to next chunk and then only adjust the offset variable Jira DNVGPU-24 Change-Id: I58c2e24678f4c6dfbe33bf111edd06788629eca8 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1210892 (cherry picked from commit 83cc179199692d28a93b3b884c9bc094ff513298) Reviewed-on: http://git-master/r/1213450 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: suppress unbind operation	Sri Krishna chowdary	2016-09-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unbind on nvgpu results in kernel panic. Suppress it to avoid kernel panic. Proper fix should follow later on. bug 1779085 Change-Id: Ibc966ac031f7f04406db63310e2f5ea126649ac0 Signed-off-by: Sri Krishna chowdary <schowdary@nvidia.com> Reviewed-on: http://git-master/r/1212759 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
*	gpu: nvgpu: fix compilation errors for 32 bit arch	Deepak Nibade	2016-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Converting return value of sg_dma_address() (which is u64) into a pointer results in compilation failure on 32 bit machines Hence convert address first into uintptr_t and then into pointer Change-Id: I8e036af8f4c936b88883cf8af1491f03025ed356 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1211243 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: enable big page support for pci	Deepak Nibade	2016-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While mapping the buffer, first check if buffer is in vidmem, and if yes convert allocation into base address And then walk through each chunk to decide the alignment Add new API gk20a_mm_get_align() which returns the alignment based on scatterlist and aperture, and use this API to get alignment during mapping Enable big page support for pci by unsetting disable_bigpage Jira DNVGPU-97 Change-Id: I358dc98fac8103fdf9d2bde758e61b363fea9ae9 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1207673 (cherry picked from commit d14d42290eed4aa7a2dd2be25e8e996917a58e82) Reviewed-on: http://git-master/r/1210959 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: make default vidmem page size of 64k	Deepak Nibade	2016-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allocate 64k pages for vidmem by default Also make sure that base address of vidmem is aligned to page size Jira DNVGPU-20 Change-Id: Ie2e5111f942467754db5b45f1518d72c925d3d19 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1206405 (cherry picked from commit 542ebf7f571ba6dc631466e562f7d8e05df4a9a6) Reviewed-on: http://git-master/r/1210958 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: use vidmem for page tables if available	Konsta Holtta	2016-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the common gk20a_gmmu_alloc() that tries vidmem too. Jira DNVGPU-20 Change-Id: I4ea02bc4962d299c6f71444048d4a2a22bd80f55 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1206404 (cherry picked from commit 7297727cce8c5c7b26f82afe98cc5428135b4777) Reviewed-on: http://git-master/r/1178831 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: add new API to get base address for sysmem/vidmem buffers	Deepak Nibade	2016-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new API gk20a_mem_get_base_addr() which will return vidmem base address in case of vidmem and IOVA address in case of sysmem Even though vidmem allocations are non-contiguous, this API is useful (and should only be used) for allocations with one chunk (e.g. page tables) Also, since page tables could either reside in sysmem or vidmem, use this API to get address of page tables Jira DNVGPU-20 Change-Id: Ie04af9ca7bfccfec1a8a8e4be2c507cef5cef8e1 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1206403 (cherry picked from commit a8c74dc188878f2948fa1e0e47bf1837fba6c5e0) Reviewed-on: http://git-master/r/1210957 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: allocate blob space early	Deepak Nibade	2016-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allocting blob space for pmu might need fixed address allocation in vidmem and during boot up But if some page tables are allocated before blob space, blob space allocation could fail Fix this by allocating blob space early during boot up Jira DNVGPU-20 Change-Id: I30eca1023c8f8f8be101bb7e160ba57a7040911a Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1206402 (cherry picked from commit fad4309ce345ed3879f497bda27f2eceb1084dbb) Reviewed-on: http://git-master/r/1210956 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Add proper timeout handling for vidmem clear operations	Lakshmanan M	2016-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gk20a_fence_wait() api may be interrupted by a signal before actual its timeout elapsed. This CL does retry (-ERESTARTSYS) mechanism if gk20a_fence_wait() return before its timeout elapsed. Bug 200230544 Change-Id: I347ed2004935a8b9413f95dcb6fca2b74bf49f2a Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: http://git-master/r/1206265 (cherry picked from commit d3ef533942487785d84d109f985ae648eb3c2434) Reviewed-on: http://git-master/r/1210955 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>