nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: Wait for BAR1 bind	Terje Bergstrom	2016-04-15
\| \| \| \| \| \| \| \| \|	Wait for BAR1 bind to complete before continuing. The register to wait exists Maxwell onwards. Change-Id: Ie3736033fdb748c5da8d7a6085ad6d63acaf41f5 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1123941
*	gpu: nvgpu: Add litter values HAL	Terje Bergstrom	2016-04-15
\| \| \| \| \| \| \| \| \|	Move per-chip constants to be returned by a chip specific function. Implement get_litter_value() for each chip. Change-Id: I2a2730fce14010924d2507f6fa15cc2ea0795113 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1121383
*	gpu: nvgpu: make tpc_fs_mask work on production board	Richard Zhao	2016-04-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On production fused boards, it uses gr_fe_tpc_fs_r() to mask TPCs, rather than fues. Bug 1734150 Change-Id: I7b4eb428f1ad0cf841a57214e0c8c1e8f17b2c5a Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1111630 (cherry picked from commit 869ea54967812e03d9f1e69775ca56fd6459216c) Reviewed-on: http://git-master/r/1122121 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	debugfs: Pass bool pointer to debugfs_create_bool()	Alex Van Brunt	2016-04-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Port the change 621a5f7ad9cd1ce7933f1d302067cbd58354173c from kernel.org to the nvgpu driver. bug 200187033 Change-Id: I7d742f614161d9d4ed59c4216d7c730d57ef4116 Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com> Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1118397 GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: Support GPUs with only one interrupt line	Terje Bergstrom	2016-04-13
\| \| \| \| \| \| \| \| \| \| \|	Not all GPUs have stalling and non-stalling interrupt. Support ones with just one interrupt line. Change-Id: I0f1e8faa5b353b8d1b10691375bd853152379a3a Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120470 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	gpu: nvgpu: vgpu: add fecs trace support	Richard Zhao	2016-04-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug 1648908 Change-Id: I7901e7bce5f7aa124a188101dd0736241d87bd53 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1031861 Reviewed-on: http://git-master/r/1121261 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: virtualized SMPC/HWPM ctx switch	Peter Daifuku	2016-04-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for SMPC and HWPM context switching when virtualized Bug 1648200 JIRASW EVLR-219 JIRASW EVLR-253 Change-Id: I80a1613eaad87d8510f00d9aef001400d642ecdf Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1122034 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use device instead of platform_device	Terje Bergstrom	2016-04-08
\| \| \| \| \| \| \| \| \|	Use struct device instead of struct platform_device wherever possible. This allows adding other bus types later. Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120466
*	gpu: nvgpu: Add Fuse prints on PMU Halt	Supriya	2016-04-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-Print fuse values in case of PMU halt error -and mailbox reads 0xDEADDEAD Bug 1737044 Change-Id: I59f5fcf4a69bdd2a2eea81a69dd99bb9c4c21e1d Signed-off-by: Supriya <ssharatkumar@nvidia.com> Reviewed-on: http://git-master/r/1113464 (cherry picked from commit d0320eed72c5070c4fcc7564c02fa38599984751) Reviewed-on: http://git-master/r/1120429 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add HAL for GPU characteristics	Sami Kiminki	2016-04-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add function pointer for chip specific GPU characteristics init. Bug 1637486 Change-Id: I6ce5eea124d8057393dec6e86e72412cc87e1cfa Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Signed-off-by: Adeel Raza <araza@nvidia.com> Reviewed-on: http://git-master/r/780535 (cherry picked from commit f5c240d6ed19b5b9eedff05767c885ad5812c71e) Reviewed-on: http://git-master/r/1120428 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: split address space for fixed allocs	Alex Waterman	2016-03-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow a special address space node to be split out from the user adress space or fixed allocations. A debugfs node, /d/<gpu>/separate_fixed_allocs Controls this feature. To enable it: # echo <SPLIT_ADDR> > /d/<gpu>/separate_fixed_allocs Where <SPLIT_ADDR> is the address to do the split on in the GVA address range. This will cause the split to be made in all subsequent address space ranges that get created until it is turned off. To turn this off just echo 0x0 into the same debugfs node. Change-Id: I21a3f051c635a90a6bfa8deae53a54db400876f9 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1030303 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add support for FECS ctxsw tracing	Anton Vorontsov	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 1648908 This commit adds support for FECS ctxsw tracing. Code is compiled conditionnaly under CONFIG_GK20_CTXSW_TRACE. This feature requires an updated FECS ucode that writes one record to a ring buffer on each context switch. On RM/Kernel side, the GPU driver reads records from the master ring buffer and generates trace entries into a user-facing VM ring buffer. For each record in the master ring buffer, RM/Kernel has to retrieve the vmid+pid of the user process that submitted related work. Features currently implemented: - master ring buffer allocation - debugfs to dump master ring buffer - FECS record per context switch (with both current and new contexts) - dedicated device for ctxsw tracing (access to VM ring buffer) - SOF generation (and access to PTIMER) - VM ring buffer allocation, and reconfiguration - enable/disable tracing at user level - event-based trace filtering - context_ptr to vmid+pid mapping - read system call for ctxsw dev - mmap system call for ctxsw dev (direct access to VM ring buffer) - poll system call for ctxsw dev - save/restore register on ELPG/CG6 - separate user ring from FECS ring handling Features requiring ucode changes: - enable/disable tracing at FECS level - actual busy time on engine (bug 1642354) - master ring buffer threshold interrupt (P1) - API for GPU to CPU timestamp conversion (P1) - vmid/pid/uid based filtering (P1) Change-Id: I8e39c648221ee0fa09d5df8524b03dca83fe24f3 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1022737 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add support to set channel timeslice	Aingara Paramakuru	2016-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of improving GPU scheduling, userspace can now set a channel's timeslice, within reasonable limits imposed by the kernel driver. JIRA VFND-1312 Bug 1729664 Change-Id: I4c3430c43437889b8685f12988d4b967bb7877bb Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1020917 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: improve channel interleave support	Aingara Paramakuru	2016-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, only "high" priority bare channels were interleaved between all other bare channels and TSGs. This patch decouples priority from interleaving and introduces 3 levels for interleaving a bare channel or TSG: high, medium, and low. The levels define the number of times a channel or TSG will appear on a runlist (see nvgpu.h for details). By default, all bare channels and TSGs are set to interleave level low. Userspace can then request the interleave level to be increased via the CHANNEL_SET_RUNLIST_INTERLEAVE ioctl (TSG-specific ioctl will be added later). As timeslice settings will soon be coming from userspace, the default timeslice for "high" priority channels has been restored. JIRA VFND-1302 Bug 1729664 Change-Id: I178bc1cecda23f5002fec6d791e6dcaedfa05c0c Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1014962 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: update slcg/blcg prod setiings	Seshendra Gadagottu	2016-03-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add following missing prod settings: blcg bus blcg ce slcg ce2 slcg chiplet slcg gr Bug 1689806 Change-Id: Ic7c9afdb1fc47ad71ca326384f5d2a4528121abe Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1030987 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: LRF, TEX, LTC, DRAM override	Supriya	2016-02-26
\| \| \| \| \| \| \| \| \| \| \| \|	- Adding support for FECS mem overrides Bug 1699676 Change-Id: I6c9ddcd98d57b29059513ee508c6f92b194c4fc7 Signed-off-by: Supriya <ssharatkumar@nvidia.com> Reviewed-on: http://git-master/r/921253 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add create_gr_sysfs() function pointer	Adeel Raza	2016-02-19
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add create_gr_sysfs() function pointer for creating gr specific sysfs nodes. Bug 1699676 Change-Id: I0a14d3676ebfcd5adebce673e46bdaad8d6aecf7 Signed-off-by: Adeel Raza <araza@nvidia.com> Reviewed-on: http://git-master/r/1008658 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: SM/TEX exception handling support	Adeel Raza	2016-01-29
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add TEX exception handling support. Also make SM exception handler into a function pointer, which should allow different chips to implement their own SM exception handling routine. Bug 1635727 Bug 1637486 Change-Id: I429905726c1840c11e83780843d82729495dc6a5 Signed-off-by: Adeel Raza <araza@nvidia.com> Reviewed-on: http://git-master/r/935329
*	gpu: nvgpu: fix race condition with poweroff	Seshendra Gadagottu	2016-01-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When gpu rail-gating is enabled, it is possible that both rail gating code and system shudown can start executing gk20a_pm_prepare_poweroff() in parallel. To synchronize this execution, protect gk20a_pm_prepare_poweroff() with a mutex lock. Bug 200168805 Change-Id: I19536a43ed20c3e82b32c316922dc3e19e3f59bb Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/999548 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add support for therm gate ctrl	Seshendra Gadagottu	2016-01-27
\| \| \| \| \| \| \| \| \| \| \| \| \|	During gpu init, therm gate control is required to add delay cycles before clock gating. Bug 1717152 Change-Id: Ifabc428cf7b49e49964dc994eba2c38af4aa1a91 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/936443 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: add channel_set_priority support	Richard Zhao	2016-01-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- add gops.fifo.channel_set_priority and move current code as native callback. - implement the callback for vgpu Bug 1701079 Change-Id: If1cd13ea4478d11d578da2f682598e0c4522bcaf Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/932829 Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support masking hww_warp_esr	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add below API pointer to support masking of hww_warp_esr after hardware read of register and before using it further u32 (*mask_hww_warp_esr)(u32 hww_warp_esr) If needed, this API will mask value of hww_warp_esr appropriately and return it Bug 200156699 Change-Id: I1afb1347e650fab607009c1ee55691484653a4c1 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/927133 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support preprocessing of SM exceptions	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support preprocessing of SM exceptions if API pointer pre_process_sm_exception() is defined Also, expose some common APIs Bug 200156699 Change-Id: I1303642c1c4403c520b62efb6fd83e95eaeb519b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/925883 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add high priority channel interleave	Peter Pipkorn	2016-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Interleave all high priority channels between all other channels. This reduces the latency for high priority work when there are a lot of lower priority work present, imposing an upper bound on the latency. Change the default high priority timeslice from 5.2ms to 3.0 in the process, to prevent long running high priority apps from hogging the GPU too much. Introduce a new debugfs node to enable/disable high priority channel interleaving. It is currently enabled by default. Adds new runlist length max register, used for allocating suitable sized runlist. Limit the number of interleaved channels to 32. This change reduces the maximum time a lower priority job is running (one timeslice) before we check that high priority jobs are running. Tested with gles2_context_priority (still passes) Basic sanity testing is done with graphics_submit (one app is high priority) Also more functional testing using lots of parallel runs with: NVRM_GPU_CHANNEL_PRIORITY=3 ./gles2_expensive_draw –drawsperframe 20000 –triangles 50 –runtime 30 –finish plus multiple: NVRM_GPU_CHANNEL_PRIORITY=2 ./gles2_expensive_draw –drawsperframe 20000 –triangles 50 –runtime 30 -finish Previous to this change, the relative performance between high priority work and normal priority work comes down to timeslice value. This means that when there are many low priority channels, the high priority work will still drop quite a lot. But with this change, the high priority work will roughly get about half the entire GPU time, meaning that after the initial lower performance, it is less likely to get lower in performance due to more apps running on the system. This change makes a large step towards real priority levels. It is not perfect and there are no guarantees on anything, but it is a step forwards without any additional CPU overhead or other complications. It will also serve as a baseline to judge other algorithms against. Support for priorities with TSG is future work. Support for interleave mid + high priority channels, instead of just high, is also future work. Bug 1419900 Change-Id: I0f7d0ce83b6598fe86000577d72e14d312fdad98 Signed-off-by: Peter Pipkorn <ppipkorn@nvidia.com> Reviewed-on: http://git-master/r/805961 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: abstract set sm debug mode	Richard Zhao	2016-01-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new operation g->ops.gr.set_sm_debug_mode and move native implementation to gr_gk20a.c It's preparing for adding vgpu set sm debug mode hook. JIRA VFND-1006 Bug 1594604 Change-Id: Ia5ca06a86085a690e70bfa9c62f57ec3830ea933 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/923232 (cherry picked from commit 032552b54c570952d1e36c08191e9f70b9c59447) Reviewed-on: http://git-master/r/835614 Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: Control comptagline assignment from kernel	Terje Bergstrom	2016-01-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Maxwell comptaglines are assigned per 128k, but preferred big page size for graphics is 64k. Bit 16 of GPU VA is used for determining which half of comptagline is used. This creates problems if user space wants to map a page multiple times and to arbitrary GPU VA. In one mapping the page might be mapped to lower half of 128k comptagline, and in another mapping the page might be mapped to upper half. Turn on mode where MSB of comptagline in PTE is used instead of bit 16 for determining the comptagline lower/upper half selection. Bug 1704834 Change-Id: If87e8f6ac0fc9c5624e80fa1ba2ceeb02781355b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/924322 Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	gpu: nvgpu: Make access map chip specific	Terje Bergstrom	2015-12-07
\| \| \| \| \| \| \| \|	Bug 1692373 Change-Id: Ie3fc3e02fa7b0636da464d6ee1c28da7a4543ec2 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/812353
*	gpu: nvgpu: Wait for pause for SMs	sujeet baranwal	2015-12-04
\| \| \| \| \| \| \| \| \| \| \| \|	SM locking & register reads Order has been changed. Also, functions have been implemented based on gk20a and gm20b. Change-Id: Iaf720d088130f84c4b2ca318d9860194c07966e1 Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Signed-off-by: ashutosh jain <ashutoshj@nvidia.com> Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/837236
*	gpu: nvgpu: IOCTL to disable watchdog per-channel	Deepak Nibade	2015-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add IOCTL NVGPU_IOCTL_CHANNEL_WDT to disable/enable watchdog per-channel Also, if watchdog is disabled, we currently schedule the worker with MAX timeout. Instead of this, do not schedule any worker if watchdog is disabled Bug 1683059 Bug 1700277 Change-Id: I7f6bec84adeedb74e014ed6d1471317b854df84c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/837962 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: abstract set mmu debug mode	Richard Zhao	2015-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new operaton g->ops.mm.set_debug_mode and let other places that set debug mode call this callback. It's preparing for adding vgpu set mmu debug mode hook. JIRA VFND-1005 Bug 1594604 Change-Id: I1d227a0c0f96adb0035ae16ae1f4fbfa739bf0a7 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/833497 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: User-space managed address space support	Sami Kiminki	2015-11-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which enables creating userspace-managed GPU address spaces. When an address space is marked as userspace-managed, the following changes are in effect: - Only fixed-address mappings are allowed. - VA space allocation for fixed-address mappings is not required, except to mark space as sparse. - Maps and unmaps are always immediate. In particular, the mapping ref increments at kickoffs and decrements at job completion are skipped. Bug 1614735 Bug 1623949 Bug 1660392 Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/738558 Reviewed-on: http://git-master/r/833253 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: ZBC update without idle	Terje Bergstrom	2015-11-17
\| \| \| \| \| \| \| \| \| \|	Do ZBC updates without forcing engine idle first. Bug 1698013 Change-Id: I99218c8cfd02be05dace2003b8d91921765f7ca9 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/829145
*	gpu: nvgpu: Do not use G_ELPG_FLUSH	Terje Bergstrom	2015-11-10
\| \| \| \| \| \| \| \| \| \| \| \| \|	G_ELPG_FLUSH is protected in some chips. Use L2 flush operations instead. Bug 1698618 Change-Id: I984a8ace8bcd0ad2d4a4e2d63af75a342bdeb75a Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/828656 (cherry picked from commit ba9075fa43975112a221d37d246f0b8f5af40fab) Reviewed-on: http://git-master/r/829415
*	gpu: nvgpu: use platform data for ptimer source rate	Seshendra Gadagottu	2015-11-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of depending on clock frame-work, use platform data for ptimer source rate. Removed ptimerscaling10x platform data, and use ptimer source frequency to calculate ptimerscaling factor. Reviewed-on: http://git-master/r/819030 (cherry picked from commit dd291334d54dab80cab7eb1656dffc48a59610b4) Change-Id: I7638ce9875a6e440bbfc2ba2da0d0b094b2700ff Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/827300 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: IOCTL to disable timeouts	Deepak Nibade	2015-11-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add IOCTL NVGPU_DBG_GPU_IOCTL_TIMEOUT to support disabling/re-enabling scheduler timeout from user space If user space application is closed without re-enabling the timeouts, kernel will restore the timeouts' state while releasing the debug session This is needed for debugging purpose Bug 1514061 Change-Id: I32efb47ad09d793f3e7fd8f0aaa9720c8bc91272 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/788176 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: update thermal programming	Seshendra Gadagottu	2015-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add required fileds and values for thermal slow-down settings in thermal header file and implemented chip specific thermal register programming Reviewed-on: http://git-master/r/822199 (cherry picked from commit 9e8a745b8295af002b9780c83caa8dc7b22cc737) Change-Id: I016b18ed230fa6c104eada2e166ccd1a5f2ace36 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/823012 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Disable only channel at zcull bind	Terje Bergstrom	2015-10-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	At zcull bind we disable whole GR engine. This is unnecessary, so instead disable only the channel and make sure it's unloaded. Introduces also an API in fifo_gk20a.c to do the channel disable. gr_gk20a_ctx_zcull_setup() was always passed true as last parameter, so remove parameter. Change-Id: I7ae6e101ec7d1ab3f6ee4e9bcc442d23dbd21247 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/787570
*	gpu: nvgpu: add support to remove bar2 mm	Seshendra Gadagottu	2015-10-12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Adding support to remove bar2 mm on gpu module remove. Change-Id: Id5f680b1abf7056da9871d5460d9fbc40422673e Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/814571 (cherry picked from commit e7c6c87dd6b0893d26a9a3b4568121a691e1eb3c) Reviewed-on: http://git-master/r/815429 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: set correct timeslice value	Kirill Artamonov	2015-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Scale timeslice register value based on platform specific ptimer scale koefficient. Expose timeslice values through debugfs to simplify performance tuning. bug 1605552 bug 1603226 Change-Id: I49f86f22d58d26a366ee1b5f5a9ab9d7f896ad25 Signed-off-by: Kirill Artamonov <kartamonov@nvidia.com> Reviewed-on: http://git-master/r/800007 (cherry picked from commit 00c85ef24cf28ffaa81eb53fff7edef1c699220a) Reviewed-on: http://git-master/r/808251 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: scale ptimer based timeouts	Vijayakumar	2015-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 1603226 host based timeouts use ptimer for detecting timeouts. on gk20a and gm20b ptimer runs 2.6x slower. scale the fifo_eng_timeout to account for this Change-Id: Ie44718382953e36436ea47d6e89b9a225d5c2070 Signed-off-by: Vijayakumar <vsubbu@nvidia.com> Reviewed-on: http://git-master/r/799983 (cherry picked from commit d1d837fd09ff0f035feff1757c67488404c23cc6) Reviewed-on: http://git-master/r/808250 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: update slcg xbar prod settings	Seshendra Gadagottu	2015-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug 1689806 Change-Id: I368ad8fb64e49b21ba61c519def1f86e1ca6e492 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/806116 (cherry picked from commit 1a3bbe989a795d379703e7f4b915f6e1bb38c2c3) Reviewed-on: http://git-master/r/805480 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: ELPG init & statistics update	Mahantesh Kumbar	2015-09-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Required init param to start elpg - change in statistics dump Bug 1684939 Change-Id: I26dca52079f08b8962e9cb758831910207610220 Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/802456 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/806179 Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add CDE bits in FECS header	sujeet baranwal	2015-09-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In case of CDE channel, T1 (Tex) unit needs to be promoted to 128B aligned, otherwise causes a HW deadlock. Gpu driver makes changes in FECS header which FECS uses to configure the T1 promotions to aligned 128B accesses. Bug 200096226 Change-Id: I8a8deaf6fb91f4bbceacd491db7eb6f7bca5001b Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Reviewed-on: http://git-master/r/804625 Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add support for CDE scatter buffers	Jussi Rasanen	2015-09-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for CDE scatter buffers. When the bus addresses for surfaces are not contiguous as seen by the GPU (e.g., when SMMU is bypassed), CDE swizzling needs additional per-page information. This information is populated in a scatter buffer when required. Bug 1604102 Change-Id: I3384e2cfb5d5f628ed0f21375bdac8e36b77ae4f Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-on: http://git-master/r/789436 Reviewed-on: http://git-master/r/791243 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: implement per-channel watchdog	Deepak Nibade	2015-09-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement per-channel watchdog/timer as per below rules : - start the timer while submitting first job on channel or if no timer is already running - cancel the timer when job completes - re-start the timer if there is any incomplete job left in the channel's queue - trigger appropriate recovery method as part of timeout handling mechanism Handle the timeout as per below : - get timed out channel, and job data - disable activity on all engines - check if fence is really pending - get information on failing engine - if no engine is failing, just abort the channel - if engine is failing, trigger the recovery Also, add flag "ch_wdt_enabled" to enable/disable channel watchdog mechanism. Watchdog can also be disabled using global flag "timeouts_enabled" Set the watchdog time to be 5s using macro NVGPU_CHANNEL_WATCHDOG_DEFAULT_TIMEOUT_MS Bug 200133289 Change-Id: I401cf14dd34a210bc429f31bd5216a361edf1237 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/797072 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: HAL to write DMATRFBASE	Mahantesh Kumbar	2015-09-15
\| \| \| \| \| \| \| \| \| \|	Bug 200137618 Change-Id: I18b980876e93c3f7287082701e1d2b998cd33114 Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/798777 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: cyclestats snapshot permissions rework	Leonid Moiseichuk	2015-09-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Cyclestats snapshot feature is expected for new devices. The detection code was isolated in separate function and run-time check added to validate/allow ioctl calls on the current GPU. Bug 1674079 Change-Id: Icc2f1e5cc50d39b395d31d5292c314f99d67f3eb Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com> Reviewed-on: http://git-master/r/781697 (cherry picked from commit bdd23136b182c933841f91dd2829061e278a46d4) Reviewed-on: http://git-master/r/793630 Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Inject function addresses	Yogesh	2015-08-21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Inject function addresses of gk20a_do_idle and gk20a_do_unidle once the nvgpu module loads. Bug 1476801 Change-Id: I67a8ae7fb654524616c2c2c710013cbc097a3f32 Signed-off-by: Yogesh Bhosale <ybhosale@nvidia.com> Reviewed-on: http://git-master/r/785047 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Fix NS boot transcfg	Supriya	2015-08-21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Bug 1667322 Accommodate for transcfg address change Change-Id: I7054202b8ce3be1a3fbfe0465e662be6f9740eb3 Signed-off-by: Supriya <ssharatkumar@nvidia.com> Reviewed-on: http://git-master/r/780326 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Inject function address from nvgpu	Yogesh	2015-08-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch inserts the function address of gk20a_debug_dump_device into host data struct once the nvgpu module loads and removes it during unload. Bug 1476801 Change-Id: If49262208325b2aa0807705c26086e6d7c81632c Signed-off-by: Yogesh Bhosale <ybhosale@nvidia.com> Reviewed-on: http://git-master/r/779397 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User