nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: support kernel-3.10 version	Deepak Nibade	2016-04-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make necessary changes to support nvgpu on kernel-3.10 This includes below changes - PROBE_PREFER_ASYNCHRONOUS is defined only for K3.10 - Fence handling and struct sync_fence is different between K3.10 and K3.18 - variable status in struct sync_fence is atomic on K3.18 whereas it is int on K3.10 - if SOC == T132, set soc_name = "tegra13x" - ioremap_cache() is not defined on K3.10 ARM versions, hence use ioremap_cached() Bug 200188753 Change-Id: I18d77eb1404e15054e8510d67c9a61c0f1883e2b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1121092 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix compilation error with CONFIG_PM disabled	Deepak Nibade	2016-04-14
\| \| \| \| \| \| \| \| \| \| \|	gk20a_gpu_is_virtual() needs to pass struct device *dev and not pdev which is undefined Change-Id: I8835bb1175efa693b468588e91aaef9e5531d0bc Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1125439 GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: Support GPUs with only one interrupt line	Terje Bergstrom	2016-04-13
\| \| \| \| \| \| \| \| \| \| \|	Not all GPUs have stalling and non-stalling interrupt. Support ones with just one interrupt line. Change-Id: I0f1e8faa5b353b8d1b10691375bd853152379a3a Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120470 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	gpu: nvgpu: Support GPUs with no physical mode	Terje Bergstrom	2016-04-13
\| \| \| \| \| \| \| \| \| \| \|	Support GPUs which cannot choose between SMMU and physical addressing. Change-Id: If3256fa1bc795a84d039ad3aa63ebdccf5cc0afb Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120469 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	gpu: nvgpu: Use device instead of platform_device	Terje Bergstrom	2016-04-08
\| \| \| \| \| \| \| \| \|	Use struct device instead of struct platform_device wherever possible. This allows adding other bus types later. Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120466
*	gpu: nvgpu: Replace CONFIG_PM_RUNTIME with CONFIG_PM	Sumit Singh	2016-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \|	As a result GPU driver unfication, one instance of CONFIG_PM_RUNTIME got introduced. So replacing it with CONFIG_PM. Bug 200188753 Change-Id: If3f55ac32f6800c54e5bf620684b54b39457a6f4 Signed-off-by: Sumit Singh <sumsingh@nvidia.com> Reviewed-on: http://git-master/r/1121650 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add DT support for gpu power-domain	Sumit Singh	2016-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make modification to add DT support for gpu power-domain for T124 chip. Bug 200070810 Bug 200188753 Change-Id: Iac63c8fb5fc5280e9a9f5758e63c9da009f3813d Signed-off-by: Sumit Singh <sumsingh@nvidia.com> Reviewed-on: http://git-master/r/739698 Tested-by: Bharat Nihalani <bnihalani@nvidia.com> (cherry picked from commit 3d62150337d7e071f4713d8af832b268cda0d258) Reviewed-on: http://git-master/r/1121649 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	Revert "gpu: nvgpu: clean-up the code"	Sumit Singh	2016-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As CONFIG_PM_GENERIC_DOMAINS_OF is disabled for l4t, so we have to revert this cleanup, because removed portion of code is getting used by l4t. Bug 200188753 This reverts commit 25f0faeb378a885d2d57f1b667f001edfbea026c. Change-Id: Ib9339b03d5ae55d11597690602802b6f723b7777 Signed-off-by: Sumit Singh <sumsingh@nvidia.com> Reviewed-on: http://git-master/r/1121648 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add channel event id support	Deepak Nibade	2016-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With NVGPU_IOCTL_CHANNEL_EVENTS_CTRL, nvgpu can raise events to User space. But user space cannot distinguish between various types of events. To overcome this, we need finer-grained API to deliver various events to user space. Remove old API NVGPU_IOCTL_CHANNEL_EVENTS_CTRL, and all the support for this API (we can remove this since User space has not started using this API at all) Add new API NVGPU_IOCTL_CHANNEL_EVENT_ID_CTRL which will accept an event_id (like BPT.INT or BPT.PAUSE), a command to enable the event, and return a file descriptor on which we can raise the event (if cmd=enable) Event is disabled when file descriptor is closed Add file operations "gk20a_event_id_ops" to support polling on event fd Also add API gk20a_channel_get_event_data_from_id() to get event_data of event from its id Bug 200089620 Change-Id: I5288f19f38ff49448c46338c33b2a927c9e02254 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1030775 (cherry picked from commit 5721ce2735950440bedc2b86f851db08ed593275) Reviewed-on: http://git-master/r/1120318 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: Add HAL for GPU characteristics	Sami Kiminki	2016-04-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add function pointer for chip specific GPU characteristics init. Bug 1637486 Change-Id: I6ce5eea124d8057393dec6e86e72412cc87e1cfa Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Signed-off-by: Adeel Raza <araza@nvidia.com> Reviewed-on: http://git-master/r/780535 (cherry picked from commit f5c240d6ed19b5b9eedff05767c885ad5812c71e) Reviewed-on: http://git-master/r/1120428 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	Merge branch 'PM_RUNTIME-Removal' into 'dev-kernel-3.18'	Sumit Singh	2016-03-30
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change performs merge of 'PM_RUNTIME_Removal' dev-branch with 'dev-kernel-3.18' branch. It replaces CONFIG_PM_RUNTIME with CONFIG_PM. JIRA TPM-704 Change-Id: I306e254716f275c283f727fc232d7244939542b6 Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
\| *	gpu: nvgpu: Replace CONFIG_PM_RUNTIME with CONFIG_PM	Sumit Singh	2016-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After commit b2b49ccbdd54 (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is selected) PM_RUNTIME is always set if PM is set, so #ifdef blocks depending on CONFIG_PM_RUNTIME may now be changed to depend on CONFIG_PM. Replace CONFIG_PM_RUNTIME with CONFIG_PM everywhere under drivers/gpu/nvgpu/. JIRA TPM-704 Change-Id: I23965838ff6ec77829076cd834e87641fb68e268 Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
* \|	gpu: nvgpu: wait for 500 usec before ce reset	Seshendra Gadagottu	2016-03-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Wait for 500 usec before ce reset to ensure that no memory outstanding requests are pending. Bug 1699365 Change-Id: I9f73f87cbbdca0208e95ebaee32dd1f764a3cd4f Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1116679 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* \|	gpu: nvgpu: split address space for fixed allocs	Alex Waterman	2016-03-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow a special address space node to be split out from the user adress space or fixed allocations. A debugfs node, /d/<gpu>/separate_fixed_allocs Controls this feature. To enable it: # echo <SPLIT_ADDR> > /d/<gpu>/separate_fixed_allocs Where <SPLIT_ADDR> is the address to do the split on in the GVA address range. This will cause the split to be made in all subsequent address space ranges that get created until it is turned off. To turn this off just echo 0x0 into the same debugfs node. Change-Id: I21a3f051c635a90a6bfa8deae53a54db400876f9 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1030303 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* \|	gpu: nvgpu: Add support for FECS ctxsw tracing	Anton Vorontsov	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 1648908 This commit adds support for FECS ctxsw tracing. Code is compiled conditionnaly under CONFIG_GK20_CTXSW_TRACE. This feature requires an updated FECS ucode that writes one record to a ring buffer on each context switch. On RM/Kernel side, the GPU driver reads records from the master ring buffer and generates trace entries into a user-facing VM ring buffer. For each record in the master ring buffer, RM/Kernel has to retrieve the vmid+pid of the user process that submitted related work. Features currently implemented: - master ring buffer allocation - debugfs to dump master ring buffer - FECS record per context switch (with both current and new contexts) - dedicated device for ctxsw tracing (access to VM ring buffer) - SOF generation (and access to PTIMER) - VM ring buffer allocation, and reconfiguration - enable/disable tracing at user level - event-based trace filtering - context_ptr to vmid+pid mapping - read system call for ctxsw dev - mmap system call for ctxsw dev (direct access to VM ring buffer) - poll system call for ctxsw dev - save/restore register on ELPG/CG6 - separate user ring from FECS ring handling Features requiring ucode changes: - enable/disable tracing at FECS level - actual busy time on engine (bug 1642354) - master ring buffer threshold interrupt (P1) - API for GPU to CPU timestamp conversion (P1) - vmid/pid/uid based filtering (P1) Change-Id: I8e39c648221ee0fa09d5df8524b03dca83fe24f3 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1022737 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* \|	gpu: nvgpu: Make use of reset controller optional	Terje Bergstrom	2016-03-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reset controller is not enabled in all builds, so make its use optional. Change-Id: I88df11d0aae0552eb4c7f3acee5be70885ea2901 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1028348
* \|	gpu: nvgpu: improve channel interleave support	Aingara Paramakuru	2016-03-15
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, only "high" priority bare channels were interleaved between all other bare channels and TSGs. This patch decouples priority from interleaving and introduces 3 levels for interleaving a bare channel or TSG: high, medium, and low. The levels define the number of times a channel or TSG will appear on a runlist (see nvgpu.h for details). By default, all bare channels and TSGs are set to interleave level low. Userspace can then request the interleave level to be increased via the CHANNEL_SET_RUNLIST_INTERLEAVE ioctl (TSG-specific ioctl will be added later). As timeslice settings will soon be coming from userspace, the default timeslice for "high" priority channels has been restored. JIRA VFND-1302 Bug 1729664 Change-Id: I178bc1cecda23f5002fec6d791e6dcaedfa05c0c Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1014962 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: disable ELPG while accessing gr_gpcs_tpcs_sm_sch_macro_sched_r	Thomas Fleury	2016-03-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 200139995 Any GR register access should disable ELPG and clock gating before access and enable it back after it is done. Disable ELPG while tweaking perf parameters in gk20a_alloc_obj_ctx. Also output NV_PBUS_INTR_0 in case of interrupt (including fix to display correct value on pbus isr). Change-Id: I81d2eb4461e92fbb33db8554779f6566f6b002c1 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/835307 (cherry picked from commit 6acc35bd1bcc706fbde8d11521cf1d0f64a16fe4) Reviewed-on: http://git-master/r/921299 (cherry picked from commit 73afd520445bb1f4757fd167b38289143fd46d80) Reviewed-on: http://git-master/r/930040 (cherry picked from commit 7a784ebea0dd60a88469f51eaa61c33b356e499c) Reviewed-on: http://git-master/r/1023529 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: LRF, TEX, LTC, DRAM override	Supriya	2016-02-26
\| \| \| \| \| \| \| \| \| \| \| \|	- Adding support for FECS mem overrides Bug 1699676 Change-Id: I6c9ddcd98d57b29059513ee508c6f92b194c4fc7 Signed-off-by: Supriya <ssharatkumar@nvidia.com> Reviewed-on: http://git-master/r/921253 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: check null when call clk_round_rate	Richard Zhao	2016-02-16
\| \| \| \| \| \| \| \| \| \| \|	Bug 1726406 Change-Id: Ia03b0a174e92b28c471164cefcde514e6db94bdf Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1002700 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: add characteristics flag NVGPU_GPU_FLAGS_SUPPORT_TSG	Richard Zhao	2016-02-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NVGPU_GPU_FLAGS_SUPPORT_TSG indicates both the kernel driver and device support time slice group (TSG). Bug 1617046 Bug 200155618 Change-Id: Ib3490a32b773222560c58f1fd6d32bffcb97d6cd Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1010173 Reviewed-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: add max freq to gpu characteristics	Deepak Nibade	2016-02-02
\| \| \| \| \| \| \| \| \| \|	Bug 200097029 Change-Id: Id63dad1629b1d1919cbbfb20b0cb85d4855f526d Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1000724 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix race condition with poweroff	Seshendra Gadagottu	2016-01-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When gpu rail-gating is enabled, it is possible that both rail gating code and system shudown can start executing gk20a_pm_prepare_poweroff() in parallel. To synchronize this execution, protect gk20a_pm_prepare_poweroff() with a mutex lock. Bug 200168805 Change-Id: I19536a43ed20c3e82b32c316922dc3e19e3f59bb Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/999548 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: suspend cde cleanly	Seshendra Gadagottu	2016-01-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Few times cde is getting deadlocked because of pending cde operation. So do the things cleanly, first suspend cde then do channel suspend. Bug 1709757 Change-Id: Iaf566b63d9efb13aa2691c19e2df676c70f26afc Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/926574 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	drivers: allow selected drivers to async probe	dmitry pervushin	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	List of drivers that now want async probe: - sdhci-tegra - qspi-mtd - nvmap - gk20a - dc Bug 200083391 Change-Id: Ie0a0677961b704c78d4eb2cdab9f0e9a925a3ca1 Reviewed-on: http://git-master/r/923738 (cherry-picked from 75c067e83c7cde2a37c4fae01719e40c5b7d2835) Signed-off-by: dmitry pervushin <dpervushin@nvidia.com> Reviewed-on: http://git-master/r/923121 Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com> Tested-by: Sumeet Gupta <sumeetg@nvidia.com>
*	gpu: nvgpu: add high priority channel interleave	Peter Pipkorn	2016-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Interleave all high priority channels between all other channels. This reduces the latency for high priority work when there are a lot of lower priority work present, imposing an upper bound on the latency. Change the default high priority timeslice from 5.2ms to 3.0 in the process, to prevent long running high priority apps from hogging the GPU too much. Introduce a new debugfs node to enable/disable high priority channel interleaving. It is currently enabled by default. Adds new runlist length max register, used for allocating suitable sized runlist. Limit the number of interleaved channels to 32. This change reduces the maximum time a lower priority job is running (one timeslice) before we check that high priority jobs are running. Tested with gles2_context_priority (still passes) Basic sanity testing is done with graphics_submit (one app is high priority) Also more functional testing using lots of parallel runs with: NVRM_GPU_CHANNEL_PRIORITY=3 ./gles2_expensive_draw –drawsperframe 20000 –triangles 50 –runtime 30 –finish plus multiple: NVRM_GPU_CHANNEL_PRIORITY=2 ./gles2_expensive_draw –drawsperframe 20000 –triangles 50 –runtime 30 -finish Previous to this change, the relative performance between high priority work and normal priority work comes down to timeslice value. This means that when there are many low priority channels, the high priority work will still drop quite a lot. But with this change, the high priority work will roughly get about half the entire GPU time, meaning that after the initial lower performance, it is less likely to get lower in performance due to more apps running on the system. This change makes a large step towards real priority levels. It is not perfect and there are no guarantees on anything, but it is a step forwards without any additional CPU overhead or other complications. It will also serve as a baseline to judge other algorithms against. Support for priorities with TSG is future work. Support for interleave mid + high priority channels, instead of just high, is also future work. Bug 1419900 Change-Id: I0f7d0ce83b6598fe86000577d72e14d312fdad98 Signed-off-by: Peter Pipkorn <ppipkorn@nvidia.com> Reviewed-on: http://git-master/r/805961 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: IOCTL to disable watchdog per-channel	Deepak Nibade	2015-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add IOCTL NVGPU_IOCTL_CHANNEL_WDT to disable/enable watchdog per-channel Also, if watchdog is disabled, we currently schedule the worker with MAX timeout. Instead of this, do not schedule any worker if watchdog is disabled Bug 1683059 Bug 1700277 Change-Id: I7f6bec84adeedb74e014ed6d1471317b854df84c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/837962 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix Coverity issues	Deepak Nibade	2015-11-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- operands not affecting result (id = 12845) - logically dead code (id = 12890) - dereference after null check (id = 12968) - unsigned compared to 0 (id = 13176) - resource leak (id = 13338, 18673) - unused pointer value (id = 13916) Bug 1703084 Change-Id: I2f401dd93126af27748c53fa1b3a59cb154af36b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/835143 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: abstract set mmu debug mode	Richard Zhao	2015-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new operaton g->ops.mm.set_debug_mode and let other places that set debug mode call this callback. It's preparing for adding vgpu set mmu debug mode hook. JIRA VFND-1005 Bug 1594604 Change-Id: I1d227a0c0f96adb0035ae16ae1f4fbfa739bf0a7 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/833497 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: update name for gpu debugfs node	Mahantesh Kumbar	2015-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create constant name for gpu debugfs node across all chip. Bug n/a Change-Id: I359b82b5389c49d8fe2a31ace49ff6daa1edfb10 Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/805397 Signed-off-by: Seema Khowala <seemaj@nvidia.com> (cherry-picked from commit 17a3882cde09412c68f7a0ee4765f45be1a51c45) Reviewed-on: http://git-master/r/817014
*	gpu: nvgpu: User-space managed address space support	Sami Kiminki	2015-11-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which enables creating userspace-managed GPU address spaces. When an address space is marked as userspace-managed, the following changes are in effect: - Only fixed-address mappings are allowed. - VA space allocation for fixed-address mappings is not required, except to mark space as sparse. - Maps and unmaps are always immediate. In particular, the mapping ref increments at kickoffs and decrements at job completion are skipped. Bug 1614735 Bug 1623949 Bug 1660392 Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/738558 Reviewed-on: http://git-master/r/833253 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: use railgate instead of reset_assert	Deepak Nibade	2015-10-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_do_idle(), if can_railgate = false, we do reset_assert() on the GPU But asserting reset might have dependencies on specific h/w seqeunce for ensuring proper reset Hence avoid calling reset_assert() and directly call platform specific unrailgate() and railgate() APIs which take care of the correct reset sequence Bug 200142989 Bug 200137963 Bug 1678611 Change-Id: Ide886dd88b8422ad36de52d54378b1edd9c7bbd6 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/820322 (cherry picked from commit d621ddd49da976a75c14aa7aaa37f700fb4e83f2) Reviewed-on: http://git-master/r/822515 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: unrailgate only if pm_domains not enabled	Deepak Nibade	2015-10-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we unrailgate the GPU if railgating is not enabled or pm_domains are not enabled But in case if railgating is not enabled and pm_domains are enabled, we explicitly unrailgate GPU in gk20a_pm_init() and then runtime PM unrailgates it again when first user space request arrives - setting unrailgate refcount to 2 Now for gk20a_do_idle(), we need to railgate the GPU in fist call but that does not happen since unrailgate refcount != 1 hence, in case railgating is not enabled, we should unrailgate the GPU from only one place i.e. when first user space request arrives Bug 200142989 Bug 200137963 Bug 1678611 Reviewed-on: http://git-master/r/820321 (cherry picked from commit 452a1ff8da8e3f47caed2371440f9ad150bf8699) Change-Id: Ic9fe2267c9df5629315c30c1404c2b3044c1265a Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/822296 GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: rework secure_page allocation path	Deepak Nibade	2015-10-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, if can_railgate = false, then we have below sequence to allocate secure_page - unrailgate GPU (forever) - reset_assert() - allocate secure_page - reset_deassert() But if we allocate secure page even before unrailgating GPU for first time, then we can avoid reset_assert()/deassert() calls since GPU should already be in reset/railgated at boot time hence, rework this sequence as below - init required mutex, set platform->reset_control - allocate secure page (GPU should already be in reset at this point) - gk20a_pm_init() which unrailgates GPU in case of can_railgate = false Bug 200137963 Bug 1678611 Change-Id: I79d0543bb5cf1eaf1009e1e6ac142532d84514a5 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/797153 (cherry picked from commit 368004501943d38c003747f6bec0384fed57ee65) Reviewed-on: http://git-master/r/816005 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: set correct timeslice value	Kirill Artamonov	2015-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Scale timeslice register value based on platform specific ptimer scale koefficient. Expose timeslice values through debugfs to simplify performance tuning. bug 1605552 bug 1603226 Change-Id: I49f86f22d58d26a366ee1b5f5a9ab9d7f896ad25 Signed-off-by: Kirill Artamonov <kartamonov@nvidia.com> Reviewed-on: http://git-master/r/800007 (cherry picked from commit 00c85ef24cf28ffaa81eb53fff7edef1c699220a) Reviewed-on: http://git-master/r/808251 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support reset_control API	Deepak Nibade	2015-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug 200137963 Change-Id: I3197af905c945540b97ba191e5695d970d77af8e Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/797154 (cherry picked from commit 8a50245ea636deb87a3d9435fb115b4eac88fac9) Reviewed-on: http://git-master/r/808247 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: let shutdown callback call vgpu_pm_prepare_poweroff for vgpu	Richard Zhao	2015-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It fixed the issue that system hang when reboot. Bug 1638850 Change-Id: If53a31e86c10b2fce4a22fe4fcf92106d86c95ef Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/803234 (cherry picked from commit 4dbea2c7037a5244ccb9d6e886023c29ba584892) Reviewed-on: http://git-master/r/808245 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: create debugfs node early	Kirill Artamonov	2015-09-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create debugfs node before platform->probe() is called. Allow chip specific debugfs entries go to correct directory. bug 1525327 bug 1581799 Change-Id: I2d91bdc1e72dac6787938eff01218c9f871029cb Signed-off-by: Kirill Artamonov <kartamonov@nvidia.com> Reviewed-on: http://git-master/r/796092 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/778729 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: implement per-channel watchdog	Deepak Nibade	2015-09-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement per-channel watchdog/timer as per below rules : - start the timer while submitting first job on channel or if no timer is already running - cancel the timer when job completes - re-start the timer if there is any incomplete job left in the channel's queue - trigger appropriate recovery method as part of timeout handling mechanism Handle the timeout as per below : - get timed out channel, and job data - disable activity on all engines - check if fence is really pending - get information on failing engine - if no engine is failing, just abort the channel - if engine is failing, trigger the recovery Also, add flag "ch_wdt_enabled" to enable/disable channel watchdog mechanism. Watchdog can also be disabled using global flag "timeouts_enabled" Set the watchdog time to be 5s using macro NVGPU_CHANNEL_WATCHDOG_DEFAULT_TIMEOUT_MS Bug 200133289 Change-Id: I401cf14dd34a210bc429f31bd5216a361edf1237 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/797072 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: cyclestats snapshot permissions rework	Leonid Moiseichuk	2015-09-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Cyclestats snapshot feature is expected for new devices. The detection code was isolated in separate function and run-time check added to validate/allow ioctl calls on the current GPU. Bug 1674079 Change-Id: Icc2f1e5cc50d39b395d31d5292c314f99d67f3eb Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com> Reviewed-on: http://git-master/r/781697 (cherry picked from commit bdd23136b182c933841f91dd2829061e278a46d4) Reviewed-on: http://git-master/r/793630 Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: dump PGRAPH_PRI on error	Sam Payne	2015-08-31
\| \| \| \| \| \| \| \| \| \| \| \| \|	dumps NV_PGRAPH_PRI_GPC0_GPCCS_FS_GPC whenever pbus sends the 0xbadf13 error bug 1662268 Change-Id: I302ffe5c86098e7235ecc8c071a5e2c852455565 Signed-off-by: Sam Payne <spayne@nvidia.com> Reviewed-on: http://git-master/r/789090 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Inject function addresses	Yogesh	2015-08-21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Inject function addresses of gk20a_do_idle and gk20a_do_unidle once the nvgpu module loads. Bug 1476801 Change-Id: I67a8ae7fb654524616c2c2c710013cbc097a3f32 Signed-off-by: Yogesh Bhosale <ybhosale@nvidia.com> Reviewed-on: http://git-master/r/785047 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	Merge branch 'power-domain-t186' into 'kernel-3.18'	Sumit Singh	2015-08-04
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add device-tree support for tegra power-domains and power-gating for t186, then perform the related cleanup. Also enable TEGRA_MC_DOMIANS, PM_GENERIC_DOMAINS_OF and TEGRA_POWERGATE for t186. Bug 200105664 Change-Id: I548c6b71a1577afa439a39a0eafc317a1c3cbc68 Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
\| *	gpu: nvgpu: clean-up the code	Sumit Singh	2015-07-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As CONFIG_PM_GENERIC_DOMAINS_OF is enabled, so cleaning-up the code which remains unused when this config is enabled. Bug 200070810 Change-Id: I884ca3d6fb8fa6acdff8c1b2fbe66a672758274a Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
\| *	gpu: nvgpu: Add DT support for gpu power-domain	Sumit Singh	2015-07-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make modification to add DT support for gpu power-domain for T186 chip. Bug 200105664 Change-Id: Ief8d0a6c84918578c52d153db7eac02587b67ee7 Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
* \|	gpu: nvgpu: prepare_poweroff() in shutdown()	Deepak Nibade	2015-07-22
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gk20a_pm_shutdown() is the last callback before GPU railgate will be forced by platform code Hence we need to call prepare_poweroff() before returning from shutdown() to clean up below things mainly, 1. disable interrupts to ensure that GPU is not processing any interrupts while railgating 2. disable clocks (and related flags) to ensure no h/w access from exported clock ops Note that GPU railgate will be triggered by platform code since config CONFIG_PM_GENERIC_DOMAINS_OF is enabled by default Bug 200123584 Change-Id: Ifaa0d1ba9b01d49bf5cc85d9c9a9feb3815866d8 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/770485 Reviewed-by: Prashant Gaikwad <pgaikwad@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: cyclestats snapshots are only for t210	Leonid Moiseichuk	2015-07-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cyclestats mode-e feature supported by userspace only for t210 devices, so kernel should advertize it only for t210. Also small check added to prevent BUG in dma-buf.c:826 if device has lack of memory. Bug 1662506 Change-Id: I8417a8cdd9092e64126382f379d171932e4592a1 Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com> Reviewed-on: http://git-master/r/767073 (cherry picked from commit 06f86b6e78bae5e26e32466716c18e7918efb1b1) Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com> Reviewed-on: http://git-master/r/767148 Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: Move clk bypass div code to clk init	Terje Bergstrom	2015-07-03
\| \| \| \| \| \| \| \| \| \| \| \| \|	Clock bypass divider was changed just before resetting priv ring. Move the code to a new clk op instead so that it is executed only on gk20a. Change-Id: Ic8084a4a5fac23770f50b50f910ced2543ba0f28 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/764970 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Aleksandr Frid <afrid@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
*	gpu: nvgpu: Initial MAP_BUFFER_BATCH implementation	Sami Kiminki	2015-06-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add batch support for mapping and unmapping. Batching essentially helps transform some per-map/unmap overhead to per-batch overhead, namely gk20a_busy()/gk20a_idle() calls, GPU L2 flushes, and GPU TLB invalidates. Batching with size 64 has been measured to yield >20x speed-up in low-level fixed-address mapping microbenchmarks. Bug 1614735 Bug 1623949 Change-Id: Ie22b9caea5a7c3fc68a968d1b7f8488dfce72085 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/733231 (cherry picked from commit de4a7cfb93e8228a4a0c6a2815755a8df4531c91) Reviewed-on: http://git-master/r/763812 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Uncomment suspend/resume ops	Sumit Singh	2015-06-29
\| \| \| \| \| \| \| \| \| \|	As upstream has removed them, but we are still using these. So uncommenting these callback assignment. Bug 200070810 Change-Id: I26a221f9d76f6acef70095eb8afcf440057f464c Signed-off-by: Sumit Singh <sumsingh@nvidia.com>