nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	drivers: allow selected drivers to async probe	dmitry pervushin	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	List of drivers that now want async probe: - sdhci-tegra - qspi-mtd - nvmap - gk20a - dc Bug 200083391 Change-Id: Ie0a0677961b704c78d4eb2cdab9f0e9a925a3ca1 Reviewed-on: http://git-master/r/923738 (cherry-picked from 75c067e83c7cde2a37c4fae01719e40c5b7d2835) Signed-off-by: dmitry pervushin <dpervushin@nvidia.com> Reviewed-on: http://git-master/r/923121 Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com> Tested-by: Sumeet Gupta <sumeetg@nvidia.com>
*	gpu: nvgpu: add high priority channel interleave	Peter Pipkorn	2016-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Interleave all high priority channels between all other channels. This reduces the latency for high priority work when there are a lot of lower priority work present, imposing an upper bound on the latency. Change the default high priority timeslice from 5.2ms to 3.0 in the process, to prevent long running high priority apps from hogging the GPU too much. Introduce a new debugfs node to enable/disable high priority channel interleaving. It is currently enabled by default. Adds new runlist length max register, used for allocating suitable sized runlist. Limit the number of interleaved channels to 32. This change reduces the maximum time a lower priority job is running (one timeslice) before we check that high priority jobs are running. Tested with gles2_context_priority (still passes) Basic sanity testing is done with graphics_submit (one app is high priority) Also more functional testing using lots of parallel runs with: NVRM_GPU_CHANNEL_PRIORITY=3 ./gles2_expensive_draw –drawsperframe 20000 –triangles 50 –runtime 30 –finish plus multiple: NVRM_GPU_CHANNEL_PRIORITY=2 ./gles2_expensive_draw –drawsperframe 20000 –triangles 50 –runtime 30 -finish Previous to this change, the relative performance between high priority work and normal priority work comes down to timeslice value. This means that when there are many low priority channels, the high priority work will still drop quite a lot. But with this change, the high priority work will roughly get about half the entire GPU time, meaning that after the initial lower performance, it is less likely to get lower in performance due to more apps running on the system. This change makes a large step towards real priority levels. It is not perfect and there are no guarantees on anything, but it is a step forwards without any additional CPU overhead or other complications. It will also serve as a baseline to judge other algorithms against. Support for priorities with TSG is future work. Support for interleave mid + high priority channels, instead of just high, is also future work. Bug 1419900 Change-Id: I0f7d0ce83b6598fe86000577d72e14d312fdad98 Signed-off-by: Peter Pipkorn <ppipkorn@nvidia.com> Reviewed-on: http://git-master/r/805961 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: IOCTL to disable watchdog per-channel	Deepak Nibade	2015-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add IOCTL NVGPU_IOCTL_CHANNEL_WDT to disable/enable watchdog per-channel Also, if watchdog is disabled, we currently schedule the worker with MAX timeout. Instead of this, do not schedule any worker if watchdog is disabled Bug 1683059 Bug 1700277 Change-Id: I7f6bec84adeedb74e014ed6d1471317b854df84c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/837962 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix Coverity issues	Deepak Nibade	2015-11-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- operands not affecting result (id = 12845) - logically dead code (id = 12890) - dereference after null check (id = 12968) - unsigned compared to 0 (id = 13176) - resource leak (id = 13338, 18673) - unused pointer value (id = 13916) Bug 1703084 Change-Id: I2f401dd93126af27748c53fa1b3a59cb154af36b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/835143 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: abstract set mmu debug mode	Richard Zhao	2015-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new operaton g->ops.mm.set_debug_mode and let other places that set debug mode call this callback. It's preparing for adding vgpu set mmu debug mode hook. JIRA VFND-1005 Bug 1594604 Change-Id: I1d227a0c0f96adb0035ae16ae1f4fbfa739bf0a7 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/833497 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: update name for gpu debugfs node	Mahantesh Kumbar	2015-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create constant name for gpu debugfs node across all chip. Bug n/a Change-Id: I359b82b5389c49d8fe2a31ace49ff6daa1edfb10 Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/805397 Signed-off-by: Seema Khowala <seemaj@nvidia.com> (cherry-picked from commit 17a3882cde09412c68f7a0ee4765f45be1a51c45) Reviewed-on: http://git-master/r/817014
*	gpu: nvgpu: User-space managed address space support	Sami Kiminki	2015-11-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which enables creating userspace-managed GPU address spaces. When an address space is marked as userspace-managed, the following changes are in effect: - Only fixed-address mappings are allowed. - VA space allocation for fixed-address mappings is not required, except to mark space as sparse. - Maps and unmaps are always immediate. In particular, the mapping ref increments at kickoffs and decrements at job completion are skipped. Bug 1614735 Bug 1623949 Bug 1660392 Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/738558 Reviewed-on: http://git-master/r/833253 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: use railgate instead of reset_assert	Deepak Nibade	2015-10-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_do_idle(), if can_railgate = false, we do reset_assert() on the GPU But asserting reset might have dependencies on specific h/w seqeunce for ensuring proper reset Hence avoid calling reset_assert() and directly call platform specific unrailgate() and railgate() APIs which take care of the correct reset sequence Bug 200142989 Bug 200137963 Bug 1678611 Change-Id: Ide886dd88b8422ad36de52d54378b1edd9c7bbd6 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/820322 (cherry picked from commit d621ddd49da976a75c14aa7aaa37f700fb4e83f2) Reviewed-on: http://git-master/r/822515 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: unrailgate only if pm_domains not enabled	Deepak Nibade	2015-10-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we unrailgate the GPU if railgating is not enabled or pm_domains are not enabled But in case if railgating is not enabled and pm_domains are enabled, we explicitly unrailgate GPU in gk20a_pm_init() and then runtime PM unrailgates it again when first user space request arrives - setting unrailgate refcount to 2 Now for gk20a_do_idle(), we need to railgate the GPU in fist call but that does not happen since unrailgate refcount != 1 hence, in case railgating is not enabled, we should unrailgate the GPU from only one place i.e. when first user space request arrives Bug 200142989 Bug 200137963 Bug 1678611 Reviewed-on: http://git-master/r/820321 (cherry picked from commit 452a1ff8da8e3f47caed2371440f9ad150bf8699) Change-Id: Ic9fe2267c9df5629315c30c1404c2b3044c1265a Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/822296 GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: rework secure_page allocation path	Deepak Nibade	2015-10-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, if can_railgate = false, then we have below sequence to allocate secure_page - unrailgate GPU (forever) - reset_assert() - allocate secure_page - reset_deassert() But if we allocate secure page even before unrailgating GPU for first time, then we can avoid reset_assert()/deassert() calls since GPU should already be in reset/railgated at boot time hence, rework this sequence as below - init required mutex, set platform->reset_control - allocate secure page (GPU should already be in reset at this point) - gk20a_pm_init() which unrailgates GPU in case of can_railgate = false Bug 200137963 Bug 1678611 Change-Id: I79d0543bb5cf1eaf1009e1e6ac142532d84514a5 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/797153 (cherry picked from commit 368004501943d38c003747f6bec0384fed57ee65) Reviewed-on: http://git-master/r/816005 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: set correct timeslice value	Kirill Artamonov	2015-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Scale timeslice register value based on platform specific ptimer scale koefficient. Expose timeslice values through debugfs to simplify performance tuning. bug 1605552 bug 1603226 Change-Id: I49f86f22d58d26a366ee1b5f5a9ab9d7f896ad25 Signed-off-by: Kirill Artamonov <kartamonov@nvidia.com> Reviewed-on: http://git-master/r/800007 (cherry picked from commit 00c85ef24cf28ffaa81eb53fff7edef1c699220a) Reviewed-on: http://git-master/r/808251 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support reset_control API	Deepak Nibade	2015-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug 200137963 Change-Id: I3197af905c945540b97ba191e5695d970d77af8e Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/797154 (cherry picked from commit 8a50245ea636deb87a3d9435fb115b4eac88fac9) Reviewed-on: http://git-master/r/808247 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: let shutdown callback call vgpu_pm_prepare_poweroff for vgpu	Richard Zhao	2015-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It fixed the issue that system hang when reboot. Bug 1638850 Change-Id: If53a31e86c10b2fce4a22fe4fcf92106d86c95ef Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/803234 (cherry picked from commit 4dbea2c7037a5244ccb9d6e886023c29ba584892) Reviewed-on: http://git-master/r/808245 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: create debugfs node early	Kirill Artamonov	2015-09-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create debugfs node before platform->probe() is called. Allow chip specific debugfs entries go to correct directory. bug 1525327 bug 1581799 Change-Id: I2d91bdc1e72dac6787938eff01218c9f871029cb Signed-off-by: Kirill Artamonov <kartamonov@nvidia.com> Reviewed-on: http://git-master/r/796092 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/778729 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: implement per-channel watchdog	Deepak Nibade	2015-09-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement per-channel watchdog/timer as per below rules : - start the timer while submitting first job on channel or if no timer is already running - cancel the timer when job completes - re-start the timer if there is any incomplete job left in the channel's queue - trigger appropriate recovery method as part of timeout handling mechanism Handle the timeout as per below : - get timed out channel, and job data - disable activity on all engines - check if fence is really pending - get information on failing engine - if no engine is failing, just abort the channel - if engine is failing, trigger the recovery Also, add flag "ch_wdt_enabled" to enable/disable channel watchdog mechanism. Watchdog can also be disabled using global flag "timeouts_enabled" Set the watchdog time to be 5s using macro NVGPU_CHANNEL_WATCHDOG_DEFAULT_TIMEOUT_MS Bug 200133289 Change-Id: I401cf14dd34a210bc429f31bd5216a361edf1237 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/797072 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: cyclestats snapshot permissions rework	Leonid Moiseichuk	2015-09-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Cyclestats snapshot feature is expected for new devices. The detection code was isolated in separate function and run-time check added to validate/allow ioctl calls on the current GPU. Bug 1674079 Change-Id: Icc2f1e5cc50d39b395d31d5292c314f99d67f3eb Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com> Reviewed-on: http://git-master/r/781697 (cherry picked from commit bdd23136b182c933841f91dd2829061e278a46d4) Reviewed-on: http://git-master/r/793630 Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: dump PGRAPH_PRI on error	Sam Payne	2015-08-31
\| \| \| \| \| \| \| \| \| \| \| \| \|	dumps NV_PGRAPH_PRI_GPC0_GPCCS_FS_GPC whenever pbus sends the 0xbadf13 error bug 1662268 Change-Id: I302ffe5c86098e7235ecc8c071a5e2c852455565 Signed-off-by: Sam Payne <spayne@nvidia.com> Reviewed-on: http://git-master/r/789090 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Inject function addresses	Yogesh	2015-08-21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Inject function addresses of gk20a_do_idle and gk20a_do_unidle once the nvgpu module loads. Bug 1476801 Change-Id: I67a8ae7fb654524616c2c2c710013cbc097a3f32 Signed-off-by: Yogesh Bhosale <ybhosale@nvidia.com> Reviewed-on: http://git-master/r/785047 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	Merge branch 'power-domain-t186' into 'kernel-3.18'	Sumit Singh	2015-08-04
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add device-tree support for tegra power-domains and power-gating for t186, then perform the related cleanup. Also enable TEGRA_MC_DOMIANS, PM_GENERIC_DOMAINS_OF and TEGRA_POWERGATE for t186. Bug 200105664 Change-Id: I548c6b71a1577afa439a39a0eafc317a1c3cbc68 Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
\| *	gpu: nvgpu: clean-up the code	Sumit Singh	2015-07-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As CONFIG_PM_GENERIC_DOMAINS_OF is enabled, so cleaning-up the code which remains unused when this config is enabled. Bug 200070810 Change-Id: I884ca3d6fb8fa6acdff8c1b2fbe66a672758274a Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
\| *	gpu: nvgpu: Add DT support for gpu power-domain	Sumit Singh	2015-07-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make modification to add DT support for gpu power-domain for T186 chip. Bug 200105664 Change-Id: Ief8d0a6c84918578c52d153db7eac02587b67ee7 Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
* \|	gpu: nvgpu: prepare_poweroff() in shutdown()	Deepak Nibade	2015-07-22
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gk20a_pm_shutdown() is the last callback before GPU railgate will be forced by platform code Hence we need to call prepare_poweroff() before returning from shutdown() to clean up below things mainly, 1. disable interrupts to ensure that GPU is not processing any interrupts while railgating 2. disable clocks (and related flags) to ensure no h/w access from exported clock ops Note that GPU railgate will be triggered by platform code since config CONFIG_PM_GENERIC_DOMAINS_OF is enabled by default Bug 200123584 Change-Id: Ifaa0d1ba9b01d49bf5cc85d9c9a9feb3815866d8 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/770485 Reviewed-by: Prashant Gaikwad <pgaikwad@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: cyclestats snapshots are only for t210	Leonid Moiseichuk	2015-07-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cyclestats mode-e feature supported by userspace only for t210 devices, so kernel should advertize it only for t210. Also small check added to prevent BUG in dma-buf.c:826 if device has lack of memory. Bug 1662506 Change-Id: I8417a8cdd9092e64126382f379d171932e4592a1 Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com> Reviewed-on: http://git-master/r/767073 (cherry picked from commit 06f86b6e78bae5e26e32466716c18e7918efb1b1) Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com> Reviewed-on: http://git-master/r/767148 Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: Move clk bypass div code to clk init	Terje Bergstrom	2015-07-03
\| \| \| \| \| \| \| \| \| \| \| \| \|	Clock bypass divider was changed just before resetting priv ring. Move the code to a new clk op instead so that it is executed only on gk20a. Change-Id: Ic8084a4a5fac23770f50b50f910ced2543ba0f28 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/764970 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Aleksandr Frid <afrid@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
*	gpu: nvgpu: Initial MAP_BUFFER_BATCH implementation	Sami Kiminki	2015-06-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add batch support for mapping and unmapping. Batching essentially helps transform some per-map/unmap overhead to per-batch overhead, namely gk20a_busy()/gk20a_idle() calls, GPU L2 flushes, and GPU TLB invalidates. Batching with size 64 has been measured to yield >20x speed-up in low-level fixed-address mapping microbenchmarks. Bug 1614735 Bug 1623949 Change-Id: Ie22b9caea5a7c3fc68a968d1b7f8488dfce72085 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/733231 (cherry picked from commit de4a7cfb93e8228a4a0c6a2815755a8df4531c91) Reviewed-on: http://git-master/r/763812 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Uncomment suspend/resume ops	Sumit Singh	2015-06-29
\| \| \| \| \| \| \| \| \| \|	As upstream has removed them, but we are still using these. So uncommenting these callback assignment. Bug 200070810 Change-Id: I26a221f9d76f6acef70095eb8afcf440057f464c Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
*	gpu: nvgpu: Add DT support for gpu power-domain for T132	Sumit Singh	2015-06-29
\| \| \| \| \| \| \| \| \| \| \| \| \|	Make modification to add DT support for gpu power-domain for T132 chip. Bug 200070810 Change-Id: Iac63c8fb5fc5280e9a9f5758e63c9da009f3813d Signed-off-by: Sumit Singh <sumsingh@nvidia.com> Reviewed-on: http://git-master/r/739698 Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com> Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
*	Revert "HACK: Disable genpd_pm_subdomain_attach"	Sumit Singh	2015-06-29
\| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 83699a4ec9ebf55f6cc12c76e57dad1d4ec2fbfa. This hack was put in place as upstream has removed of_node field from generic_pm_domain structure. But as we are still using it, so removing this hack. Bug 200100078 Change-Id: I14e533786fb814e361c580e2883ceff1f63d251f
*	gpu: nvgpu: add per-channel refcounting	Konsta Holtta	2015-06-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add reference counting for channels, and wait for reference count to get to 0 in gk20a_channel_free() before actually freeing the channel. Also, change free channel tracking a bit by employing a list of free channels, which simplifies the procedure of finding available channels with reference counting. Each use of a channel must have a reference taken before use or held by the caller. Taking a reference of a wild channel pointer may fail, if the channel is either not opened or in a process of being closed. Also, add safeguards for protecting accidental use of closed channels, specifically, by setting ch->g = NULL in channel free. This will make it obvious if freed channel is attempted to be used. The last user of a channel might be the deferred interrupt handler, so wait for deferred interrupts to be processed twice in the channel free procedure: once for providing last notifications to the channel and once to make sure there are no stale pointers left after referencing to the channel has been denied. Finally, fix some races in channel and TSG force reset IOCTL path, by pausing the channel scheduler in gk20a_fifo_recover_ch() and gk20a_fifo_recover_tsg(), while the affected engines have been identified, the appropriate MMU faults triggered, and the MMU faults handled. In this case, make sure that the MMU fault does not attempt to query the hardware about the failing channel or TSG ids. This should make channel recovery more safe also in the regular (i.e., not in the interrupt handler) context. Bug 1530226 Bug 1597493 Bug 1625901 Bug 200076344 Bug 200071810 Change-Id: Ib274876908e18219c64ea41e50ca443df81d957b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/448463 (cherry picked from commit 3f03aeae64ef2af4829e06f5f63062e8ebd21353) Reviewed-on: http://git-master/r/755147 Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: cyclestats mode E snapshots support	Leonid Moiseichuk	2015-06-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	That is a kernel supporting code for cyclestats mode E. Cyclestats mode E implemented following Windows-design in user-space and required the following operations to be implemented: - attach a client for shared hardware buffer of device - detach client from shared hardware buffer - flush means copy of available data from hardware buffer to private client buffers according to perfmon IDs assigned for clients - perfmon IDs management for user-space clients - a NVGPU_GPU_FLAGS_SUPPORT_CYCLE_STATS_SNAPSHOT capability added Bug 1573150 Change-Id: I9e09f0fbb2be5a95c47e6d80a2e23fa839b46f9a Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com> Reviewed-on: http://git-master/r/740653 (cherry picked from commit 79fe89fd4cea39d8ab9dbef0558cd806ddfda87f) Reviewed-on: http://git-master/r/753274 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	Revert "Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space""""	Bharat Nihalani	2015-06-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 2e5803d0f2b7d7a1577a40f45ab9f3b22ef2df80 since the issue seen with bug 200106514 is fixed with change http://git-master/r/#/c/752080/. Bug 200112195 Change-Id: I588151c2a7ea74bd89dc3fd48bb81ff2c49f5a0a Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com> Reviewed-on: http://git-master/r/752503 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space"""	Bharat Nihalani	2015-06-02
\| \| \| \| \| \| \| \| \| \| \|	This reverts commit ce1cf06b9a8eb6314ba0ca294e8cb430e1e141c0 since it causes GPU pbdma interrupt to be generated. Bug 200106514 Change-Id: If3ed9a914c4e3e7f3f98c6609c6dbf57e1eb9aad Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com> Reviewed-on: http://git-master/r/749291
*	gpu: nvgpu: Support >32bit addresses in simulation	Terje Bergstrom	2015-06-01
\| \| \| \| \| \| \| \|	Change-Id: I96282b4e047ba8b5369dac039f0f51856c69235b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/747935 (cherry-picked from commit 0bb090745b4122fc4149b1bd6026138a1b9a32bc) Reviewed-on: http://git-master/r/749235
*	Revert "Revert "gpu: nvgpu: New allocator for VA space""	Alex Waterman	2015-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 7eb42bc239dbd207208ff491c3fb65c3d83274d8. The original commit was actually fine. Change-Id: I564ce6530ac73fcfad17dcec9c53f0353b4f02d4 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/743300 (cherry picked from commit e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff) Reviewed-on: http://git-master/r/743301 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: wait for running jobs to finish before shutdown	Deepak Nibade	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gk20a_pm_shutdown(), we currently call __pm_runtime_disable() which prevents h/w access to new requests made after shutdown() call Also, once gk20a_pm_shutdown() completes, platform code will just rail gate the GPU But it is possible that some other thread is already accessing h/w while shutdown() was triggered and this can result in hang Hence, wait until all currently executing jobs are finished before returning from gk20a_pm_shutdown() Also, we need to wait for GPU's usage count to become 1 since platform code will increase the usage count and then call shutdown(). Hence usage count of 1 indicates that GPU is idle Bug 200099940 Change-Id: I1f2457829e2737c07302d13f355353a30c3b4e67 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/734920 Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com> Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
*	Revert "gpu: nvgpu: Fix gk20a shutdown issue"	Deepak Nibade	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit c18edd3686115ca0b7d8bb08b35f23264f865358. Proper fixes for shutdown issue are being added with below changes http://git-master/r/#/c/738509/ http://git-master/r/#/c/734920/ Hence revert this workaround Bug 200099940 Change-Id: I74b29c804af2bdb9d95c6b93c5308a323575ae57 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/739082 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: jump to fail path if pm_runtime_get_sync() fails	Deepak Nibade	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we execute pm_runtime_get_sync() and then gk20a_scale_notify_busy() without checking return value of pm_runtime_get_sync() In case of shutdown of GPU is already initiate, we get a hard hang due to this as per below sequence : - one thread invokes GPU shutdown and then forcibly rail gates the GPU - another thread (unaware of shutdown) calls gk20a_busy() - since runtime PM is disabled in shutdown path, pm_runtime_get_sync() fails - but we still go on running gk20a_scale_notify_busy() which tries to access some GPU registers and hangs Fix this by jumping to failure path in case pm_runtime_get_sync() fails Bug 200099940 Change-Id: I022f2dfa9408f640fb44e6f4b10a437688779c0a Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/738509 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Implement compbits mapping	Sami Kiminki	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_AS_IOCTL_GET_BUFFER_COMPBITS_INFO for requesting info on compbits-mappable buffers; and NVGPU_AS_IOCTL_MAP_BUFFER_COMPBITS, which enables mapping compbits to the GPU address space of said buffers. This, subsequently, enables moving comptag swizzling from GPU to CDEH/CDEV formats to userspace. Compbits mapping is conservative and it may map more than what is strictly needed. This is because two reasons: 1) mapping must be done on small page alignment (4kB), and 2) GPU comptags are swizzled all around the aggregate cache line, which means that the whole cache line must be visible even if only some comptag lines are required from it. Cache line size is not necessarily a multiple of the small page size. Bug 200077571 Change-Id: I5ae88fe6b616e5ea37d3bff0dff46c07e9c9267e Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/719710 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Dynamic betacb size	Terje Bergstrom	2015-05-18
\| \| \| \| \| \| \| \| \| \| \|	Allow querying and setting default betacb size via debugfs. For global buffers the value takes effect upon first boot of GPU, and has no effect after that. Bug 1628352 Change-Id: Ib63f4299249c41eab1b36cc501b525cc54211195 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/733328
*	gpu: nvgpu: Fix gk20a shutdown issue	Sumit Singh	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With CONFIG_PM_GENERIC_DOMAINS_OF enabled, device reboot was getting hung while shutting-down gk20a. It was happening because genpd_dev_pm_detach() was railgating gk20a while other thread was still accessing it. So, assigning NULL to dev->pm_domain->detach for gk20a, so that genpd_dev_pm_detach() is not called during gk20a shutdown, which will not railgate it. This patch will be reverted once we have clean shutdown for gk20a. Bug 200070810 Bug 200099940 Change-Id: Ie2e89ea01a98a9d4f2f68a3ab07b6923ffa374f6 Signed-off-by: Sumit Singh <sumsingh@nvidia.com> Reviewed-on: http://git-master/r/735455 Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com> Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: fix deadlock on railgate_lock during race condition	Deepak Nibade	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have below race condition during __gk20a_do_idle() and force_reset case : - before execution of __gk20a_do_idle(), a process drops the last usage count of GPU, which triggers GPU railgate process - but before GPU is really railgated (there is 500 mS delay), some process calls __gk20a_do_idle() - in __gk20a_do_idle(), we first take railgate_lock - then we check if GPU is already railgated or not - since it is not railgated yet (due to 500 mS delay), this returns false - then we call pm_runtime_get_noresume() which just increases the usage counter - in this particular case, this call just increases usage count to 1 from 0, but whereas GPU is already on its way to railgate - while we check if GPU usage count drops to one, GPU gets railgated - now if we have force_reset=true case, we will end up calling pm_runtime_get_sync() which will take railgate_lock lock _again_ and try to unrailgate GPU - this causes a deadlock on railgate_lock To fix this, use below sequence : - take railgate_lock - check if GPU is already railgated - release railgate_lock - call pm_runtime_get_sync() which will keep GPU active even if railgating is already triggered - take railgate_lock again to prevent unrailgate in futher process Also, add more descriptive comments to explain the flow Bug 1624537 Change-Id: I0febc65d7bfac03ee738be200cf321322ffbe5a6 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/719625 (cherry picked from commit 480284eda16e2b50ee6368bad3d15574e098b231) Reviewed-on: http://git-master/r/719620 Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: don't reset clk that doesn't exist	Alex Van Brunt	2015-05-15
\| \| \| \| \| \| \| \| \| \| \|	If the clock is null, calling the reset function will crash the kernel. So, don't call the reset function. Change-Id: I37ef25c8dca67bec8bf6654eb6e275b866bdae53 Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com> Reviewed-on: http://git-master/r/742361 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit
*	Revert "gpu: nvgpu: New allocator for VA space"	Terje Bergstrom	2015-05-12
\| \| \| \| \| \| \| \| \| \| \|	This reverts commit 2e235ac150fa4af8632c9abf0f109a10973a0bf5. Change-Id: I3aa745152124c2bc09c6c6dc5aeb1084ae7e08a4 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/741469 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Hiroshi Doyu <hdoyu@nvidia.com> Tested-by: Hiroshi Doyu <hdoyu@nvidia.com>
*	gpu: nvgpu: New allocator for VA space	Alex Waterman	2015-05-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement a new buddy allocation scheme for the GPU's VA space. The bitmap allocator was using too much memory and is not a scaleable solution as the GPU's address space keeps getting bigger. The buddy allocation scheme is much more memory efficient when the majority of the address space is not allocated. The buddy allocator is not constrained by the notion of a split address space. The bitmap allocator could only manage either small pages or large pages but not both at the same time. Thus the bottom of the address space was for small pages, the top for large pages. Although, that split is not removed quite yet, the new allocator enables that to happen. The buddy allocator is also very scalable. It manages the relatively small comptag space to the enormous GPU VA space and everything in between. This is important since the GPU has lots of different sized spaces that need managing. Currently there are certain limitations. For one the allocator does not handle the fixed allocations from CUDA very well. It can do so but with certain caveats. The PTE page size is always set to small. This means the BA may place other small page allocations in the buddies around the fixed allocation. It does this to avoid having large and small page allocations in the same PDE. Change-Id: I501cd15af03611536490137331d43761c402c7f9 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/740694 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: SMMU bypass	Terje Bergstrom	2015-05-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Improve GMMU mapping code to cope with discontiguous buffers. Add debugfs entry that allows bypassing SMMU and disabling big pages. Bug 1605769 Change-Id: I14d32c62293a16ff8c7195377c75a85fa8061083 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/717503 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/737533 Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com> Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User
*	HACK: Disable genpd_pm_subdomain_attach	Dan Willemsen	2015-04-04
\| \| \| \| \| \|	Upstream doesn't keep track of the DT node in the genpd struct anymore. Signed-off-by: Dan Willemsen <dwillemsen@nvidia.com>
*	tegra: gpu: disable touch boost for gpu	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Gpu boosting with input events is causing more gpu power consumption than required. To avoid this, touch boot for gpu is disabled by not registering gpu device for cfboost frame work. Current rail gate entry/exit latencies are fast enough to give smooth user experience. Bug 200087243 Change-Id: I18673d9c95a44ce9bee87e860b4edb29212dc766 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/721989 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: make the local function static	Amit Sharma (SW-TEGRA)	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed the following sparse warnings by making below APIs static: - gk20a.c: warning: symbol 'gk20a_pm_restore_debug_setting' was not declared. Should it be static? - gr_gk20a.c: warning: symbol 'gr_gk20a_rop_l2_en_mask' was not declared. Should it be static? - gr_gm20b.c: warning: symbol 'gr_gm20b_rop_l2_en_mask' was not declared. Should it be static? Bug 200067946 Change-Id: I334893bb6614171bff835d270716a7dd262c9ba7 Signed-off-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com> Reviewed-on: http://git-master/r/718756 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: Correct irq and elpg disable sequence	Supriya	2015-04-04
\| \| \| \| \| \| \| \| \|	Bug 200066741 Change-Id: I873835c8aff0c53ac475090d727754ce1ccca0ee Signed-off-by: Supriya <ssharatkumar@nvidia.com> Reviewed-on: http://git-master/r/715632 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Gpu characterstics enhancement	sujeet baranwal	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	New members are added in nvgpu_gpu_characterstics to export more information required specially from CUDA tools. Change-Id: I907f3bcbd272405a13f47ef6236bc2cff01c6c80 Signed-off-by: Sujeet Baranwal <sbaranwal@nvidia.com> Reviewed-on: http://git-master/r/679202 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>