nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: CDE support	Arto Merilainen	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for executing a precompiled GPU program to allow exporting GPU buffers to other graphics units that have color decompression engine (CDE) support. Bug 1409151 Change-Id: Id0c930923f2449b85a6555de71d7ec93eed238ae Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/360418 Reviewed-by: Lauri Peltonen <lpeltonen@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Attach compression state to dma-buf	Lauri Peltonen	2015-03-18
\| \| \| \| \| \| \| \| \| \| \|	Bug 1509620 Change-Id: I694fe43ef5d1f4f329d997a3d60e006785374cc3 Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com> Reviewed-on: http://git-master/r/439849 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add gk20a_fence type	Lauri Peltonen	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When moving compression state tracking and compbit management ops to kernel, we need to attach a fence to dma-buf metadata, along with the compbit state. To make in-kernel fence management easier, introduce a new gk20a_fence abstraction. A gk20a_fence may be backed by a semaphore or a syncpoint (id, value) pair. If the kernel is configured with CONFIG_SYNC, it will also contain a sync_fence. The gk20a_fence can easily be converted back to a syncpoint (id, value) parir or sync FD when we need to return it to user space. Change gk20a_submit_channel_gpfifo to return a gk20a_fence instead of nvhost_fence. This is to facilitate work submission initiated from kernel. Bug 1509620 Change-Id: I6154764a279dba83f5e91ba9e0cb5e227ca08e1b Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com> Reviewed-on: http://git-master/r/439846 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Remove unused code in allocator	Terje Bergstrom	2015-03-18
\| \| \| \| \| \| \| \| \| \|	Remove functions that are not used in gk20a allocator. Bug 1523403 Change-Id: I36b2b236258d61602cb3283b59c43b40f237d514 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/432174
*	gpu: nvgpu: gk20a: add address check in allocator.	Kevin Huang	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Check the address range before allocation to avoid illegal address range. Bug 1523403 Change-Id: Iff171399a980b69f9b1a18eea5bc37eff4c5d749 Signed-off-by: Kevin Huang <kevinh@nvidia.com> Reviewed-on: http://git-master/r/437871 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gm20b: select NETB for final NETLIST	Seshendra Gadagottu	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \|	Final netlist for T210 uses NETB firmware for gpu. Change-Id: Id396f1b6fa53f8d3c7b39ad0f93db230d6ad6d86 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/441355 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gm20b: Store LTC configuration	Arto Merilainen	2015-03-18
\| \| \| \| \| \| \| \| \|	Change-Id: Ia780e6a7cb3579f0d6ed2dca9949a349799535fd Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/448115 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Update GM20B GPCPLL programming sequence	Alex Frid	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Updated GM20B GPCPLL programming sequence to utilize new glitch-less post divider: - No longer bypass PLL for re-locking if it is already enabled, and post divider as well as feedback divider are changing (input divider change is still under bypass only). - Use post divider instead of external linear divider to introduce (VCO min/2) intermediated step when changing PLL frequency. Bug 1450787 Signed-off-by: Alex Frid <afrid@nvidia.com> Change-Id: I4fe60f8eb0d8e59002b641a6bfb29a53467dc8ce
*	gpu: nvgpu: Updated GM20b GPCPLL dynamic ramp setup	Alex Frid	2015-03-18
\| \| \| \| \| \| \| \| \| \| \|	Setup GPCPLL dynamic ramp coefficients based on update rate (instead of hard-coding), since on GM20B high reference clock 38.4MHz allows to use several update rates within supported range. Bug 1450787 Change-Id: I0e14bcb8e3f65cc164fbb66b4adc688fcee9e2d6 Signed-off-by: Alex Frid <afrid@nvidia.com>
*	gpu: nvgpu: Update GM20b GPCPLL locking under bypass	Alex Frid	2015-03-18
\| \| \| \| \| \| \| \| \| \|	Moved GPCPLL locking under bypass procedure into separate function. Added SYNC_MODE control during locking. Bug 1450787 Change-Id: I8dbf9427fbdaf55ea20b6876750b518eb738de1b Signed-off-by: Alex Frid <afrid@nvidia.com>
*	gpu: nvgpu: skip WFI for KEPLER_C channels	Deepak Nibade	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In channel_finish(), we submit WFI for all the channels including channels with KEPLER_C class. Since there is no need to submit WFI for channels with KEPLER_C class, we can optimize to skip submitting WFI and directly wait on last submit fence Bug 1534272 Change-Id: I3838416cf22122728e7f1008e01d77b14a35deba Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
*	gpu: nvgpu: poweron host1x explicitly	Deepak Nibade	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently gk20a gets reference of host1x via phandle in Device Tree. But runtime PM does not seem to be handling power dependencies too well in this case and hence some times host1x is off when we need it. To fix this, exlicitly power on host1x while powering gpu up. Do this via "busy" and "idle" callbacks from gk20a_platform Bug 1534272 Bug 200022536 Change-Id: Ia562ee19722cfc8edc5626a5a058ab8edfe3d206 Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
*	gpu: nvgpu: gm20b: update gpu headers	Seshendra Gadagottu	2015-03-18
\| \| \| \| \| \| \| \| \| \| \|	Keep gm20b headers upto date with script output Change-Id: I0916df7c43616b1d9231436a512290c2fa901d64 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/447725 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Bo Yan <byan@nvidia.com>
*	gpu: nvgpu: Update GM20b GPCPLL parameters	Alex Frid	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Updated GPCPLL parameters according to GM20b specification. Modified PLL programming, since on GM20b PLL post divider value is equal to divider setting (which was not the case on GK20a this code was inherited from). Bug 1450787 Change-Id: Ia455ac49040047a3dbcd5d5211f2fbc71dc332ae Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/447751 GVS: Gerrit_Virtual_Submit Reviewed-by: Hoang Pham <hopham@nvidia.com> Tested-by: Hoang Pham <hopham@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: Update GM20b GPCPLL initial configuration	Alex Frid	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Set initial output rate to 1/3 of VCO minimum. - Cleared global BYPASSCTRL to get ready for enabling PLL (this won't bring PLL out of bypass, since SEL_VCO register is cleared). - Added debugfs nodes for BYPASSCTRL and SEL_VCO state. Bug 1450787 Change-Id: I10b068b006b7e9fbdf7854eff0cfd5cfdc1dd546 Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/447750 GVS: Gerrit_Virtual_Submit Reviewed-by: Hoang Pham <hopham@nvidia.com> Tested-by: Hoang Pham <hopham@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: Expand GM20b PLL fields header	Alex Frid	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added masks for GM20b GPCPLL input and post dividers. Bug 1450787 Change-Id: I39a9c7ffb740fa9ef3a614deb2591412e34ef263 Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/447857 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Hoang Pham <hopham@nvidia.com> Reviewed-by: Bo Yan <byan@nvidia.com>
*	nvgpu: new gpmu ucode compatibility	Supriya	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For LS PMU new ucode needs to be used. Ucode has interface header file changes too. This patch also has fixes for pmu dmem copy failure Bug 1509680 Change-Id: I8c7018f889a82104dea590751e650e53e5524a54 Signed-off-by: Supriya <ssharatkumar@nvidia.com> Reviewed-on: http://git-master/r/441734 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use GPU device name in clock get operation	Alex Frid	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Used GPU device name in clock get operation (instead of fixed name), to make operation is common for GK20A and GM20B. Updated clock ids in tegra clock framework accordingly. Bug 1450787 Change-Id: Ifd5b9c3a6fd8db5b06e6dcd989285e8410794803 Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/441711 Reviewed-by: Bo Yan <byan@nvidia.com> Tested-by: Bo Yan <byan@nvidia.com>
*	gpu: nvgpu: Make clock operations static	Alex Frid	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	Made GK20A and GM20B clock operations static, since they are invoked only via HAL interfaces. Bug 1450787 Change-Id: Ia30218ad4244bd8790b5ef96d1963678d0ba39e1 Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/441710 Reviewed-by: Bo Yan <byan@nvidia.com> Tested-by: Bo Yan <byan@nvidia.com>
*	gpu: nvgpu: Switch to use GM20B hw header files	Hoang Pham	2015-03-18
\| \| \| \| \| \| \| \| \| \| \|	Bug 1450787 Change-Id: Id28bd49eadae7b2310410c1676d73b37f57d1443 Signed-off-by: Hoang Pham <hopham@nvidia.com> Reviewed-on: http://git-master/r/441543 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Aleksandr Frid <afrid@nvidia.com> Reviewed-by: Bo Yan <byan@nvidia.com>
*	Revert "gpu: nvgpu: return error from mutex_acquire() if pmu not initialized"	Deepak Nibade	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 50497d4031103df1067f14ce4c1e14b15713efb9. Simply returning error from mutex_acquire() causes the code to call disable_elpg() which decreases elpg refcount But we already have a race condition between pmu initialization where we initialize elpg and runlist update where we call this mutex_acquire and decrease the refcount As a result of this race and returned error we might mess up with the elpg refcount and cause abnormal behaviour Hence revert this change for now until we have clean fix considering this race as well Bug 200024116 Change-Id: Ie64ca36f70aba6b15c2acc235a5d36d13c9025aa Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/441793 Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
*	gpu: nvgpu: Fork GM20B clock from GK20A clock	Hoang Pham	2015-03-18
\| \| \| \| \| \| \| \| \|	Bug 1450787 Change-Id: Id7fb699d9129a272286d6bc93e0e95844440a628 Signed-off-by: Hoang Pham <hopham@nvidia.com> Reviewed-on: http://git-master/r/440536 Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: Init clock debugfs after clock support	Alex Frid	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \|	Initialized GK20A clock debugfs after clock support hardware and software are ready. Bug 1450787 Change-Id: I8ec2ef303a84b9151b7ce209a1864f1729382a44 Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/440973 Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: Add gm20b h/w definitions	Alex Frid	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	Added SYNC_MODE field, and BYPASSCTRL register; expanded GPC2CLK_OUT_VCODIV field. Bug 1450787 Signed-off-by: Alex Frid <afrid@nvidia.com> Change-Id: Ibf2119a88b0d5f099199920e70b2e88f04b8863b Reviewed-on: http://git-master/r/440928 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: flush write before unlocking	Sang-Hun Lee	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- gk20a_enable is reading the clock after unlocking the spinlock to flush any previous write - This could lead to a race if any write afterwards assume the write has been completed already - Read the clock before unlocking to ensure all previous writes have been completed before letting any other thread use gk20a Bug 200007520 Change-Id: I737fbbe825c68b25ca256c4a8ee2b99aa8baf0f5 Signed-off-by: Sang-Hun Lee <sanlee@nvidia.com> Reviewed-on: http://git-master/r/418485 (cherry picked from commit 2aed542a719caa69620766bf2dceefe50626c189) Reviewed-on: http://git-master/r/437842 Reviewed-by: Mitch Luban <mluban@nvidia.com> Tested-by: Mitch Luban <mluban@nvidia.com>
*	gpu: nvgpu: Double syncpoint increments	Arto Merilainen	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	gm20b/gm20x requires incrementing syncpoints twice to ensure that the data has reached memory in all cases. This patch modifies increment push buffer to account this requirement. Bug 1491360 Change-Id: I5c2899b26ce0e1cdf9408bb9aaa576fc3054480f Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/437675 Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: Add helpers for backing store access	Arto Merilainen	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds mm helpers to access compression backing store from in-kernel shader. Bug 1409151 Change-Id: Icb4f6dc0b5a35fdb97bc4221ab3657866f775fae Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/440263 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Lauri Peltonen <lpeltonen@nvidia.com> GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Allow reloading the golden context	Arto Merilainen	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In cases where a kernel channel dies, we can reload the context by just reloading the golden context buffer. This patch makes necessary infrastructural changes to support this behaviour. Bug 1409151 Change-Id: Ibe6a88bf7acea2d3aced2b86a7a687279075c386 Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/440262 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Lauri Peltonen <lpeltonen@nvidia.com> GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: gk20a: Allow in-kernel channel alloc	Arto Merilainen	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch modifies channel interfaces to allow allocating the channel for kernel use. This is needed if we want to run a shader from kernel space. Bug 1409151 Change-Id: I3544186bb1541120f85e01a19de106ef011c1b11 Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/440261 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Lauri Peltonen <lpeltonen@nvidia.com> GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: return error from mutex_acquire() if pmu not initialized	Deepak Nibade	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In pmu_mutex_acquire(), we return zero (success) if pmu->initialized is not set Since mutex_acquire() was successful, we then call pmu_mutex_release() If now pmu->initialized is set in some other thread then we proceed to validate the mutex owner and end up causing below warning : pmu_mutex_release: requester 0x00000000 NOT match owner 0x00000008 Hence to fix this return error from mutex_acquire() and mutex_release() if pmu->initialized is not yet set and in that case we proceed to call elpg enable/disable Bug 1533644 Change-Id: Ifbb9e6a8e13f6478a13e3f9d98ced11792cc881f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/439333 GVS: Gerrit_Virtual_Submit Reviewed-by: Naveen Kumar S <nkumars@nvidia.com> Tested-by: Naveen Kumar S <nkumars@nvidia.com> Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
*	gpu: nvgpu: Remove unused GK20A cooling device	Alex Frid	2015-03-18
\| \| \| \| \| \| \| \| \| \| \|	Removed unused, obsolete GK20A cooling device. Bug 1450787 Change-Id: I5b02546d0405dd518ec841d903e650a8d38db8f2 Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/437942 Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: Split clk_ops for GK20A and GM20B	Hoang Pham	2015-03-18
\| \| \| \| \| \| \| \| \| \| \|	Split clk_ops for GK20A and GM20B into different files Bug 1450787 Change-Id: I34d16c54ac40c70854e80588475434c9e50b51a5 Signed-off-by: Hoang Pham <hopham@nvidia.com> Reviewed-on: http://git-master/r/437771 Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: Use 1kHz resolution for GPCPLL programming	Alex Frid	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Used 1kHz resolution (instead of 1 MHz) for GPCPLL programming: limits specifications, calculating GPCPLL settings, storing target frequency values, and proving output from debug monitor. Updated comments in clock header to properly reflect frequency units. Bug 1450787 Change-Id: Ica58f794b82522288f2883c40626d82dbd794902 Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/437943 Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: gm20b: fix CBC clear	Wei Sun	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gm20b_ltc_cbc_ctrl sent the wrong register value to clear the CBC. Bug 1507804 Change-Id: Ib0d867a122466e50cb15fef3b320fb2ee8455ef2 Signed-off-by: Wei Sun <wsun@nvidia.com> Reviewed-on: http://git-master/r/435297 Reviewed-by: Kevin Huang (Eng-SW) <kevinh@nvidia.com> Tested-by: Kevin Huang (Eng-SW) <kevinh@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: force idle if railgate not supported	Deepak Nibade	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a way to force idle and reset the GPU in case where GPU rail gating is not supported (i.e. platform->can_railgate = false) In this case, we follow below sequence : - once GPU is idle, get runtime reference which enables the clocks - call prepare_poweroff() to save the state explicitly - perform explicit reset assert/deassert - call finalize_poweron() to restore the state - drop the runtime reference taken earlier Bug 1525284 Change-Id: Id5f3ec152093acd585631dfbf785d8e0561f9048 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/435620 GVS: Gerrit_Virtual_Submit Reviewed-by: Arto Merilainen <amerilainen@nvidia.com> Tested-by: Arto Merilainen <amerilainen@nvidia.com>
*	gpu: nvgpu: increase delays in do_idle()	Deepak Nibade	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Increase the wait delays in do_idle() to 2000 mS and make use of msleep instead of mdelays Also, to check if GPU is rail gated or not, add a do-while() loop which will keep checking the status and bail out as soon as GPU is rail gated This increase in delays is required to allow GPU sufficient time to complete its work and get rail gated These delays are specially needed during stress testing where it is possible that a large amount of GPU work is blocked during do_idle() and then it might take more time to complete it while next do_idle() is waiting for it Also, remove waiting on API gk20a_wait_channel_idle() for each channels since it is sufficient to wait for refcount to be 1 bug 1529160 Change-Id: Ie541485fbdda76d79ae4a75dda928da240fc5d8f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/434192 (cherry picked from commit 5a621bf2aaf3355e1330a662dc98e943d68ef86d) Reviewed-on: http://git-master/r/435133 Reviewed-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: remove redundant busy()/idle() calls	Deepak Nibade	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gk20a_busy() call in channel_syncpt_incr() and corresponding gk20a_idle() call in channel_update() are redundant since they are already encapsulated inside another pair of busy/idle calls This busy/idle pair will be called only from submit_gpfifo() and submit_gpfifo() already has its own busy/idle which it preserves for whole path and hence this redundant pair can be removed Also, this prevents a dead lock scenario while do_idle() is in progress as follows : - in submit_gpfifo() we call first gk20a_busy() which acquires busy read semaphore - in do_idle() we acquire busy write semaphore and wait for current jobs to finish - now submit_gpfifo() encounters second gk20a_busy() and requests busy read semaphore again - this results in dead lock where do_idle() is waiting for submit_gpfifo() to complete and submit_gpfifo() is waiting for busy lock held by do_idle() and hence it cannot complete bug 1529160 Change-Id: I96e4368352f693e93524f0f61689b4447e5331ea Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/434191 (cherry picked from commit c4315c6caa42bab72ba6017c7ded25f4e9363dec) Reviewed-on: http://git-master/r/435132 Reviewed-by: Sachin Nikam <snikam@nvidia.com> Tested-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: fix race between do_idle() and unrailgate()	Deepak Nibade	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While we are executing do_idle() API, it is possible that unrailgate() gets invoked in midst of idling the GPU and this can result in failure of do_idle() To prevent simultaneous execution of these methods, add a mutex railgate_lock and acquire it during do_idle() and unrailgate() APIs Also, keep this lock held if do_idle() is successful. In success, lock will be released in do_unidle(), otherwise release this lock before returning Note that this lock should not be held in railgate() API since we do not want it to be blocked during do_idle() bug 1529160 Change-Id: I87114b5367eaa217376455a2699c0d21c451c889 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/434190 (cherry picked from commit 561dc8e0933ff2d72573292968b893a52f5f783a) Reviewed-on: http://git-master/r/435131 Reviewed-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	Revert "gpu: nvgpu: Dump offending push buffer fragment"	Arto Merilainen	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Channel and gpfifo allocations are entirely separated from each other, however, the code here assumes that active channel means that the channel also has a gpfifo. This reverts commit a24602f094380539788696d1b1567a4f4d914b17 which added gpfifo dump. Changing debug dumping to be safe requires refactoring the channel release code to use proper locking. Bug 1530226 Change-Id: I2fb02542a17dd56a0a9ce732b327e34b85ade8b9 Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/434038 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Shridhar Rasal <srasal@nvidia.com> Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: Exclude non-secure boot with CONFIG_TEGRA_ACR	Seshendra Gadagottu	2015-03-18
\| \| \| \| \| \| \| \| \| \| \|	Do not compile non-secure boot code if CONFIG_TEGRA_ACR is defined. Bug 1524197 Change-Id: Id1ec222e00e2229e1d28e406e4ddad99e368296e Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/433356 Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu:nvgpu: Enable Falcon trace prints	Vijayakumar	2015-03-18
\| \| \| \| \| \| \| \| \| \| \|	Dump Falcon trace on PMU Crash and add debugfs node falc_trace. This needs Debug trace to be enabled in GPmu binary. Change-Id: I093ef196202958e46d8d9636a848bd6733b5e4cc Signed-off-by: Vijayakumar <vsubbu@nvidia.com> Reviewed-on: http://git-master/r/432732 Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com> Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: Update generic platform	Arto Merilainen	2015-03-18
\| \| \| \| \| \| \| \| \| \|	This patch adds .is_railgated() callback for the generic gpu platform. Change-Id: Ief13a6fba82b376aafbe861e8f3823a19bb7f679 Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/433059 Reviewed-by: Hiroshi Doyu <hdoyu@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: Support probing host1x link from apps	Arto Merilainen	2015-03-18
\| \| \| \| \| \| \| \| \| \| \|	This patch adds support to check if the host1x link exists and is supported using the gpu characteristics ioctl. Bug 1459653 Change-Id: I832eea217ed7f007e341dfde5769887e0882d6bb Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/433058
*	gpu:nvgpu:sysfs node to update aelpg parameter	Mahantesh Kumbar	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added sysfs node to update aelpg parameter. Pass parameter as below sequence, SAMPLING_PERIOD_PG_DEFAULT_US, MINIMUM_IDLE_FILTER_DEFAULT_US, MINIMUM_TARGET_SAVING_DEFAULT_US, POWER_BREAKEVEN_DEFAULT_US, CYCLES_PER_SAMPLE_MAX_DEFAULT Bug 1464737 Change-Id: I46873c463820f30f190c722d7ed038622cb2710f Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/422702 Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com> Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: Remove GPU MMIO access on power/rail gate	Alex Waterman	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \|	This is to weed out accesses to the GPU while it is gated. Otherwise the accesses are silently dropped or cause a HW hang (on older chips). Bug 1514949 Change-Id: Ice4cdb9f1f736978ebb3db847f39c7439bf98134 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/416339 Reviewed-by: Mitch Luban <mluban@nvidia.com>
*	gpu: nvgpu: Wait for idle via FIFO registers	Terje Bergstrom	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Wait for engine idle via FIFO's engine status instead of submitting WFI to channel. Submitting WFI and waiting is not robust, and wait might invoke debug dump which cannot be done while powering down. Bug 1499214 Change-Id: I4d52e8558e1a862ad4292036594d81ebfbd5f36b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/432151 Reviewed-by: Sachin Nikam <snikam@nvidia.com> Tested-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: Bump unmap retries if not silicon	Terje Bergstrom	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In simulation and emulation 50ms is not enough to ensure a job is complete. Bump it to 5s when not running on silicon. Bug 1510751 Change-Id: I90883b70ce2a75a8f07344f713d647b3fa0d0c7d Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/432044 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Chris Dragan <kdragan@nvidia.com> Tested-by: Chris Dragan <kdragan@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Shridhar Rasal <srasal@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: Clear channel class on open	Terje Bergstrom	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Channel class needs to be cleared when a channel is opened. Otherwise previously used channel remains, and we can accidentally use KEPLER_C methods even if KEPLER_C is not allocated. Bug 1487928 Bug 200000669 Change-Id: I3e1ae8d5edbdd82fa569b38a89a89dedb69ee773 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/428866 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Shridhar Rasal <srasal@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: Free allocated gr_ctx	Terje Bergstrom	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gr_ctx is nowadays kalloc()'d separately. Adding kfree() to prevent memory leak. Bug 1528275 Change-Id: I942812a483adad47e82bc75a7bda5942c30c527a Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/428890 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Shridhar Rasal <srasal@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: fix possible PMU isr race	Deepak Nibade	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Possible race description : - while PMU is booting, it sends messages to kernel which we process in gk20a_pmu_isr() - but when messages are processed it is possible that we are on the way to rail gate the GPU and we have already called pmu_destroy() - this could lead to hangs if while processing messages, GR is already off To fix this, introduce another mutex isr_enable_lock and a flag to turn on/off ISRs - when we enable PMU, get the lock and set the flag - in pmu_destroy(), get the lock and remove the flag - in pmu_isr(), take the lock, check if flag is set or not. If flag is not set return, otherwise proceed with the messages Bug 200014542 Bug 200014887 Change-Id: I0204d8a00e4563859eebc807d4ac7d26161316ea Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/428371 (cherry picked from commit 9a37528314f2a2504e4530719f817a93db9a5bf0) Reviewed-on: http://git-master/r/428352 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>