nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: Ensure memory subsystem is enabled	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	Ensure that memory subsystem is enabled at init. Bug 1603128 Change-Id: Ie3fcd4d9df4dbd480e44fa8919fc311e61b627ca Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/707027 Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
*	gpu: nvgpu: add hw perfmon buffer mapping ioctls	Konsta Holtta	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Map/unmap buffers for HWPM and deal with its instance block, the minimum work required to run the HWPM via regops from userspace. Bug 1517458 Bug 1573150 Change-Id: If14086a88b54bf434843d7c2fee8a9113023a3b0 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/673689 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: validate reg ops always	Konsta Holtta	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Call validate_reg_ops() even when allow_all is set, since that function takes care of counting ctxsw regops which would not be executed without the counters set. Bug 1517458 Change-Id: Ie6173229fb6580e8812b7d2a52bfa8661f3d95e5 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/709439 Reviewed-by: Automatic_Commit_Validation_User Tested-by: Sandarbh Jain <sanjain@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Fix if/else conds if PMU flag is OFF.	Deepak Goyal	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 200069748 Invalidating FECS code instblk is required only if FECS uses bootloader to load. Added check for same instead of using PMU support to invalidate. Handle elpg enable/disable call in case PMU is OFF. Change-Id: I28abbbbe1f22edd9e0417df9d0e831bbd770502c Signed-off-by: Deepak Goyal <dgoyal@nvidia.com> Reviewed-on: http://git-master/r/670664 Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Tested-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Supriya Sharatkumar <ssharatkumar@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: bypass smmu for VPR memory access	Krishna Reddy	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	SMMU translation should be bypassed for VPR accesses via GPU. clear sgt dma address to bypass smmu for VPR. Bug 1215470 Change-Id: I22df41a9afc447e2502055b7907cc1848a770f26 Signed-off-by: Krishna Reddy <vdumpa@nvidia.com> Reviewed-on: http://git-master/r/696509 (cherry picked from commit a699f55941fa22e90d41a53798956a542b212659) Reviewed-on: http://git-master/r/707889
*	gpu: nvgpu: Use vzalloc for bitmap	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Allocator bitmap is now larger, and cannot be allocated with kzalloc anymore. Bug 200081843 Change-Id: I9c978ddbdd796e4f1dd5719dbef3a6bd99e64f48 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/709884 Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
*	gpu: nvgpu: register dump in ctxsw timeout	Deepak Nibade	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Dump GR status registers in case of ctxsw timeout. This is helpful in case where ctxsw timeout is encountered during stress testing but we lose the bad state since we do the recovery. So dump as much status as we can when timeouts are seen Bug 200062436 Change-Id: Ie7d320cefa7b272f2cc607cdb5c01ba1f43ba1f2 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/708465 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Per-chip PBDMA signature	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \|	PBDMA HW signature depends on the chip. Change-Id: If57d721d9bb77a090f967930a1aa2037bf4a16fe Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/672922
*	gpu: nvgpu: Do not return timedout in emulation	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \|	We have infinite timeouts for loops in emulation. Some functions with the loops still return error if we exceed the original retry count. Change-Id: I1f9ddbfc0acd9f30f6bd49d9e748d8d8fbefa154 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/709491
*	gpu: nvgpu: Unify PDE & PTE structs	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \|	Introduce a new struct gk20a_mm_entry. Allocate and store PDE and PTE arrays using the same structure. Always pass pointer to this struct when possible between functions in memory code. Change-Id: Ia4a2a6abdac9ab7ba522dafbf73fc3a3d5355c5f Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/696414
*	gpu: nvgpu: Install fd after no errors can happen	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	fd_install() should be called only once all other initialization is done and no errors can happen. Bug 1589104 Change-Id: I822511a64d4c6fa59c8e772a896dbd6818459c97 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/706928 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
*	gpu: nvgpu: reduce message severity to info	Naveen Kumar S	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Shader informs user about context switch wait time. This doesn't affect any functionality. Hence changing print to info. bug 200015967 Change-Id: I7fbb562e43ee6ec1bc8ac01a51d3c9f19d5cb4cf Signed-off-by: Naveen Kumar S <nkumars@nvidia.com> Reviewed-on: http://git-master/r/662657 (cherry picked from commit 3a4d2022369f4bfc1701d6543226e01d7f6f8e0d) Reviewed-on: http://git-master/r/671534 Reviewed-by: Venkat Moganty <vmoganty@nvidia.com> Tested-by: Venkat Moganty <vmoganty@nvidia.com>
*	gpu: nvgpu: vgpu: fix AS split	Aingara Paramakuru	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The GVA was increased to 128GB but for vgpu, the split was not updated to reflect the correct small and large page split (16GB for small pages, rest for large pages). Bug 1606860 Change-Id: Ieae056d6a6cfd2f2fc5066d33e1247d2a96a3616 Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/681340 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: APIs to dump GR status	Deepak Nibade	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add below APIs to dump various GR status registers 1. debugfs : /d/gpu.0/gr_status Read this debugfs at runtime to get status registers 2. API gk20a_gr_debug_dump() Add this API in code to dump registers at any point Bug 200062436 Change-Id: Ic1115b5a2fc16362954b5ed8a9e70afb872a8d91 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/486465 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: optimize fecs status polling	Vijayakumar	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 200078367 using udelay for fecs status polling during GR init phase brings down fecs transaction time to < 20usec from few hundred usec. Change-Id: I61a27daaf1187ac086a42779b46aa3fbee3b37f2 Signed-off-by: Vijayakumar <vsubbu@nvidia.com> Reviewed-on: http://git-master/r/691918 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gm20b: WPR size 0, on railgate exit	Supriya	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug 200066741 ACR ucode has mechanism to skip WPR blob copy for second time, in case WPR size is sent as 0 to acr ucode. With above there is a saving of around 0.5 ms, but, in conjunction with acr change to disable LS sig verification, and scrubbing empty spaces in WPR sections to 0. This change can reduce railgate exit latency by 4ms ACR ucodes to be checked in main, as a different CL, and after getting prod signs for ACR Change-Id: I9d662027abf0b2615176d17433ff3ec3ae53d78a Signed-off-by: Supriya <ssharatkumar@nvidia.com> Reviewed-on: http://git-master/r/681892 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gk20a: FECS HALT method	Supriya	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	FECS halt method is used to do graceful FECS shutdown. Bug 1551865 Change-Id: Iec8590e86cb09f9b54c36f85859208fc8650f6a6 Signed-off-by: Supriya <ssharatkumar@nvidia.com> Reviewed-on: http://git-master/r/682459 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Reset sync point at alloc/free	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \|	Change-Id: I8753e47ef4d3f4b3645ed6c6e604449d81d3da4b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/709061 (cherry picked from commit cc07f316334b88cc18070fba9dd9149ba193bd38) Reviewed-on: http://git-master/r/707980
*	gpu: nvgpu: More robust recovery	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \|	Make recovery a more straightforward process. When we detect a fault, trigger MMU fault, and wait for it to trigger, and complete recovery. Also reset engines before aborting channel to ensure no stray sync point increments can happen. Change-Id: Iac685db6534cb64fe62d9fb452391f43100f2999 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/709060 (cherry picked from commit 95c62ffd9ac30a0d2eb88d033dcc6e6ff25efd6f) Reviewed-on: http://git-master/r/707443
*	gpu: nvgpu: TLB invalidate after map/unmap	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	Always invalidate TLB after mapping or unmapping, and remove the delayed TLB invalidate. Change-Id: I6df3c5c1fcca59f0f9e3f911168cb2f913c42815 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/696413 Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: Fix NULL instead of interger	Amit Sharma (SW-TEGRA)	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed the following sparse warning using proper NULL instead of '0': - mm_gk20a.c:1301: warning: Using plain integer as NULL pointer Bug 200067946 Change-Id: Idd84f541711682bf097bb474049d523a5bb01ae2 Signed-off-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com> Reviewed-on: http://git-master/r/682242 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Sachin Nikam <snikam@nvidia.com>
*	gpu: nvgpu: Add Fmax at Vmin debugfs node	Alex Frid	2015-04-04
\| \| \| \| \| \| \| \| \|	Change-Id: I547cc02a544d117a4c76bf2541b9594d0769c2ef Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/682822 GVS: Gerrit_Virtual_Submit Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: Remove hard-coded GPU name strings	Alex Frid	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Removed hard-coded GPU name strings. Instead retrieved GPU name via device name access interfaces. Change-Id: Iefb41cc610e92e870d4664951c3599df2bb83020 Signed-off-by: Alex Frid <afrid@nvidia.com> Reviewed-on: http://git-master/r/682671 GVS: Gerrit_Virtual_Submit Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
*	gpu: nvgpu: gpu pm control update for cde ioctls	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For cde ioctl "NVGPU_GPU_IOCTL_MARK_COMPRESSIBLE_WRITE", gpu hw not engaged. So remove this call from gpu pm control. Bug 1592636 Change-Id: I9b700e469bf365f2d02549cd9cd9babc68fbb049 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/680294 (cherry picked from commit cae24ee5e9630cc891fb7fcf98d234a42126f464) Reviewed-on: http://git-master/r/681622 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add a new CDE parameter	Jussi Rasanen	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	Add TYPE_PARAM_GOBS_PER_COMPTAGLINE_PER_SLICE. Change-Id: I7cbf7b6db6642a61629ba06f7887bd58af3dc28f Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-on: http://git-master/r/673152 Reviewed-by: Konsta Holtta <kholtta@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add open channel ioctl to ctrl node	Konsta Holtta	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the ioctl to open a new gpu channel to also the control node for improved process startup performance, in addition to the current open ioctl in the channel node. The new channel fd creation is refactored to a separate function which is called from both ctrl and channel ioctls. Bug 1604952 Change-Id: I3357ceec694c0e6d7a85807183884324cb725d3a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/679516 Reviewed-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use busy looping on memory ops	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Use busy looping on L2 and TLB maintenance operations. This speeds them up by an order of magnitude. Add also trace points to measure performance for memory ops and interrupt processing. Change-Id: Ic4a8525d3d946b2b8f57b4b8ddcfc61605619399 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/681640
*	gpu: nvgpu: use vm param to vm_map/unmap_buffer	Konsta Holtta	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pass vm instead of as share to the userspace buffer mapping functions, since they need to be called also from other places than just the AS device ioctls, and as share is specific to them. Bug 1573150 Change-Id: I994872f23ea7b1582361f3f4fabbd64b4786419c Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/674020 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: use gk20a_free_inst_block in remove_vm	Konsta Holtta	2015-04-04
\| \| \| \| \| \| \| \| \| \|	Use the common instance block freeing function when removing vm. Change-Id: I1dfaaceb57e01d0a1359ce5742ed55d81dff10ed Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/672033 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Set compression page per SoC	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \|	Compression page size varies depending on architecture. Make it 129kB on gk20a and gm20b. Also export some common functions from gm20b. Bug 1592495 Change-Id: Ifb1c5b15d25fa961dab097021080055fc385fecd Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/673790
*	gpu:nvgpu: add bar2 aperture support	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \|	Bug 1587825 Change-Id: I884c6b268aabb04b4990713395ebedf92410e02a Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/659239 Tested-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: unify instance block initialization	Konsta Holtta	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	Create gk20a_init_inst_block() to reduce reg write clutter when initializing instance blocks, which is done in several places. Change-Id: Idcb8b604851a849e0bb6abce5743c9f4cbf98033 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/672434 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gm20b: update clock gating defines	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed sparse warnings related to clock gating list and function definitions. Bug 200067946 Change-Id: I9844771b6713c56dbe5dcf85f746a0ebd6c48f9c Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/677878 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Change va_node free behavior	Alex Waterman	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Decrement the ref count on all mapped_buffers belonging to a va_node when a va_node is freed. This prevents userspace from leaking some mapped_buffers in some cases. This does prevent userspace from keeping a buffer around after freeing a space allocation if the buffer in question is not otherwise ref counted. Not sure if this is a bad thing for userspace or not. Bug 1600686 Change-Id: I659ccbda5935d44086fd367bd2110f7d0f066194 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/676629 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gk20a: Optimize gpfifo entry copy	Janne Hellsten	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use memcpy for copying gpfifo inputs into the gpfifo ring buffer. This speeds up one command buffer heavy benchmark from 82 FPS to 86 FPS. Speed up is due to a) faster memory move and b) zero tracing overhead when PB tracing is disabled. Bug 1550886 Change-Id: If95ebff53745bbf59edeac32ad4f32f10f1ea7ee Signed-off-by: Janne Hellsten <jhellsten@nvidia.com> Reviewed-on: http://git-master/r/676967 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: vgpu: handle fifo and gr exceptions	Aingara Paramakuru	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Handle the gr and fifo exceptions delivered from the server and update the channel state as needed. Bug 1551865 Change-Id: Ie19626c6e8a72f92ffd134983fe6d84e5c6c8736 Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/670329 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gk20a: Add a gpfifo wait trace point	Janne Hellsten	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a couple of trace points for tracking when we need to wait for space in the gpfifo ring buffer. This wait can introduce significant latencies to rendering with large gpfifo entry inputs so it's good to be able to measure how often this path is taken. Bug 1592391 Bug 1550886 Change-Id: I7f362e9c307eeffeeecaaba268ef2e3613e54597 Signed-off-by: Janne Hellsten <jhellsten@nvidia.com> Reviewed-on: http://git-master/r/674021 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add SMMU bit only if SMMU enabled	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \|	If SMMU is disabled, we should not add the SMMU bit to addresses. Change-Id: I6dd82e18b63474fb487d21f421ef06467551595b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/673250 Reviewed-by: Adeel Raza <araza@nvidia.com> Tested-by: Adeel Raza <araza@nvidia.com>
*	gpu: nvgpu: Update gk20a and gm20b headers	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update gk20a and gm20b headers with bar2 register block registers. Also updated gm20b ctxsw headers with latest tool output. Bug 1587825 Change-Id: I9d1c459e03051278e7e79806803aaf71655f0dc5 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/672124 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Do not panic if PMU/regops not supported	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	Fix panics when using regops when PMU is disabled, or when whitelists have not been defined. Bug 1592505 Change-Id: I316c98147c54be7b1114ad23049ce3a634d4805e Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/671841
*	gpu: nvgpu: Make locally used function static	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \|	Make gr_gm20b_alloc_gr_ctx static. It is used only in the same file. Bug 200067946 Change-Id: I484ff84ebe9a356f251db5a14ca0e60db64578bf Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/673267
*	gpu: nvgpu: make larger address space work	Alex Waterman	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement several fixes for allowing the GVA address space to grow to larger than 32GB and increase the address space to 128GB. o Implement dynamic allocation of PDE backing pages. The memory to store the PDE entries was hard coded to 1 page. Now the number of pages necessary is computed dynamically based on the size of the address space and the size of large pages. o Fix an arithmetic problem in the gm20b sparse texture code that caused large address spaces to be truncated when sparse PDEs/PTEs were being filled in. This caused a kernel panic when freeing the address space since a lot of the backing PTE memory was not allocated. o Change the address space split for large and small pages. Small pages now occupy the bottom 16GB of the address space. Large pages are used for the rest of the address space. Now, with a 128GB address space, there are 112GB of large page GVA available. This patch exists to allow large (16GB) sparse textures to be allocated without running into lack of memory issues and kernel panics. Bug 1574267 Change-Id: I7c59ee54bd573dfc53b58c346156df37a85dfc22 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/671204 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: enable ce2 interrupts	Sam Payne	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	enables non-blocking interrupts in ce2 all other ce2 interrupts are cleared and not handled. bug 200036089 Change-Id: I9f47b06c677c72ac523019e6a3f70fedd07830a2 Signed-off-by: Sam Payne <spayne@nvidia.com> Reviewed-on: http://git-master/r/671783 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gm20b: update clock gating lists	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	Bug 1584688 Change-Id: I9c0f3dcd3287ec8ced3520847b44a6a6a4c55cec Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/658550 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix struct file memleak in alloc_as	Konsta Holtta	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \|	Free also newly allocated struct file in error conditions with fput, and pair it by not trying to release the resulting null as_share on release. Bug 1597056 Change-Id: Ifad5c3a829b2c459ed6a738ecdc1ac2ac7e1678a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/671527 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: correct channel open sequence	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Corrected sequence to bind and enable channel only afer channel gpfifo alloction done. Bug 1591647 Change-Id: I539458d1b666c0403cca1abcf8271b9c8c09f52c Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/671208 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gk20a: Set lockboost size for compute	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	For compute channel on gk20a, set lockboost size to zero. Bug 1573856 Change-Id: I369cebf72241e4017e7d380c82caff6014e42984 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/594843 GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: gm20b: Enable CTA preemption	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	CTA preemption needs to be enabled by setting a value in context. Set it for gm20b. Bug 200063473 Bug 1517461 Change-Id: I080cd71b348d08f834fd23ebbe7443dba79224db Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/661299
*	gpu: nvgpu: unify instance block creation	Konsta Holtta	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	Reduce copypaste code in instance block allocation and deletion with functions purposed for that. Change-Id: I2c8ae6a317ac89e2c857dde4296cb4316b8aaafe Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/668698 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Pre-Population of zbc entries	sujeet baranwal	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The default zbc entries were never populated in zbc HW table because the conditional flag "gr->sw_ready" was always set thus avoided the zbc default loading function call. Now zbc default loading would happen only during boot time in sw structure.Hw zbc regs would be loaded from that structure every time a railgate exit happens. Bug 1580210 Change-Id: Ie3e40738cbc84cf724c3f3871f15b17a5c84025a Signed-off-by: Sujeet Baranwal <sbaranwal@nvidia.com> Reviewed-on: http://git-master/r/662306 Reviewed-by: Sami Kiminki <skiminki@nvidia.com> Tested-by: Lauri Peltonen <lpeltonen@nvidia.com> Reviewed-by: Arto Merilainen <amerilainen@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>