nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: support preprocessing of SM exceptions	Deepak Nibade	2016-01-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support preprocessing of SM exceptions if API pointer pre_process_sm_exception() is defined Also, expose some common APIs Bug 200156699 Change-Id: I1303642c1c4403c520b62efb6fd83e95eaeb519b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/925883 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add high priority channel interleave	Peter Pipkorn	2016-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Interleave all high priority channels between all other channels. This reduces the latency for high priority work when there are a lot of lower priority work present, imposing an upper bound on the latency. Change the default high priority timeslice from 5.2ms to 3.0 in the process, to prevent long running high priority apps from hogging the GPU too much. Introduce a new debugfs node to enable/disable high priority channel interleaving. It is currently enabled by default. Adds new runlist length max register, used for allocating suitable sized runlist. Limit the number of interleaved channels to 32. This change reduces the maximum time a lower priority job is running (one timeslice) before we check that high priority jobs are running. Tested with gles2_context_priority (still passes) Basic sanity testing is done with graphics_submit (one app is high priority) Also more functional testing using lots of parallel runs with: NVRM_GPU_CHANNEL_PRIORITY=3 ./gles2_expensive_draw –drawsperframe 20000 –triangles 50 –runtime 30 –finish plus multiple: NVRM_GPU_CHANNEL_PRIORITY=2 ./gles2_expensive_draw –drawsperframe 20000 –triangles 50 –runtime 30 -finish Previous to this change, the relative performance between high priority work and normal priority work comes down to timeslice value. This means that when there are many low priority channels, the high priority work will still drop quite a lot. But with this change, the high priority work will roughly get about half the entire GPU time, meaning that after the initial lower performance, it is less likely to get lower in performance due to more apps running on the system. This change makes a large step towards real priority levels. It is not perfect and there are no guarantees on anything, but it is a step forwards without any additional CPU overhead or other complications. It will also serve as a baseline to judge other algorithms against. Support for priorities with TSG is future work. Support for interleave mid + high priority channels, instead of just high, is also future work. Bug 1419900 Change-Id: I0f7d0ce83b6598fe86000577d72e14d312fdad98 Signed-off-by: Peter Pipkorn <ppipkorn@nvidia.com> Reviewed-on: http://git-master/r/805961 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: abstract set sm debug mode	Richard Zhao	2016-01-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new operation g->ops.gr.set_sm_debug_mode and move native implementation to gr_gk20a.c It's preparing for adding vgpu set sm debug mode hook. JIRA VFND-1006 Bug 1594604 Change-Id: Ia5ca06a86085a690e70bfa9c62f57ec3830ea933 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/923232 (cherry picked from commit 032552b54c570952d1e36c08191e9f70b9c59447) Reviewed-on: http://git-master/r/835614 Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: Control comptagline assignment from kernel	Terje Bergstrom	2016-01-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Maxwell comptaglines are assigned per 128k, but preferred big page size for graphics is 64k. Bit 16 of GPU VA is used for determining which half of comptagline is used. This creates problems if user space wants to map a page multiple times and to arbitrary GPU VA. In one mapping the page might be mapped to lower half of 128k comptagline, and in another mapping the page might be mapped to upper half. Turn on mode where MSB of comptagline in PTE is used instead of bit 16 for determining the comptagline lower/upper half selection. Bug 1704834 Change-Id: If87e8f6ac0fc9c5624e80fa1ba2ceeb02781355b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/924322 Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	gpu: nvgpu: Make access map chip specific	Terje Bergstrom	2015-12-07
\| \| \| \| \| \| \| \|	Bug 1692373 Change-Id: Ie3fc3e02fa7b0636da464d6ee1c28da7a4543ec2 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/812353
*	gpu: nvgpu: Wait for pause for SMs	sujeet baranwal	2015-12-04
\| \| \| \| \| \| \| \| \| \| \| \|	SM locking & register reads Order has been changed. Also, functions have been implemented based on gk20a and gm20b. Change-Id: Iaf720d088130f84c4b2ca318d9860194c07966e1 Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Signed-off-by: ashutosh jain <ashutoshj@nvidia.com> Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/837236
*	gpu: nvgpu: IOCTL to disable watchdog per-channel	Deepak Nibade	2015-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add IOCTL NVGPU_IOCTL_CHANNEL_WDT to disable/enable watchdog per-channel Also, if watchdog is disabled, we currently schedule the worker with MAX timeout. Instead of this, do not schedule any worker if watchdog is disabled Bug 1683059 Bug 1700277 Change-Id: I7f6bec84adeedb74e014ed6d1471317b854df84c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/837962 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: abstract set mmu debug mode	Richard Zhao	2015-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new operaton g->ops.mm.set_debug_mode and let other places that set debug mode call this callback. It's preparing for adding vgpu set mmu debug mode hook. JIRA VFND-1005 Bug 1594604 Change-Id: I1d227a0c0f96adb0035ae16ae1f4fbfa739bf0a7 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/833497 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: User-space managed address space support	Sami Kiminki	2015-11-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which enables creating userspace-managed GPU address spaces. When an address space is marked as userspace-managed, the following changes are in effect: - Only fixed-address mappings are allowed. - VA space allocation for fixed-address mappings is not required, except to mark space as sparse. - Maps and unmaps are always immediate. In particular, the mapping ref increments at kickoffs and decrements at job completion are skipped. Bug 1614735 Bug 1623949 Bug 1660392 Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/738558 Reviewed-on: http://git-master/r/833253 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: ZBC update without idle	Terje Bergstrom	2015-11-17
\| \| \| \| \| \| \| \| \| \|	Do ZBC updates without forcing engine idle first. Bug 1698013 Change-Id: I99218c8cfd02be05dace2003b8d91921765f7ca9 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/829145
*	gpu: nvgpu: Do not use G_ELPG_FLUSH	Terje Bergstrom	2015-11-10
\| \| \| \| \| \| \| \| \| \| \| \| \|	G_ELPG_FLUSH is protected in some chips. Use L2 flush operations instead. Bug 1698618 Change-Id: I984a8ace8bcd0ad2d4a4e2d63af75a342bdeb75a Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/828656 (cherry picked from commit ba9075fa43975112a221d37d246f0b8f5af40fab) Reviewed-on: http://git-master/r/829415
*	gpu: nvgpu: use platform data for ptimer source rate	Seshendra Gadagottu	2015-11-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of depending on clock frame-work, use platform data for ptimer source rate. Removed ptimerscaling10x platform data, and use ptimer source frequency to calculate ptimerscaling factor. Reviewed-on: http://git-master/r/819030 (cherry picked from commit dd291334d54dab80cab7eb1656dffc48a59610b4) Change-Id: I7638ce9875a6e440bbfc2ba2da0d0b094b2700ff Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/827300 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: IOCTL to disable timeouts	Deepak Nibade	2015-11-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add IOCTL NVGPU_DBG_GPU_IOCTL_TIMEOUT to support disabling/re-enabling scheduler timeout from user space If user space application is closed without re-enabling the timeouts, kernel will restore the timeouts' state while releasing the debug session This is needed for debugging purpose Bug 1514061 Change-Id: I32efb47ad09d793f3e7fd8f0aaa9720c8bc91272 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/788176 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: update thermal programming	Seshendra Gadagottu	2015-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add required fileds and values for thermal slow-down settings in thermal header file and implemented chip specific thermal register programming Reviewed-on: http://git-master/r/822199 (cherry picked from commit 9e8a745b8295af002b9780c83caa8dc7b22cc737) Change-Id: I016b18ed230fa6c104eada2e166ccd1a5f2ace36 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/823012 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Disable only channel at zcull bind	Terje Bergstrom	2015-10-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	At zcull bind we disable whole GR engine. This is unnecessary, so instead disable only the channel and make sure it's unloaded. Introduces also an API in fifo_gk20a.c to do the channel disable. gr_gk20a_ctx_zcull_setup() was always passed true as last parameter, so remove parameter. Change-Id: I7ae6e101ec7d1ab3f6ee4e9bcc442d23dbd21247 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/787570
*	gpu: nvgpu: add support to remove bar2 mm	Seshendra Gadagottu	2015-10-12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Adding support to remove bar2 mm on gpu module remove. Change-Id: Id5f680b1abf7056da9871d5460d9fbc40422673e Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/814571 (cherry picked from commit e7c6c87dd6b0893d26a9a3b4568121a691e1eb3c) Reviewed-on: http://git-master/r/815429 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: set correct timeslice value	Kirill Artamonov	2015-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Scale timeslice register value based on platform specific ptimer scale koefficient. Expose timeslice values through debugfs to simplify performance tuning. bug 1605552 bug 1603226 Change-Id: I49f86f22d58d26a366ee1b5f5a9ab9d7f896ad25 Signed-off-by: Kirill Artamonov <kartamonov@nvidia.com> Reviewed-on: http://git-master/r/800007 (cherry picked from commit 00c85ef24cf28ffaa81eb53fff7edef1c699220a) Reviewed-on: http://git-master/r/808251 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: scale ptimer based timeouts	Vijayakumar	2015-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 1603226 host based timeouts use ptimer for detecting timeouts. on gk20a and gm20b ptimer runs 2.6x slower. scale the fifo_eng_timeout to account for this Change-Id: Ie44718382953e36436ea47d6e89b9a225d5c2070 Signed-off-by: Vijayakumar <vsubbu@nvidia.com> Reviewed-on: http://git-master/r/799983 (cherry picked from commit d1d837fd09ff0f035feff1757c67488404c23cc6) Reviewed-on: http://git-master/r/808250 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: update slcg xbar prod settings	Seshendra Gadagottu	2015-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug 1689806 Change-Id: I368ad8fb64e49b21ba61c519def1f86e1ca6e492 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/806116 (cherry picked from commit 1a3bbe989a795d379703e7f4b915f6e1bb38c2c3) Reviewed-on: http://git-master/r/805480 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: ELPG init & statistics update	Mahantesh Kumbar	2015-09-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Required init param to start elpg - change in statistics dump Bug 1684939 Change-Id: I26dca52079f08b8962e9cb758831910207610220 Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/802456 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/806179 Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add CDE bits in FECS header	sujeet baranwal	2015-09-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In case of CDE channel, T1 (Tex) unit needs to be promoted to 128B aligned, otherwise causes a HW deadlock. Gpu driver makes changes in FECS header which FECS uses to configure the T1 promotions to aligned 128B accesses. Bug 200096226 Change-Id: I8a8deaf6fb91f4bbceacd491db7eb6f7bca5001b Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Reviewed-on: http://git-master/r/804625 Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add support for CDE scatter buffers	Jussi Rasanen	2015-09-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for CDE scatter buffers. When the bus addresses for surfaces are not contiguous as seen by the GPU (e.g., when SMMU is bypassed), CDE swizzling needs additional per-page information. This information is populated in a scatter buffer when required. Bug 1604102 Change-Id: I3384e2cfb5d5f628ed0f21375bdac8e36b77ae4f Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-on: http://git-master/r/789436 Reviewed-on: http://git-master/r/791243 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: implement per-channel watchdog	Deepak Nibade	2015-09-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement per-channel watchdog/timer as per below rules : - start the timer while submitting first job on channel or if no timer is already running - cancel the timer when job completes - re-start the timer if there is any incomplete job left in the channel's queue - trigger appropriate recovery method as part of timeout handling mechanism Handle the timeout as per below : - get timed out channel, and job data - disable activity on all engines - check if fence is really pending - get information on failing engine - if no engine is failing, just abort the channel - if engine is failing, trigger the recovery Also, add flag "ch_wdt_enabled" to enable/disable channel watchdog mechanism. Watchdog can also be disabled using global flag "timeouts_enabled" Set the watchdog time to be 5s using macro NVGPU_CHANNEL_WATCHDOG_DEFAULT_TIMEOUT_MS Bug 200133289 Change-Id: I401cf14dd34a210bc429f31bd5216a361edf1237 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/797072 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: HAL to write DMATRFBASE	Mahantesh Kumbar	2015-09-15
\| \| \| \| \| \| \| \| \| \|	Bug 200137618 Change-Id: I18b980876e93c3f7287082701e1d2b998cd33114 Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/798777 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: cyclestats snapshot permissions rework	Leonid Moiseichuk	2015-09-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Cyclestats snapshot feature is expected for new devices. The detection code was isolated in separate function and run-time check added to validate/allow ioctl calls on the current GPU. Bug 1674079 Change-Id: Icc2f1e5cc50d39b395d31d5292c314f99d67f3eb Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com> Reviewed-on: http://git-master/r/781697 (cherry picked from commit bdd23136b182c933841f91dd2829061e278a46d4) Reviewed-on: http://git-master/r/793630 Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Inject function addresses	Yogesh	2015-08-21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Inject function addresses of gk20a_do_idle and gk20a_do_unidle once the nvgpu module loads. Bug 1476801 Change-Id: I67a8ae7fb654524616c2c2c710013cbc097a3f32 Signed-off-by: Yogesh Bhosale <ybhosale@nvidia.com> Reviewed-on: http://git-master/r/785047 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Fix NS boot transcfg	Supriya	2015-08-21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Bug 1667322 Accommodate for transcfg address change Change-Id: I7054202b8ce3be1a3fbfe0465e662be6f9740eb3 Signed-off-by: Supriya <ssharatkumar@nvidia.com> Reviewed-on: http://git-master/r/780326 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Inject function address from nvgpu	Yogesh	2015-08-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch inserts the function address of gk20a_debug_dump_device into host data struct once the nvgpu module loads and removes it during unload. Bug 1476801 Change-Id: If49262208325b2aa0807705c26086e6d7c81632c Signed-off-by: Yogesh Bhosale <ybhosale@nvidia.com> Reviewed-on: http://git-master/r/779397 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: Prepare for per-GPU CDE program numbers	Sami Kiminki	2015-08-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add gpu_ops for CDE, and add get_program_numbers function pointer for determining horizontal and vertical CDE swizzler programs. This allows different GPUs to have their own specific requirements for choosing the CDE firmware programs. Bug 1604102 Change-Id: Ib37c13abb017c8eb1c32adc8cbc6b5984488222e Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/784899 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Implement priv pages	Terje Bergstrom	2015-07-03
\| \| \| \| \| \| \| \|	Implement support for privileged pages. Use them for kernel allocated buffers. Change-Id: I720fc441008077b8e2ed218a7a685b8aab2258f0 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/761919
*	gpu: nvgpu: Move clk bypass div code to clk init	Terje Bergstrom	2015-07-03
\| \| \| \| \| \| \| \| \| \| \| \| \|	Clock bypass divider was changed just before resetting priv ring. Move the code to a new clk op instead so that it is executed only on gk20a. Change-Id: Ic8084a4a5fac23770f50b50f910ced2543ba0f28 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/764970 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Aleksandr Frid <afrid@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
*	gpu: nvgpu: Initial MAP_BUFFER_BATCH implementation	Sami Kiminki	2015-06-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add batch support for mapping and unmapping. Batching essentially helps transform some per-map/unmap overhead to per-batch overhead, namely gk20a_busy()/gk20a_idle() calls, GPU L2 flushes, and GPU TLB invalidates. Batching with size 64 has been measured to yield >20x speed-up in low-level fixed-address mapping microbenchmarks. Bug 1614735 Bug 1623949 Change-Id: Ie22b9caea5a7c3fc68a968d1b7f8488dfce72085 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/733231 (cherry picked from commit de4a7cfb93e8228a4a0c6a2815755a8df4531c91) Reviewed-on: http://git-master/r/763812 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: load secure gpccs using dma	Vijayakumar	2015-06-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 200080684 use new cmd defined in ucode for loading GR falcons. flip PRIV load flag in lsb header to indicate using dma. use pmu msg as cmd completion for new cmd instead of polling fecs mailbox. also move check for using dma in non secure boot path to hal. Change-Id: I22582a705bd1ae0603f858e1fe200d72e6794a81 Signed-off-by: Vijayakumar <vsubbu@nvidia.com> Reviewed-on: http://git-master/r/761625 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add per-channel refcounting	Konsta Holtta	2015-06-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add reference counting for channels, and wait for reference count to get to 0 in gk20a_channel_free() before actually freeing the channel. Also, change free channel tracking a bit by employing a list of free channels, which simplifies the procedure of finding available channels with reference counting. Each use of a channel must have a reference taken before use or held by the caller. Taking a reference of a wild channel pointer may fail, if the channel is either not opened or in a process of being closed. Also, add safeguards for protecting accidental use of closed channels, specifically, by setting ch->g = NULL in channel free. This will make it obvious if freed channel is attempted to be used. The last user of a channel might be the deferred interrupt handler, so wait for deferred interrupts to be processed twice in the channel free procedure: once for providing last notifications to the channel and once to make sure there are no stale pointers left after referencing to the channel has been denied. Finally, fix some races in channel and TSG force reset IOCTL path, by pausing the channel scheduler in gk20a_fifo_recover_ch() and gk20a_fifo_recover_tsg(), while the affected engines have been identified, the appropriate MMU faults triggered, and the MMU faults handled. In this case, make sure that the MMU fault does not attempt to query the hardware about the failing channel or TSG ids. This should make channel recovery more safe also in the regular (i.e., not in the interrupt handler) context. Bug 1530226 Bug 1597493 Bug 1625901 Bug 200076344 Bug 200071810 Change-Id: Ib274876908e18219c64ea41e50ca443df81d957b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/448463 (cherry picked from commit 3f03aeae64ef2af4829e06f5f63062e8ebd21353) Reviewed-on: http://git-master/r/755147 Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: add zbc support to vgpu	Richard Zhao	2015-06-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For both adding and querying zbc entry, added callbacks in gr ops. Native gpu driver (gk20a) and vgpu will both hook there. For vgpu, it will add or query zbc entry from RM server. Bug 1558561 Change-Id: If8a4850ecfbff41d8592664f5f93ad8c25f6fbce Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/732775 (cherry picked from commit a3787cf971128904c2712338087685b02673065d) Reviewed-on: http://git-master/r/737880 (cherry picked from commit fca2a0457c968656dc29455608f35acab094d816) Reviewed-on: http://git-master/r/753278 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu:nvgpu: update channel_setup_ramfc interface	Seshendra Gadagottu	2015-06-02
\| \| \| \| \| \| \| \| \| \| \| \| \|	Pass flags parameter to channel_setup_ramfc for indicating nvgpu_alloc_gpfifo_args characteristics. Bug 1645628 Change-Id: Ia40b37c5c7b208d459aa84f1b022036dd5e1b599 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/744526 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Use HAL for waiting for GR quiet	Terje Bergstrom	2015-06-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create a HAL for waiting for GR to become quiet. Use it forall cases where we require GR to be quiet, but where it does not need to be idle. Bug 1640378 Change-Id: Ic0222d595a2d049e0fa8864b069ab94a97fac143 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/745640 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com>
*	gpu: nvgpu: add secure gpccs boot support	Vijayakumar	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 200080684 keeping it disabled by default also trimming the code by removing redundant variable to check recovery. pmu quick wait now checks only for irqs which are serviced by kernel. requests pmu to bit bang gpccs ucode. Change-Id: I12ef23d6d59b507e86a129b69eab65b21d0438c6 Signed-off-by: Vijayakumar <vsubbu@nvidia.com> Reviewed-on: http://git-master/r/729622 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Dynamic betacb size	Terje Bergstrom	2015-05-18
\| \| \| \| \| \| \| \| \| \| \|	Allow querying and setting default betacb size via debugfs. For global buffers the value takes effect upon first boot of GPU, and has no effect after that. Bug 1628352 Change-Id: Ib63f4299249c41eab1b36cc501b525cc54211195 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/733328
*	gpu: nvgpu: updated gpmu interface data struct.	Mahantesh Kumbar	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- pmu version 19494277 is from CL 19495746 - updated gpmu interface data struct with respect to latest pmu ucode interface headers. gpmuifpg.h - 19199047 gpmuifperfmon.h - 18238819 gpmuifpmu.h - 19199047 gpmuifacr.h - 19343196 gpmuifcmn.h - 19264862 rmflcnbl.h - 19317152 Bug 200085428 Change-Id: I7db56dcf5a3038b40da37a69e8723a2e9a652e4b Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/728461 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: SMMU bypass	Terje Bergstrom	2015-05-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Improve GMMU mapping code to cope with discontiguous buffers. Add debugfs entry that allows bypassing SMMU and disabling big pages. Bug 1605769 Change-Id: I14d32c62293a16ff8c7195377c75a85fa8061083 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/717503 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/737533 Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com> Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: Reconfigure instance block with syncpt	Terje Bergstrom	2015-05-05
\| \| \| \| \| \| \| \| \| \| \| \|	Resetup RAMFC once sync point id is allocated for a channel. Change-Id: Idbac406bea1c94c89ef587dda08fddc740c1fadb Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/711302 Reviewed-on: http://git-master/r/737526 Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com> Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User
*	gpu: nvgpu: Per-SoC compressible page size	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Define smallest compressible page size per SoC, and use that for determining if a compressible kind should be downgraded to uncompressed. Bug 1605769 Change-Id: I7c9991ba0ae82fe533641f045e506c0b01a10d8b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/724492
*	gpu: nvgpu: add platform specific get_iova_addr()	Deepak Nibade	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add platform specific API pointer (*get_iova_addr)() which can be used to get iova/physical address from given scatterlist and flags Use this API with g->ops.mm.get_iova_addr() instead of calling API gk20a_mm_iova_addr() which makes it platform specific Bug 1605653 Change-Id: I798763db1501bd0b16e84daab68f6093a83caac2 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/713089 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	tegra: gpu: disable touch boost for gpu	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Gpu boosting with input events is causing more gpu power consumption than required. To avoid this, touch boot for gpu is disabled by not registering gpu device for cfboost frame work. Current rail gate entry/exit latencies are fast enough to give smooth user experience. Bug 200087243 Change-Id: I18673d9c95a44ce9bee87e860b4edb29212dc766 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/721989 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: GM20B extended buffer definition	Sandarbh Jain	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update extended buffer definition for Maxwell. On GM20B only PERF_CONTROL0 and PERF_CONTROL5 registers are restored in extended buffer. They are needed for stopping the counters as late as possible during ctx save and start them as early as possible during context restore. On Maxwell, these registers contain the enable/disable bit. Bug 200086767 Change-Id: I59125a2f04bd0975be8a1ccecf993c9370f20337 Signed-off-by: Sandarbh Jain <sanjain@nvidia.com> Reviewed-on: http://git-master/r/717421 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Gpu characterstics enhancement	sujeet baranwal	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	New members are added in nvgpu_gpu_characterstics to export more information required specially from CUDA tools. Change-Id: I907f3bcbd272405a13f47ef6236bc2cff01c6c80 Signed-off-by: Sujeet Baranwal <sbaranwal@nvidia.com> Reviewed-on: http://git-master/r/679202 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Removal of regops from CUDA driver	sujeet baranwal	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current CUDA drivers have been using the regops to directly accessing the GPU registers from user space through the dbg node. This is a security hole and needs to be avoided. The patch alternatively implements the similar functionality in the kernel and provide an ioctl for it. Bug 200083334 Change-Id: Ic5ff5a215cbabe7a46837bc4e15efcceb0df0367 Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Reviewed-on: http://git-master/r/711758 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add DT support for gpu power-domain	Sumit Singh	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \|	First, defining a new structure to support gk20a power domain. Then making necessary modifications to add so as to add DT support for gpu power-domain. bug 200070810 Change-Id: I29e1c24b181e14743d3969103abfd1882d171f07 Signed-off-by: Sumit Singh <sumsingh@nvidia.com> Reviewed-on: http://git-master/r/668973 Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: support for dumping vpr/wpr info	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added support for dumping vpr/wpr info for gm20b. This dump info called when ever gk20a_mm_fb_flush is timed-out. Bug 200082817 Change-Id: I21b0372d0e3f976a189c9c428c015165b715bf88 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/711439 (cherry picked from commit b69897d71c8f6119b49ceb8d3273cdb354178cc5) Reviewed-on: http://git-master/r/712675 GVS: Gerrit_Virtual_Submit Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>