summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/nvgpu/gk20a/cde_gk20a.c
Commit message (Collapse)AuthorAge
...
* gpu: nvgpu: Prepare for per-GPU CDE program numbersSami Kiminki2015-08-19
| | | | | | | | | | | | | | | Add gpu_ops for CDE, and add get_program_numbers function pointer for determining horizontal and vertical CDE swizzler programs. This allows different GPUs to have their own specific requirements for choosing the CDE firmware programs. Bug 1604102 Change-Id: Ib37c13abb017c8eb1c32adc8cbc6b5984488222e Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/784899 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: fix channel close sequenceDeepak Nibade2015-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | In gk20a_cde_remove_ctx(), current sequence is as below - gk20a_channel_close() - gk20a_deinit_cde_img() - gk20a_free_obj_ctx() But gk20a_free_obj_ctx() needs reference to channel and hence below crash is seen : [ 3901.466223] Unable to handle kernel paging request at virtual address 00001624 ... [ 3901.535218] PC is at gk20a_free_obj_ctx+0x14/0xb0 [ 3901.539910] LR is at gk20a_deinit_cde_img+0xd8/0x12c Fix this by closing the channel after gk20a_deinit_cde_img() Bug 1625901 Change-Id: Ic2dc5af933b6d6ef8982c2b9f0caa28df204051f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/770322 GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>
* gpu: nvgpu: Implement priv pagesTerje Bergstrom2015-07-03
| | | | | | | | Implement support for privileged pages. Use them for kernel allocated buffers. Change-Id: I720fc441008077b8e2ed218a7a685b8aab2258f0 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/761919
* gpu: nvgpu: Initial MAP_BUFFER_BATCH implementationSami Kiminki2015-06-30
| | | | | | | | | | | | | | | | | | | Add batch support for mapping and unmapping. Batching essentially helps transform some per-map/unmap overhead to per-batch overhead, namely gk20a_busy()/gk20a_idle() calls, GPU L2 flushes, and GPU TLB invalidates. Batching with size 64 has been measured to yield >20x speed-up in low-level fixed-address mapping microbenchmarks. Bug 1614735 Bug 1623949 Change-Id: Ie22b9caea5a7c3fc68a968d1b7f8488dfce72085 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/733231 (cherry picked from commit de4a7cfb93e8228a4a0c6a2815755a8df4531c91) Reviewed-on: http://git-master/r/763812 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: add per-channel refcountingKonsta Holtta2015-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add reference counting for channels, and wait for reference count to get to 0 in gk20a_channel_free() before actually freeing the channel. Also, change free channel tracking a bit by employing a list of free channels, which simplifies the procedure of finding available channels with reference counting. Each use of a channel must have a reference taken before use or held by the caller. Taking a reference of a wild channel pointer may fail, if the channel is either not opened or in a process of being closed. Also, add safeguards for protecting accidental use of closed channels, specifically, by setting ch->g = NULL in channel free. This will make it obvious if freed channel is attempted to be used. The last user of a channel might be the deferred interrupt handler, so wait for deferred interrupts to be processed twice in the channel free procedure: once for providing last notifications to the channel and once to make sure there are no stale pointers left after referencing to the channel has been denied. Finally, fix some races in channel and TSG force reset IOCTL path, by pausing the channel scheduler in gk20a_fifo_recover_ch() and gk20a_fifo_recover_tsg(), while the affected engines have been identified, the appropriate MMU faults triggered, and the MMU faults handled. In this case, make sure that the MMU fault does not attempt to query the hardware about the failing channel or TSG ids. This should make channel recovery more safe also in the regular (i.e., not in the interrupt handler) context. Bug 1530226 Bug 1597493 Bug 1625901 Bug 200076344 Bug 200071810 Change-Id: Ib274876908e18219c64ea41e50ca443df81d957b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/448463 (cherry picked from commit 3f03aeae64ef2af4829e06f5f63062e8ebd21353) Reviewed-on: http://git-master/r/755147 Reviewed-by: Automatic_Commit_Validation_User
* gpu: nvgpu: Use common allocator for compbit storeTerje Bergstrom2015-04-04
| | | | | | | | | | | | Reduce amount of duplicate code around memory allocation by using common helpers, and common data structure for storing results of allocations. Bug 1605769 Change-Id: I7c1662b669ed8c86465254f6001e536141051ee5 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/720435
* gpu: nvgpu: Implement common allocator and mem_descTerje Bergstrom2015-04-04
| | | | | | | | | | Introduce mem_desc, which holds all information needed for a buffer. Implement helper functions for allocation and freeing that use this data type. Change-Id: I82c88595d058d4fb8c5c5fbf19d13269e48e422f Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/712699
* Revert "gpu: nvgpu: cache cde compbits buf mappings"Konsta Holtta2015-04-04
| | | | | | | | | | This reverts commit 9968badd26490a9d399f526fc57a9defd161dd6c. The commit accidentally introduced some memory leaks. Change-Id: I00d8d4452a152a8a2fe2d90fb949cdfee0de4c69 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/714288 Reviewed-by: Juha Tukkinen <jtukkinen@nvidia.com>
* gpu: nvgpu: remove support for CDE firmware v0Jussi Rasanen2015-04-04
| | | | | | | | | | | | | | -CDE firmware v0 is not used anymore so we can remove support for it. -Bump the threshold for a large surface warning to 8k. Bug 1566740 Change-Id: Ia0434a04cdd453a10a8de08d259e92e6b9a3e964 Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-on: http://git-master/r/709452 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cache cde compbits buf mappingsKonsta Holtta2015-04-04
| | | | | | | | | | | | | | | | don't unmap compbits_buf explicitly from system vm early but store it in the dmabuf's private data, and unmap it later when all user mappings to that buffer have been disappeared. Bug 1546619 Change-Id: I333235a0ea74c48503608afac31f5e9f1eb4b99b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/661949 (cherry picked from commit ed2177e25d9e5facfb38786b818330798a14b9bb) Reviewed-on: http://git-master/r/661835 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: add a new CDE parameterJussi Rasanen2015-04-04
| | | | | | | | | | | Add TYPE_PARAM_GOBS_PER_COMPTAGLINE_PER_SLICE. Change-Id: I7cbf7b6db6642a61629ba06f7887bd58af3dc28f Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-on: http://git-master/r/673152 Reviewed-by: Konsta Holtta <kholtta@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: gk20a: Set lockboost size for computeTerje Bergstrom2015-04-04
| | | | | | | | | | | For compute channel on gk20a, set lockboost size to zero. Bug 1573856 Change-Id: I369cebf72241e4017e7d380c82caff6014e42984 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/594843 GVS: Gerrit_Virtual_Submit
* gpu: nvgpu: cde: ignore spurious context releasesKonsta Holtta2015-04-04
| | | | | | | | | | | | | | | | Gpu channels may get spurious updates from at least nonstalling semaphore wait interrupts. Protect data structure sanity by ignoring releases on already released (= not in use) cde contexts. Bug 200062826 Change-Id: I5940a7557e902bcfcff1a7e8e4593472d9ac306c Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/666235 (cherry picked from commit 47dc2f41eb8054b099b6eb9a4a7d82c97295d415) Reviewed-on: http://git-master/r/666657 GVS: Gerrit_Virtual_Submit Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
* gpu: nvgpu: cde: allow duplicate finish signalsKonsta Holtta2015-03-18
| | | | | | | | | | | | | | | | Channel update callback for a channel that has no more cde jobs signals that a cde context is free. Spurious channel updates may still happen from at least nonstalling semaphore wait interrupts. Instead of scary WARNs, use only gk20a_dbg_info() for info prints in these harmless situations, and double check that only the first update starts a deleter work for temporary contexts. Change-Id: I68de8f35e2c366206c6efac3ee97025239e8bba2 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> (cherry-picked from commit f56a941b4962c5479291cae48e2abca6067e3f13) Reviewed-on: http://git-master/r/660849 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: remove unused obj_idsKonsta Holtta2015-03-18
| | | | | | | | | | | | | obj_id from gk20a_alloc_obj_ctx is not used and calling free_obj_ctx is effectively a no-op, since the corresponding channel is also freed. Bug 200059216 Change-Id: Icbe2cf5dc21d50cb007bf73829705451ada106ac Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/655368 Reviewed-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: fix off-by-1 in buf allocationKonsta Holtta2015-03-18
| | | | | | | | | | Bug 200046882 Change-Id: I515e972f84cb7e1b17eef42ade6a4eaf0f8d71f8 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/559332 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: wait for ctx deletion before getKonsta Holtta2015-03-18
| | | | | | | | | | | | | | | | | | | | Wait for possible temp context deletion to finish properly before passing contexts around later, to prevent situations where the context deleter scheduling would have been completed, but running it would not, and a new one could have been scheduled again. When finished, schedule the deleter before freeing the context back to use to prevent races. Warn in impossible situations when these double deletions would happen. Bug 200054186 Bug 200052943 Change-Id: I23ca0d1081eea77d0e453b9038adc914909b5f48 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/603439 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: combine init and convert passesKonsta Holtta2015-03-18
| | | | | | | | | | | | | | | | CDE context needs to be initialized in the first run using a separate initialization gpfifo before the actual conversion. To prevent a race condition, include both of them in a single gpfifo whenever the initialization is performed. Bug 200052943 Change-Id: I7eb09a906c0374825df71eba969e4596b94e5ff2 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/602888 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: add trace events for ctx allocsKonsta Holtta2015-03-18
| | | | | | | | | | | | Trace cde context allocation and deallocation with ftrace. Bug 200052943 Change-Id: Ieeb625166662971fb3eb3fb29c986fdb6809c10b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/602886 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: warn on double finish and releaseKonsta Holtta2015-03-18
| | | | | | | | | | | | | | Add WARN to conditions that should never happen, to help debugging any context issues. Bug 200052943 Change-Id: Ibe2a9507f3a62bb7b2e263ff3ff21a24a092a971 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/602885 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: restrict context countKonsta Holtta2015-03-18
| | | | | | | | | | | | | Add an upper limit for cde contexts, and wait for a while if a new context is queried and the limit has been exceeded. This happens only under very high load. If the timeout is exceeded, report -EAGAIN. Change-Id: I1fa47ad6cddf620eae00cea16ecea36cf4151cab Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/601719 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: report use counts to debugfsKonsta Holtta2015-03-18
| | | | | | | | | | | Create debugfs nodes for ctx_count, ctx_usecount and ctx_cont_top. Change-Id: I1360853b2650d37a96c8adf76368d48d9b457909 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/602860 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: do not rearm deleter on failureKonsta Holtta2015-03-18
| | | | | | | | | | | | | | | | | | | | Rescheduling the temp context deleter when it is not immediately possible is not necessary, and complicates things. Don't do it. The context would anyway be used later when its time comes in the free list, and the deletion would then be retried. This simplifies canceling the works when shutting down or going into suspend, since re-canceling the possibly rescheduled work is not needed. Releasing the app mutex is still necessary when deleting the whole cde. Bug 200052943 Change-Id: I06afe1766097a78d7bcb93f3140855799ac903ca Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/601035 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: cancel ctx deleter when using itKonsta Holtta2015-03-18
| | | | | | | | | | | | Cancel the temporary context deleter work when acquiring a channel for use, to prevent re-scheduling when the same context would be quickly re-used and finished twice or more in a row before deletion. Change-Id: Iadd8230d9462adc451e506152a24c50a920a59e3 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/600273 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: fix err in oom conditionKonsta Holtta2015-03-18
| | | | | | | | | | | | | | use a correct, negative error sign in ENOMEM when gk20a_gmmu_map runs out of memory. Change-Id: I4fa8a2cf359a5c98cebdf64d4e3fcc96f478f779 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/594397 Reviewed-by: Jussi Rasanen <jrasanen@nvidia.com> Tested-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: Combine H and V passesJussi Rasanen2015-03-18
| | | | | | | | | | | | | | | | When using CDE firmware v1, combine H and V swizzling passes into one pushbuffer submission. This removes one GPU context switch, almost halving the time taken for swizzling. Map only the compbit part of the destination surface. Bug 1546619 Change-Id: I95ed4e4c2eefd6d24a58854d31929cdb91ff556b Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-on: http://git-master/r/553234 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvpgu: cde: fix timeout mgmt, use two listsKonsta Holtta2015-03-18
| | | | | | | | | | | | | | | | | | | | | | | If a channel timeout occurs, reload only the particular context/channel where the timeout occurred, instead of destroying whole cde. Reloading happens by allocating a replacement context and marking the offending channel as soon-to-be-deleted. Clean up the code by using two separate lists for free and used contexts. Rename channel deallocation/allocation functions to better describe what they do, and annotate the functions that need locking. Also do not wait for channel idle before submitting, since the acquired context has a ready channel already. Bug 200046882 Change-Id: I4155a85ea0ed79e284309eb2ad0042df3938f1e2 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/591235 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: fix sparse warningsDeepak Nibade2015-03-18
| | | | | | | | | | | | | | | | | | | Fix below sparse warnings : warning: Using plain integer as NULL pointer warning: symbol <variable/funcion> was not declared. Should it be static? warning: Initializer entry defined twice Also, remove dead functions Bug 1573254 Change-Id: I29d71ecc01c841233cf6b26c9088ca8874773469 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/593363 Reviewed-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Sachin Nikam <snikam@nvidia.com>
* gpu: nvgpu: cde: cancel delayed_work during suspendSeshendra Gadagottu2015-03-18
| | | | | | | | | | | | | | | | | During gpu suspend, cancel all pending delayed cde work to avoid issues of scheduling this delayed work during suspend/resume when gpu is not ready. Bug 1574000 Change-Id: I2b6bfa489435a781dc576a077f9af01b1e1628ce Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/593557 Reviewed-by: Shridhar Rasal <srasal@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Prashant Gaikwad <pgaikwad@nvidia.com> Tested-by: Prashant Gaikwad <pgaikwad@nvidia.com> Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
* gpu: nvgpu: cde: list for contexts, defer deletionKonsta Holtta2015-03-18
| | | | | | | | | | | | | | | | Instead of current preallocated array plus dynamically allocated temporary contexts, use a linked list in LRU fashion, always storing free contexts at the beginning of the list. Initialize the preallocated contexts to the list and store dynamically allocated temporaries there too for quick reuse as needed, with a delayed scheduled work for deleting temporaries when the high load has diminished. Bug 200040211 Change-Id: Ibc75a0150109ec9c44b2eeb74607450990584b18 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/562856 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* Revert "gpu: nvgpu: GR and LTC HAL to use const structs"Sam Payne2015-03-18
| | | | | | | | | | | This reverts commit 41b82e97164138f45fbdaef6ab6939d82ca9419e. Change-Id: Iabd01fcb124e0d22cd9be62151a6552cbb27fc94 Signed-off-by: Sam Payne <spayne@nvidia.com> Reviewed-on: http://git-master/r/592221 Tested-by: Hoang Pham <hopham@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Mitch Luban <mluban@nvidia.com>
* gpu: nvgpu: cde: move GK20A_CDE to platform dataKonsta Holtta2015-03-18
| | | | | | | | | | | | | | | | | CONFIG_GK20A_CDE has not even been used for enabling CDE, just for initializing it at boot time, and it was disabled; initialization has been done late when the engine is first used. Remove the config setting and add information about CDE support in gk20a platform data, forcing the initialization at boot time. Boot time init removes rare race conditions when CDE would be initialized by first user. Bug 200046882 Change-Id: I85d5fb73dc27acbbe203138d25f6e342de030d93 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/562855 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: GR and LTC HAL to use const structsTerje Bergstrom2015-03-18
| | | | | | | | | | | Convert GR and LTC HALs to use const structs, and initialize them with macros. Bug 1567274 Change-Id: Ia3f24a5eccb27578d9cba69755f636818d11275c Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/590371
* gpu: nvgpu: create cde context dynamically as neededKonsta Holtta2015-03-18
| | | | | | | | | | | | | | | | When the preallocated buffer of cde contexts is full, allocate a new temporary one dynamically. This needs to create a totally new command buffer but fixes possible but rare lockups in case of circular dependencies. The temporary is deleted after the channel job has finished. Bug 200040211 Change-Id: Ic18d1441e8574a3e562a22f9b9dfec1acdf72b71 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/552036 GVS: Gerrit_Virtual_Submit Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
* gpu: nvgpu: find unused cde context instead of lruKonsta Holtta2015-03-18
| | | | | | | | | | | | | | | | When preparing a new job, loop initially through the small number of preallocated contexts and try to find one that is already finished, instead of blindly getting the next slot in lru order. If all have work to do, select next in lru order. This reduces the possibility of a deadlock between cde tasks. Bug 200040211 Change-Id: Ib695c0a8e1bcec095d50ec4f2522f3aad39ce97b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/552035 GVS: Gerrit_Virtual_Submit Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
* gpu: nvgpu: cde: set err properly in oom conditionKonsta Holtta2015-03-18
| | | | | | | | | | | | When gk20a_gmmu_map runs out of memory, set the error code before returning early, so that caller knows about that cde load didn't succeed and wouldn't use the bad context. Change-Id: I1e166c78e39f07df941a29fc4e392a853d97a5c6 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/557273 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: CDE optimizationsJussi Rasanen2015-03-18
| | | | | | | | | | | | | | | | | -Change cde_buf to use writecombined cpu mapping. -Since reading writecombined cpu data is still slow, avoid reads in gk20a_replace_data by checking whether a patch overwrites a whole word. -Remove unused distinction between src and dst buffers in cde_convert. -Remove cde debug dump code as it causes a perf hit. Bug 1546619 Change-Id: Ibd45d9c3a3dd3936184c2a2a0ba29e919569b328 Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-on: http://git-master/r/553233 Reviewed-by: Arto Merilainen <amerilainen@nvidia.com> Tested-by: Arto Merilainen <amerilainen@nvidia.com>
* gpu: nvgpu: rename gpu ioctls and structs to nvgpuKonsta Holtta2015-03-18
| | | | | | | | | | | | | | To help remove the nvhost dependency from nvgpu, rename ioctl defines and structures used by nvgpu such that nvhost is replaced by nvgpu. Duplicate some structures as needed. Update header guards and such accordingly. Change-Id: Ifc3a867713072bae70256502735583ab38381877 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/542620 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: check return values from busyKonsta Holtta2015-03-18
| | | | | | | | | | | check gk20a_busy return value in cde converter code paths. Bug 200040921 Change-Id: Ibad36df5877e325636a0a6ccc30c0d3d076ca941 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/540006
* gpu: nvgpu: cde: CDE swizzling optimizationsJussi Rasanen2015-03-18
| | | | | | | | | | | | | | | | | Change CDE swizzling shader kernel size to 8x8 to avoid waste with relatively small surfaces. Map compbit backing store and destination surface as cacheable. Clean up kernel size calculation. Bug 1546619 Change-Id: Ie97c019b4137d2f2230da6ba3034387b1ab1468a Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-on: http://git-master/r/501158 Reviewed-by: Arto Merilainen <amerilainen@nvidia.com> Tested-by: Arto Merilainen <amerilainen@nvidia.com>
* gpu: nvgpu: Support ZBC color trackingLauri Peltonen2015-03-18
| | | | | | | | | | | | | | | | | | The compression state tracking user space API already accepts and returns the ZBC color used for the surface. Actually store the color in kernel so that the feature works. Bug 1536227 Bug 1524301 Change-Id: I264e1eeb90f0c4d40fe35fc2479b0ce83e19a7d7 Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com> Reviewed-on: http://git-master/r/497476 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Tested-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Defer CDE app initializationLauri Peltonen2015-03-18
| | | | | | | | | | | | | | | | | | | | Defer CDE app initialization to the point where we actually need to launch the app. This allows us to use the compression state API also on T124 where we never use the CDE app. Also return the error code correctly from gk20a_prepare_compressible_read. Bug 1524301 Change-Id: If79fbe161e8dc9353b9f5fa0dfcd7f30b00d29b4 Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com> Reviewed-on: http://git-master/r/497351 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Jussi Rasanen <jrasanen@nvidia.com> Tested-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: Map backing store as read-onlyLauri Peltonen2015-03-18
| | | | | | | | | The cde shader will only read data from the global compbit backing store. Map it as read-only to enforce this. Change-Id: If5be44b8daedd5e7fdee650a6e76befa7bdecfd6 Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com> Reviewed-on: http://git-master/r/486679
* gpu: nvgpu: cde: Re-use contextArto Merilainen2015-03-18
| | | | | | | | | | | | | | | | Currently cde reinitialises the context each time before submitting work to the channel. This was done to ensure that we are able to get clean context for the shader during development phase. However, as the shader has been tested to work w/o reinitialising the context, we can remove the reinitialisation to gain better performance. Change-Id: If0b0e03133058528da943faaeb72ca500d3ddb14 Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/486673 Reviewed-by: Jussi Rasanen <jrasanen@nvidia.com> Tested-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Lauri Peltonen <lpeltonen@nvidia.com>
* gpu: nvgpu: cde: Allow large surfacesArto Merilainen2015-03-18
| | | | | | | | | | | | | Currently cde swizzling application forces upper limit to surface size. The limitation is artificial (i.e. nothing prevents shader handling larger surfaces). Therefore, make the error condition a warning. Change-Id: I8f0cda7f2e9e9ecc90589e5a4b4091abcb513482 Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/454591 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: gk20a: cde: Add base_post_divide paramArto Merilainen2015-03-18
| | | | | | | | | | | This patch adds a parameter to communicate the compression bit backing store address we write to the hardware. Change-Id: Ibc0e3d8304e893ddf15b4e03b405c7d85a73e95b Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/454510 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: cde: Allow passing shader parametersArto Merilainen2015-03-18
| | | | | | | | | | | | | | This patch adds support to pass shader parameters through debugfs. These parameters are required to change the shader behaviour without reloading the firmware image. Change-Id: Ib0ff773d9425aa9fcc58655717cccafcfbaf7bfd Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/453462 Reviewed-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Jussi Rasanen <jrasanen@nvidia.com>
* gpu: nvgpu: Add compression state IOCTLsLauri Peltonen2015-03-18
| | | | | | | Bug 1409151 Change-Id: I29a325d7c2b481764fc82d945795d50bcb841961 Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com>
* gpu: nvgpu: CDE supportArto Merilainen2015-03-18
This patch adds support for executing a precompiled GPU program to allow exporting GPU buffers to other graphics units that have color decompression engine (CDE) support. Bug 1409151 Change-Id: Id0c930923f2449b85a6555de71d7ec93eed238ae Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Reviewed-on: http://git-master/r/360418 Reviewed-by: Lauri Peltonen <lpeltonen@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>