nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: fix engine reset in FECS trace	Thomas Fleury	2016-05-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In virtualization case, VM server is the only one allowed to write to ctxsw ring buffer. It will also generate an event in case of engine reset. Only generate a tracepoint on Guest OS side. EVLR-314 Change-Id: I2cb09780a9b41237fe196ef1f3515923f36a24a4 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1130743 (cherry picked from commit 4bbf9538e2a3375eb86b2feea6c605c3eec2ca40) Reviewed-on: http://git-master/r/1133614 (cherry picked from commit 2076d944db41b37143c27795b3cffd88e99e0b00) Reviewed-on: http://git-master/r/1150046 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add ctxsw channel reset event	Thomas Fleury	2016-05-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Generate a ctxsw channel reset when engine needs to be reset. This event is generated by the driver, while other events are generated by FECS. JIRA ELVR-314 Change-Id: I7791cf3e538782464c37c442c871acb177484566 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1129029 (cherry picked from commit 114038a1a5d9e8941bc53f3e95115b01dd1f8c6e) Reviewed-on: http://git-master/r/1134379 (cherry picked from commit 15fa2a7b48a0937dfd449ca0c4ed5ad3a863d6bf) Reviewed-on: http://git-master/r/1123916 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix comment on binding channels to debug session	Deepak Nibade	2016-05-12
\| \| \| \| \| \| \| \| \| \| \|	Bug 1646259 Change-Id: Ieb1ca36c8e883c475174d7bbdb3a215f77f9fb47 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1145715 Reviewed-by: Cory Perry <cperry@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add supported preemptions to gpu characteristics	Deepak Nibade	2016-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add below flag fields to gpu characteristics to indicate supported and default preemption modes on platform for graphics and compute __u32 graphics_preemption_mode_flags; __u32 compute_preemption_mode_flags; __u32 default_graphics_preempt_mode; __u32 default_compute_preempt_mode; Add struct nvgpu_preemption_modes_rec to struct gr_gk20a to store these values locally Use platform specific get_preemption_mode_flags() to get the flags and define gk20a/gm20b specific get_preemption_mode_flags() API Bug 1646259 Change-Id: I80193c0d988dc93bd96585f9aa631fd817f4dfa3 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1133595 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: separate IOCTL to set preemption mode	Deepak Nibade	2016-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add separate IOCTL NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE to allow setting preemption modes from UMD Define preemption modes in nvgpu.h and use them everywhere Remove mode definitions from mm_gk20a.h Also, we support setting only one preemption mode in a channel But it is possible to have multiple preemption modes (one from graphics and one from compute) set simultaneously Hence, update struct gr_ctx_desc to include two separate preemption modes (graphics_preempt_mode and compute_preempt_mode) API NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE also supports setting two separate preemption modes i.e. one for graphics and one for compute Make necessary changes in code to support two preemption modes Bug 1646259 Change-Id: Ia1dea19e609ba8cc0de2f39ab6c0c4cd6b0a752c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1131805 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add TSG timeslice support	Aingara Paramakuru	2016-04-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for changing a TSG's timeslice, within reasonable limits imposed by the kernel driver. JIRA VFND-1494 Bug 1749744 Change-Id: Ifca1b63a00da7a5872483bb56692da70a5f18bdf Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1129837 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: IOCTL to suspend/resume context	Deepak Nibade	2016-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add below IOCTL to suspend/resume a context NVGPU_DBG_GPU_IOCTL_SUSPEND_RESUME_CONTEXTS: Suspend sequence : - disable ctxsw - loop through list of channels - if channel is ctx resident, suspend all SMs - otherwise, disable channel/TSG - enable ctxsw Resume sequence : - disable ctxsw - loop through list of channels - if channel is ctx resident, resume all SMs - otherwise, enable channel/TSG - enable ctxsw Bug 200156699 Change-Id: Iacf1bf7877b67ddf87cc6891c37c758a4644b014 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1120332 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support binding multiple channels to a debug session	Deepak Nibade	2016-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently bind only one channel to a debug session But some use cases might need multiple channels bound to same debug session Add this support by adding a list of channels to debug session. List structure is implemented as struct dbg_session_channel_data List node dbg_s_list_node is currently defined in struct dbg_session_gk20a. But this is inefficient when we need to add debug session to multiple channels Hence add new reference structure dbg_session_data to store dbg_session pointer and list entry For each NVGPU_DBG_GPU_IOCTL_BIND_CHANNEL call, create two reference structure dbg_session_channel_data for channel and dbg_session_data for debug session and bind them together Define API nvgpu_dbg_gpu_get_session_channel() which will get first channel in the list of debug session Use this API wherever we refer to channel bound to debug session Remove dbg_sessions define in struct gk20a since it is not being used anywhere Add new API NVGPU_DBG_GPU_IOCTL_UNBIND_CHANNEL to support unbinding of channel from debug sesssion Bug 200156699 Change-Id: I3bfa6f9cd5b90e7254a75c7e64ac893739776b7f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1120331 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu; nvgpu: IOCTL to write/clear SM error states	Deepak Nibade	2016-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add below IOCTLs to write/clear SM error states NVGPU_DBG_GPU_IOCTL_CLEAR_SINGLE_SM_ERROR_STATE NVGPU_DBG_GPU_IOCTL_WRITE_SINGLE_SM_ERROR_STATE Bug 200156699 Change-Id: I89e3ec51c33b8e131a67d28807d5acf57b3a48fd Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1120330 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support storing/reading single SM error state	Deepak Nibade	2016-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support to store error state of single SM before preprocessing SM exception Error state is stored as : struct nvgpu_dbg_gpu_sm_error_state_record { u32 hww_global_esr; u32 hww_warp_esr; u64 hww_warp_esr_pc; u32 hww_global_esr_report_mask; u32 hww_warp_esr_report_mask; } Note that we can safely append new fields to above structure in the future if required Also, add IOCTL NVGPU_DBG_GPU_IOCTL_READ_SINGLE_SM_ERROR_STATE to support reading SM's error state by user space Bug 200156699 Change-Id: I9a62cb01e8a35c720b52d5d202986347706c7308 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1120329 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add TSG interleave support	Aingara Paramakuru	2016-04-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for changing a TSG's runlist interleave level. JIRA VFND-1497 Bug 1749744 Change-Id: I3cf3ebc2334f83b1bfb6b3230fae2ca73c75c239 Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1122677 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Implement NVGPU_GPU_IOCTL_GET_GPU_TIME	Sami Kiminki	2016-04-15
\| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_GPU_IOCTL_GET_GPU_TIME for reading the GPU time. Bug 1395833 Change-Id: I7ddc7c28ff0c9a336cc0dcd820b15fb0fea714d0 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/1125630 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add support for t19x	Seshendra Gadagottu	2016-04-13
\| \| \| \| \| \| \| \| \| \| \| \|	Add build and gpu framework support for t19x. Bug 1735757 Change-Id: I4b7c6468871ca27412a6f9be20f744bc730b4142 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1122093 GVS: Gerrit_Virtual_Submit Reviewed-by: Ken Adams <kadams@nvidia.com>
*	gpu: nvgpu: support for hwpm context switching	Peter Daifuku	2016-04-07
\| \| \| \| \| \| \| \| \| \| \| \|	Add support for hwpm context switching Bug 1648200 Change-Id: I482899bf165cd2ef24bb8617be16df01218e462f Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1120450 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add TSG support to channel event id	Deepak Nibade	2016-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add NVGPU_IOCTL_TSG_EVENT_ID_CTRL API for channel event id support to TSGs This API will accept an event_id (like BPT.INT or BPT.PAUSE), a command to enable the event, and return a file descriptor on which we can raise the event (if cmd=enable) Events generated for TSGs will reuse file operations "gk20a_event_id_ops" Bug 200089620 Change-Id: I2f563c6d3a0988eb670caac2d3c7c6795724792c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1030776 (cherry picked from commit 72b61fa266279038f013e582be80c21808e1038d) Reviewed-on: http://git-master/r/1120319 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: add channel event id support	Deepak Nibade	2016-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With NVGPU_IOCTL_CHANNEL_EVENTS_CTRL, nvgpu can raise events to User space. But user space cannot distinguish between various types of events. To overcome this, we need finer-grained API to deliver various events to user space. Remove old API NVGPU_IOCTL_CHANNEL_EVENTS_CTRL, and all the support for this API (we can remove this since User space has not started using this API at all) Add new API NVGPU_IOCTL_CHANNEL_EVENT_ID_CTRL which will accept an event_id (like BPT.INT or BPT.PAUSE), a command to enable the event, and return a file descriptor on which we can raise the event (if cmd=enable) Event is disabled when file descriptor is closed Add file operations "gk20a_event_id_ops" to support polling on event fd Also add API gk20a_channel_get_event_data_from_id() to get event_data of event from its id Bug 200089620 Change-Id: I5288f19f38ff49448c46338c33b2a927c9e02254 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1030775 (cherry picked from commit 5721ce2735950440bedc2b86f851db08ed593275) Reviewed-on: http://git-master/r/1120318 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: Add support for FECS ctxsw tracing	Anton Vorontsov	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 1648908 This commit adds support for FECS ctxsw tracing. Code is compiled conditionnaly under CONFIG_GK20_CTXSW_TRACE. This feature requires an updated FECS ucode that writes one record to a ring buffer on each context switch. On RM/Kernel side, the GPU driver reads records from the master ring buffer and generates trace entries into a user-facing VM ring buffer. For each record in the master ring buffer, RM/Kernel has to retrieve the vmid+pid of the user process that submitted related work. Features currently implemented: - master ring buffer allocation - debugfs to dump master ring buffer - FECS record per context switch (with both current and new contexts) - dedicated device for ctxsw tracing (access to VM ring buffer) - SOF generation (and access to PTIMER) - VM ring buffer allocation, and reconfiguration - enable/disable tracing at user level - event-based trace filtering - context_ptr to vmid+pid mapping - read system call for ctxsw dev - mmap system call for ctxsw dev (direct access to VM ring buffer) - poll system call for ctxsw dev - save/restore register on ELPG/CG6 - separate user ring from FECS ring handling Features requiring ucode changes: - enable/disable tracing at FECS level - actual busy time on engine (bug 1642354) - master ring buffer threshold interrupt (P1) - API for GPU to CPU timestamp conversion (P1) - vmid/pid/uid based filtering (P1) Change-Id: I8e39c648221ee0fa09d5df8524b03dca83fe24f3 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1022737 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add support to set channel timeslice	Aingara Paramakuru	2016-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of improving GPU scheduling, userspace can now set a channel's timeslice, within reasonable limits imposed by the kernel driver. JIRA VFND-1312 Bug 1729664 Change-Id: I4c3430c43437889b8685f12988d4b967bb7877bb Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1020917 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Provide cpu gpu time correlation via ioctl	Arul Sekar	2016-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 1648908 Provides pairs of CPU and GPU timestamps that can be used for correlatiing the two timebases - IOCTL made available /dev/nvhost-ctrl-gpu Change-Id: I1458b9d33d794b1b02ec9fd29ed9426756b94bcd Signed-off-by: Arul Sekar <aruls@nvidia.com> Reviewed-on: http://git-master/r/1029732 Reviewed-by: Arun Gona <agona@nvidia.com> Tested-by: Arun Gona <agona@nvidia.com> Reviewed-on: http://git-master/r/1111715 GVS: Gerrit_Virtual_Submit Reviewed-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: improve channel interleave support	Aingara Paramakuru	2016-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, only "high" priority bare channels were interleaved between all other bare channels and TSGs. This patch decouples priority from interleaving and introduces 3 levels for interleaving a bare channel or TSG: high, medium, and low. The levels define the number of times a channel or TSG will appear on a runlist (see nvgpu.h for details). By default, all bare channels and TSGs are set to interleave level low. Userspace can then request the interleave level to be increased via the CHANNEL_SET_RUNLIST_INTERLEAVE ioctl (TSG-specific ioctl will be added later). As timeslice settings will soon be coming from userspace, the default timeslice for "high" priority channels has been restored. JIRA VFND-1302 Bug 1729664 Change-Id: I178bc1cecda23f5002fec6d791e6dcaedfa05c0c Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1014962 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add characteristics flag NVGPU_GPU_FLAGS_SUPPORT_TSG	Richard Zhao	2016-02-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NVGPU_GPU_FLAGS_SUPPORT_TSG indicates both the kernel driver and device support time slice group (TSG). Bug 1617046 Bug 200155618 Change-Id: Ib3490a32b773222560c58f1fd6d32bffcb97d6cd Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1010173 Reviewed-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: IOCTL to set stop_trigger type	Deepak Nibade	2016-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add IOCTL NVGPU_DBG_GPU_IOCTL_SET_NEXT_STOP_TRIGGER_TYPE to set next stop_trigger type (either single SM or broadcast to all SMs) Also, expose below APIs to check and clear broadcast flag: gk20a_dbg_gpu_broadcast_stop_trigger() gk20a_dbg_gpu_clear_broadcast_stop_trigger() Bug 200156699 Change-Id: I5e6cd4b84e601889fb172e0cdbb6bd5a0d366eab Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/925882 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add max freq to gpu characteristics	Deepak Nibade	2016-02-02
\| \| \| \| \| \| \| \| \| \|	Bug 200097029 Change-Id: Id63dad1629b1d1919cbbfb20b0cb85d4855f526d Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1000724 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add 3 functions to regops interface.	Ashutosh Jain	2015-12-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds the following IOCTLS: - NVGPU_GPU_IOCTL_RESUME_FROM_PAUSE - NVGPU_GPU_IOCTL_TRIGGER_SUSPEND - NVGPU_GPU_IOCTL_CLEAR_SM_ERRORS Bug 1619430 Change-Id: Iac37d515a753d8b799e631224eae2fa168b43e2c Signed-off-by: ashutosh jain <ashutoshj@nvidia.com> Reviewed-on: http://git-master/r/921378 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: update last debug ioctl	Chris Dragan	2015-12-14
\| \| \| \| \| \| \| \| \| \| \| \| \|	Update last IOCTL number to point to GET_TIMEOUT to allow its use. Bug 1706457 Change-Id: I9c8cc4e972fc25e14ca5aff075eca72bc1807a0b Signed-off-by: Chris Dragan <kdragan@nvidia.com> (cherry-picked from commit f73444b0f92737919dafd1623a2dde60c467b25b) Reviewed-on: http://git-master/r/921453 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add API to extract GPU timeout mode	Chris Dragan	2015-12-09
\| \| \| \| \| \| \| \| \| \| \|	Bug 1706457 Change-Id: Iab76bcb7cabc55d99b5acd932716d30da6f01b46 Signed-off-by: Chris Dragan <kdragan@nvidia.com> Reviewed-on: http://git-master/r/835852 Reviewed-on: http://git-master/r/836454 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Wait for pause for SMs	sujeet baranwal	2015-12-04
\| \| \| \| \| \| \| \| \| \| \| \|	SM locking & register reads Order has been changed. Also, functions have been implemented based on gk20a and gm20b. Change-Id: Iaf720d088130f84c4b2ca318d9860194c07966e1 Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Signed-off-by: ashutosh jain <ashutoshj@nvidia.com> Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/837236
*	gpu: nvgpu: IOCTL to disable watchdog per-channel	Deepak Nibade	2015-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add IOCTL NVGPU_IOCTL_CHANNEL_WDT to disable/enable watchdog per-channel Also, if watchdog is disabled, we currently schedule the worker with MAX timeout. Instead of this, do not schedule any worker if watchdog is disabled Bug 1683059 Bug 1700277 Change-Id: I7f6bec84adeedb74e014ed6d1471317b854df84c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/837962 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support skipping buffer refcounting in submit	Deepak Nibade	2015-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In job submission path, we always take refcount on all the mapped buffers to safeguard against case where user space releases the buffer early But in case user space itself is doing proper buffer management, kernel need not take refcounts on all the buffers - which is also a overhead in submit path Hence, provide a new submit flag NVGPU_SUBMIT_GPFIFO_FLAGS_SKIP_BUFFER_REFCOUNTING to optionally skip taking refcounts on all the buffers Also, if we do not take refcounts, then no need to drop any refcounts in gk20a_channel_update() as well Bug 1698667 Bug 200141116 Change-Id: I81bb7a03240300b691c70bcec04ea1badd5934f4 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/824718 (cherry picked from commit 8c8978fa303ec4e6db0233becdbdcbad4a248173) Reviewed-on: http://git-master/r/835801 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: User-space managed address space support	Sami Kiminki	2015-11-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which enables creating userspace-managed GPU address spaces. When an address space is marked as userspace-managed, the following changes are in effect: - Only fixed-address mappings are allowed. - VA space allocation for fixed-address mappings is not required, except to mark space as sparse. - Maps and unmaps are always immediate. In particular, the mapping ref increments at kickoffs and decrements at job completion are skipped. Bug 1614735 Bug 1623949 Bug 1660392 Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/738558 Reviewed-on: http://git-master/r/833253 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Implement NVGPU_GPU_IOCTL_GET_BUFFER_INFO	Sami Kiminki	2015-11-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_GPU_IOCTL_GET_BUFFER_INFO. The new IOCTL can be used to identify buffers and retrieve their sizes. This allows the userspace to be agnostic to the dmabuf implementation, as the generic dmabuf fd interface does not have a reliable way for buffer identification. Bug 1614735 Bug 1623949 Bug 1660392 Change-Id: Ic3dd0a9385c9852778110ccb80636dd6f4f36208 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/822845 Reviewed-on: http://git-master/r/833252 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: IOCTL to set TSG timeslice	Deepak Nibade	2015-11-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new IOCTL NVGPU_IOCTL_TSG_SET_PRIORITY to allow setting timeslice for entire TSG Return error from channel specific IOCTL_CHANNEL_SET_PRIORITY if the channel is part of TSG Separate out API gk20a_channel_get_timescale_from_timeslice() to get timeslice_timeout and scale from timeslice period Use this API to get timeslice_timeout and scale for TSG and store it in tsg_gk20a structure Then trigger runlist update so that new timeslice values will be re-written to runlist for TSG Bug 200146615 Change-Id: I555467d034f81b372b31372f0835d72b1c159508 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/824206 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: IOCTL to disable timeouts	Deepak Nibade	2015-11-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add IOCTL NVGPU_DBG_GPU_IOCTL_TIMEOUT to support disabling/re-enabling scheduler timeout from user space If user space application is closed without re-enabling the timeouts, kernel will restore the timeouts' state while releasing the debug session This is needed for debugging purpose Bug 1514061 Change-Id: I32efb47ad09d793f3e7fd8f0aaa9720c8bc91272 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/788176 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Get rid of legacy gpfifo type	Janne Hellsten	2015-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Get rid of the duplicate gpfifo struct to emphasize the fact that nvgpu_gpfifo is the only memory layout for gpfifo entries that works. This is the same layout that HW uses. Also, add a local pointer to the gpfifo memory in gk20a_submit_channel_gpfifo to get rid of repeated typecasts. Bug 1592391 Bug 1550886 Change-Id: I5432859ef8e7c1aab5907e44098994d7bb807f50 Signed-off-by: Janne Hellsten <jhellsten@nvidia.com> Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/677341 (cherry picked from commit 724c8c6228af81dd440e825bddf545dd6b2b8bd7) Reviewed-on: http://git-master/r/822548 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Arto Merilainen <amerilainen@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add support for CDE scatter buffers	Jussi Rasanen	2015-09-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for CDE scatter buffers. When the bus addresses for surfaces are not contiguous as seen by the GPU (e.g., when SMMU is bypassed), CDE swizzling needs additional per-page information. This information is populated in a scatter buffer when required. Bug 1604102 Change-Id: I3384e2cfb5d5f628ed0f21375bdac8e36b77ae4f Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-on: http://git-master/r/789436 Reviewed-on: http://git-master/r/791243 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Initial MAP_BUFFER_BATCH implementation	Sami Kiminki	2015-06-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add batch support for mapping and unmapping. Batching essentially helps transform some per-map/unmap overhead to per-batch overhead, namely gk20a_busy()/gk20a_idle() calls, GPU L2 flushes, and GPU TLB invalidates. Batching with size 64 has been measured to yield >20x speed-up in low-level fixed-address mapping microbenchmarks. Bug 1614735 Bug 1623949 Change-Id: Ie22b9caea5a7c3fc68a968d1b7f8488dfce72085 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/733231 (cherry picked from commit de4a7cfb93e8228a4a0c6a2815755a8df4531c91) Reviewed-on: http://git-master/r/763812 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: update flush L2 ioctl definition	Sam Payne	2015-06-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	this ioctl can be called only by a ctrlfd created from the /dev/nvhost-ctrl-gpu node therefore NVGPU_GPU_IOCTL_MAGIC, not NVGPU_DBG_GPU_IOCTL_MAGIC should be used for this bug 200111987 Change-Id: I9fce7eae9f8203a15270ac1d25b575aebd9ccf88 Signed-off-by: Sam Payne <spayne@nvidia.com> Reviewed-on: http://git-master/r/755164 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> (cherry picked from commit 609d45ddd98c31ecd089d2e213ee1b6c560fc21e) Reviewed-on: http://git-master/r/760830
*	gpu:nvgpu: correct name for unmapped ptes flags	Seshendra Gadagottu	2015-06-10
\| \| \| \| \| \| \| \| \| \|	Bug 1587825 Change-Id: I66f2988b7f1884b53bb8f3cd09ad1ead1652ffda Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/751484 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: cyclestats mode E snapshots support	Leonid Moiseichuk	2015-06-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	That is a kernel supporting code for cyclestats mode E. Cyclestats mode E implemented following Windows-design in user-space and required the following operations to be implemented: - attach a client for shared hardware buffer of device - detach client from shared hardware buffer - flush means copy of available data from hardware buffer to private client buffers according to perfmon IDs assigned for clients - perfmon IDs management for user-space clients - a NVGPU_GPU_FLAGS_SUPPORT_CYCLE_STATS_SNAPSHOT capability added Bug 1573150 Change-Id: I9e09f0fbb2be5a95c47e6d80a2e23fa839b46f9a Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com> Reviewed-on: http://git-master/r/740653 (cherry picked from commit 79fe89fd4cea39d8ab9dbef0558cd806ddfda87f) Reviewed-on: http://git-master/r/753274 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Implement compbits mapping	Sami Kiminki	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_AS_IOCTL_GET_BUFFER_COMPBITS_INFO for requesting info on compbits-mappable buffers; and NVGPU_AS_IOCTL_MAP_BUFFER_COMPBITS, which enables mapping compbits to the GPU address space of said buffers. This, subsequently, enables moving comptag swizzling from GPU to CDEH/CDEV formats to userspace. Compbits mapping is conservative and it may map more than what is strictly needed. This is because two reasons: 1) mapping must be done on small page alignment (4kB), and 2) GPU comptags are swizzled all around the aggregate cache line, which means that the whole cache line must be visible even if only some comptag lines are required from it. Cache line size is not necessarily a multiple of the small page size. Bug 200077571 Change-Id: I5ae88fe6b616e5ea37d3bff0dff46c07e9c9267e Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/719710 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Implement compbits padding for mapping	Sami Kiminki	2015-05-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_AS_MAP_BUFFER_FLAGS_MAPPABLE_COMPBITS, which adds extra alignment to compbits allocation for safe compbits mapping. Bug 200077571 Change-Id: I3a74ebb81412e4e1e69501debeb9ef4e2056ef1a Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/730763 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/740693 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Gpu characterstics enhancement	sujeet baranwal	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	New members are added in nvgpu_gpu_characterstics to export more information required specially from CUDA tools. Change-Id: I907f3bcbd272405a13f47ef6236bc2cff01c6c80 Signed-off-by: Sujeet Baranwal <sbaranwal@nvidia.com> Reviewed-on: http://git-master/r/679202 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Removal of regops from CUDA driver	sujeet baranwal	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current CUDA drivers have been using the regops to directly accessing the GPU registers from user space through the dbg node. This is a security hole and needs to be avoided. The patch alternatively implements the similar functionality in the kernel and provide an ioctl for it. Bug 200083334 Change-Id: Ic5ff5a215cbabe7a46837bc4e15efcceb0df0367 Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Reviewed-on: http://git-master/r/711758 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu:nvgpu: add support for unmapped ptes	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Add support for unmapped ptes during gmmu map. Bug 1587825 Change-Id: I6e42ef58bae70ce29e5b82852f77057855ca9971 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/696507 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Allow enabling PC sampling	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Allow enabling of PC sampling hardware workaround. It is only applicable to gm20b. Bug 1517458 Bug 1573150 Change-Id: Iad6a3ae556489fb7ab9628637d291849d2cd98ea Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/710421
*	gpu: nvgpu: ioctl for flushing GPU L2	sujeet baranwal	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	CUDA devtools need to be able to flush the GPU's cache in a sideband fashion and so cannot use methods. This change implements an nvgpu_gpu_ioctl to flush and optionally invalidate the GPU's L2 cache and flush fb. Change-Id: Ib06a0bc8d8880ffbfe4b056518cc3c3df0cc4988 Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Signed-off-by: Mayank Kaushik <mkaushik@nvidia.com> Reviewed-on: http://git-master/r/671809 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add hw perfmon buffer mapping ioctls	Konsta Holtta	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Map/unmap buffers for HWPM and deal with its instance block, the minimum work required to run the HWPM via regops from userspace. Bug 1517458 Bug 1573150 Change-Id: If14086a88b54bf434843d7c2fee8a9113023a3b0 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/673689 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add open channel ioctl to ctrl node	Konsta Holtta	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the ioctl to open a new gpu channel to also the control node for improved process startup performance, in addition to the current open ioctl in the channel node. The new channel fd creation is refactored to a separate function which is called from both ctrl and channel ioctls. Bug 1604952 Change-Id: I3357ceec694c0e6d7a85807183884324cb725d3a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/679516 Reviewed-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gk20a: Set lockboost size for compute	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	For compute channel on gk20a, set lockboost size to zero. Bug 1573856 Change-Id: I369cebf72241e4017e7d380c82caff6014e42984 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/594843 GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Implement NVGPU_AS_IOCTL_GET_VA_REGIONS	Sami Kiminki	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_AS_IOCTL_GET_VA_REGIONS which returns a list of GPU VA regions for different page sizes. This is required for the userspace for safe fixed-address address space allocation. Bug 1551752 Change-Id: I63ddde30935db2471bec498dae0caa870e89c1a5 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/590814 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>