nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: support for hwpm context switching	Peter Daifuku	2016-04-07
\| \| \| \| \| \| \| \| \| \| \| \|	Add support for hwpm context switching Bug 1648200 Change-Id: I482899bf165cd2ef24bb8617be16df01218e462f Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1120450 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add TSG support to channel event id	Deepak Nibade	2016-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add NVGPU_IOCTL_TSG_EVENT_ID_CTRL API for channel event id support to TSGs This API will accept an event_id (like BPT.INT or BPT.PAUSE), a command to enable the event, and return a file descriptor on which we can raise the event (if cmd=enable) Events generated for TSGs will reuse file operations "gk20a_event_id_ops" Bug 200089620 Change-Id: I2f563c6d3a0988eb670caac2d3c7c6795724792c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1030776 (cherry picked from commit 72b61fa266279038f013e582be80c21808e1038d) Reviewed-on: http://git-master/r/1120319 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: add channel event id support	Deepak Nibade	2016-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With NVGPU_IOCTL_CHANNEL_EVENTS_CTRL, nvgpu can raise events to User space. But user space cannot distinguish between various types of events. To overcome this, we need finer-grained API to deliver various events to user space. Remove old API NVGPU_IOCTL_CHANNEL_EVENTS_CTRL, and all the support for this API (we can remove this since User space has not started using this API at all) Add new API NVGPU_IOCTL_CHANNEL_EVENT_ID_CTRL which will accept an event_id (like BPT.INT or BPT.PAUSE), a command to enable the event, and return a file descriptor on which we can raise the event (if cmd=enable) Event is disabled when file descriptor is closed Add file operations "gk20a_event_id_ops" to support polling on event fd Also add API gk20a_channel_get_event_data_from_id() to get event_data of event from its id Bug 200089620 Change-Id: I5288f19f38ff49448c46338c33b2a927c9e02254 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1030775 (cherry picked from commit 5721ce2735950440bedc2b86f851db08ed593275) Reviewed-on: http://git-master/r/1120318 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: Add support for FECS ctxsw tracing	Anton Vorontsov	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 1648908 This commit adds support for FECS ctxsw tracing. Code is compiled conditionnaly under CONFIG_GK20_CTXSW_TRACE. This feature requires an updated FECS ucode that writes one record to a ring buffer on each context switch. On RM/Kernel side, the GPU driver reads records from the master ring buffer and generates trace entries into a user-facing VM ring buffer. For each record in the master ring buffer, RM/Kernel has to retrieve the vmid+pid of the user process that submitted related work. Features currently implemented: - master ring buffer allocation - debugfs to dump master ring buffer - FECS record per context switch (with both current and new contexts) - dedicated device for ctxsw tracing (access to VM ring buffer) - SOF generation (and access to PTIMER) - VM ring buffer allocation, and reconfiguration - enable/disable tracing at user level - event-based trace filtering - context_ptr to vmid+pid mapping - read system call for ctxsw dev - mmap system call for ctxsw dev (direct access to VM ring buffer) - poll system call for ctxsw dev - save/restore register on ELPG/CG6 - separate user ring from FECS ring handling Features requiring ucode changes: - enable/disable tracing at FECS level - actual busy time on engine (bug 1642354) - master ring buffer threshold interrupt (P1) - API for GPU to CPU timestamp conversion (P1) - vmid/pid/uid based filtering (P1) Change-Id: I8e39c648221ee0fa09d5df8524b03dca83fe24f3 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1022737 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add support to set channel timeslice	Aingara Paramakuru	2016-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of improving GPU scheduling, userspace can now set a channel's timeslice, within reasonable limits imposed by the kernel driver. JIRA VFND-1312 Bug 1729664 Change-Id: I4c3430c43437889b8685f12988d4b967bb7877bb Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1020917 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Provide cpu gpu time correlation via ioctl	Arul Sekar	2016-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bug 1648908 Provides pairs of CPU and GPU timestamps that can be used for correlatiing the two timebases - IOCTL made available /dev/nvhost-ctrl-gpu Change-Id: I1458b9d33d794b1b02ec9fd29ed9426756b94bcd Signed-off-by: Arul Sekar <aruls@nvidia.com> Reviewed-on: http://git-master/r/1029732 Reviewed-by: Arun Gona <agona@nvidia.com> Tested-by: Arun Gona <agona@nvidia.com> Reviewed-on: http://git-master/r/1111715 GVS: Gerrit_Virtual_Submit Reviewed-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: improve channel interleave support	Aingara Paramakuru	2016-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, only "high" priority bare channels were interleaved between all other bare channels and TSGs. This patch decouples priority from interleaving and introduces 3 levels for interleaving a bare channel or TSG: high, medium, and low. The levels define the number of times a channel or TSG will appear on a runlist (see nvgpu.h for details). By default, all bare channels and TSGs are set to interleave level low. Userspace can then request the interleave level to be increased via the CHANNEL_SET_RUNLIST_INTERLEAVE ioctl (TSG-specific ioctl will be added later). As timeslice settings will soon be coming from userspace, the default timeslice for "high" priority channels has been restored. JIRA VFND-1302 Bug 1729664 Change-Id: I178bc1cecda23f5002fec6d791e6dcaedfa05c0c Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/1014962 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add characteristics flag NVGPU_GPU_FLAGS_SUPPORT_TSG	Richard Zhao	2016-02-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NVGPU_GPU_FLAGS_SUPPORT_TSG indicates both the kernel driver and device support time slice group (TSG). Bug 1617046 Bug 200155618 Change-Id: Ib3490a32b773222560c58f1fd6d32bffcb97d6cd Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1010173 Reviewed-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
*	gpu: nvgpu: IOCTL to set stop_trigger type	Deepak Nibade	2016-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add IOCTL NVGPU_DBG_GPU_IOCTL_SET_NEXT_STOP_TRIGGER_TYPE to set next stop_trigger type (either single SM or broadcast to all SMs) Also, expose below APIs to check and clear broadcast flag: gk20a_dbg_gpu_broadcast_stop_trigger() gk20a_dbg_gpu_clear_broadcast_stop_trigger() Bug 200156699 Change-Id: I5e6cd4b84e601889fb172e0cdbb6bd5a0d366eab Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/925882 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add max freq to gpu characteristics	Deepak Nibade	2016-02-02
\| \| \| \| \| \| \| \| \| \|	Bug 200097029 Change-Id: Id63dad1629b1d1919cbbfb20b0cb85d4855f526d Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1000724 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add 3 functions to regops interface.	Ashutosh Jain	2015-12-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds the following IOCTLS: - NVGPU_GPU_IOCTL_RESUME_FROM_PAUSE - NVGPU_GPU_IOCTL_TRIGGER_SUSPEND - NVGPU_GPU_IOCTL_CLEAR_SM_ERRORS Bug 1619430 Change-Id: Iac37d515a753d8b799e631224eae2fa168b43e2c Signed-off-by: ashutosh jain <ashutoshj@nvidia.com> Reviewed-on: http://git-master/r/921378 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: update last debug ioctl	Chris Dragan	2015-12-14
\| \| \| \| \| \| \| \| \| \| \| \| \|	Update last IOCTL number to point to GET_TIMEOUT to allow its use. Bug 1706457 Change-Id: I9c8cc4e972fc25e14ca5aff075eca72bc1807a0b Signed-off-by: Chris Dragan <kdragan@nvidia.com> (cherry-picked from commit f73444b0f92737919dafd1623a2dde60c467b25b) Reviewed-on: http://git-master/r/921453 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add API to extract GPU timeout mode	Chris Dragan	2015-12-09
\| \| \| \| \| \| \| \| \| \| \|	Bug 1706457 Change-Id: Iab76bcb7cabc55d99b5acd932716d30da6f01b46 Signed-off-by: Chris Dragan <kdragan@nvidia.com> Reviewed-on: http://git-master/r/835852 Reviewed-on: http://git-master/r/836454 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Wait for pause for SMs	sujeet baranwal	2015-12-04
\| \| \| \| \| \| \| \| \| \| \| \|	SM locking & register reads Order has been changed. Also, functions have been implemented based on gk20a and gm20b. Change-Id: Iaf720d088130f84c4b2ca318d9860194c07966e1 Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Signed-off-by: ashutosh jain <ashutoshj@nvidia.com> Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/837236
*	gpu: nvgpu: IOCTL to disable watchdog per-channel	Deepak Nibade	2015-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add IOCTL NVGPU_IOCTL_CHANNEL_WDT to disable/enable watchdog per-channel Also, if watchdog is disabled, we currently schedule the worker with MAX timeout. Instead of this, do not schedule any worker if watchdog is disabled Bug 1683059 Bug 1700277 Change-Id: I7f6bec84adeedb74e014ed6d1471317b854df84c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/837962 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support skipping buffer refcounting in submit	Deepak Nibade	2015-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In job submission path, we always take refcount on all the mapped buffers to safeguard against case where user space releases the buffer early But in case user space itself is doing proper buffer management, kernel need not take refcounts on all the buffers - which is also a overhead in submit path Hence, provide a new submit flag NVGPU_SUBMIT_GPFIFO_FLAGS_SKIP_BUFFER_REFCOUNTING to optionally skip taking refcounts on all the buffers Also, if we do not take refcounts, then no need to drop any refcounts in gk20a_channel_update() as well Bug 1698667 Bug 200141116 Change-Id: I81bb7a03240300b691c70bcec04ea1badd5934f4 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/824718 (cherry picked from commit 8c8978fa303ec4e6db0233becdbdcbad4a248173) Reviewed-on: http://git-master/r/835801 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: User-space managed address space support	Sami Kiminki	2015-11-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which enables creating userspace-managed GPU address spaces. When an address space is marked as userspace-managed, the following changes are in effect: - Only fixed-address mappings are allowed. - VA space allocation for fixed-address mappings is not required, except to mark space as sparse. - Maps and unmaps are always immediate. In particular, the mapping ref increments at kickoffs and decrements at job completion are skipped. Bug 1614735 Bug 1623949 Bug 1660392 Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/738558 Reviewed-on: http://git-master/r/833253 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Implement NVGPU_GPU_IOCTL_GET_BUFFER_INFO	Sami Kiminki	2015-11-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_GPU_IOCTL_GET_BUFFER_INFO. The new IOCTL can be used to identify buffers and retrieve their sizes. This allows the userspace to be agnostic to the dmabuf implementation, as the generic dmabuf fd interface does not have a reliable way for buffer identification. Bug 1614735 Bug 1623949 Bug 1660392 Change-Id: Ic3dd0a9385c9852778110ccb80636dd6f4f36208 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/822845 Reviewed-on: http://git-master/r/833252 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: IOCTL to set TSG timeslice	Deepak Nibade	2015-11-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new IOCTL NVGPU_IOCTL_TSG_SET_PRIORITY to allow setting timeslice for entire TSG Return error from channel specific IOCTL_CHANNEL_SET_PRIORITY if the channel is part of TSG Separate out API gk20a_channel_get_timescale_from_timeslice() to get timeslice_timeout and scale from timeslice period Use this API to get timeslice_timeout and scale for TSG and store it in tsg_gk20a structure Then trigger runlist update so that new timeslice values will be re-written to runlist for TSG Bug 200146615 Change-Id: I555467d034f81b372b31372f0835d72b1c159508 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/824206 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: IOCTL to disable timeouts	Deepak Nibade	2015-11-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add IOCTL NVGPU_DBG_GPU_IOCTL_TIMEOUT to support disabling/re-enabling scheduler timeout from user space If user space application is closed without re-enabling the timeouts, kernel will restore the timeouts' state while releasing the debug session This is needed for debugging purpose Bug 1514061 Change-Id: I32efb47ad09d793f3e7fd8f0aaa9720c8bc91272 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/788176 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Get rid of legacy gpfifo type	Janne Hellsten	2015-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Get rid of the duplicate gpfifo struct to emphasize the fact that nvgpu_gpfifo is the only memory layout for gpfifo entries that works. This is the same layout that HW uses. Also, add a local pointer to the gpfifo memory in gk20a_submit_channel_gpfifo to get rid of repeated typecasts. Bug 1592391 Bug 1550886 Change-Id: I5432859ef8e7c1aab5907e44098994d7bb807f50 Signed-off-by: Janne Hellsten <jhellsten@nvidia.com> Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/677341 (cherry picked from commit 724c8c6228af81dd440e825bddf545dd6b2b8bd7) Reviewed-on: http://git-master/r/822548 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Arto Merilainen <amerilainen@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add support for CDE scatter buffers	Jussi Rasanen	2015-09-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for CDE scatter buffers. When the bus addresses for surfaces are not contiguous as seen by the GPU (e.g., when SMMU is bypassed), CDE swizzling needs additional per-page information. This information is populated in a scatter buffer when required. Bug 1604102 Change-Id: I3384e2cfb5d5f628ed0f21375bdac8e36b77ae4f Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-on: http://git-master/r/789436 Reviewed-on: http://git-master/r/791243 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Initial MAP_BUFFER_BATCH implementation	Sami Kiminki	2015-06-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add batch support for mapping and unmapping. Batching essentially helps transform some per-map/unmap overhead to per-batch overhead, namely gk20a_busy()/gk20a_idle() calls, GPU L2 flushes, and GPU TLB invalidates. Batching with size 64 has been measured to yield >20x speed-up in low-level fixed-address mapping microbenchmarks. Bug 1614735 Bug 1623949 Change-Id: Ie22b9caea5a7c3fc68a968d1b7f8488dfce72085 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/733231 (cherry picked from commit de4a7cfb93e8228a4a0c6a2815755a8df4531c91) Reviewed-on: http://git-master/r/763812 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: update flush L2 ioctl definition	Sam Payne	2015-06-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	this ioctl can be called only by a ctrlfd created from the /dev/nvhost-ctrl-gpu node therefore NVGPU_GPU_IOCTL_MAGIC, not NVGPU_DBG_GPU_IOCTL_MAGIC should be used for this bug 200111987 Change-Id: I9fce7eae9f8203a15270ac1d25b575aebd9ccf88 Signed-off-by: Sam Payne <spayne@nvidia.com> Reviewed-on: http://git-master/r/755164 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> (cherry picked from commit 609d45ddd98c31ecd089d2e213ee1b6c560fc21e) Reviewed-on: http://git-master/r/760830
*	gpu:nvgpu: correct name for unmapped ptes flags	Seshendra Gadagottu	2015-06-10
\| \| \| \| \| \| \| \| \| \|	Bug 1587825 Change-Id: I66f2988b7f1884b53bb8f3cd09ad1ead1652ffda Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/751484 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: cyclestats mode E snapshots support	Leonid Moiseichuk	2015-06-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	That is a kernel supporting code for cyclestats mode E. Cyclestats mode E implemented following Windows-design in user-space and required the following operations to be implemented: - attach a client for shared hardware buffer of device - detach client from shared hardware buffer - flush means copy of available data from hardware buffer to private client buffers according to perfmon IDs assigned for clients - perfmon IDs management for user-space clients - a NVGPU_GPU_FLAGS_SUPPORT_CYCLE_STATS_SNAPSHOT capability added Bug 1573150 Change-Id: I9e09f0fbb2be5a95c47e6d80a2e23fa839b46f9a Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com> Reviewed-on: http://git-master/r/740653 (cherry picked from commit 79fe89fd4cea39d8ab9dbef0558cd806ddfda87f) Reviewed-on: http://git-master/r/753274 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Implement compbits mapping	Sami Kiminki	2015-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_AS_IOCTL_GET_BUFFER_COMPBITS_INFO for requesting info on compbits-mappable buffers; and NVGPU_AS_IOCTL_MAP_BUFFER_COMPBITS, which enables mapping compbits to the GPU address space of said buffers. This, subsequently, enables moving comptag swizzling from GPU to CDEH/CDEV formats to userspace. Compbits mapping is conservative and it may map more than what is strictly needed. This is because two reasons: 1) mapping must be done on small page alignment (4kB), and 2) GPU comptags are swizzled all around the aggregate cache line, which means that the whole cache line must be visible even if only some comptag lines are required from it. Cache line size is not necessarily a multiple of the small page size. Bug 200077571 Change-Id: I5ae88fe6b616e5ea37d3bff0dff46c07e9c9267e Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/719710 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Implement compbits padding for mapping	Sami Kiminki	2015-05-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_AS_MAP_BUFFER_FLAGS_MAPPABLE_COMPBITS, which adds extra alignment to compbits allocation for safe compbits mapping. Bug 200077571 Change-Id: I3a74ebb81412e4e1e69501debeb9ef4e2056ef1a Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/730763 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/740693 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Gpu characterstics enhancement	sujeet baranwal	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	New members are added in nvgpu_gpu_characterstics to export more information required specially from CUDA tools. Change-Id: I907f3bcbd272405a13f47ef6236bc2cff01c6c80 Signed-off-by: Sujeet Baranwal <sbaranwal@nvidia.com> Reviewed-on: http://git-master/r/679202 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Removal of regops from CUDA driver	sujeet baranwal	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current CUDA drivers have been using the regops to directly accessing the GPU registers from user space through the dbg node. This is a security hole and needs to be avoided. The patch alternatively implements the similar functionality in the kernel and provide an ioctl for it. Bug 200083334 Change-Id: Ic5ff5a215cbabe7a46837bc4e15efcceb0df0367 Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Reviewed-on: http://git-master/r/711758 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu:nvgpu: add support for unmapped ptes	Seshendra Gadagottu	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Add support for unmapped ptes during gmmu map. Bug 1587825 Change-Id: I6e42ef58bae70ce29e5b82852f77057855ca9971 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/696507 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Allow enabling PC sampling	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \|	Allow enabling of PC sampling hardware workaround. It is only applicable to gm20b. Bug 1517458 Bug 1573150 Change-Id: Iad6a3ae556489fb7ab9628637d291849d2cd98ea Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/710421
*	gpu: nvgpu: ioctl for flushing GPU L2	sujeet baranwal	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	CUDA devtools need to be able to flush the GPU's cache in a sideband fashion and so cannot use methods. This change implements an nvgpu_gpu_ioctl to flush and optionally invalidate the GPU's L2 cache and flush fb. Change-Id: Ib06a0bc8d8880ffbfe4b056518cc3c3df0cc4988 Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Signed-off-by: Mayank Kaushik <mkaushik@nvidia.com> Reviewed-on: http://git-master/r/671809 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add hw perfmon buffer mapping ioctls	Konsta Holtta	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Map/unmap buffers for HWPM and deal with its instance block, the minimum work required to run the HWPM via regops from userspace. Bug 1517458 Bug 1573150 Change-Id: If14086a88b54bf434843d7c2fee8a9113023a3b0 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/673689 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add open channel ioctl to ctrl node	Konsta Holtta	2015-04-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the ioctl to open a new gpu channel to also the control node for improved process startup performance, in addition to the current open ioctl in the channel node. The new channel fd creation is refactored to a separate function which is called from both ctrl and channel ioctls. Bug 1604952 Change-Id: I3357ceec694c0e6d7a85807183884324cb725d3a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/679516 Reviewed-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gk20a: Set lockboost size for compute	Terje Bergstrom	2015-04-04
\| \| \| \| \| \| \| \| \| \| \|	For compute channel on gk20a, set lockboost size to zero. Bug 1573856 Change-Id: I369cebf72241e4017e7d380c82caff6014e42984 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/594843 GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Implement NVGPU_AS_IOCTL_GET_VA_REGIONS	Sami Kiminki	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	Implement NVGPU_AS_IOCTL_GET_VA_REGIONS which returns a list of GPU VA regions for different page sizes. This is required for the userspace for safe fixed-address address space allocation. Bug 1551752 Change-Id: I63ddde30935db2471bec498dae0caa870e89c1a5 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/590814 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: GPU characteristics additions	Sami Kiminki	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the following info into GPU characteristics: available big page sizes, support indicators for sync fence fds and cycle stats, gpc mask, SM version, SM SPA version and warp count, and IOCTL interface levels. Also, add new IOCTL to fetch TPC masks. Bug 1551769 Bug 1558186 Change-Id: I8a47d882645f29c7bf0c8f74334ebf47240e41de Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/562904 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add class numbers to characteristics	Terje Bergstrom	2015-03-18
\| \| \| \| \| \| \| \| \| \| \|	Some kernel APIs rely on user space knowing class numbers. Allow querying the numbers from kernel. Bug 1567274 Change-Id: Idec2fe8ee983ee74bcbf9dfc98f71bbcc1492cfb Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/594402
*	gpu: kernel support for suspending/resuming SMs	sujeet baranwal	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Kernel support for allowing a GPU debugger to suspend and resume SMs. Invocation of "suspend" on a given channel will suspend all SMs if the channel is resident, else remove the channel form the runlist. Similarly, "resume" will either resume all SMs if the channel was resident, or re-enable the channel in the runlist. Change-Id: I3b4ae21dc1b91c1059c828ec6db8125f8a0ce194 Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com> Signed-off-by: Mayank Kaushik <mkaushik@nvidia.com> Reviewed-on: http://git-master/r/552115 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: T18x support	Kenneth Adams	2015-03-18
\| \| \| \| \| \| \| \| \| \|	nvgpu framework and build for T18x Bug 1567274 Change-Id: I77835302a1110573008869d1106eface512bb9b1 Signed-off-by: Ken Adams <kadams@nvidia.com> Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add ioctl to create new TSG	Terje Bergstrom	2015-03-18
\| \| \| \| \| \| \| \| \|	Add ioctl to nvhost-ctrl to create a new TSG. Bug 200042993 Change-Id: Icdd0edb1d9e374740ace6da9eb3a10c57c62617a Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Implement 64k large page support	Terje Bergstrom	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement support for 64kB large page size. Add an API to create an address space via IOCTL so that we can accept flags, and assign one flag for enabling 64kB large page size. Also adds APIs to set per-context large page size. This is possible only on Maxwell, so return error if caller tries to set large page size on Kepler. Default large page size is still 128kB. Change-Id: I20b51c8f6d4a984acae8411ace3de9000c78e82f Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	include: uapi: fix nvgpu.h comments and bits	Konsta Holtta	2015-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	Correct some old comments and remove uses of the BIT macro to make it easier to sync this file to userspace. Change-Id: Ie897fc73e28b8194e0c5357eef7ae233395e9ba3 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/552916 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Lauri Peltonen <lpeltonen@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: create new nvgpu ioctl header	Konsta Holtta	2015-03-18
	Move nvgpu ioctls from the many user space interface headers to a new single nvgpu.h header under include/uapi. No new code or replaced names are introduced; this change only moves the definitions and changes include directives accordingly. Bug 1434573 Change-Id: I4d02415148e437a4e3edad221e08785fac377e91 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/542651 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>