aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu.h
Commit message (Collapse)AuthorAge
* Merge tag 'drm-for-v4.15-amd-dc' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds2017-11-17
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull amdgpu DC display code for Vega from Dave Airlie: "This is the pull request for the AMD DC (display code) layer which is a requirement to program the display engines on the new Vega and Raven based GPUs. It also contains support for all amdgpu supported GPUs (CIK, VI, Polaris), which has to be enabled. It is also a kms atomic modesetting compatible driver (unlike the current in-tree display code). I've kept it separate from drm-next because it may have some things that cause you to reject it. Background story: AMD have an internal team creating a shared OS codebase for display at hw bring up time using information from their hardware teams. This process doesn't lead to the most Linux friendly/looking code but we have worked together on cleaning a lot of it up and dealing with sparse/smatch/checkpatch, and having their team internally adhere to Linux coding standards. This tree is a complete history rebased since they started opening it, we decided not to squash it down as the history may have some value. Some of the commits therefore might not reach kernel standards, and we are steadily training people in AMD to better write commit msgs. There is a major bunch of generated bandwidth calculation and verification code that comes from their hardware team. On Vega and before this is float calculations, on Raven (DCN10) this is double based. They do the required things to do FP in the kernel, and I could understand this might raise some issues. Rewriting the bandwidth would be a major undertaken in reverification, it's non-trivial to work out if a display can handle the complete set of mode information thrown at it. Future story: There is a TODO list with this, and it address most of the remaining things that would be nice to refine/remove. The DCN10 code is still under development internally and they push out a lot of patches quite regularly and are supporting this code base with their display team. I think we've reached the point where keeping it out of tree is going to motivate distributions to start carrying the code, so I'd prefer we get it in tree. I think this code is slightly better than STAGING quality but not massively so, I'd really like to see that float/double magic gone and fixed point used, but AMD don't seem to think the accuracy and revalidation of the code is worth the effort" * tag 'drm-for-v4.15-amd-dc' of git://people.freedesktop.org/~airlied/linux: (1110 commits) drm/amd/display: fix MST link training fail division by 0 drm/amd/display: Fix formatting for null pointer dereference fix drm/amd/display: Remove dangling planes on dc commit state drm/amd/display: add flip_immediate to commit update for stream drm/amd/display: Miss register MST encoder cbs drm/amd/display: Fix warnings on S3 resume drm/amd/display: use num_timing_generator instead of pipe_count drm/amd/display: use configurable FBC option in dm drm/amd/display: fix AZ clock not enabled before program AZ endpoint amdgpu/dm: Don't use DRM_ERROR in amdgpu_dm_atomic_check amd/display: Fix potential null dereference in dce_calcs.c amdgpu/dm: Remove unused forward declaration drm/amdgpu: Remove unused dc_stream from amdgpu_crtc amdgpu/dc: Fix double unlock in amdgpu_dm_commit_planes amdgpu/dc: Fix missing null checks in amdgpu_dm.c amdgpu/dc: Fix potential null dereferences in amdgpu_dm.c amdgpu/dc: fix more indentation warnings amdgpu/dc: handle allocation failures in dc_commit_planes_to_stream. amdgpu/dc: fix indentation warning from smatch. amdgpu/dc: fix non-ansi function decls. ...
| * Merge branch 'drm-next-4.15-dc' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie2017-10-08
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into drm-next Initial pull request for DC support. We've completed a substantial amount of the cleanup and restructuring in our TODO. There are a few additional cleanups that we are continuing to work on, but I don't think there are any showstoppers remaining. We've tried to maintain most of the history for bisect purposes. Harry made sure all the commits build. We've enabled DC for vega10 and Raven. Pre-vega10 parts can be enabled via module parameter (amdgpu.dc=1), but are not enabled by default at this point until we get further testing upstream. This code provides atomic modesetting support for DCE8 (CIK), DCE10 (Tonga, Fiji), DCE11 (CZ, ST, Polaris), DCE12 (vega10), and DCN1 (RV) including HDMI and DP audio, DP MST, and many other advanced display features. + Latest cleanups for DC from you and Harry. Note that there is some flickering on some older asics with this branch due to a regression in powerplay that has already been fixed and will be included in my next non-DC pull request next week. * 'drm-next-4.15-dc' of git://people.freedesktop.org/~agd5f/linux: (897 commits) amdgpu/dc: use kref for dc_state. amdgpu/dc: convert dc_sink to kref. amdgpu/dc: convert dc_stream_state to kref. amdgpu/dc: use kref for dc_plane_state. amdgpu/dc: convert dc_gamma to kref reference counting. amdgpu/dc: convert dc_transfer to use a kref. amdgpu/dc: kill a bunch of dead code. amdgpu/dc: set a bunch of functions to static. amdgpu/dc: kill some deadcode in dc core. amdgpu/dc: fix indentation on a couple of returns. amdgpu/dm: don't use after free. amdgpu/dc: kfree already checks for NULL. amdgpu/dc: fix a bunch of misc whitespace. amdgpu/dc: drop hw_sequencer_types.h amdgpu/dc: drop dce110_types.h amdgpu/dc: use kernel ilog2 for log_2. amdgpu/dc: don't memset after kzalloc. amdgpu/dc: inline dal grph object id functions. amdgpu/dc: inline dml_round_to_multiple amdgpu/dc: rename bios get_image symbol to something more searchable. ...
| | * drm/amdgpu: Add dc_log module parameterHarry Wentland2017-09-26
| | | | | | | | | | | | | | | | | | | | | | | | We want to make DC less chatty but still allow bug reporters to provide more detailed logs. Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| | * drm/amd/dc: Add dc display driver (v2)Harry Wentland2017-09-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Supported DCE versions: 8.0, 10.0, 11.0, 11.2 v2: rebase against 4.11 Signed-off-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | drm/amdgpu: busywait KIQ register accessing (v4)pding2017-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Register accessing is performed when IRQ is disabled. Never sleep in this function. Known issue: dead sleep in many use cases of index/data registers. v2: - wrap polling fence functions. - don't trigger IRQ for polling in case of wrongly fence signal. v3: - handle wrap round gracefully. - add comments for polling function v4: - don't return negative timeout confused with error code Signed-off-by: pding <Pixel.Ding@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | drm/amdgpu:reduce wb to 512 slotMonk Liu2017-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with current WB usage we only use 57 slots, so 512 is extreamly sufficient, and reduce to 512 can make WB fit into one page. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | drm/amdgpu: move the VRAM lost counter per contextChristian König2017-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of per device track the VRAM lost per context and return ECANCELED instead of ENODEV. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | drm/amdgpu: keep copy of VRAM lost counter in jobChristian König2017-10-19
| | | | | | | | | | | | | | | | | | | | | | | | Instead of reading the current counter from fpriv. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | drm/amdgpu: Move old fence waiting before reservation lock is aquired v2Andrey Grodzovsky2017-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Helps avoiding deadlock during GPU reset. Added mutex to amdgpu_ctx to preserve order of fences on a ring. v2: Put waiting logic in a function in a seperate function in amdgpu_ctx.c Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | drm/amdgpu: add plumbing for ctx priority changes v2Andres Rodriguez2017-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce amdgpu_ctx_priority_override(). A mechanism to override a context's priority. An override can be terminated by setting the override to AMD_SCHED_PRIORITY_UNSET. v2: change refcounted interface for a direct set Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | drm/amdgpu: implement ring set_priority for gfx_v8 compute v9Andres Rodriguez2017-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Programming CP_HQD_QUEUE_PRIORITY enables a queue to take priority over other queues on the same pipe. Multiple queues on a pipe are timesliced so this gives us full precedence over other queues. Programming CP_HQD_PIPE_PRIORITY changes the SPI_ARB_PRIORITY of the wave as follows: 0x2: CS_H 0x1: CS_M 0x0: CS_L The SPI block will then dispatch work according to the policy set by SPI_ARB_PRIORITY. In the current policy CS_H is higher priority than gfx. In order to prevent getting stuck in loops of resources bouncing between GFX and high priority compute and introducing further latency, we statically reserve a portion of the pipe. v2: fix srbm_select to ring->queue and use ring->funcs->type v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_* v4: switch int to enum amd_sched_priority v5: corresponding changes for srbm_lock v6: change CU reservation to PIPE_PERCENT allocation v7: use kiq instead of MMIO v8: back to MMIO, and make the implementation sleep safe. v9: corresponding changes for splitting HIGH into _HW/_SW Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | drm/amdgpu: Reserve shared memory on VRAM for SR-IOVHorace Chen2017-10-09
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SR-IOV need to reserve a piece of shared VRAM at the exact place to exchange data betweem PF and VF. The start address and size of the shared mem are passed to guest through VBIOS structure VRAM_UsageByFirmware. VRAM_UsageByFirmware is a general feature in VBIOS, it indicates that VBIOS need to reserve a piece of memory on the VRAM. Because the mem address is specified. Reserve it early in amdgpu_ttm_init to make sure that it can monoplize the space. Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | drm/amdgpu: add FENCE_TO_HANDLE ioctl that returns syncobj or sync_fileMarek Olšák2017-10-06
| | | | | | | | | | | | | | | | | | for being able to convert an amdgpu fence into one of the handles. Mesa will use this. Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | drm/amdgpu: delete pp_enable in adevRex Zhu2017-09-28
| | | | | | | | | | | | | | | | | | amdgpu not care powerplay or dpm is enabled. just check ip functions and pp functions Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | drm/amdgpu: add option for force enable multipipe policy for computeAndres Rodriguez2017-09-28
|/ | | | | | | | Useful for testing the effects of multipipe compute without recompiling. Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: Add gem_prime_mmap supportSamuel Li2017-09-26
| | | | | | | | | | v2: drop hdp invalidate/flush. v3: honor pgoff during prime mmap. Add a barrier after cpu access. v4: drop begin/end_cpu_access() for now, revisit later. Signed-off-by: Samuel Li <Samuel.Li@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: Add copy_pte_num_dw member in amdgpu_vm_pte_funcsYong Zhao2017-09-26
| | | | | | | | Use it to replace the hard coded value in amdgpu_vm_bo_update_mapping(). Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: Fix a bug in amdgpu_fill_buffer()Yong Zhao2017-09-26
| | | | | | | | | | | | | | | | | | | | When max_bytes is not 8 bytes aligned and bo size is larger than max_bytes, the last 8 bytes in a ttm node may be left unchanged. For example, on pre SDMA 4.0, max_bytes = 0x1fffff, and the bo size is 0x200000, the problem will happen. In order to fix the problem, we separately store the max nums of PTEs/PDEs a single operation can set in amdgpu_vm_pte_funcs structure, rather than inferring it from bytes limit of SDMA constant fill, i.e. fill_max_bytes. Together with the fix, we replace the hard code value "10" in amdgpu_vm_bo_update_mapping() with the corresponding values from structure amdgpu_vm_pte_funcs. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/sriov:fix memory leak after gpu resetMonk Liu2017-09-26
| | | | | | | | | | | | | GPU reset will require all hw doing hw_init thus ucode_init_bo will be invoked again, which lead to memory leak skip the fw_buf allocation during sriov gpu reset to avoid memory leak. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu:make ctx_add_fence interruptible(v2)Monk Liu2017-09-26
| | | | | | | | | | | | | | otherwise a gpu hang will make application couldn't be killed under timedout=0 mode v2: Fix memoryleak job/job->s_fence issue unlock mn remove the ERROR msg after waiting being interrupted Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/sriov:move in_reset to adev and renameMonk Liu2017-09-26
| | | | | | | | | | | currently in_reset is only used in sriov gpu reset, and it will be used for other non-gfx hw component later, like PSP, so move it from gfx to adev and rename to in_sriov_reset make more sense. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: fix checkpatch.pl warning to amdgpu_drv.cRex Zhu2017-09-26
| | | | | | | | | fix checkpatch.pl WARNING: Prefer 'unsigned int' to bare use of 'unsigned' Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: Add prescreening stage in IH processing (v2)Felix Kuehling2017-09-26
| | | | | | | | | | To filter out high-frequency interrupts that can be safely ignored. v2: squash in trivial typo fix for si (Alex) Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: move MMU notifier related defines to amdgpu_mn.hChristian König2017-09-12
| | | | | | | | | Just some cleanup. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: move amdgpu_ttm_tt_* declarations into amdgpu_ttm.hChristian König2017-09-12
| | | | | | | | | Just some cleanup. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: keep the MMU lock until the update ends v4Christian König2017-09-12
| | | | | | | | | | | | | This is quite controversial because it adds another lock which is held during page table updates, but I don't see much other option. v2: allow multiple updates to be in flight at the same time v3: simplify the patch, take the read side only once v4: correctly fix rebase conflict Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: move amdgpu_cs_sysvm_access_required into find_mappingChristian König2017-09-12
| | | | | | | | When we need to find the mapping we need sysvm access anyway. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: stop reserving the BO in the MMU callback v3Christian König2017-09-12
| | | | | | | | | | | | | Instead take the callback lock during the final parts of CS. This should solve the last remaining locking order problems with BO reservations. v2: rebase, make dummy functions static inline v3: add one more missing inline and comments Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: move userptr BOs to CPU domain during CS v2Christian König2017-09-12
| | | | | | | | | | Instead of moving them in the MMU notifier move them during CS. v2: still mark pages as accessed/dirty Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: stop using BO status for user pagesChristian König2017-09-12
| | | | | | | | Instead use a counter to figure out if we need to set new pages or not. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: fix userptr put_page handlingChristian König2017-09-12
| | | | | | | | | Move calling put_page into the unpopulate callback. Otherwise we mess up the pages reference count when it is unbound multiple times. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add IOCTL interface for per VM BOs v3Christian König2017-08-31
| | | | | | | | | | | | | | Add the IOCTL interface so that applications can allocate per VM BOs. Still WIP since not all corner cases are tested yet, but this reduces average CS overhead for 10K BOs from 21ms down to 48us. v2: add some extra checks, remove the WIP tag v3: rename new flag to AMDGPU_GEM_CREATE_VM_ALWAYS_VALID Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add automatic per asic settings for gart_sizeAlex Deucher2017-08-29
| | | | | | | | | | We need a larger gart for asics that do not support GPUVM on all engines (e.g., MM) to make sure we have enough space for all gtt buffers in physical mode. Change the default size based on the asic type. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amd/amdgpu: expose fragment size as module parameter (v2)Roger He2017-08-17
| | | | | | | | | | Allow overrides on the command line. v2: agd: sqaush in spelling fix and bogus default value warning Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Roger He <Hongbo.He@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: cleanup static CSA handlingChristian König2017-08-17
| | | | | | | | Move the CSA bo_va from the VM to the fpriv structure. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: move vram usage tracking into the vram manager v2Christian König2017-08-17
| | | | | | | | | | Looks like a better place for this. v2: use atomic64_t members instead Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: move gtt usage tracking into the gtt manager v2Christian König2017-08-17
| | | | | | | | | | It doesn't make much sense to count those numbers twice. v2: use and atomic64_t instead Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: Fix stolen typoKent Russell2017-08-15
| | | | | | | | Change "stollen" to "stolen" Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: use 256 bit buffers for all wb allocations (v2)Alex Deucher2017-08-15
| | | | | | | | | | May waste a bit of memory, but simplifies the interface significantly. v2: convert internal accounting to use 256bit slots Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/sdma4: drop allocation of poll_mem_offsAlex Deucher2017-08-15
| | | | | | | | | We already allocate this as part of the ring structure, use that instead. Cc: Frank Min <Frank.Min@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: make wb 256bit function names consistentAlex Deucher2017-08-15
| | | | | | | Use a lower case b to be consistent with the other wb functions. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: move some defines aroundChristian König2017-08-15
| | | | | | | | | | Move amdgpu_bo and related structures into amdgpu_object.h. Move amdgpu_bo_list structures to the amdgpu_bo_list functions. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: cleanup kptr handlingChristian König2017-08-15
| | | | | | | | Don't keep around the same pointer twice. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/sdma4: Enable sdma poll mem addr on vega10 for SRIOVFrank Min2017-08-15
| | | | | | | | | | | While doing flr on VFs, there is possibility to lost the doorbell writing for sdma, so enable poll mem for sdma, then sdma fw would check the pollmem holding wptr. Signed-off-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Xiangliang.Yu <Xiangliang.Yu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: According hardware design revert vce and uvd doorbell assignmentFrank Min2017-08-15
| | | | | | | | | Now uvd doorbell is from 0xf8-0xfb and vce doorbell is from 0xfc-0xff Signed-off-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Xiangliang.Yu <Xiangliang.Yu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu:fix gfx fence allocate sizeMonk Liu2017-07-25
| | | | | | | | | | | | | 1, for sriov, we need 8dw for the gfx fence due to CP behaviour 2, cleanup wrong logic in wptr/rptr wb alloc and free Change-Id: Ifbfed17a4621dae57244942ffac7de1743de0294 Signed-off-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: Make SDMA phase quantum configurableFelix Kuehling2017-07-14
| | | | | | | | | | | | | | Set a configurable SDMA phase quantum when enabling SDMA context switching. The default value significantly reduces SDMA latency in page table updates when user-mode SDMA queues have concurrent activity, compared to the initial HW setting. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Andres Rodriguez <andres.rodriguez@amd.com> Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com> Acked-by: Chunming Zhou <david1.zhou@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: Throttle visible VRAM moves separatelyJohn Brooks2017-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The BO move throttling code is designed to allow VRAM to fill quickly if it is relatively empty. However, this does not take into account situations where the visible VRAM is smaller than total VRAM, and total VRAM may not be close to full but the visible VRAM segment is under pressure. In such situations, visible VRAM would experience unrestricted swapping and performance would drop. Add a separate counter specifically for moves involving visible VRAM, and check it before moving BOs there. v2: Only perform calculations for separate counter if visible VRAM is smaller than total VRAM. (Michel Dänzer) v3: [Michel Dänzer] * Use BO's location rather than the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag to determine whether to account a move for visible VRAM in most cases. * Use a single if (adev->mc.visible_vram_size < adev->mc.real_vram_size) { block in amdgpu_cs_get_threshold_for_moves. Fixes: 95844d20ae02 (drm/amdgpu: throttle buffer migrations at CS using a fixed MBps limit (v2)) Signed-off-by: John Brooks <john@fastquake.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: Add vis_vramlimit module parameterJohn Brooks2017-07-14
| | | | | | | | | | | | | Allow specifying a limit on visible VRAM via a module parameter. This is helpful for testing performance under visible VRAM pressure. v2: Add cast to 64-bit (Christian König) Signed-off-by: John Brooks <john@fastquake.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: change gartsize default to 256MBChristian König2017-07-14
| | | | | | | | Limit the default GART size and save a lot of VRAM. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>