nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: correct handling of pbdma rc	Debarshi Dutta	2019-09-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current code reads the pbdma_status info after clearing the interrupt. Other interrupts/sleep can cause enough delay between clearing the interrupt and pbdma switching the channel leading to invalid channel/tsg ID. Correct that by reading the pbdma_status info register before clearing of the pbdma interrupt to correctly read the context information before the pbdma can switch out the context. Bug 200533450 Change-Id: Ic2f0682526e00d14ad58f0411472f34388183f2b Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2165047 (cherry-picked from 0ef96e4b1a7979d2bae0e52924e976515cb87400 in dev-main) Reviewed-on: https://git-master.nvidia.com/r/2188861 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: protect recovery with engines_reset_mutex	Debarshi Dutta	2019-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rename gr_reset_mutex to engines_reset_mutex and acquire it before initiating recovery. Recovery running in parallel with engine reset is not recommended. On hitting engine reset, h/w drops the ctxsw_status to INVALID in fifo_engine_status register. Also while the engine is held in reset h/w passes busy/idle straight through. fifo_engine_status registers are correct in that there is no context switch outstanding as the CTXSW is aborted when reset is asserted. Use deferred_reset_mutex to protect deferred_reset_pending variable If deferred_reset_pending is true then acquire engines_reset_mutex and call gk20a_fifo_deferred_reset. gk20a_fifo_deferred_reset would also check the value of deferred_reset_pending before initiating reset process Bug 2092051 Bug 2429295 Bug 2484211 Bug 1890287 Change-Id: I47de669a6203e0b2e9a8237ec4e4747339b9837c Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2022373 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry-picked from cb91bf1e13740023903282d1c2271d9154e940ba in dev-main) Reviewed-on: https://git-master.nvidia.com/r/2024901 GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: disable elpg before ctxsw_disable	Debarshi Dutta	2019-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	if fecs is sent stop_ctxsw method, elpg entry/exit cannot happen and may timeout. It could manifest as different error signatures depending on when stop_ctxsw fecs method gets sent with respect to pmu elpg sequence. It could come as pmu halt or abort or maybe ext error too. If ctxsw failed to disable, do not read engine info and just abort tsg. Bug 2092051 Bug 2429295 Bug 2484211 Bug 1890287 Change-Id: I5f3ba07663bcafd3f0083d44c603420b0ccf6945 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2014914 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2018156 GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: add cg and pg function	Debarshi Dutta	2019-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new power/clock gating functions that can be called by other units. New clock_gating functions will reside in cg.c under common/power_features/cg unit. New power gating functions will reside in pg.c under common/power_features/pg unit. Use nvgpu_pg_elpg_disable and nvgpu_pg_elpg_enable to disable/enable elpg and also in gr_gk20a_elpg_protected macro to access gr registers. Add cg_pg_lock to make elpg_enabled, elcg_enabled, blcg_enabled and slcg_enabled thread safe. JIRA NVGPU-2014 Change-Id: I00d124c2ee16242c9a3ef82e7620fbb7f1297aff Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2025493 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry-picked from c90585856567a547173a8b207365b3a4a3ccdd57 in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2108406 GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: change err to info print if failing eng id is -1	Seema Khowala	2019-05-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For handle_sched_error, change err to info print for failing eng id returned as -1 i.e. FIFO_INVAL_ENGINE_ID as no engine is found busy doing ctxsw. May be ctxsw already finished for the context for which ctxsw timeout intr was triggered. Possible Causes: a) On hitting engine reset, h/w drops the ctxsw_status to INVALID in fifo_engine_status register. Also while the engine is held in reset h/w passes busy/idle straight through. fifo_engine_status registers are correct in that there is no context switch outstanding as the CTXSW is aborted when reset is asserted. This is just a side effect of how gv100 and earlier versions of ctxsw_timeout behave. With gv10b and later, h/w snaps the context at the point of error so that s/w can see the tsg_id which caused the HW timeout. b) If engines are not busy and ctxsw state is valid then intr occurred in the past and if the ctxsw state has moved on to VALID from LOAD or SAVE, it means that whatever timed out eventually finished anyways. The problem with this is that s/w cannot conclude which context caused the problem as maybe more switches occurred before intr is handled. Bug 2092051 Bug 2429295 Bug 2484211 Bug 1890287 Change-Id: Ia79bee6e860fb179ee39024c963671d4f8245227 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2030866 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry-picked from d27f875d2c7839d3b1ec7db80d83594509ff2ea8 in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2076126 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: do not do timeout_debug_dump for non fifo_error_idle_timeout	Seema Khowala	2019-05-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Any recovery that goes through gk20a_fifo_recover path e.g. gr error, mmu fault or any recovery that involves engine recovery as well, will still dump the full debug dump. This change will just avoid dumping debug dump for force reset channels and pbdma intr if they do not involve engine recovery. For FIFO_ERROR_IDLE_TIMEOUT error notifiers that involves tsg recovery only, debug_dump will happen only if timeout_debug_dump is set. timeout_debug_dump by default is set to true but can be changed using NVGPU_IOCTL_CHANNEL_SET_TIMEOUT_EX. Bug 2092051 Change-Id: Ibbf3cd2c44c586d9deb9e61ffbf37945b8d9e428 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2033068 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 5222d0ff4f8d31b02267eb8926b5d00835f39508 in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2076117 GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: add hal to mask/unmask intr during teardown	Seema Khowala	2019-05-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ctxsw timeout error prevents recovery as it can get triggered periodically. Disable ctxsw timeout interrupt to allow recovery. Bug 2092051 Bug 2429295 Bug 2484211 Bug 1890287 Change-Id: I47470e13968d8b26cdaf519b62fd510bc7ea05d9 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2019645 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 68c13e2f0447118d7391807c9b9269749d09a4ec in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2024899 GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: tsg: ensure unbound channel is disabled	Peter Daifuku	2019-03-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Multiple threads could be unbinding different channels from the same tsg at the same time. At the point where we remove the channel from the tsg's channel list, call disable_channel again, in case another thread had re-enabled the channel after we had disabled it. Bug 200404549 Change-Id: I9abbc08dc11fe1f7a0abada88376c0ef96b56610 Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2083337 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Satish Arora <satisha@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: remove gk20a_is_channel_marked_as_tsg	Seema Khowala	2019-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use tsg_gk20a_from_ch to get tsg pointer for tsgid of a channel. For invalid tsgid, tsg pointer will be NULL Bug 2092051 Bug 2429295 Bug 2484211 Change-Id: I82cd6a2dc5fab4acb147202af667ca97a2842a73 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2006722 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 13f37f9c70b9ae2e0d179830cded93a0a6f86494 in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2025507 GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: remove code for ch not bound to tsg	Seema Khowala	2019-02-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Remove handling for channels that are no more bound to tsg as channel could be referenceable but no more part of a tsg - Use tsg_gk20a_from_ch to get pointer to tsg for a given channel - Clear unhandled gr interrupts Bug 2429295 JIRA NVGPU-1580 Change-Id: I9da43a2bc9a0282c793b9f301eaf8e8604f91d70 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1972492 (cherry picked from commit 013ca60edd97e7719e389b3048fed9b165277251 in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2018262 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Debarshi Dutta <ddutta@nvidia.com> Tested-by: Debarshi Dutta <ddutta@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: rename has_timedout and make it thread safe	Seema Khowala	2019-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently has_timedout variable is protected by wmb at places where it is being set and there is no correspoding rmb whenever has_timedout variable is read. This is prone to errors for concurrent execution. This change is supposed to fix this issue. Rename has_timedout variable of channel struct to ch_timedout. Also to avoid rmb every time ch_timedout is read, ch_timedout_spinlock is added to protect ch_timedout variable for taking care of concurrent execution. Bug 2404865 Bug 2092051 Change-Id: I0bee9f50af0a48720aa8b54cbc3af97ef9f6df00 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1930935 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 1f54ea09e3445d9ca3cf7a69b4967849cc9defc8 in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2016975 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: replace input param chid with pointer to channel	Debarshi Dutta	2019-02-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	preempt_channel needs to use the channel to pass it to other public functions, get access to a tsg etc. This qualifies it to take a pointer to a channel as an input parameter instead of a chid. Increment the channel ref counter using the function gk20a_channel_from_id in functions where we get the chid from the h/w registers directly. Once the prempt_channel function call is done, use a gk20a_channel_put on the referenced channel. Jira NVGPU-1461 Change-Id: I6c87c8104cfcb418d468c8c590087fd4aeabf4bd Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1963200 (cherry picked from commit 9abe9fe062367902ede7721cff55396859f8e4e8 in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2013728 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: replace input param chid with pointer to channel	Debarshi Dutta	2019-02-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gk20a_fifo_recover_channel takes a reference to the channel via its chid before passing the channel pointer to other public functions such as gk20a_channel_abort and gk20a_fifo_error_ch. This qualifies the gk20a_fifo_recover_channel to take a pointer to a channel instead of only chid. Jira NVGPU-1461 Change-Id: I338a12a05e5ccee785a202fea7848db5201a3a39 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1963199 (cherry picked from commit 99acb8011a8627a2433d31e6e0c8ab833ab3317d in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2013727 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: replace input parameter tsgid with pointer to struct tsg_gk20a	Debarshi Dutta	2019-02-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The function gk20a_fifo_recover_tsg has to pass a valid struct tsg to other functions from within. This qualifies it to have a pointer to struct tsg_gk20a as an input parameter. Tsg specific parts of the gk20a_fifo_preempt_timeout_rc are now moved into another function gk20a_fifo_preempt_timeout_rc_tsg that takes a tsg as an input and passes it to gk20a_fifo_recover_tsg. The pointer to a tsg is also used to enumerate channels from within. The function gk20a_fifo_preempt_timeout_rc now contains only channel specific code. Jira NVGPU-1461 Change-Id: Ice0a9921567841fb5586a7e4e010c442ca6cf172 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1961675 (cherry picked from commit e19cea7ab3ef688186222dec940c2396536408ce in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2013726 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: replace input parameter tsgid with pointer to struct tsg_gk20a	Debarshi Dutta	2019-02-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gv11b_fifo_preempt_tsg needs to access the runlist_id of the tsg as well as pass the tsg pointer to other public functions such as gk20a_fifo_disable_tsg_sched. This qualifies the preempt_tsg to use a pointer to a struct tsg_gk20a instead of just using the tsgid. Jira NVGPU-1461 Change-Id: I01fbd2370b5746c2a597a0351e0301b0f7d25175 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1959068 (cherry picked from commit 1e78d47f15ff050edbb10a88550012178d353288 in rel-32) Reviewed-on: https://git-master.nvidia.com/r/2013725 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: replace tsgid input variable with pointer to a struct tsg_gk20a	Debarshi Dutta	2019-02-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	replace tsgid with a pointer to a struct tsg_gk20a in the function gk20a_fifo_tsg_abort(). gk20a_fifo_tsg_abort needs to enumerate through all the channels within the tsg as well as pass the tsg pointer to other functions, qualifying the need to use a pointer instead as an input parameter. Jira NVGPU-1461 Change-Id: I59cec05d5d778f733d0c3e9ffadf46e74e249080 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1956567 (cherry picked from commit e5bebd880f28fe719c5e01e165fb189e7cafee01 in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2013724 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: delete raw chid lookup	Konsta Holtta	2019-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This (dangerous) array lookup with no channel references is now unused. Jira NVGPU-1460 Change-Id: Ic6bdbcf19fc8996bc6ff02a40afe3224bdd5bc27 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1955402 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 4a53854a92e7c841fa3cb58da062fa756ae7b5c7 in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2008517 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: add safe channel id lookup	Konsta Holtta	2019-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add gk20a_channel_from_id() to retrieve a channel, given a raw channel ID, with a reference taken (or NULL if the channel was dead). This makes it harder to mistakenly use a channel that's dead and thus uncovers bugs sooner. Convert code to use the new lookup when applicable; work remains to convert complex uses where a ref should have been taken but hasn't. The channel ID is also validated against FIFO_INVAL_CHANNEL_ID; NULL is returned for such IDs. This is often useful and does not hurt when unnecessary. However, this does not prevent the case where a channel would be closed and reopened again when someone would hold a stale channel number. In all such conditions the caller should hold a reference already. The only conditions where a channel can be safely looked up by an id and used without taking a ref are when initializing or deinitializing the list of channels. Jira NVGPU-1460 Change-Id: I0a30968d17c1e0784d315a676bbe69c03a73481c Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1955400 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 7df3d587502c2de997dfbe8ea8ddc114d0a0481e in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2008515 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: channel: make chid u32	Philip Elcan	2019-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The chid member of the channel_gk20a struct was being used as a unsigned value. By being declared as an int, it was causing MISRA 10.3 violations for implicit assignment of different types. JIRA NVGPU-647 Change-Id: I7477fad6f0c837cf7ede1dba803158b1dda717af Signed-off-by: Philip Elcan <pelcan@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1918470 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 1c7bb9b538200a11aa3ef31d72038d8ba820dfca in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2008514 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: make tsgid a consistent type	Philip Elcan	2019-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Different units were declaring tsgid as int or u32. This makes everyone use u32. This change resolves MISRA 10.3 violations for implicit assingment to different types. JIRA NVGPU-647 Change-Id: I78660e737acb0dad76dd538e5dd37f4527cf5acd Signed-off-by: Philip Elcan <pelcan@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1918469 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit f5cac144a04a3ef83762ecb2e3f405196beffd68 in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2008513 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: fix double handling in timeout	Konsta Holtta	2019-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The context switch timeout works by triggering a hardware timeout at 10 Hz. When handling these, we check whether a channel has actually timed out. Currently the timeout limit can be shorter than the 10 Hz interval which always causes us to recover a channel but would also cause detection of progress if there was any in the interval. Handling both situations at the same time would reuse the channel pointer local to the function after a loop has finished and would cause memory corruption. Fix this by making the two branches mutually exclusive, and move the recover case to happen first because that's how our tests assume things to work. Jira NVGPU-967 Bug 2502074 Change-Id: I26aa0fa7fd80ab42a9a1a93a6cca2cd29c9d3f3f Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1932449 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 8ac9a53d816a3d012a6948a9a96ac6db699c662di in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/1997597 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Bibek Basu <bbasu@nvidia.com> Tested-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: clear all handled fifo interrupts	Seema Khowala	2018-12-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue is that local variable clear_intr is reset if fifo intr handler happens to handle interrupts handled by fifo_error_isr. This fix is to take care of clearing all handled fifo interrupts. Bug 2361571 Change-Id: Ic8fe2294cfb25c58925942750a81c104ec9747de Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1960330 (cherry picked from commit 1195239d1c624e600ec4152374c493e887a90774 in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/1979744 GVS: Gerrit_Virtual_Submit Reviewed-by: Bitan Biswas <bbiswas@nvidia.com> Tested-by: Bitan Biswas <bbiswas@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Add MC APIs for reset masks	Terje Bergstrom	2018-09-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add API for querying reset mask corresponding to a unit. The reset masks need to be read from MC HW header, and we do not want all units to access Mc HW headers themselves. JIRA NVGPU-954 Change-Id: I49ebbd891569de634bfc71afcecc8cd2358805c0 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1823384 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: move channel header to common	Konsta Holtta	2018-09-05
\| \| \| \| \| \| \| \| \| \| \| \| \|	channel_gk20a is clear from chip specifics and from most dependencies, so move it under the common directory. Jira NVGPU-967 Change-Id: I41f2160b96d4ec84064288ecc22bb360e82352df Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1810578 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Fix mutex MISRA 17.7 violations	Nicolas Benech	2018-09-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MISRA Rule-17.7 requires the return value of all functions to be used. Fix is either to use the return value or change the function to return void. This patch contains fix for calls to nvgpu_mutex_init and improves related error handling. JIRA NVGPU-677 Change-Id: I609fa138520cc7ccfdd5aa0e7fd28c8ca0b3a21c Signed-off-by: Nicolas Benech <nbenech@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1805598 Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: gk20a: Fix MISRA 15.6 violations	Srirangan	2018-09-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MISRA Rule-15.6 requires that all if-else blocks be enclosed in braces, including single statement blocks. Fix errors due to single statement if blocks without braces by introducing the braces. JIRA NVGPU-671 Change-Id: Iedac7d50aa2ebd409434eea5fda902b16d9c6fea Signed-off-by: Srirangan <smadhavan@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1797695 Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: fix various MISRA 10.1 bool violations	Scott Long	2018-09-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch corrects a handful of MISRA 10.1 violations involving illegal arithmetic operations (e.g. bitwise OR) on boolean values: * fix to status handling in regops validation code * fix to debugger event handling in gr code * fix to entries_left tracking in runlist construct code * fix to verbose channel dumping and reset tracking in fifo code JIRA NVGPU-650 Change-Id: I3c3d9123b5a0e08fc936d0e63d51de99fc310ade Signed-off-by: Scott Long <scottl@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1810957 Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: switch gk20a nonstall ops to #defines	Scott Long	2018-08-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix MISRA rule 10.1 violations involving gk20a_nonstall_ops enums by replacing them with with corresponding #defines. Because these values can be used in expressions that require unsigned values (e.g. bitwise OR) we cannot use enums. The g->ce2.isr_nonstall() function was previously returning an int that was a combination of gk20a_nonstall_ops enum bits which led to the violations. JIRA NVGPU-650 Change-Id: I6210aacec8829b3c8d339c5fe3db2f3069c67406 Signed-off-by: Scott Long <scottl@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1796242 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: MISRA 10.3-Conversions to/from an enum	Amulya	2018-08-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix violations where the conversion is from a non-enum type to enum type or vice-versa. JIRA NVGPU-659 Change-Id: I45f43c907b810cc86b2a4480809d0c6757ed3486 Signed-off-by: Amulya <Amurthyreddy@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1802322 GVS: Gerrit_Virtual_Submit Tested-by: Amulya Murthyreddy <amurthyreddy@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Adeel Raza <araza@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: remove utils.h from gk20a.h	Vinod G	2018-08-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Removed the utils.h include from gk20a.h utils.h is included in those files which make use of the macros in utils.h JIRA NVGPU-1005 Change-Id: Ifb41da58db6ff8682fa6b5dfdd8eda11a751fcac Signed-off-by: Vinod G <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1785952 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: gk20a: Fix MISRA 15.6 violations	Srirangan	2018-08-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes errors due to single statement loop bodies without braces, which is part of Rule 15.6 of MISRA. This patch covers in gpu/nvgpu/gk20a/ JIRA NVGPU-989 Change-Id: I2f422e9bc2b03229f4d2c3198613169ce5e7f3ee Signed-off-by: Srirangan <smadhavan@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1791019 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gk20a: nvgpu: Remove io.h dependency from gk20a.h	Debarshi Dutta	2018-07-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the current code, gk20a.h includes io.h which gets directly included in a lot of other files. io.h contains methods which uses a struct gk20a as a parameter leading to a circular dependency between io.h and gk20a.h. This can be mitigated by removing io.h from gk20a.h as part of larger effort to moving gk20a.h to nvgpu/gk20a.h JIRA NVGPU-597 Change-Id: I93e504fa9371b88152737b342a75580c65e8f712 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1787316 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: set preempt timeout	Seema Khowala	2018-07-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-For Si platforms, gk20a_get_gr_idle_timeout returns 3000 ms i.e. 3 sec. Currently this time is used for preempt polling and this conflicts with channel timeout if polling times out. Use fifo_eng_timeout_us converted to ms for preempt polling. -In case of preempt timeout, do not issue recovery for si platform. ctxsw timeout will trigger recovery if needed. For non si platforms, issue preempt timeout rc if preempt times out. Bug 2113657 Bug 2064553 Bug 2038366 Bug 2028993 Bug 200426402 Change-Id: I8d9f58be9ac634e94defa92a20fb737bf256d841 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1762076 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: acquire/release runlist_lock during teardown/mmu_fault	Seema Khowala	2018-07-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-Recovery can be called for various types of faults. Acquire runlist_lock for all runlists so that current teardown is done before proceeding to next one. -For legacy chips teardown is done by triggering mmu fault so make sure runlist_locks are acquired during teardown and also during handling mmu fault. -gk20a_fifo_handle_mmu_fault is renamed as gk20a_fifo_handle_mmu_fault_locked -gk20a_fifo_handle_mmu_fault called from gk20a_fifo_teardown_ch_tsg is replaced with gk20a_fifo_handle_mmu_fault_locked -gk20a_fifo_handle_mmu_fault acquires/release runlist_lock for all runlists and calls gk20a_fifo_handle_mmu_fault_locked Bug 2113657 Bug 2064553 Bug 2038366 Bug 2028993 Change-Id: I973d7ddb6924b50bae2d095152867e99c87e780a Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1761197 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Rearrange the static inline code	Vinod G	2018-07-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to avoid the circular dependencies, rearrange the static inline functions from gk20a.h file. Moved gk20a_gr_flush_channel_tlb function to gr_gk20a.c and removed the #include gr_gk20a.h from gk20a.h Added a helper function utils.h to move all generic static inline functions which have no reference to gpu related structures. ptimer related functions are moved to ptimer.h Implementations for as and pmu are moved to corresponding files. JIRA NVGPU-624 Change-Id: I4e956326e773ba037bf3a1696cc4c462085dbbe5 Signed-off-by: Vinod G <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1781941 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	Revert "Revert: GV11B runlist preemption patches"	Seema Khowala	2018-07-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 0b02c8589dcc507865a8fd398431c45fbda2ba9c. Originally change was reverted as it was making ap_compute test on embedded-qnx-hv e3550-t194 fail. With fixes related to replacing tsg preempt with runlist preempt during teardown, preempt timeout set to 100 ms (earlier this was set to 1000ms for t194 and 3000ms for legacy chips) and not issuing preempt timeout recovery if preempt fails, helped resolve the issue. Bug 200426402 Change-Id: If9a68d028a155075444cc1bdf411057e3388d48e Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1762563 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: add error check for init_runlist	Nitin Kumbhar	2018-07-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allocations in init_runlist can fail. Check for such a failure during fifo setup is being done. Bug 1987855 Change-Id: I1771a15ebeac81ab2e3ebc9a75363445a0b6f20d Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1770801 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	Revert: GV11B runlist preemption patches	Alex Waterman	2018-06-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 2d397e34a5aafb5feed406a13f3db536eadae5bb. This reverts commit cd6e821cf66837a2c3479e928414007064b9c496. This reverts commit 5cf1eb145fef763f7153e449be60f1a7602e2c81. This reverts commit a8d6f31bde3ccef22ee77023eaff4a62f6f88199. This reverts commit 067ddbc4e4df3f1f756f03e7865c369a46f420aa. This reverts commit 3eede64de058fcb1e39d723dd146bcd5d06c6f43. This reverts commit 1407133b7e1b27a92ee8c116009541904d2ff691. This reverts commit 797dde3e32647df3b616cea67f4defae59d38b3f. Looks like this makes the ap_compute test on embedded-qnx-hv e3550-t194 quite bad. Might also affect ap_resmgr. Signed-off-by: Alex Waterman <alexw@nvidia.com> Change-Id: Ib9f06514d554d1a67993f0f2bd3d180147385e0a Reviewed-on: https://git-master.nvidia.com/r/1761864 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: gv11b: add runlist abort & remove bare channel	Seema Khowala	2018-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-Add support for aborting runlist/s. Aborting runlist/s, will abort all active tsgs and associated active channels within these active tsgs -Bare channels are no longer supported. Remove recovery support for bare channels. In case there are bare channels, recovery will trigger runlist abort Bug 2125776 Bug 2108544 Bug 2105322 Bug 2092051 Bug 2048824 Bug 2043838 Bug 2039587 Bug 2028993 Bug 2029245 Bug 2065990 Bug 1945121 Bug 200401707 Bug 200393631 Bug 200327596 Change-Id: I6bec8a0004508cf65ea128bf641a26bf4c2f236d Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1640567 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: remove timeout_rc_type i/p param	Seema Khowala	2018-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-is_preempt_pending hal does not need timeout_rc_type input param as for volta, reset_eng_bitmask is saved if preempt times out. For legacy chips, recovery triggers mmu fault and mmu fault handler takes care of resetting engines. -For volta, no special input param needed to differentiate between preempt polling during normal scenario and preempt polling during recovery. Recovery path uses preempt_ch_tsg hal to issue preempt. This hal does not issue recovery if preempt times out. Bug 2125776 Bug 2108544 Bug 2105322 Bug 2092051 Bug 2048824 Bug 2043838 Bug 2039587 Bug 2028993 Bug 2029245 Bug 2065990 Bug 1945121 Bug 200401707 Bug 200393631 Bug 200327596 Change-Id: Ie76a18ae0be880cfbeee615859a08179fb974fa8 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1709799 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Mask an unused HCE_ILLEGAL_OP Interrupt	Vinod G	2018-06-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	HCE interrupt is not being used in nvgpu platform now, masking the bit from the interrupt register. bug 2082123 Change-Id: I1d53584afebe57b9621c8f4ec395cd1dcd6c7611 Signed-off-by: Vinod G <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1746850 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: add g->fifo_eng_timeout_us	Thomas Fleury	2018-06-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add g->fifo_eng_timeout_us to define engine timeout in microseconds. It is initialized with GRFIFO_TIMEOUT_CHECK_PERIOD_US. In RM server case, it can be overriden with value defined in device tree. Jira EVLR-2674 Change-Id: I69ac2ce779fe575566c8ba48e8cd2d0e6b2d93cf Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1728391 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: release runlist_lock before issuing recovery	Seema Khowala	2018-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	Release runlist_lock before issuing runlist update timeout recovery. Bug 2115080 Change-Id: I22cd0dd8ab6828412fcc98f587e4a5cdce907651 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1722308 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Add timeouts_disabled_refcount for enabling timeout	Seema Khowala	2018-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-timeouts will be enabled only when timeouts_disabled_refcount will reach 0 -timeouts_enabled debugfs will change from u32 type to file type to avoid race enabling/disabling timeout from debugfs and ioctl -unify setting timeouts_enabled from debugfs and ioctl Bug 1982434 Change-Id: I54bab778f1ae533872146dfb8d80deafd2a685c7 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1588690 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Code updates for MISRA violations	Vinod G	2018-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Code related to MC module is updated for handling MISRA violations Rule 10.1: Operands shalln't be an inappropriate essential type. Rule 10.3: Value of expression shalln't be assigned to an object with a narrow essential type. Rule 10.4: Both operands in an operator shall have the same essential type. Rule 14.4: Controlling if statement shall have essentially Boolean type. Rule 15.6: Enclose if() sequences with braces. JIRA NVGPU-646 JIRA NVGPU-659 JIRA NVGPU-671 Change-Id: Ia7ada40068eab5c164b8bad99bf8103b37a2fbc9 Signed-off-by: Vinod G <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1720926 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: add NVGPU_IOCTL_CHANNEL_RESCHEDULE_RUNLIST	David Li	2018-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add NVGPU_IOCTL_CHANNEL_RESCHEDULE_RUNLIST ioctl to reschedule runlist, and optionally check host and FECS status to preempt pending load of context not belonging to the calling channel on GR engine during context switch. This should be called immediately after a submit to decrease worst case submit to start latency for high interleave channel. There is less than 0.002% chance that the ioctl blocks up to couple miliseconds due to race condition of FECS status changing while being read. For GV11B it will always preempt pending load of unwanted context since there is no chance that ioctl blocks due to race condition. Also fix bug with host reschedule for multiple runlists which needs to write both runlist registers. Bug 1987640 Bug 1924808 Change-Id: I0b7e2f91bd18b0b20928e5a3311b9426b1bf1848 Signed-off-by: David Li <davli@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1549050 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: runlist_lock released before preempt timeout recovery	Seema Khowala	2018-05-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Release runlist_lock and then initiate recovery if preempt timed out. Also do not issue preempt if ch, tsg or runlist id is invalid. tsgid could be invalid for below call trace gk20a_prepare_poweroff->gk20a_channel_suspend-> _fifo_preempt_channel->_fifo_preempt_tsg Bug 2065990 Bug 2043838 Change-Id: Ia1e3c134f06743e1258254a4a6f7256831706185 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1662656 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: add HAL to insert semaphore commands	Deepak Nibade	2018-05-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add below new HALs gops.fifo.add_sema_cmd() to insert HOST semaphore acquire/release methods gops.fifo.get_sema_wait_cmd_size() to get size of acquire command buffer gops.fifo.get_sema_incr_cmd_size() to get size of release command buffer Separate out new API gk20a_fifo_add_sema_cmd() to implement semaphore acquire/ release sequence and set it to gops.fifo.add_sema_cmd() Add gk20a_fifo_get_sema_wait_cmd_size() and gk20a_fifo_get_sema_incr_cmd_size() to return respective command buffer sizes Jira NVGPUT-16 Change-Id: Ia81a50921a6a56ebc237f2f90b137268aaa2d749 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1704490 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: remove access to mc_enable_pb_r()	Deepak Nibade	2018-05-10
\| \| \| \| \| \| \| \| \| \| \| \| \|	We don't need to configure mc_enable_pb_r() register in any of the supported chips, so remove access to this register Jira NVGPUT-52 Change-Id: I8a7a524367ce7953f926143242c6d63bc8fd5ed1 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1711245 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Remove gk20a_dbg* functions	Terje Bergstrom	2018-05-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Switch all logging to nvgpu_log(). gk20a_dbg macros are intentionally left there because of use from other repositories. Because the new functions do not work without a pointer to struct gk20a, and piping it just for logging is excessive, some log messages are deleted. Change-Id: I00e22e75fe4596a330bb0282ab4774b3639ee31e Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1704148 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>