summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/nvgpu/gk20a/fifo_gk20a.h
Commit message (Collapse)AuthorAge
* gpu: nvgpu: protect recovery with engines_reset_mutexDebarshi Dutta2019-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename gr_reset_mutex to engines_reset_mutex and acquire it before initiating recovery. Recovery running in parallel with engine reset is not recommended. On hitting engine reset, h/w drops the ctxsw_status to INVALID in fifo_engine_status register. Also while the engine is held in reset h/w passes busy/idle straight through. fifo_engine_status registers are correct in that there is no context switch outstanding as the CTXSW is aborted when reset is asserted. Use deferred_reset_mutex to protect deferred_reset_pending variable If deferred_reset_pending is true then acquire engines_reset_mutex and call gk20a_fifo_deferred_reset. gk20a_fifo_deferred_reset would also check the value of deferred_reset_pending before initiating reset process Bug 2092051 Bug 2429295 Bug 2484211 Bug 1890287 Change-Id: I47de669a6203e0b2e9a8237ec4e4747339b9837c Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2022373 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry-picked from cb91bf1e13740023903282d1c2271d9154e940ba in dev-main) Reviewed-on: https://git-master.nvidia.com/r/2024901 GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: add hal to mask/unmask intr during teardownSeema Khowala2019-05-02
| | | | | | | | | | | | | | | | | | | | | | ctxsw timeout error prevents recovery as it can get triggered periodically. Disable ctxsw timeout interrupt to allow recovery. Bug 2092051 Bug 2429295 Bug 2484211 Bug 1890287 Change-Id: I47470e13968d8b26cdaf519b62fd510bc7ea05d9 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2019645 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 68c13e2f0447118d7391807c9b9269749d09a4ec in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2024899 GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: replace input param chid with pointer to channelDebarshi Dutta2019-02-11
| | | | | | | | | | | | | | | | | | | | | | | | | preempt_channel needs to use the channel to pass it to other public functions, get access to a tsg etc. This qualifies it to take a pointer to a channel as an input parameter instead of a chid. Increment the channel ref counter using the function gk20a_channel_from_id in functions where we get the chid from the h/w registers directly. Once the prempt_channel function call is done, use a gk20a_channel_put on the referenced channel. Jira NVGPU-1461 Change-Id: I6c87c8104cfcb418d468c8c590087fd4aeabf4bd Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1963200 (cherry picked from commit 9abe9fe062367902ede7721cff55396859f8e4e8 in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2013728 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: replace input param chid with pointer to channelDebarshi Dutta2019-02-11
| | | | | | | | | | | | | | | | | | | | | | gk20a_fifo_recover_channel takes a reference to the channel via its chid before passing the channel pointer to other public functions such as gk20a_channel_abort and gk20a_fifo_error_ch. This qualifies the gk20a_fifo_recover_channel to take a pointer to a channel instead of only chid. Jira NVGPU-1461 Change-Id: I338a12a05e5ccee785a202fea7848db5201a3a39 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1963199 (cherry picked from commit 99acb8011a8627a2433d31e6e0c8ab833ab3317d in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2013727 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: replace input parameter tsgid with pointer to struct tsg_gk20aDebarshi Dutta2019-02-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The function gk20a_fifo_recover_tsg has to pass a valid struct tsg to other functions from within. This qualifies it to have a pointer to struct tsg_gk20a as an input parameter. Tsg specific parts of the gk20a_fifo_preempt_timeout_rc are now moved into another function gk20a_fifo_preempt_timeout_rc_tsg that takes a tsg as an input and passes it to gk20a_fifo_recover_tsg. The pointer to a tsg is also used to enumerate channels from within. The function gk20a_fifo_preempt_timeout_rc now contains only channel specific code. Jira NVGPU-1461 Change-Id: Ice0a9921567841fb5586a7e4e010c442ca6cf172 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1961675 (cherry picked from commit e19cea7ab3ef688186222dec940c2396536408ce in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2013726 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: replace input parameter tsgid with pointer to struct tsg_gk20aDebarshi Dutta2019-02-11
| | | | | | | | | | | | | | | | | | | | | gv11b_fifo_preempt_tsg needs to access the runlist_id of the tsg as well as pass the tsg pointer to other public functions such as gk20a_fifo_disable_tsg_sched. This qualifies the preempt_tsg to use a pointer to a struct tsg_gk20a instead of just using the tsgid. Jira NVGPU-1461 Change-Id: I01fbd2370b5746c2a597a0351e0301b0f7d25175 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1959068 (cherry picked from commit 1e78d47f15ff050edbb10a88550012178d353288 in rel-32) Reviewed-on: https://git-master.nvidia.com/r/2013725 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: replace tsgid input variable with pointer to a struct tsg_gk20aDebarshi Dutta2019-02-11
| | | | | | | | | | | | | | | | | | | | | | replace tsgid with a pointer to a struct tsg_gk20a in the function gk20a_fifo_tsg_abort(). gk20a_fifo_tsg_abort needs to enumerate through all the channels within the tsg as well as pass the tsg pointer to other functions, qualifying the need to use a pointer instead as an input parameter. Jira NVGPU-1461 Change-Id: I59cec05d5d778f733d0c3e9ffadf46e74e249080 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1956567 (cherry picked from commit e5bebd880f28fe719c5e01e165fb189e7cafee01 in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2013724 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: delete raw chid lookupKonsta Holtta2019-02-05
| | | | | | | | | | | | | | | | This (dangerous) array lookup with no channel references is now unused. Jira NVGPU-1460 Change-Id: Ic6bdbcf19fc8996bc6ff02a40afe3224bdd5bc27 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1955402 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 4a53854a92e7c841fa3cb58da062fa756ae7b5c7 in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2008517 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: move channel header to commonKonsta Holtta2018-09-05
| | | | | | | | | | | | | channel_gk20a is clear from chip specifics and from most dependencies, so move it under the common directory. Jira NVGPU-967 Change-Id: I41f2160b96d4ec84064288ecc22bb360e82352df Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1810578 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: clean up channel header includesKonsta Holtta2018-08-24
| | | | | | | | | | | | | | | | Remove a few unnecessary includes from channel_gk20a.h and add them to c files where needed. Jira NVGPU-967 Change-Id: Ic38132c776a56b6966424806faab7871575b6c10 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1804609 Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: switch gk20a nonstall ops to #definesScott Long2018-08-22
| | | | | | | | | | | | | | | | | | | | Fix MISRA rule 10.1 violations involving gk20a_nonstall_ops enums by replacing them with with corresponding #defines. Because these values can be used in expressions that require unsigned values (e.g. bitwise OR) we cannot use enums. The g->ce2.isr_nonstall() function was previously returning an int that was a combination of gk20a_nonstall_ops enum bits which led to the violations. JIRA NVGPU-650 Change-Id: I6210aacec8829b3c8d339c5fe3db2f3069c67406 Signed-off-by: Scott Long <scottl@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1796242 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: MISRA 10.3-Conversions to/from an enumAmulya2018-08-21
| | | | | | | | | | | | | | | | | Fix violations where the conversion is from a non-enum type to enum type or vice-versa. JIRA NVGPU-659 Change-Id: I45f43c907b810cc86b2a4480809d0c6757ed3486 Signed-off-by: Amulya <Amurthyreddy@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1802322 GVS: Gerrit_Virtual_Submit Tested-by: Amulya Murthyreddy <amurthyreddy@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Adeel Raza <araza@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* Revert "Revert: GV11B runlist preemption patches"Seema Khowala2018-07-19
| | | | | | | | | | | | | | | | | | | This reverts commit 0b02c8589dcc507865a8fd398431c45fbda2ba9c. Originally change was reverted as it was making ap_compute test on embedded-qnx-hv e3550-t194 fail. With fixes related to replacing tsg preempt with runlist preempt during teardown, preempt timeout set to 100 ms (earlier this was set to 1000ms for t194 and 3000ms for legacy chips) and not issuing preempt timeout recovery if preempt fails, helped resolve the issue. Bug 200426402 Change-Id: If9a68d028a155075444cc1bdf411057e3388d48e Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1762563 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* Revert: GV11B runlist preemption patchesAlex Waterman2018-06-26
| | | | | | | | | | | | | | | | | | | | This reverts commit 2d397e34a5aafb5feed406a13f3db536eadae5bb. This reverts commit cd6e821cf66837a2c3479e928414007064b9c496. This reverts commit 5cf1eb145fef763f7153e449be60f1a7602e2c81. This reverts commit a8d6f31bde3ccef22ee77023eaff4a62f6f88199. This reverts commit 067ddbc4e4df3f1f756f03e7865c369a46f420aa. This reverts commit 3eede64de058fcb1e39d723dd146bcd5d06c6f43. This reverts commit 1407133b7e1b27a92ee8c116009541904d2ff691. This reverts commit 797dde3e32647df3b616cea67f4defae59d38b3f. Looks like this makes the ap_compute test on embedded-qnx-hv e3550-t194 quite bad. Might also affect ap_resmgr. Signed-off-by: Alex Waterman <alexw@nvidia.com> Change-Id: Ib9f06514d554d1a67993f0f2bd3d180147385e0a Reviewed-on: https://git-master.nvidia.com/r/1761864 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit
* gpu: nvgpu: gv11b: add runlist abort & remove bare channelSeema Khowala2018-06-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | -Add support for aborting runlist/s. Aborting runlist/s, will abort all active tsgs and associated active channels within these active tsgs -Bare channels are no longer supported. Remove recovery support for bare channels. In case there are bare channels, recovery will trigger runlist abort Bug 2125776 Bug 2108544 Bug 2105322 Bug 2092051 Bug 2048824 Bug 2043838 Bug 2039587 Bug 2028993 Bug 2029245 Bug 2065990 Bug 1945121 Bug 200401707 Bug 200393631 Bug 200327596 Change-Id: I6bec8a0004508cf65ea128bf641a26bf4c2f236d Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1640567 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: remove timeout_rc_type i/p paramSeema Khowala2018-06-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | -is_preempt_pending hal does not need timeout_rc_type input param as for volta, reset_eng_bitmask is saved if preempt times out. For legacy chips, recovery triggers mmu fault and mmu fault handler takes care of resetting engines. -For volta, no special input param needed to differentiate between preempt polling during normal scenario and preempt polling during recovery. Recovery path uses preempt_ch_tsg hal to issue preempt. This hal does not issue recovery if preempt times out. Bug 2125776 Bug 2108544 Bug 2105322 Bug 2092051 Bug 2048824 Bug 2043838 Bug 2039587 Bug 2028993 Bug 2029245 Bug 2065990 Bug 1945121 Bug 200401707 Bug 200393631 Bug 200327596 Change-Id: Ie76a18ae0be880cfbeee615859a08179fb974fa8 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1709799 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: abstract submit profilingKonsta Holtta2018-05-25
| | | | | | | | | | | | | | | | Add gk20a_fifo_profile_snapshot() to store the submit time in a profiling entry that was acquired from gk20a_fifo_profile_acquire(). Also get rid of ifdef CONFIG_DEBUG_FS by stubbing the acquire and free functions when debugfs is not enabled. This reduces some cyclomatic complexity in the submit path. Jira NVGPU-708 Change-Id: I39829a6475cfe3aa582620219e420bde62228e52 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1729545 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Code updates for MISRA violationsVinod G2018-05-18
| | | | | | | | | | | | | | | | | | | | | | | | | Code related to MC module is updated for handling MISRA violations Rule 10.1: Operands shalln't be an inappropriate essential type. Rule 10.3: Value of expression shalln't be assigned to an object with a narrow essential type. Rule 10.4: Both operands in an operator shall have the same essential type. Rule 14.4: Controlling if statement shall have essentially Boolean type. Rule 15.6: Enclose if() sequences with braces. JIRA NVGPU-646 JIRA NVGPU-659 JIRA NVGPU-671 Change-Id: Ia7ada40068eab5c164b8bad99bf8103b37a2fbc9 Signed-off-by: Vinod G <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1720926 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: add NVGPU_IOCTL_CHANNEL_RESCHEDULE_RUNLISTDavid Li2018-05-18
| | | | | | | | | | | | | | | | | | | | | | | Add NVGPU_IOCTL_CHANNEL_RESCHEDULE_RUNLIST ioctl to reschedule runlist, and optionally check host and FECS status to preempt pending load of context not belonging to the calling channel on GR engine during context switch. This should be called immediately after a submit to decrease worst case submit to start latency for high interleave channel. There is less than 0.002% chance that the ioctl blocks up to couple miliseconds due to race condition of FECS status changing while being read. For GV11B it will always preempt pending load of unwanted context since there is no chance that ioctl blocks due to race condition. Also fix bug with host reschedule for multiple runlists which needs to write both runlist registers. Bug 1987640 Bug 1924808 Change-Id: I0b7e2f91bd18b0b20928e5a3311b9426b1bf1848 Signed-off-by: David Li <davli@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1549050 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: runlist_lock released before preempt timeout recoverySeema Khowala2018-05-17
| | | | | | | | | | | | | | | | | Release runlist_lock and then initiate recovery if preempt timed out. Also do not issue preempt if ch, tsg or runlist id is invalid. tsgid could be invalid for below call trace gk20a_prepare_poweroff->gk20a_channel_suspend-> *_fifo_preempt_channel->*_fifo_preempt_tsg Bug 2065990 Bug 2043838 Change-Id: Ia1e3c134f06743e1258254a4a6f7256831706185 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1662656 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: add HAL to insert semaphore commandsDeepak Nibade2018-05-16
| | | | | | | | | | | | | | | | | | | | | Add below new HALs gops.fifo.add_sema_cmd() to insert HOST semaphore acquire/release methods gops.fifo.get_sema_wait_cmd_size() to get size of acquire command buffer gops.fifo.get_sema_incr_cmd_size() to get size of release command buffer Separate out new API gk20a_fifo_add_sema_cmd() to implement semaphore acquire/ release sequence and set it to gops.fifo.add_sema_cmd() Add gk20a_fifo_get_sema_wait_cmd_size() and gk20a_fifo_get_sema_incr_cmd_size() to return respective command buffer sizes Jira NVGPUT-16 Change-Id: Ia81a50921a6a56ebc237f2f90b137268aaa2d749 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1704490 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: add HALs to mmu fault descriptors.Vinod G2018-05-04
| | | | | | | | | | | | | | | mmu fault information for client and gpc differ on various chip. Add separate table for each chip based on that change and add hal functions to access those descriptors. bug 2050564 Change-Id: If15a4757762569d60d4ce1a6a47b8c9a93c11cb0 Signed-off-by: Vinod G <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1704105 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: add rc_type i/p param to gk20a_fifo_recoverSeema Khowala2018-05-04
| | | | | | | | | | | | | | | | | | | | | Add below rc_types to be passed to gk20a_fifo_recover MMU_FAULT PBDMA_FAULT GR_FAULT PREEMPT_TIMEOUT CTXSW_TIMEOUT RUNLIST_UPDATE_TIMEOUT FORCE_RESET SCHED_ERR This is nice to have to know what triggered recovery. Bug 2065990 Change-Id: I202268c5f237be2180b438e8ba027fce684967b6 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1662619 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: rename mutex to runlist_lockSeema Khowala2018-05-04
| | | | | | | | | | | | | | Rename mutex to runlist_lock in fifo_runlist_info_gk20a struct. This is good to have for code readability. Bug 2065990 Bug 2043838 Change-Id: I716685e3fad538458181d2a9fe592410401862b9 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1662587 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: add HALs to submit and wait for runlistDeepak Nibade2018-04-24
| | | | | | | | | | | | | | | | | | | | | | | | Add below two new HALs gops.fifo.runlist_hw_submit() to submit a new runlist to hardware gops.fifo.runlist_wait_pending() to wait until runlist write is successful Set existing API gk20a_fifo_runlist_wait_pending() to gops.fifo.runlist_wait_pending HAL Add new API gk20a_fifo_runlist_hw_submit() which submits the runlist to h/w and set it to gops.fifo.runlist_hw_submit HAL Jira NVGPUT-20 Change-Id: Ic23f7d947e30883aca0b536de818e79e14733195 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1700548 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: add gops.fifo.setup_swSourab Gupta2018-03-29
| | | | | | | | | | | | | bar1/userd setup is different for RM server. created common function gk20a_init_fifo_setup_sw_common. Jira VQRM-3058 Change-Id: I655b54e21ed5f15dcb8e7b01bd9cd129b35ae7a3 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1665691 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: make fifo/ch functions called by RM Server globalSourab Gupta2018-03-29
| | | | | | | | | | | | | | | | | | The patch declares globally few channel/fifo HAL functions required for QNX code compilation (as they are being referred elsewhere in QNX code). This is required as a part of bringing in the nvgpu Channel/FIFO HAL into QNX. Jira VQRM-3058 Change-Id: Ia176535b64de981d2f7ddb20f62015a0da74fd2a Signed-off-by: Sourab Gupta <sourabg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1662411 GVS: Gerrit_Virtual_Submit Tested-by: Richard Zhao <rizhao@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: hal for syncpt_incr_per_releaseseshendra Gadagottu2018-03-12
| | | | | | | | | | | | | | Create hal to indicate syncpt increments per release. Legacy chip uses 2 syncpt increments per release and gv1xx onwards uses 1 syncpt increment per release. Bug 2066025 Change-Id: I5d6d0a5368ef561f8150fbb7120181f49f6e338b Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1669817 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Fold T19x code back to main code pathsTerje Bergstrom2018-01-23
| | | | | | | | | | | | | Lots of code paths were split to T19x specific code paths and structs due to split repository. Now that repositories are merged, fold all of them back to main code paths and structs and remove the T19x specific Kconfig flag. Change-Id: Id0d17a5f0610fc0b49f51ab6664e716dc8b222b6 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1640606 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: runlist info mutex not needed for runlist_stateSeema Khowala2018-01-11
| | | | | | | | | | | | | | | | runlist_info mutex for the runlist being enabled or disabled in fifo_sched_disable_r is not needed to be acquired Bug 2043838 Change-Id: Ia9839ab7effbe7daf353c3a54f25a2b4914af5e8 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1630345 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Remove bare channel schedulingTerje Bergstrom2018-01-02
| | | | | | | | | | | | | | | | Remove scheduling IOCTL implementations for bare channels. Also removes code that constructs bare channels in runlist. Bug 1842197 Change-Id: I6e833b38e24a2f2c45c7993edf939d365eaf41f0 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1627326 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: deprecate TSG/CHANNEL_SET_PRIORITY IOCTLsDeepak Nibade2017-11-15
| | | | | | | | | | | | | | | TSG/CHANNEL_SET_PRIORITY IOCTLs are deprecated and user space should be using combination of timeslice and interleave levels to decide the priority Hence remove the IOCTLs and all corresponding APIs Jira NVGPU-393 Change-Id: I7cf0785689269536eca0c278c774b0e9e74f8c2f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1598581 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: add rc input param to gk20a_fifo_handle_pbdma_intrSeema Khowala2017-11-15
| | | | | | | | | | | | | | | | | | | Add a new parameter rc to gk20a_fifo_handle_pbdma_intr so that it can be called to handle pbdma intr without doing teardown. This is needed for t19x during polling of pbdma preempt Bug 200277163 Bug 1945121 Change-Id: Ide0d3b6ed8c0862cb5332d112926b6933abd0815 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1584734 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: get intr mask for an active_engine_idSeema Khowala2017-11-15
| | | | | | | | | | | | | | | | | | | | | | | | | This is needed for t19x during eng preempt done polling. E.g. copy engine (CE) stall interrupt should not prevent GR from finishing preemption. In order to check if current stall interrupt is valid for the engine being polled for preemption completion, function to provide engine intr mask is needed. With this, polling code can make sure there are no stall interrupts pending for the engine being polled for preemption done. If stall interrupts are pending for an engine, preemption will never finish. Bug 200277163 Bug 1945121 Change-Id: Ie1ccac52c3e8d453a49084e195f2e7eaafb8f057 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1584065 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: define runlist level in common codeDeepak Nibade2017-11-08
| | | | | | | | | | | | | | | | | | | | | | | | | All the runlist levels NVGPU_RUNLIST_INTERLEAVE_LEVEL_* are declared in linux specific uapi header and used in common code But since common code should be linux-independent, move these uses out of common code Define new runlist levels NVGPU_FIFO_RUNLIST_INTERLEAVE_LEVEL_* in common code and use them wherever required Add new API nvgpu_get_common_runlist_level() to get common runlist level of the form NVGPU_FIFO_RUNLIST_INTERLEAVE_LEVEL_* from linux specific runlist level of the form NVGPU_RUNLIST_INTERLEAVE_LEVEL_* Jira NVGPU-259 Change-Id: Ic19239f0f8275683d5d1b981df530acd90e6dfbb Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1594327 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Cleanup generic MM code in gk20a/mm_gk20a.cAlex Waterman2017-10-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move much of the remaining generic MM code to a new common location: common/mm/mm.c. Also add a corresponding <nvgpu/mm.h> header. This mostly consists of init and cleanup code to handle the common MM data structures like the VIDMEM code, address spaces for various engines, etc. A few more indepth changes were made as well. 1. alloc_inst_block() has been added to the MM HAL. This used to be defined directly in the gk20a code but it used a register. As a result, if this register hypothetically changes in the future, it would need to become a HAL anyway. This path preempts that and for now just defines all HALs to use the gk20a version. 2. Rename as much as possible: global functions are, for the most part, prepended with nvgpu (there are a few exceptions which I have yet to decide what to do with). Functions that are static are renamed to be as consistent with their functionality as possible since in some cases function effect and function name have diverged. JIRA NVGPU-30 Change-Id: Ic948f1ecc2f7976eba4bb7169a44b7226bb7c0b5 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1574499 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Move function to query rl interleaveTerje Bergstrom2017-10-23
| | | | | | | | | | | | Function to query interleave name depends on IOCTL flag definition. Move that code to fifo_gk20a.c to remove Linux dependency in header. Change-Id: I6d6a80e550bf30973b2be09febc2347890b77d25 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1577249 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit
* gpu: nvgpu: verify channel status while closing per-platformDeepak Nibade2017-10-04
| | | | | | | | | | | | | | | | We right now call gk20a_fifo_tsg_unbind_channel_verify_status() to verify channel status while unbinding a channel from TSG while closing Add support to do this verification per-platform and keep this disabled for vgpu platforms Bug 200327095 Change-Id: I19fab41c74d10d528d22bd9b3982a4ed73c3b4ca Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1572368 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: Change license for common files to MITTerje Bergstrom2017-09-26
| | | | | | | | | | | | Change license of OS independent source code files to MIT. JIRA NVGPU-218 Change-Id: I1474065f4b552112786974a16cdf076c5179540e Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1565880 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: fix channel unbind sequence from TSGDeepak Nibade2017-09-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We right now remove a channel from TSG list and disable all the channels in TSG while removing a channel from TSG With this sequence if any one channel in TSG is closed, rest of the channels are set as timed out and cannot be used anymore We need to fix this sequence as below to allow removing a channel from active TSG so that rest of the channels can still be used - disable all channels of TSG - preempt TSG - check if CTX_RELOAD is set if support is available if CTX_RELOAD is set on channel, it should be moved to some other channel - check if FAULTED is set if support is available - if NEXT is set on channel then it means channel is still active print out an error in this case for the time being until properly handled - remove the channel from runlist - remove channel from TSG list - re-enable rest of the channels in TSG - clean up the channel (same as regular channels) Add below fifo operations to support checking channel status g->ops.fifo.tsg_verify_status_ctx_reload g->ops.fifo.tsg_verify_status_faulted Define ops.fifo.tsg_verify_status_ctx_reload operation for gm20b/gp10b/gp106 as gm20b_fifo_tsg_verify_status_ctx_reload() This API will check if channel to be released has CTX_RELOAD set, if yes CTX_RELOAD needs to be moved to some other channel in TSG Remove static from channel_gk20a_update_runlist() and export it Bug 200327095 Change-Id: I0dd4be7c7e0b9b759389ec12c5a148a4b919d3e2 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1560637 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: fix TSG enable sequenceDeepak Nibade2017-09-15
| | | | | | | | | | | | | | | | | | | | | | | | | Due to a h/w bug in Maxwell and Pascal we first need to enable all channels with NEXT and CTX_RELOAD set in a TSG, and then rest of the channels should be enabled Add this sequence to gk20a_tsg_enable() Add new APIs to enable/disable scheduling of TSG runlist gk20a_fifo_enable_tsg_sched() gk20a_fifo_disble_tsg_sched() Add new APIs to check if channel has NEXT or CTX_RELOAD set gk20a_fifo_channel_status_is_next() gk20a_fifo_channel_status_is_ctx_reload() Bug 1739362 Change-Id: I4891cbd7f22ebc1e0bf32c52801002cdc259dbe1 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1560636 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: add NVGPU_SUBMIT_GPFIFO_FLAGS_RESCHEDULE_RUNLISTDavid Li2017-08-31
| | | | | | | | | | | | | | | | | | NVGPU_SUBMIT_GPFIFO_FLAGS_RESCHEDULE_RUNLIST causes host to expire current timeslice and reschedule from front of runlist. This can be used with NVGPU_RUNLIST_INTERLEAVE_LEVEL_HIGH to make a channel start sooner after submit rather than waiting for natural timeslice expiration or block/finish of currently running channel. Bug 1968813 Change-Id: I632e87c5f583a09ec8bf521dc73f595150abebb0 Signed-off-by: David Li <davli@nvidia.com> Reviewed-on: http://git-master/r/#/c/1537198 Reviewed-on: https://git-master.nvidia.com/r/1537198 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: Replace kref for refcounting in nvgpuDebarshi Dutta2017-08-24
| | | | | | | | | | | | | | | | - added wrapper struct nvgpu_ref over nvgpu_atomic_t - added nvgpu_ref_* APIs to access the above struct JIRA NVGPU-140 Change-Id: Id47f897995dd4721751f7610b6d4d4fbfe4d6b9a Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1540899 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
* gpu: nvgpu: Add wrapper over atomic_t and atomic64_tDebarshi Dutta2017-08-17
| | | | | | | | | | | | | | | | | - added wrapper structs nvgpu_atomic_t and nvgpu_atomic64_t over atomic_t and atomic64_t - added nvgpu_atomic_* and nvgpu_atomic64_* APIs to access the above wrappers. JIRA NVGPU-121 Change-Id: I61667bb0a84c2fc475365abb79bffb42b8b4786a Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1533044 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit
* gpu: nvgpu: Remove gk20a supportTerje Bergstrom2017-06-30
| | | | | | | | | | | | Remove gk20a support. Leave only gk20a code which is reused by other GPUs. JIRA NVGPU-38 Change-Id: I3d5f2bc9f71cd9f161e64436561a5eadd5786a3b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: https://git-master/r/1507927 GVS: Gerrit_Virtual_Submit
* gpu: nvgpu: add fifo_t19x and channel_baseRichard Zhao2017-06-30
| | | | | | | | | | | | | | | | - add fifo_t19x to fifo_gk20a - add channel_base to fifo_gk20a It'll make kick off code re-usable by vgpu. Jira VFND-3796 Change-Id: I7f7607d3682f2eaad903e1690d83c1d78eba8052 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: https://git-master/r/1510456 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* gpu: nvgpu: rename hw_chid to chidRichard Zhao2017-06-30
| | | | | | | | | | | | | hw_chid is a relative id for vgpu. For native it's same as hw id. Renaming it to chid to avoid confusing. Jira VFND-3796 Change-Id: I1c7924da1757330ace715a7c52ac61ec9dc7065c Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: https://git-master/r/1509530 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: add fifo ops for handling pbdma_intr_1Seema Khowala2017-06-27
| | | | | | | | | | | | This is needed to handle new pbmda intr_1 in t19x JIRA GPUT19X-47 Change-Id: If75de0b57f3f18420aff07ee99feaad67ac63752 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master/r/1329373 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: hold power ref for deterministic channelsKonsta Holtta2017-06-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To support deterministic channels even with platforms where railgating is supported, have each deterministic-marked channel hold a power reference during their lifetime, and skip taking power refs for jobs in submit path for those. Previously, railgating blocked deterministic submits in general because of gk20a_busy()/gk20a_idle() calls in submit path possibly taking time and more significantly because the gpu may need turning on which takes a nondeterministic and long amount of time. As an exception, gk20a_do_idle() can still block deterministic submits until gk20a_do_unidle() is called. Add a rwsem to guard this. VPR resize needs do_idle, which conflicts with deterministic channels' requirement to keep the GPU on. This is documented in the ioctl header now. Make NVGPU_GPU_FLAGS_SUPPORT_DETERMINISTIC_SUBMIT_NO_JOBTRACKING always set in the gpu characteristics now that it's supported. The only thing left now blocking NVGPU_GPU_FLAGS_SUPPORT_DETERMINISTIC_SUBMIT_FULL is the sync framework. Make the channel debug dump show which channels are deterministic. Bug 200291300 Jira NVGPU-70 Change-Id: I47b6f3a8517cd6e4255f6ca2855e3dd912e4f5f3 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1483038 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
* gpu: nvgpu: move debugfs code to linux moduleDeepak Nibade2017-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since all debugfs code is Linux specific, remove it from common code and move it to Linux module Debugfs code is now divided into below module specific files : common/linux/debug.c common/linux/debug_cde.c common/linux/debug_ce.c common/linux/debug_fifo.c common/linux/debug_gr.c common/linux/debug_mm.c common/linux/debug_allocator.c common/linux/debug_kmem.c common/linux/debug_pmu.c common/linux/debug_sched.c Add corresponding header files for above modules too And compile all of above files only if CONFIG_DEBUG_FS is set Some more details of the changes made - Move and rename gk20a/debug_gk20a.c to common/linux/debug.c - Move and rename gk20a/debug_gk20a.h to include/nvgpu/debug.h - Remove gm20b/debug_gm20b.c and gm20b/debug_gm20b.h and call gk20a_init_debug_ops() directly from gm20b_init_hal() - Update all debug APIs to receive struct gk20a as parameter instead of receiving struct device pointer - Update API gk20a_dmabuf_get_state() to receive struct gk20a pointer instead of struct device - Include <nvgpu/debug.h> explicitly in all files where debug operations are used - Remove "gk20a/platform_gk20a.h" include from HAL files which no longer need this include - Add new API gk20a_debug_deinit() to deinitialize debugfs and call it from gk20a_remove() - Move API gk20a_debug_dump_all_channel_status_ramfc() to gk20a/fifo_gk20a.c Jira NVGPU-62 Change-Id: I076975d3d7f669bdbe9212fa33d98529377feeb6 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1488902 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>