<feed xmlns='http://www.w3.org/2005/Atom'>
<title>nvdebug.git, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.</subtitle>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/nvdebug.git/'/>
<entry>
<title>Fix a race condition in nvdebug_{readl,readq,writel,writeq}</title>
<updated>2025-04-04T14:29:54+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>jbakita@cs.unc.edu</email>
</author>
<published>2025-04-04T14:29:54+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/nvdebug.git/commit/?id=494df296bf4abe9b2b484bde1a4fad28c989afec'/>
<id>494df296bf4abe9b2b484bde1a4fad28c989afec</id>
<content type='text'>
When the GPU is powered off, attempts to read any of its registers
(such as via nvdebug_readl()) result in a fatal interrupt. The
pm_runtime_get() call included in nvdebug sent a request to nvgpu
to turn the GPU back on. **However,** this call did not wait for
the power-on command to take effect. This resulted in a race between
nvdebug and the power management logic, meaning that the GPU may not
have powered-on by the time that nvdebug attempted to read its
registers.

Use pm_runtime_get_sync() instead, which explicitly waits for the
power-on command to complete (or fail) before returning. This
eliminates the race condition.

Thank you to Diego Alejandro Parra Guzman
&lt;diego.guzman@tttech-auto.com&gt;, who brought this issue to my
attention.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the GPU is powered off, attempts to read any of its registers
(such as via nvdebug_readl()) result in a fatal interrupt. The
pm_runtime_get() call included in nvdebug sent a request to nvgpu
to turn the GPU back on. **However,** this call did not wait for
the power-on command to take effect. This resulted in a race between
nvdebug and the power management logic, meaning that the GPU may not
have powered-on by the time that nvdebug attempted to read its
registers.

Use pm_runtime_get_sync() instead, which explicitly waits for the
power-on command to complete (or fail) before returning. This
eliminates the race condition.

Thank you to Diego Alejandro Parra Guzman
&lt;diego.guzman@tttech-auto.com&gt;, who brought this issue to my
attention.
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix a critical regression in 71be6bb5 causing multiple API failures</title>
<updated>2024-11-04T16:28:21+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>jbakita@cs.unc.edu</email>
</author>
<published>2024-11-04T16:28:21+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/nvdebug.git/commit/?id=6143114460e5125621747cde2f712fed445b9a15'/>
<id>6143114460e5125621747cde2f712fed445b9a15</id>
<content type='text'>
Instead of printing the read in `nvdebug_reg32_read()`, another
read was being performed, using the first read value as the register
offset! This is a mistaken incomplete removal of the old pre-error-
-handling logic in 71be6bb5.

This caused any APIs using this function to not work by returning
bizzare or incorrect values, or crashing the system on Jetson boards.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of printing the read in `nvdebug_reg32_read()`, another
read was being performed, using the first read value as the register
offset! This is a mistaken incomplete removal of the old pre-error-
-handling logic in 71be6bb5.

This caused any APIs using this function to not work by returning
bizzare or incorrect values, or crashing the system on Jetson boards.
</pre>
</div>
</content>
</entry>
<entry>
<title>Delete no-longer-needed nvgpu headers</title>
<updated>2024-09-25T20:09:09+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>bakitajoshua@gmail.com</email>
</author>
<published>2024-09-25T20:09:09+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/nvdebug.git/commit/?id=f347fde22f1297e4f022600d201780d5ead78114'/>
<id>f347fde22f1297e4f022600d201780d5ead78114</id>
<content type='text'>
The dependency on these was removed in commit 8340d234.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The dependency on these was removed in commit 8340d234.
</pre>
</div>
</content>
</entry>
<entry>
<title>Remove dependency on Jetson (nvgpu) driver internals</title>
<updated>2024-09-25T19:58:37+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>bakitajoshua@gmail.com</email>
</author>
<published>2024-09-25T19:58:37+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/nvdebug.git/commit/?id=8340d234d78a7d0f46c11a584de538148b78b7cb'/>
<id>8340d234d78a7d0f46c11a584de538148b78b7cb</id>
<content type='text'>
For integrated (Jetson) GPUs:
- Directly retrieve and map GPU register region 0
- Directly check GPU power-on state before a register read/write
- Resume the GPU as needed for a register read/write

Most nvgpu APIs can now be called on TX2+ integrated GPUs without
first having to start some task on the GPU to make it non-suspended.

Tested on Jetson TX1, TX2, Xavier, and Orin.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For integrated (Jetson) GPUs:
- Directly retrieve and map GPU register region 0
- Directly check GPU power-on state before a register read/write
- Resume the GPU as needed for a register read/write

Most nvgpu APIs can now be called on TX2+ integrated GPUs without
first having to start some task on the GPU to make it non-suspended.

Tested on Jetson TX1, TX2, Xavier, and Orin.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add a README</title>
<updated>2024-09-25T17:28:56+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>bakitajoshua@gmail.com</email>
</author>
<published>2024-09-25T17:28:56+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/nvdebug.git/commit/?id=e2fe4cb56e6252b9cf0b43c6180efbb20a168ce0'/>
<id>e2fe4cb56e6252b9cf0b43c6180efbb20a168ce0</id>
<content type='text'>
See also the RTAS'23 and RTAS'24 papers.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
See also the RTAS'23 and RTAS'24 papers.
</pre>
</div>
</content>
</entry>
<entry>
<title>Correct an off-by-one error in addr_to_pramin_mut()</title>
<updated>2024-09-25T16:52:42+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>bakitajoshua@gmail.com</email>
</author>
<published>2024-09-25T16:44:16+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/nvdebug.git/commit/?id=2104d15eb12b03ed4cfa8eb4dc95ad13cee43227'/>
<id>2104d15eb12b03ed4cfa8eb4dc95ad13cee43227</id>
<content type='text'>
Not known to cause any current bugs, but could cause the returned
address to be inaccessible.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Not known to cause any current bugs, but could cause the returned
address to be inaccessible.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add IDs and names of new Hopper+ engines</title>
<updated>2024-09-25T15:05:54+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>bakitajoshua@gmail.com</email>
</author>
<published>2024-09-25T15:05:54+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/nvdebug.git/commit/?id=3ceb1066a751140f28444d2021f2841dd85fd482'/>
<id>3ceb1066a751140f28444d2021f2841dd85fd482</id>
<content type='text'>
Draws from NVIDIA's open-gpu-kernel-modules project.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Draws from NVIDIA's open-gpu-kernel-modules project.
</pre>
</div>
</content>
</entry>
<entry>
<title>Return an error, rather than a flag value, from `nvdebug_reg32_read()`</title>
<updated>2024-09-19T19:40:33+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>bakitajoshua@gmail.com</email>
</author>
<published>2024-09-19T19:40:33+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/nvdebug.git/commit/?id=71be6bb5203e32caa2aaf90b77b7bdbc8abac8b8'/>
<id>71be6bb5203e32caa2aaf90b77b7bdbc8abac8b8</id>
<content type='text'>
This is used to back APIs like `num_gpcs`. Better to return an error
to the caller, rather than -1 (which may be confused for an actual
result).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is used to back APIs like `num_gpcs`. Better to return an error
to the caller, rather than -1 (which may be confused for an actual
result).
</pre>
</div>
</content>
</entry>
<entry>
<title>Correctly check for read errors in the nvdebug_read* functions</title>
<updated>2024-09-19T19:38:53+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>bakitajoshua@gmail.com</email>
</author>
<published>2024-09-19T19:38:53+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/nvdebug.git/commit/?id=3653aee74ae8338b9da1f0304b0eaa1171dd640f'/>
<id>3653aee74ae8338b9da1f0304b0eaa1171dd640f</id>
<content type='text'>
Follows how NVIDIA's open-source GPU driver checks for bad reads.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Follows how NVIDIA's open-source GPU driver checks for bad reads.
</pre>
</div>
</content>
</entry>
<entry>
<title>Ampere: disable/enable_channel, preempt/switch_to_tsg, and resubmit_runlist</title>
<updated>2024-09-19T17:59:56+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>bakitajoshua@gmail.com</email>
</author>
<published>2024-09-19T16:50:02+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/nvdebug.git/commit/?id=48f9e45b9d9ebfca7d3c673597f7fbed9427a5af'/>
<id>48f9e45b9d9ebfca7d3c673597f7fbed9427a5af</id>
<content type='text'>
**Modifes the user API from `echo 1 &gt; /proc/gpuX/switch_to_tsg` to
`echo 1 &gt; /proc/gpuX/runlist0/switch_to_tsg` to switch to TSG 1 on
runlist 0 on GPU X for pre-Ampere GPUs (for example).**

Feature changes:
- switch_to_tsg only makes sense on a per-runlist level. Before, this
  always operated on runlist0; this commit allows operating on any
  runlist by moving the API to the per-runlist paths.
- On Ampere+, channel and TSG IDs are per-runlist, and no longer
  GPU-global. Consequently, the disable/enable_channel and
  preempt_tsg APIs have been moved from GPU-global to per-runlist
  paths on Ampere+.

Bug fixes:
- `preempt_runlist()` is now supported on Maxwell and Pascal.
- `resubmit_runlist()` detects too-old GPUs.
- MAX_CHID corrected from 512 to 511 and documented.
- switch_to_tsg now includes a runlist resubmit, which appears to be
  necessary on Turing+ GPUs.

Tested on GK104 (Quadro K5000), GM204 (GTX 970), GP106 (GTX 1060 3GB),
GP104 (GTX 1080 Ti), GP10B (Jetson TX2), GV11B (Jetson Xavier), GV100
(Titan V), TU102 (RTX 2080 Ti), and AD102 (RTX 6000 Ada).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
**Modifes the user API from `echo 1 &gt; /proc/gpuX/switch_to_tsg` to
`echo 1 &gt; /proc/gpuX/runlist0/switch_to_tsg` to switch to TSG 1 on
runlist 0 on GPU X for pre-Ampere GPUs (for example).**

Feature changes:
- switch_to_tsg only makes sense on a per-runlist level. Before, this
  always operated on runlist0; this commit allows operating on any
  runlist by moving the API to the per-runlist paths.
- On Ampere+, channel and TSG IDs are per-runlist, and no longer
  GPU-global. Consequently, the disable/enable_channel and
  preempt_tsg APIs have been moved from GPU-global to per-runlist
  paths on Ampere+.

Bug fixes:
- `preempt_runlist()` is now supported on Maxwell and Pascal.
- `resubmit_runlist()` detects too-old GPUs.
- MAX_CHID corrected from 512 to 511 and documented.
- switch_to_tsg now includes a runlist resubmit, which appears to be
  necessary on Turing+ GPUs.

Tested on GK104 (Quadro K5000), GM204 (GTX 970), GP106 (GTX 1060 3GB),
GP104 (GTX 1080 Ti), GP10B (Jetson TX2), GV11B (Jetson Xavier), GV100
(Titan V), TU102 (RTX 2080 Ti), and AD102 (RTX 6000 Ada).
</pre>
</div>
</content>
</entry>
</feed>
