| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
**Modifes the user API from `echo 1 > /proc/gpuX/switch_to_tsg` to
`echo 1 > /proc/gpuX/runlist0/switch_to_tsg` to switch to TSG 1 on
runlist 0 on GPU X for pre-Ampere GPUs (for example).**
Feature changes:
- switch_to_tsg only makes sense on a per-runlist level. Before, this
always operated on runlist0; this commit allows operating on any
runlist by moving the API to the per-runlist paths.
- On Ampere+, channel and TSG IDs are per-runlist, and no longer
GPU-global. Consequently, the disable/enable_channel and
preempt_tsg APIs have been moved from GPU-global to per-runlist
paths on Ampere+.
Bug fixes:
- `preempt_runlist()` is now supported on Maxwell and Pascal.
- `resubmit_runlist()` detects too-old GPUs.
- MAX_CHID corrected from 512 to 511 and documented.
- switch_to_tsg now includes a runlist resubmit, which appears to be
necessary on Turing+ GPUs.
Tested on GK104 (Quadro K5000), GM204 (GTX 970), GP106 (GTX 1060 3GB),
GP104 (GTX 1080 Ti), GP10B (Jetson TX2), GV11B (Jetson Xavier), GV100
(Titan V), TU102 (RTX 2080 Ti), and AD102 (RTX 6000 Ada).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
**Modifes the user API from `cat /proc/gpuX/runlist0` to
`cat /proc/gpuX/runlist0/runlist` to support runlist-scoped
registers**
- Count number of runlists via Ampere-style PTOP parsing.
- Create a ProcFS directory for each runlist, and create the runlist
printing file in this directory.
- Document the newly-added/-formatted Runlist RAM and Channel RAM
registers.
- Add a helper function `get_runlist_ram()` to obtain the location
of each runlist's registers.
- Support printing Ampere-style Channel RAM entries.
Tested on Jetson Orin (ga10b), A100, H100, and AD102 (RTX 6000 Ada)
|
|
|
|
|
|
|
| |
- Document topology registers (PTOP) on Ampere+
- Document graphics copy engine configuration registers
- Move resubmit_runlist range checks into runlist.c
- Miscellaneous spacing, typo, and minor documentation fixes
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Resubmits the runlist in an identical configuration. Causes the
runlist scheduler to:
1. Reload and cache timeslice and scale values from TSGs.
2. Restart scheduling from the head of the runlist [may cause a
preempt to be scheduled for the currently-running task (?)].
3. Address (?) an errata on Turing where re-enabled channels are
not always detected.
Above behavior tested on GV100 and partially tested on TU102.
|
|
|
|
|
| |
- Move Linux-specific functions to nvdebug_linux.h and .c
- Workaround PDE_DATA() being pde_data() on Linux 5.17+
|
|
|
|
|
| |
Also update how Instance Pointers are aligned in the runlist output
to make them more easily distinguishable from other fields.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Support differently-formatted runlist registers on Turing
- Support different runlist register offsets on Turing
- Fix incorrect indenting when printing the runlist
- Fix `preempt_tsg` and `switch_to_tsg` API implementations to
correctly interface with the hardware (previously, they would try
to disable scheduling for the last-updated runlist pointer, which
was nonsense, and just an artifact of my early misunderstandings
of how the NV_PFIFO_RUNLIST* registers worked).
- Remove misused NV_PFIFO_RUNLIST and NV_PFIFO_RUNLIST_BASE registers
- Refactor `runlist.c` to use the APIs from `bus.c`
|
|
|
|
|
| |
Sometimes such "malformed" runlists appear on the TX2, yet they
seem to work fine, so support printing them in full.
|
|
|
|
|
|
|
|
|
| |
- Including missing dereference to finish getting the address of
the control register range
- Add zero-initialization to the proc_ops structure in copat_ops to
insure that all intentionally unset fields remain unset
- Set .llseek in all the file_operations structures, as recent
kernels require this to be explictly set
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Works around change in parameters to proc initialization functions
via a hacky function which rewrites the layout. This also required
making all the struct file_operations writable.
Also start reducing dependency on nvgpu headers.
Known issues:
- Incorrect message printed in log after module is loaded. Unclear
if this is because the register detection logic is broken, or if
the layout of the data at NV_MC_BOOT_0 has changed.
- Not tested
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Adds:
- /proc/preempt_tsg which takes a TSG ID
- /proc/disable_channel which takes a channel ID
- /proc/enable_channel which takes a channel ID
- /proc/switch_to_tsg which takes a TSG ID
Also significantly expands documentation and structs available in
nvdebug.h.
|
|
|
|
|
|
|
|
|
|
| |
- The sequence file infrastructure prior to kernel version 4.19
has a bug in the retry code when the write buffer overflows that
causes our private iterator state to be corrupted. Work around
this by tracking some info out-of-band.
- Now supports including detailed channel status information from
channel RAM when printing the runlist.
- Adds helper function to probe for and return struct gk20a*.
|
|
`cat /proc/runlist` to print the current runlist.
Also break nvdebug.c into nvdebug_entry.c, runlist.c, and
runlist_procfs.c.
|