aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* Improve debugging messages for reverse page translationJoshua Bakita2024-04-09
|
* Fix an off-by-one error in V2 reverse page table lookupsJoshua Bakita2024-04-09
| | | | | | | | This would occationally manifest as an inability to find the runlist page in BAR2, as only part of the page table was being traversed. Also includes non-functional changes to documentation, scoping, and structure layout.
* Correctly handle startup errors and fix gpc*_mask APIsJoshua Bakita2024-04-09
| | | | | | | | | | - Do not create gpc*_mask files on pre-Maxwell GPUs (tested unavailable on the K5000s) - Use correct register offsets for gpc*_mask files on Ampere+ GPUs - Document GPC and TPC count and fuse registers. - Correctly handle errors for creation of all ProcFS files - Remove unecessary error-handling temp variables in nvdebug_entry - Misc naming, comment, and layout cleanup
* Return const pointers to string constants.Joshua Bakita2024-04-09
| | | | | Also update how Instance Pointers are aligned in the runlist output to make them more easily distinguishable from other fields.
* Correctly use Volta-based runlist layout on the GV100 GPUJoshua Bakita2024-04-08
| | | | | Fixes a bug that caused the runlist output to be garbled on the GV100 GPU (the Titan V).
* Heavily refactor runlist code for correctness and Turing supportJoshua Bakita2024-04-08
| | | | | | | | | | | | | - Support differently-formatted runlist registers on Turing - Support different runlist register offsets on Turing - Fix incorrect indenting when printing the runlist - Fix `preempt_tsg` and `switch_to_tsg` API implementations to correctly interface with the hardware (previously, they would try to disable scheduling for the last-updated runlist pointer, which was nonsense, and just an artifact of my early misunderstandings of how the NV_PFIFO_RUNLIST* registers worked). - Remove misused NV_PFIFO_RUNLIST and NV_PFIFO_RUNLIST_BASE registers - Refactor `runlist.c` to use the APIs from `bus.c`
* Add Jetson TX1 supportJoshua Bakita2024-04-08
|
* Put PRAMIN-pointer and BAR2-page-table-PRAMIN-pointer logic into bus.cJoshua Bakita2024-04-08
| | | | | Derived from logic in `runlist.c` and `mmu.c`. The new functions are not directly used in this commit.
* Rework LCE<->PCE and GRCE->LCE configuration printing APIarchive/saman63-wipJoshua Bakita2024-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than up to dozens of individual files exposing part of each copy engine's configuration, have one file which exposes a unified view of the full topology. Example new output on RTX 2080 Ti: $ cat /proc/gpu0/copy_topology GRCE0 -> LCE04 GRCE1 -> LCE03 LCE02 -> PCE02 LCE03 -> PCE03 LCE04 -> PCE01 Old output: $ tail -n 1 /proc/gpu0/lce_for_pce* ==> /proc/gpu0/lce_for_pce0 <== 0xf ==> /proc/gpu0/lce_for_pce1 <== 0x4 ==> /proc/gpu0/lce_for_pce2 <== 0x2 ==> /proc/gpu0/lce_for_pce3 <== 0x3 $ tail -n 1 /proc/gpu1/shared_lce_for_grce* ==> /proc/gpu0/shared_lce_for_grce0 <== 0x4 ==> /proc/gpu0/shared_lce_for_grce1 <== 0x3 Specifically: - Add `copy_topology` API - Remove `shared_lce_for_grce#` and `lce_for_pce#` APIs - Move logic from `nvdebug_entry.c` to `copy_topology_procfs.c` - Do not print PCE or Shared LCE configuration if flagged absent - Refer to LCE0 and LCE1 as GRCE0 and GRCE1 - Print by LCE ID, which is move helpful when attempting to trace how a given copy runlist maps to a physical copy engine. - Document two errata with CE registers Tested working on Pascal Integrated, Pascal, Volta Integrated Volta, Turing, and Ampere Integrated on Linux 4.9 through 5.10.
* Expand support for printing LCE<->PCE and GRCE->LCE configurationrtas24-aeJoshua J Bakita2023-11-08
| | | | | | | | Tested working on Pascal, Volta, Volta Integrated, Turing, Ampere, and Ada. Also clean up minor spacing issues, an errantly added file (nvdebug.mod), and fix some inconsistencies with upstream.
* Created new read function in device_info for GRCE mappings and Pascal LCE ↵Saman Sahebi2023-10-29
| | | | mappings
* patched issues with GPU compatability for CE_MAPSaman Sahebi2023-10-29
|
* implemented GRCE to PCE mappingSaman Sahebi2023-10-29
|
* Updated for loop formatting and fixed a casting errorSaman Sahebi2023-10-29
|
* added offsets for lce mapping in nvdebug.h and code to read lce for each pce ↵Saman Sahebi2023-10-29
| | | | in nvdebug_entry.c
* Add nvdebug.mod to .gitignoreJoshua Bakita2023-10-29
|
* Support printing device info on Ampere+ GPUs. By Benjamin Hadad IVJoshua Bakita2023-10-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c3d6f2c852eb046e9d4f4f1e6527b52c746b2693 Author: Joshua Bakita <bakitajoshua@gmail.com> Date: Sun Oct 29 14:37:51 2023 -0400 Print Ampere+ device_info fields with correct offsets/widths Everything now has been checked against how nvgpu handles it commit b70849d1ce67a58f9f69b37dc62122f789f4cdf7 Author: Joshua Bakita <jbakita@cs.unc.edu> Date: Wed Sep 20 14:27:38 2023 -0400 Rearrange, fix an off-by-one error, and remove an unused define The code in nvdebug.h has been rearranged to enable an easier merge against the jbakita-wip branch. commit 51f808e092846a60ea6c88ea3a1d2e349c92977b Author: Joshua Bakita <jbakita@cs.unc.edu> Date: Wed Sep 20 13:09:17 2023 -0400 Bug fixes and cleanup for new device_info logic - Update comments to match new structure - Make show() function idempotent - Skip empty table entries without aborting - Include names for new engine types - Add warning log messages for skipped table entries - Remove non-functional runlist file creation logic for Ampere+ commit 1d7adc3be1aef5ac9c144bb24008fd8cc5d688a5 Author: Benjamin Hadad IV <bh4@unc.edu> Date: Sat Aug 19 12:47:18 2023 -0400 Debugging changes made to restore functionality following refactoring. - Debugged data display errors. - Debugged crash bugs. - Debugged memory issue. commit 9e6cc03cdf736fbd817ed53fa9a7f506bc91a244 Author: Benjamin Hadad IV <bh4@unc.edu> Date: Wed Aug 16 22:00:20 2023 -0400 A variety of changes have been made as part of the code review. - Functions have been consolidated. - Code was clarified and tidied up overall. - Unnecessary elements were removed. commit 845960fc1b15995fdbd6d61c384567652a150bc4 Author: Benjamin Hadad IV <bh4@unc.edu> Date: Fri Jul 28 11:39:28 2023 -0400 Refactored various systems and debugged minor issues - Added device_info_iter - Merged functions in device_info_procfs.c - Separated device_info data structs by version in nvdebug.h - Fixed issue with device_info runlist ID data commit 8a57aaeba41c43233c323d7e0fc8bf1a81ebc65e Author: Benjamin Hadad IV <bh4@unc.edu> Date: Fri Jul 21 11:32:51 2023 -0400 I have updated the ptop_device_info_t comment in nvdebug.h. commit 33c915f08f5dc63674b158ecc18897494256a6d0 Author: Benjamin Hadad IV <bh4@unc.edu> Date: Wed Jul 19 13:02:52 2023 -0400 Debugged device_info functionality - Fixed device_info crash bugs - Made further edits to display functionality - Refactored code to enhance readability commit bfb4dcf0e78954c0163f3a06a5a088c4d1b437a8 Author: Benjamin Hadad IV <bh4@unc.edu> Date: Thu Jul 13 12:13:17 2023 -0400 This commit is to update the repo for display during a meeting. - Added an Ampere version of the device info data. - Added Ampere versions of auxillary functions. - Modified display functions to accommodate Ampere data. - Made other various small modifications. commit 068e7f4e7208d6c9250ad72208e0b36fd9fdf2f6 Merge: 3725b15 073e897 Author: Benjamin Hadad IV <bh4@unc.edu> Date: Mon Jul 10 12:39:12 2023 -0400 Merge branch 'jbakita-wip' of ssh://rtsrv.cs.unc.edu/public/nvdebug into wip I am merging Mr. Bakita's changes (046d7d2) into this repository. commit 3725b15d5da3e06ef202045d710aa5f15eb72fcc Author: Benjamin Hadad IV <bh4@unc.edu> Date: Mon Jul 3 04:30:54 2023 -0400 I modified nvdebug.h for Ampere.
* Rather than abort, print placeholders for missing runlist channelsJoshua Bakita2023-10-29
| | | | | Sometimes such "malformed" runlists appear on the TX2, yet they seem to work fine, so support printing them in full.
* Support PRAMIN-based runlist access fallback (optional; on by default)Joshua Bakita2023-10-29
| | | | | | | | | | | | | Using this may be hazardous---we don't know if some of the GPU drivers use this after initial bring-up. If they do, and we race with them in setting it, or we unexpectedly change it under them, arbitrary state corruption could occur. This is only entirely safe to use if you don't trust the GPU state after the first use of this fallback. In limited experiments vs the `nvgpu` (Tegra) and `nvidia` (closed-source discrete) drivers, no ill side effects have yet been observed, but still please use with caution.
* Zero-out unused/invalid fields in g_nvdebug_device on Tegra devicesJoshua Bakita2023-10-29
| | | | | | Previously, unloading the module could cause a segfault on Tegra, as pcid would be unitilized and possibly non-zero, causing us to attempt PCIe-device-style deinitialization on a non-PCEe device.
* Include <linux/seq_file.h> in nvdebug.h and sort includesJoshua Bakita2023-10-29
| | | | | Fixes the build for some kernel versions where this is no longer transatively included.
* Update includes to L4T r32.7.4 and drop nvgpu/gk20a.h dependencyJoshua Bakita2023-10-29
| | | | | Also add instructions for updating `include/`. These files are now only needed to build on Linux 4.9-based Tegra platforms.
* Only log interrupts if explictly enabledJoshua Bakita2023-10-05
|
* Add architecture decode for Ada, Hopper, and BlackwellJoshua Bakita2023-09-01
|
* Improve copy engine register documentation in nvdebug.h + cleanupJoshua Bakita2023-07-20
|
* Fail reads which return the flag value for a non-existent registerJoshua Bakita2023-07-18
| | | | Also check for read success in get_runlist_iter().
* Fix addressing, zero-init, and compatibility bugs in a3fe378Joshua Bakita2023-07-03
| | | | | | | | | - Including missing dereference to finish getting the address of the control register range - Add zero-initialization to the proc_ops structure in copat_ops to insure that all intentionally unset fields remain unset - Set .llseek in all the file_operations structures, as recent kernels require this to be explictly set
* Hacky support for Linux 5.6+ and the Jetson AGX OrinJoshua Bakita2023-06-29
| | | | | | | | | | | | | | Works around change in parameters to proc initialization functions via a hacky function which rewrites the layout. This also required making all the struct file_operations writable. Also start reducing dependency on nvgpu headers. Known issues: - Incorrect message printed in log after module is loaded. Unclear if this is because the register detection logic is broken, or if the layout of the data at NV_MC_BOOT_0 has changed. - Not tested
* Fix deinitialization on non-PCIe devicesJoshua Bakita2023-06-28
|
* Include nvgpu headersJoshua Bakita2023-06-28
| | | | | | These are needed to build on NVIDIA's Jetson boards for the time being. Only a couple structs are required, so it should be fairly easy to remove this dependency at some point in the future.
* Quick dump of current state for Ben to review.Joshua Bakita2023-06-22
|
* Comment fixup and abort if runlist is stored in VRAMJoshua Bakita2021-10-03
|
* Add APIs to enable/disable a channel and switch to or preempt a specific TSGJoshua Bakita2021-09-23
| | | | | | | | | | | Adds: - /proc/preempt_tsg which takes a TSG ID - /proc/disable_channel which takes a channel ID - /proc/enable_channel which takes a channel ID - /proc/switch_to_tsg which takes a TSG ID Also significantly expands documentation and structs available in nvdebug.h.
* Include git hash as version number and print startup messageJoshua Bakita2021-09-22
|
* Fix a pre-4.19 bug in seq procfs files and add detailed channel printJoshua Bakita2021-09-22
| | | | | | | | | | - The sequence file infrastructure prior to kernel version 4.19 has a bug in the retry code when the write buffer overflows that causes our private iterator state to be corrupted. Work around this by tracking some info out-of-band. - Now supports including detailed channel status information from channel RAM when printing the runlist. - Adds helper function to probe for and return struct gk20a*.
* Use procfs instead of dmesg to print runlistJoshua Bakita2021-08-26
| | | | | | | `cat /proc/runlist` to print the current runlist. Also break nvdebug.c into nvdebug_entry.c, runlist.c, and runlist_procfs.c.
* Add initial implementationJoshua Bakita2021-08-26
Supports accessing and printing the runlist on the Jetson Xavier to dmesg. May work on other Jetson boards. Currently requires the nvgpu headers from NVIDIA's Linux4Tegra (L4T) source tree.