aboutsummaryrefslogtreecommitdiffstats
path: root/nvdebug.h
Commit message (Collapse)AuthorAge
* Add /proc/gpu#/local_memory API for getting VRAM sizeJoshua Bakita2024-04-13
|
* Linux 5.17+ support and allow including nvdebug.h independentlyJoshua Bakita2024-04-11
| | | | | - Move Linux-specific functions to nvdebug_linux.h and .c - Workaround PDE_DATA() being pde_data() on Linux 5.17+
* Support page directories outside PRAMIN or in SYS_MEMJoshua Bakita2024-04-11
| | | | | | | | | | | | | | | | | | | | | - Re-read PRAMIN configuration after update to verify change applies - Return a page_dir_config_t rather than just an address and page. table version from `get_bar2_pdb()`. - Less verbose logging for MMU-related functions by default. - Perform all conversion from SYS_MEM/VID_MEM addresses to kernel addresses inside the translation functions, via the new function 'pd_deref()`. - Support use of an I/O MMU, page tables/directories outside the current PRAMIN window, and page tables/directories arbitrarially located in SYS_MEM or VID_MEM on different levels of the same tree. - Heavily improve documentation and add references for Version 1 and Version 0 page tables. - Improve logging in `runlist.c` to include runlist and chip IDs. - Update all users of search_page_directory* to use the new API. - Remove now-unused supporting functions from `mmu.c`. Tested on GTX 970, GTX 1060 3GB, Jetson TX2, Titan V, Jetson Xavier, and RTX 2080 Ti.
* Correctly check return code from vram2PRAMIN()Joshua Bakita2024-04-09
| | | | | | Blindly using an invalid return address was resulting in undefined behavior due to traversal of non-page-table addresses as though they were part of the page table.
* Improve debugging messages for reverse page translationJoshua Bakita2024-04-09
|
* Fix an off-by-one error in V2 reverse page table lookupsJoshua Bakita2024-04-09
| | | | | | | | This would occationally manifest as an inability to find the runlist page in BAR2, as only part of the page table was being traversed. Also includes non-functional changes to documentation, scoping, and structure layout.
* Correctly handle startup errors and fix gpc*_mask APIsJoshua Bakita2024-04-09
| | | | | | | | | | - Do not create gpc*_mask files on pre-Maxwell GPUs (tested unavailable on the K5000s) - Use correct register offsets for gpc*_mask files on Ampere+ GPUs - Document GPC and TPC count and fuse registers. - Correctly handle errors for creation of all ProcFS files - Remove unecessary error-handling temp variables in nvdebug_entry - Misc naming, comment, and layout cleanup
* Return const pointers to string constants.Joshua Bakita2024-04-09
| | | | | Also update how Instance Pointers are aligned in the runlist output to make them more easily distinguishable from other fields.
* Correctly use Volta-based runlist layout on the GV100 GPUJoshua Bakita2024-04-08
| | | | | Fixes a bug that caused the runlist output to be garbled on the GV100 GPU (the Titan V).
* Heavily refactor runlist code for correctness and Turing supportJoshua Bakita2024-04-08
| | | | | | | | | | | | | - Support differently-formatted runlist registers on Turing - Support different runlist register offsets on Turing - Fix incorrect indenting when printing the runlist - Fix `preempt_tsg` and `switch_to_tsg` API implementations to correctly interface with the hardware (previously, they would try to disable scheduling for the last-updated runlist pointer, which was nonsense, and just an artifact of my early misunderstandings of how the NV_PFIFO_RUNLIST* registers worked). - Remove misused NV_PFIFO_RUNLIST and NV_PFIFO_RUNLIST_BASE registers - Refactor `runlist.c` to use the APIs from `bus.c`
* Put PRAMIN-pointer and BAR2-page-table-PRAMIN-pointer logic into bus.cJoshua Bakita2024-04-08
| | | | | Derived from logic in `runlist.c` and `mmu.c`. The new functions are not directly used in this commit.
* Rework LCE<->PCE and GRCE->LCE configuration printing APIarchive/saman63-wipJoshua Bakita2024-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than up to dozens of individual files exposing part of each copy engine's configuration, have one file which exposes a unified view of the full topology. Example new output on RTX 2080 Ti: $ cat /proc/gpu0/copy_topology GRCE0 -> LCE04 GRCE1 -> LCE03 LCE02 -> PCE02 LCE03 -> PCE03 LCE04 -> PCE01 Old output: $ tail -n 1 /proc/gpu0/lce_for_pce* ==> /proc/gpu0/lce_for_pce0 <== 0xf ==> /proc/gpu0/lce_for_pce1 <== 0x4 ==> /proc/gpu0/lce_for_pce2 <== 0x2 ==> /proc/gpu0/lce_for_pce3 <== 0x3 $ tail -n 1 /proc/gpu1/shared_lce_for_grce* ==> /proc/gpu0/shared_lce_for_grce0 <== 0x4 ==> /proc/gpu0/shared_lce_for_grce1 <== 0x3 Specifically: - Add `copy_topology` API - Remove `shared_lce_for_grce#` and `lce_for_pce#` APIs - Move logic from `nvdebug_entry.c` to `copy_topology_procfs.c` - Do not print PCE or Shared LCE configuration if flagged absent - Refer to LCE0 and LCE1 as GRCE0 and GRCE1 - Print by LCE ID, which is move helpful when attempting to trace how a given copy runlist maps to a physical copy engine. - Document two errata with CE registers Tested working on Pascal Integrated, Pascal, Volta Integrated Volta, Turing, and Ampere Integrated on Linux 4.9 through 5.10.
* Expand support for printing LCE<->PCE and GRCE->LCE configurationrtas24-aeJoshua J Bakita2023-11-08
| | | | | | | | Tested working on Pascal, Volta, Volta Integrated, Turing, Ampere, and Ada. Also clean up minor spacing issues, an errantly added file (nvdebug.mod), and fix some inconsistencies with upstream.
* Created new read function in device_info for GRCE mappings and Pascal LCE ↵Saman Sahebi2023-10-29
| | | | mappings
* patched issues with GPU compatability for CE_MAPSaman Sahebi2023-10-29
|
* implemented GRCE to PCE mappingSaman Sahebi2023-10-29
|
* added offsets for lce mapping in nvdebug.h and code to read lce for each pce ↵Saman Sahebi2023-10-29
| | | | in nvdebug_entry.c
* Support printing device info on Ampere+ GPUs. By Benjamin Hadad IVJoshua Bakita2023-10-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c3d6f2c852eb046e9d4f4f1e6527b52c746b2693 Author: Joshua Bakita <bakitajoshua@gmail.com> Date: Sun Oct 29 14:37:51 2023 -0400 Print Ampere+ device_info fields with correct offsets/widths Everything now has been checked against how nvgpu handles it commit b70849d1ce67a58f9f69b37dc62122f789f4cdf7 Author: Joshua Bakita <jbakita@cs.unc.edu> Date: Wed Sep 20 14:27:38 2023 -0400 Rearrange, fix an off-by-one error, and remove an unused define The code in nvdebug.h has been rearranged to enable an easier merge against the jbakita-wip branch. commit 51f808e092846a60ea6c88ea3a1d2e349c92977b Author: Joshua Bakita <jbakita@cs.unc.edu> Date: Wed Sep 20 13:09:17 2023 -0400 Bug fixes and cleanup for new device_info logic - Update comments to match new structure - Make show() function idempotent - Skip empty table entries without aborting - Include names for new engine types - Add warning log messages for skipped table entries - Remove non-functional runlist file creation logic for Ampere+ commit 1d7adc3be1aef5ac9c144bb24008fd8cc5d688a5 Author: Benjamin Hadad IV <bh4@unc.edu> Date: Sat Aug 19 12:47:18 2023 -0400 Debugging changes made to restore functionality following refactoring. - Debugged data display errors. - Debugged crash bugs. - Debugged memory issue. commit 9e6cc03cdf736fbd817ed53fa9a7f506bc91a244 Author: Benjamin Hadad IV <bh4@unc.edu> Date: Wed Aug 16 22:00:20 2023 -0400 A variety of changes have been made as part of the code review. - Functions have been consolidated. - Code was clarified and tidied up overall. - Unnecessary elements were removed. commit 845960fc1b15995fdbd6d61c384567652a150bc4 Author: Benjamin Hadad IV <bh4@unc.edu> Date: Fri Jul 28 11:39:28 2023 -0400 Refactored various systems and debugged minor issues - Added device_info_iter - Merged functions in device_info_procfs.c - Separated device_info data structs by version in nvdebug.h - Fixed issue with device_info runlist ID data commit 8a57aaeba41c43233c323d7e0fc8bf1a81ebc65e Author: Benjamin Hadad IV <bh4@unc.edu> Date: Fri Jul 21 11:32:51 2023 -0400 I have updated the ptop_device_info_t comment in nvdebug.h. commit 33c915f08f5dc63674b158ecc18897494256a6d0 Author: Benjamin Hadad IV <bh4@unc.edu> Date: Wed Jul 19 13:02:52 2023 -0400 Debugged device_info functionality - Fixed device_info crash bugs - Made further edits to display functionality - Refactored code to enhance readability commit bfb4dcf0e78954c0163f3a06a5a088c4d1b437a8 Author: Benjamin Hadad IV <bh4@unc.edu> Date: Thu Jul 13 12:13:17 2023 -0400 This commit is to update the repo for display during a meeting. - Added an Ampere version of the device info data. - Added Ampere versions of auxillary functions. - Modified display functions to accommodate Ampere data. - Made other various small modifications. commit 068e7f4e7208d6c9250ad72208e0b36fd9fdf2f6 Merge: 3725b15 073e897 Author: Benjamin Hadad IV <bh4@unc.edu> Date: Mon Jul 10 12:39:12 2023 -0400 Merge branch 'jbakita-wip' of ssh://rtsrv.cs.unc.edu/public/nvdebug into wip I am merging Mr. Bakita's changes (046d7d2) into this repository. commit 3725b15d5da3e06ef202045d710aa5f15eb72fcc Author: Benjamin Hadad IV <bh4@unc.edu> Date: Mon Jul 3 04:30:54 2023 -0400 I modified nvdebug.h for Ampere.
* Include <linux/seq_file.h> in nvdebug.h and sort includesJoshua Bakita2023-10-29
| | | | | Fixes the build for some kernel versions where this is no longer transatively included.
* Update includes to L4T r32.7.4 and drop nvgpu/gk20a.h dependencyJoshua Bakita2023-10-29
| | | | | Also add instructions for updating `include/`. These files are now only needed to build on Linux 4.9-based Tegra platforms.
* Add architecture decode for Ada, Hopper, and BlackwellJoshua Bakita2023-09-01
|
* Improve copy engine register documentation in nvdebug.h + cleanupJoshua Bakita2023-07-20
|
* Fail reads which return the flag value for a non-existent registerJoshua Bakita2023-07-18
| | | | Also check for read success in get_runlist_iter().
* Fix addressing, zero-init, and compatibility bugs in a3fe378Joshua Bakita2023-07-03
| | | | | | | | | - Including missing dereference to finish getting the address of the control register range - Add zero-initialization to the proc_ops structure in copat_ops to insure that all intentionally unset fields remain unset - Set .llseek in all the file_operations structures, as recent kernels require this to be explictly set
* Hacky support for Linux 5.6+ and the Jetson AGX OrinJoshua Bakita2023-06-29
| | | | | | | | | | | | | | Works around change in parameters to proc initialization functions via a hacky function which rewrites the layout. This also required making all the struct file_operations writable. Also start reducing dependency on nvgpu headers. Known issues: - Incorrect message printed in log after module is loaded. Unclear if this is because the register detection logic is broken, or if the layout of the data at NV_MC_BOOT_0 has changed. - Not tested
* Quick dump of current state for Ben to review.Joshua Bakita2023-06-22
|
* Add APIs to enable/disable a channel and switch to or preempt a specific TSGJoshua Bakita2021-09-23
| | | | | | | | | | | Adds: - /proc/preempt_tsg which takes a TSG ID - /proc/disable_channel which takes a channel ID - /proc/enable_channel which takes a channel ID - /proc/switch_to_tsg which takes a TSG ID Also significantly expands documentation and structs available in nvdebug.h.
* Fix a pre-4.19 bug in seq procfs files and add detailed channel printJoshua Bakita2021-09-22
| | | | | | | | | | - The sequence file infrastructure prior to kernel version 4.19 has a bug in the retry code when the write buffer overflows that causes our private iterator state to be corrupted. Work around this by tracking some info out-of-band. - Now supports including detailed channel status information from channel RAM when printing the runlist. - Adds helper function to probe for and return struct gk20a*.
* Use procfs instead of dmesg to print runlistJoshua Bakita2021-08-26
| | | | | | | `cat /proc/runlist` to print the current runlist. Also break nvdebug.c into nvdebug_entry.c, runlist.c, and runlist_procfs.c.
* Add initial implementationJoshua Bakita2021-08-26
Supports accessing and printing the runlist on the Jetson Xavier to dmesg. May work on other Jetson boards. Currently requires the nvgpu headers from NVIDIA's Linux4Tegra (L4T) source tree.