| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Add an "all" build target
- Fix build if libcuda.so is not on linker search path
- Do not assume that nvcc is available on $PATH
- Allow specifying CFLAGS and LDFLAGS when running make
- Allow passing non-standard CUDA build locations to make
Suggested usage if CUDA is installed in a non-standard location,
say, /playpen/jbakita/CUDA/cuda-archive/cuda-12.2:
make CUDA=/playpen/jbakita/CUDA/cuda-archive/cuda-12.2
|
|
|
|
|
|
|
| |
Make automatically provides CXX and CC, and these manual
definitions were being ignored.
Also fix a missing space in one of the messages from the tests.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use a different callback to intercept the TMD/QMD later in the
launch pipeline.
Major improvements:
- Fix bug with next mask not overriding stream mask on CUDA 11.0+
- Add CUDA 6.5-10.2 support for next- and global-granularity
partitioning masks on x86_64 and aarch64 Jetson
- Remove libdl dependency
- Partially support TMD/QMD Version 4 (Hopper)
Minor improvements:
- Check for sufficient CUDA version before before attempting to
apply a next-granularity partitioning mask
- Only check for sufficient CUDA version on the first call to
`libsmctrl_set_next_mask()` or `libsmctrl_set_global_mask()`,
rather than checking every time (lowers overheads)
- Check that TMD version is sufficient before modifying it
- Improve documentation
Issues:
- Partitioning mask bits have a different meaning in TMD/QMD
Version 4 and require floorsweeping and remapping information to
properly construct. This information will be forthcoming in
future releases of libsmctrl and nvdebug.
|
|
|
|
|
| |
Also allow building with an alternate version of g++ for backwards
compatibility.
|
|
|
|
|
|
|
|
|
|
|
| |
Stream-level masks should always override globally-set masks.
Next-kernel masks should always override both stream-level masks
and globally-set masks.
Tests reveal an issue with the next-kernel mask not overriding the
stream mask on CUDA 11.0+. CUDA appears to apply the per-stream
mask to the QMD/TMD after `launchCallback()` is triggered, making
it impossible to override as currently implemented.
|
|
|
|
| |
Also rewrite the global masking test to be much more thorough.
|
|
|
|
|
|
| |
nvcc links against a stub version of libcuda.so by default which is
missing a required symbol starting around CUDA 11.8. Use libdl to
resolve the symbol at runtime instead.
|
|
|
|
|
| |
Also use static linking for tests, to avoid a need to set
LD_LIBRARY_PATH to include the libsmctrl directory.
|
|
- Tested working with cuda_scheduling_examiner
- Supports everything described in the accepted RTAS'23 paper
- Can be used as either a shared or staticly-linked library
- Documented in libsmctrl.h
|