libsmctrl.git/Makefile, branch master

Makefile improvements

2025-03-20T20:46:21+00:00

- Add an "all" build target
- Fix build if libcuda.so is not on linker search path
- Do not assume that nvcc is available on $PATH
- Allow specifying CFLAGS and LDFLAGS when running make
- Allow passing non-standard CUDA build locations to make

Suggested usage if CUDA is installed in a non-standard location,
say, /playpen/jbakita/CUDA/cuda-archive/cuda-12.2:

make CUDA=/playpen/jbakita/CUDA/cuda-archive/cuda-12.2

Remove unused variables from Makefile

2024-12-19T20:04:26+00:00

Make automatically provides CXX and CC, and these manual
definitions were being ignored.

Also fix a missing space in one of the messages from the tests.

Bugfix stream-mask override, support old CUDA, and start Hopper support

2024-12-19T19:48:21+00:00

Use a different callback to intercept the TMD/QMD later in the
launch pipeline.

Major improvements:
- Fix bug with next mask not overriding stream mask on CUDA 11.0+
- Add CUDA 6.5-10.2 support for next- and global-granularity
  partitioning masks on x86_64 and aarch64 Jetson
- Remove libdl dependency
- Partially support TMD/QMD Version 4 (Hopper)

Minor improvements:
- Check for sufficient CUDA version before before attempting to
  apply a next-granularity partitioning mask
- Only check for sufficient CUDA version on the first call to
  `libsmctrl_set_next_mask()` or `libsmctrl_set_global_mask()`,
  rather than checking every time (lowers overheads)
- Check that TMD version is sufficient before modifying it
- Improve documentation

Issues:
- Partitioning mask bits have a different meaning in TMD/QMD
  Version 4 and require floorsweeping and remapping information to
  properly construct. This information will be forthcoming in
  future releases of libsmctrl and nvdebug.

Add build and use instructions to the README

2024-02-19T20:37:46+00:00

Also allow building with an alternate version of g++ for backwards
compatibility.

Add test that higher-granularity masks override lower-granularity ones

2024-02-14T20:36:25+00:00

Stream-level masks should always override globally-set masks.
Next-kernel masks should always override both stream-level masks
and globally-set masks.

Tests reveal an issue with the next-kernel mask not overriding the
stream mask on CUDA 11.0+. CUDA appears to apply the per-stream
mask to the QMD/TMD after `launchCallback()` is triggered, making
it impossible to override as currently implemented.

Add a README and tests for stream masking and next masking

2023-11-29T23:24:25+00:00

Also rewrite the global masking test to be much more thorough.

Build on CUDA 11.8+; Adds libdl dependency

2023-11-29T22:40:45+00:00

nvcc links against a stub version of libcuda.so by default which is
missing a required symbol starting around CUDA 11.8. Use libdl to
resolve the symbol at runtime instead.

Add test for libsmctrl_set_global_mask()

2023-10-17T19:32:51+00:00

Also use static linking for tests, to avoid a need to set
LD_LIBRARY_PATH to include the libsmctrl directory.

Initial reimplementation of libsmctrl as a library

2023-03-03T03:14:22+00:00

- Tested working with cuda_scheduling_examiner
- Supports everything described in the accepted RTAS'23 paper
- Can be used as either a shared or staticly-linked library
- Documented in libsmctrl.h