aboutsummaryrefslogtreecommitdiffstats
path: root/Makefile
diff options
context:
space:
mode:
authorJoshua Bakita <jbakita@cs.unc.edu>2024-12-19 14:20:38 -0500
committerJoshua Bakita <jbakita@cs.unc.edu>2024-12-19 14:48:21 -0500
commitd052c2df34ab41ba285f70965663e5a0832f6ac9 (patch)
tree0a761be3f62910275da8a2cad546a8902073b1e9 /Makefile
parentaa63a02efa5fc8701f0c3418704bbbc2051c1042 (diff)
Bugfix stream-mask override, support old CUDA, and start Hopper support
Use a different callback to intercept the TMD/QMD later in the launch pipeline. Major improvements: - Fix bug with next mask not overriding stream mask on CUDA 11.0+ - Add CUDA 6.5-10.2 support for next- and global-granularity partitioning masks on x86_64 and aarch64 Jetson - Remove libdl dependency - Partially support TMD/QMD Version 4 (Hopper) Minor improvements: - Check for sufficient CUDA version before before attempting to apply a next-granularity partitioning mask - Only check for sufficient CUDA version on the first call to `libsmctrl_set_next_mask()` or `libsmctrl_set_global_mask()`, rather than checking every time (lowers overheads) - Check that TMD version is sufficient before modifying it - Improve documentation Issues: - Partitioning mask bits have a different meaning in TMD/QMD Version 4 and require floorsweeping and remapping information to properly construct. This information will be forthcoming in future releases of libsmctrl and nvdebug.
Diffstat (limited to 'Makefile')
-rw-r--r--Makefile2
1 files changed, 1 insertions, 1 deletions
diff --git a/Makefile b/Makefile
index 0e9ee3a..0d9b9f6 100644
--- a/Makefile
+++ b/Makefile
@@ -3,7 +3,7 @@ CXX = g++
3NVCC ?= nvcc 3NVCC ?= nvcc
4# -fPIC is needed in all cases, as we may be linked into another shared library 4# -fPIC is needed in all cases, as we may be linked into another shared library
5CFLAGS = -fPIC 5CFLAGS = -fPIC
6LDFLAGS = -lcuda -I/usr/local/cuda/include -ldl 6LDFLAGS = -lcuda -I/usr/local/cuda/include
7 7
8.PHONY: clean tests 8.PHONY: clean tests
9 9