<feed xmlns='http://www.w3.org/2005/Atom'>
<title>libsmctrl.git/Makefile, branch master</title>
<subtitle>Library to enable intra-context SM/TPC partitioning on NVIDIA GPUs</subtitle>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/libsmctrl.git/'/>
<entry>
<title>Makefile improvements</title>
<updated>2025-03-20T20:46:21+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>jbakita@cs.unc.edu</email>
</author>
<published>2025-03-20T20:28:52+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/libsmctrl.git/commit/?id=39c57bca3cbb42b1939a28377d8ef6cfab872450'/>
<id>39c57bca3cbb42b1939a28377d8ef6cfab872450</id>
<content type='text'>
- Add an "all" build target
- Fix build if libcuda.so is not on linker search path
- Do not assume that nvcc is available on $PATH
- Allow specifying CFLAGS and LDFLAGS when running make
- Allow passing non-standard CUDA build locations to make

Suggested usage if CUDA is installed in a non-standard location,
say, /playpen/jbakita/CUDA/cuda-archive/cuda-12.2:

make CUDA=/playpen/jbakita/CUDA/cuda-archive/cuda-12.2
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Add an "all" build target
- Fix build if libcuda.so is not on linker search path
- Do not assume that nvcc is available on $PATH
- Allow specifying CFLAGS and LDFLAGS when running make
- Allow passing non-standard CUDA build locations to make

Suggested usage if CUDA is installed in a non-standard location,
say, /playpen/jbakita/CUDA/cuda-archive/cuda-12.2:

make CUDA=/playpen/jbakita/CUDA/cuda-archive/cuda-12.2
</pre>
</div>
</content>
</entry>
<entry>
<title>Remove unused variables from Makefile</title>
<updated>2024-12-19T20:04:26+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>jbakita@cs.unc.edu</email>
</author>
<published>2024-12-19T19:59:15+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/libsmctrl.git/commit/?id=32f89528085f3cdb183e50461ffcd109d1f9a58d'/>
<id>32f89528085f3cdb183e50461ffcd109d1f9a58d</id>
<content type='text'>
Make automatically provides CXX and CC, and these manual
definitions were being ignored.

Also fix a missing space in one of the messages from the tests.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make automatically provides CXX and CC, and these manual
definitions were being ignored.

Also fix a missing space in one of the messages from the tests.
</pre>
</div>
</content>
</entry>
<entry>
<title>Bugfix stream-mask override, support old CUDA, and start Hopper support</title>
<updated>2024-12-19T19:48:21+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>jbakita@cs.unc.edu</email>
</author>
<published>2024-12-19T19:20:38+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/libsmctrl.git/commit/?id=d052c2df34ab41ba285f70965663e5a0832f6ac9'/>
<id>d052c2df34ab41ba285f70965663e5a0832f6ac9</id>
<content type='text'>
Use a different callback to intercept the TMD/QMD later in the
launch pipeline.

Major improvements:
- Fix bug with next mask not overriding stream mask on CUDA 11.0+
- Add CUDA 6.5-10.2 support for next- and global-granularity
  partitioning masks on x86_64 and aarch64 Jetson
- Remove libdl dependency
- Partially support TMD/QMD Version 4 (Hopper)

Minor improvements:
- Check for sufficient CUDA version before before attempting to
  apply a next-granularity partitioning mask
- Only check for sufficient CUDA version on the first call to
  `libsmctrl_set_next_mask()` or `libsmctrl_set_global_mask()`,
  rather than checking every time (lowers overheads)
- Check that TMD version is sufficient before modifying it
- Improve documentation

Issues:
- Partitioning mask bits have a different meaning in TMD/QMD
  Version 4 and require floorsweeping and remapping information to
  properly construct. This information will be forthcoming in
  future releases of libsmctrl and nvdebug.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use a different callback to intercept the TMD/QMD later in the
launch pipeline.

Major improvements:
- Fix bug with next mask not overriding stream mask on CUDA 11.0+
- Add CUDA 6.5-10.2 support for next- and global-granularity
  partitioning masks on x86_64 and aarch64 Jetson
- Remove libdl dependency
- Partially support TMD/QMD Version 4 (Hopper)

Minor improvements:
- Check for sufficient CUDA version before before attempting to
  apply a next-granularity partitioning mask
- Only check for sufficient CUDA version on the first call to
  `libsmctrl_set_next_mask()` or `libsmctrl_set_global_mask()`,
  rather than checking every time (lowers overheads)
- Check that TMD version is sufficient before modifying it
- Improve documentation

Issues:
- Partitioning mask bits have a different meaning in TMD/QMD
  Version 4 and require floorsweeping and remapping information to
  properly construct. This information will be forthcoming in
  future releases of libsmctrl and nvdebug.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add build and use instructions to the README</title>
<updated>2024-02-19T20:37:46+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>jbakita@cs.unc.edu</email>
</author>
<published>2024-02-19T20:37:10+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/libsmctrl.git/commit/?id=5029151978a20831558480ae052c98b7e528af95'/>
<id>5029151978a20831558480ae052c98b7e528af95</id>
<content type='text'>
Also allow building with an alternate version of g++ for backwards
compatibility.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Also allow building with an alternate version of g++ for backwards
compatibility.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add test that higher-granularity masks override lower-granularity ones</title>
<updated>2024-02-14T20:36:25+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>jbakita@cs.unc.edu</email>
</author>
<published>2024-02-14T20:36:25+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/libsmctrl.git/commit/?id=b5281f5fc01fc925898c9323edab41b817df8661'/>
<id>b5281f5fc01fc925898c9323edab41b817df8661</id>
<content type='text'>
Stream-level masks should always override globally-set masks.
Next-kernel masks should always override both stream-level masks
and globally-set masks.

Tests reveal an issue with the next-kernel mask not overriding the
stream mask on CUDA 11.0+. CUDA appears to apply the per-stream
mask to the QMD/TMD after `launchCallback()` is triggered, making
it impossible to override as currently implemented.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Stream-level masks should always override globally-set masks.
Next-kernel masks should always override both stream-level masks
and globally-set masks.

Tests reveal an issue with the next-kernel mask not overriding the
stream mask on CUDA 11.0+. CUDA appears to apply the per-stream
mask to the QMD/TMD after `launchCallback()` is triggered, making
it impossible to override as currently implemented.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add a README and tests for stream masking and next masking</title>
<updated>2023-11-29T23:24:25+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>bakitajoshua@gmail.com</email>
</author>
<published>2023-11-29T23:00:31+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/libsmctrl.git/commit/?id=8062646a185baa6d3934d1e19743ac671e943fa8'/>
<id>8062646a185baa6d3934d1e19743ac671e943fa8</id>
<content type='text'>
Also rewrite the global masking test to be much more thorough.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Also rewrite the global masking test to be much more thorough.
</pre>
</div>
</content>
</entry>
<entry>
<title>Build on CUDA 11.8+; Adds libdl dependency</title>
<updated>2023-11-29T22:40:45+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>bakitajoshua@gmail.com</email>
</author>
<published>2023-11-29T22:40:45+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/libsmctrl.git/commit/?id=3ee974590403730f2fea911a2574d335cedc4fab'/>
<id>3ee974590403730f2fea911a2574d335cedc4fab</id>
<content type='text'>
nvcc links against a stub version of libcuda.so by default which is
missing a required symbol starting around CUDA 11.8. Use libdl to
resolve the symbol at runtime instead.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
nvcc links against a stub version of libcuda.so by default which is
missing a required symbol starting around CUDA 11.8. Use libdl to
resolve the symbol at runtime instead.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add test for libsmctrl_set_global_mask()</title>
<updated>2023-10-17T19:32:51+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>jbakita@cs.unc.edu</email>
</author>
<published>2023-10-17T19:32:51+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/libsmctrl.git/commit/?id=aba56610404c90143f7837aadfd19b769caf5460'/>
<id>aba56610404c90143f7837aadfd19b769caf5460</id>
<content type='text'>
Also use static linking for tests, to avoid a need to set
LD_LIBRARY_PATH to include the libsmctrl directory.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Also use static linking for tests, to avoid a need to set
LD_LIBRARY_PATH to include the libsmctrl directory.
</pre>
</div>
</content>
</entry>
<entry>
<title>Initial reimplementation of libsmctrl as a library</title>
<updated>2023-03-03T03:14:22+00:00</updated>
<author>
<name>Joshua Bakita</name>
<email>bakitajoshua@gmail.com</email>
</author>
<published>2023-03-03T03:14:22+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/libsmctrl.git/commit/?id=7db0d3088a6e25c7c64999a20267f55751571dee'/>
<id>7db0d3088a6e25c7c64999a20267f55751571dee</id>
<content type='text'>
- Tested working with cuda_scheduling_examiner
- Supports everything described in the accepted RTAS'23 paper
- Can be used as either a shared or staticly-linked library
- Documented in libsmctrl.h
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Tested working with cuda_scheduling_examiner
- Supports everything described in the accepted RTAS'23 paper
- Can be used as either a shared or staticly-linked library
- Documented in libsmctrl.h
</pre>
</div>
</content>
</entry>
</feed>
