If you don’t care about the technical background and want to use Nix-packaged CUDA applications on a non-NixOS system, scroll down to Solutions.
Dynamic linking outside Nix
Suppose that we have a CUDA application like llama.cpp outside Nix. How does it get its library dependencies such as the required CUDA libraries on Linux? ELF binaries contains dynamic section with information for the dynamic linker. It encodes, among other things, the required dynamic libraries. For instance, we can use patchelf
or readelf
to get the CUDA libraries that are used:
So the llama.cpp CLI uses the CUDA runtime (libcudart.so.12
), cuBLAS (libcublas.so.12
), and the CUDA driver library (libcuda.so.1
). The CUDA driver library is different than the other libraries in that it is tightly coupled to the NVIDIA driver and does not come with CUDA it self, but the NVIDIA drivers.
Dynamic library dependencies are resolved by the dynamic linker. The dynamic linker uses a cache of known libraries. The directories that are cached can be configured using /etc/ld.so.conf
. In addition to that, an ELF binary can specify additional library paths, the so-called _runtime path) or rpath. However, rpath is not used in our llama-cli binary:
So everything library is loaded from directories configured in ld.so.conf
. If you want to get more details on dynamic linking and how libraries are looked up, you can use the LD_DEBUG
environment variable to get the dynamic linker to display library search paths:
Dynamic linking in Nix
The standard dynamic linking approach is not compatible with the objectives of Nix. Nix aims for full reproducibility, which is not possible with a global dynamic linker cache.
Suppose that we have two applications, both are built against OpenBLAS (same library, same version), but with different OpenBLAS build configurations. With a global dynamic linker cache, we cannot distinguish both builds and ensure that the applications are dynamically linked against the correct build. So, we cannot fully reproduce the intended configurations.
To resolve this issue, Nix avoids using a global cache for dynamic linking. Instead, it embeds the paths of the library dependencies in the binary’s runtime path (rpath). We can observe this by e.g. building the llama.cpp
derivation from the nixpkgs
repository and inspecting the required libraries and the rpath:
As you can see, rather than just storing the required dynamic libraries and letting the dynamic linker resolve their full paths from its cache, a binary compiled with Nix embeds the full paths of its library dependencies in the Nix Store (/nix/store
).
This solves the reproducibility issue, since each binary/library can fully specify the version it uses, and e.g. different build configurations of a binary will lead to different hashes in the output paths (/nix/store/<hash>-<name>-<version>-<output>
).
The glitch in the matrix: the CUDA driver library
There is one glitch/impurity that creeps in. Remember that the CUDA driver library (libcuda.so.1
) is tightly coupled to the NVIDIA driver? So, in the case of this particular library, we cannot dynamically link against arbitrary versions. It needs to link against the CUDA driver library that corresponds to the system’s NVIDIA driver.
NixOS solves this by allowing an impurity in the form of global state for this particular case. As can be seen in the rpath above, there is an entry /run/opengl-driver/lib
. If the NVIDIA driver is configured on a NixOS system, NixOS guarantees that libcuda.so.1
is symlinked into this location. In this way, a binary will always use a CUDA driver library that is consistent with the system’s NVIDIA driver version.
Sadly, this doesn’t work on non-NixOS systems, because they don’t have the /run/opengl-driver/lib
directory. This brings us to some hacks to resolve this issue…
Solutions
Make /run/opengl-driver/lib
and symlink the driver library
Preload the driver library
Warning
LD_PRELOAD
does not cover all cases. When a program/library uses runtime compilation (e.g. Triton), the Nix derivation will typically burn the path/run/opengl-driver/lib
into the package as a linker path (i.e.-L/run/opengl-driver/lib
).LD_PRELOAD
does not override this and will fail in such cases.
Warning
Avoid using
LD_LIBRARY_PATH
unless the CUDA driver library is in a directory by itself. UsingLD_LIBRARY_PATH
with a path with multiple libraries can also override other libraries. In the best case, this breaks reproducibility. In the worst case it breaks the application.
nixGL
nixGL can wrap a program to resolve the CUDA driver library.