Installing NVIDIA CUDA on Leap

2024-10-02 00:00:00 +0000 | Egbert Eich | No License

In a blog post last year Simplify GPU Application Development with HMM on Leap we've described how to install CUDA using the NVIDIA open kernel driver built by SUSE. For this, we utilized driver packages from the graphics driver repository for openSUSE Leap hosted by NVIDIA. This happend to work at the time, as at the time the kernel driver version for the graphics driver happened to be the same as the one used by CUDA. This, however, is not usually the case and at present, therefore this method fails.

NVIDIA CUDA Runtime Installation on openSUSE Leap 15.6 Bare Metal

To recap, CUDA will work with the 'legacy' proprietary kernel driver that has existed for a long time as well as the open driver KMP provided with CUDA. There is also a pre-built driver KMP avaiable on openSUSE Leap (starting with version 15.5). The latter provides the additional perks:

since it is fully pre-build it does not require an additional tool chain to complete the build during installation - which the CUDA provided drivers will install as dependencies.
This helps to keep an installation foot print small which is desirable for HPC compute nodes or AI clusters.
It is singed with the same key as the kernel and its components and thus will integrate seamlessly into a secure boot environment without enrolling an additional MOK - which usually requires access to the system console.

The NVIDIA open driver supports all recent GPUs, in fact, it is required for cutting edge platforms such as Grace Hopper or Blackwell, and recommended for all Turing, Ampere, Ada Lovelace, or Hopper architectures. The the proprietary driver is only still required for older GPU architectures.

The versions of the driver stacks for CUDA and for graphics frequently differ. CUDA's higher level libraries will work with any version equal or later to the one these have been released with. This means for instance, that CUDA 12.5 will run with a driver stack version greater or equal to 555.42.06. The components of the lower level stack - ie. those packages with the driver generation in their name (at the time of writing G06) - need to match the kernel driver version.
The component stack for graphics is always tightly version coupled. To accomodate versions difference for CUDA and graphics, openSUSE Leap is now providing two sets of NVIDIA Open Driver KMPs: the version targetting CUDA has the string -cuda prepended before -kmp in the package name. With this, installing CUDA using the open driver becomes quite simple.

CUDA comes with a number of meta-packages which provide an easy way to install certain aspects while still obeying required dependencies. A more detailed list of these meta packages and their purposes will follow in a future blog. To run applications built with CUDA the only components required are the runtime. If we plan to set up a minimal system - such as an HPC compute node - these are the only components to be installed. They consist of the low level driver stack as well as the CUDA libraries. In a containerized system, the libraries would reside inside the application container.

To install the runtime stack for CUDA 12.9, we would simply run:

# zypper ar https://developer.download.nvidia.com/compute/cuda/repos/opensuse15/x86_64/cuda-opensuse15.repo
# zypper --gpg-auto-import-keys refresh
# zypper -n in -y nv-prefer-signed-open-driver
# version=$(rpm -qa --queryformat '%{VERSION}\n' nv-prefer-signed-open-driver | cut -d "_" -f1 | sort -u | tail -n 1)
# zypper -n in -y --auto-agree-with-licenses \
    nvidia-compute-utils-G06 = ${version} cuda-libraries-12-9

The command above will install the SUSE-built and signed driver. Further changes to the module configuration as described in the previous blog are no longer required.

Now we can either reboot or run

# modprobe nvidia

to make sure, the driver is loaded. Once it's loaded, we ought to check if the installation has been successful by running:

nvidia-smi -q

If the driver has loaded successfully, we should see information about the GPU, the driver and CUDA version installed:

==============NVSMI LOG==============

Timestamp                                 : Mon Aug 12 09:31:57 2024
Driver Version                            : 555.42.06
CUDA Version                              : 12.5

Attached GPUs                             : 1
GPU 00000000:81:00.0
    Product Name                          : NVIDIA L4
    Product Brand                         : NVIDIA
    Product Architecture                  : Ada Lovelace
    Display Mode                          : Enabled
    Display Active                        : Disabled
    Persistence Mode                      : Disabled
    Addressing Mode                       : HMM
...

NVIDIA CUDA Runtime Installation for containerized Workloads

Containerized workloads using CUDA have their highlevel CUDA runtime libraries installed inside the container. The low level libraries are made available when the container is run. This allows running containerized workloads relatively independent of the version of the driver stack as long as the CUDA version is not newer than the version of driver stack used. Depending on what container environment we plan to use, there are different ways to achieve this. We've learned in a previous blog, how to do this using the NVIDIA GPU Operator and a driver container. This does not require to install any NVIDIA components on the container host. The kernel drivers will be loaded from within the driver container, they do not need to be installed on the container host. However, there are other container systems and different ways to install CUDA for K8s.

Set up Container Host for `podman`

For podman, we require a different approach. For this, kernel driver as well as the nvidia-container-toolkit need to be installed on the container host. We need to use the container toolkit to configure podman to set up the GPU device files and the low level driver stack (libraries and tools) inside a container.

# zypper ar https://developer.download.nvidia.com/compute/cuda/repos/opensuse15/x86_64/cuda-opensuse15.repo
# zypper --gpg-auto-import-keys refresh
# zypper -n in -y nv-prefer-signed-open-driver
# version=$(rpm -qa --queryformat '%{VERSION}\n' nv-prefer-signed-open-driver | cut -d "_" -f1 | sort -u | tail -n 1)
# zypper -n in -y --auto-agree-with-licenses \
    nvidia-compute-utils-G06 = ${version} \
    nvidia-container-toolkit

Now, we can configure podman for the devices found in our system:

# nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml

This command should show INFO as well as some WARN messages but should not return any lines marked ERROR. To run containers which use NVIDIA GPUs, we need to specify the device using the argument: --device nvidia.com/gpu=<device>. Here, <device> may be on individual GPU or - on GPUs that support Multi-Instance GPU (MIG) - an individual MIG device: ie GPU_ID or GPU_ID:MIG_ID) or all. We are also able to specify multiple devices:

$ podman run --device nvidia.com/gpu=0 \
    --device nvidia.com/gpu=1:0 \
	...

This uses GPU 0 and MIG device 1 on GPU 1.
We may find the available devices by running:

$ nvidia-ctk cdi list

Set up Container Host for K8s (RKE2)

There are different ways to set up the K8s to let containers use NVIDIA devices. One has been described in a separate blog), however the driver stack can also be installed on the container host itself while still using the GPU Operator. The disadvantage of this approach is that the driver stack needs to be installed on each container host.

# zypper ar https://developer.download.nvidia.com/compute/cuda/repos/opensuse15/x86_64/cuda-opensuse15.repo
# zypper --gpg-auto-import-keys refresh
# zypper -n in -y nv-prefer-signed-open-driver
# version=$(rpm -qa --queryformat '%{VERSION}\n' nv-prefer-signed-open-driver | cut -d "_" -f1 | sort -u | tail -n 1)
# zypper -n in -y --auto-agree-with-licenses \
    nvidia-compute-utils-G06 = ${version}

We need to perform above steps on each node (ie, server, agents) which have an NVIDIA GPU installed. How to continue depends whether we perfer to use the NVIDIA GPU Operator (without a driver container) or just use the NVIDIA Device Plugin container.

with the NVIDIA GPU Operator

Once the previous steps have been completed, we can start the GPU Operator:

OPERATOR_VERSION="v24.6.1"
DRIVER_VERSION="555.42.06"
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
helm repo update
helm install \
  -n gpu-operator --generate-name \
  --wait \
  --create-namespace \
  --version=${OPERATOR_VERSION} \
  nvidia/gpu-operator \
  --set driver.enabled=false \
  --set driver.version=${DRIVER_VERSION} \
  --set operator.defaultRuntime=containerd \
  --set toolkit.env[0].name=CONTAINERD_CONFIG \
  --set toolkit.env[0].value=/var/lib/rancher/rke2/agent/etc/containerd/config.toml \
  --set toolkit.env[1].name=CONTAINERD_SOCKET \
  --set toolkit.env[1].value=/run/k3s/containerd/containerd.sock \
  --set toolkit.env[2].name=CONTAINERD_RUNTIME_CLASS \
  --set toolkit.env[2].value=nvidia \
  --set toolkit.env[3].name=CONTAINERD_SET_AS_DEFAULT \
  --set-string toolkit.env[3].value=true

We need to set OPERATOR_VERSION to the GPU Operator version we whish like to use and DRIVER_VERSION to the driver version we are running. Now, we are able to see the pods starting:

kubectl get pods -n gpu-operator

The final state should either be Running or Completed. Now, we should check the logs of the nvidia-operator-validator whether the driver has been set up successfully

# kubectl logs -n gpu-operator -l app=nvidia-operator-validator

and the nvidia-cuda-validator if CUDA workloads can be run successfully.

# kubectl logs -n gpu-operator -l app=nvidia-cuda-validator

Both commands should return that the validations have been successful.

without the NVIDIA GPU Operator

Alternatively, we can set up NVIDIA support without the help of the GPU Operator. For this, we will need to create a configuration - much like for podman - and use the NVIDIA Device Plugin. If not done already, we need to install the nvidia-container-toolkit package, for this:

# zypper -n in -y nvidia-container-toolkit

and restart RKE2:

# systemctl restart rke2-server

on the K8s server or

# systemctl restart rke2-agent

on each agent with NVIDIA hardware installed. This will make sure, the container runtime is configure appropriately.

Once the server is again up and running, we should find this entry in /var/lib/rancher/rke2/agent/etc/containerd/config.toml:

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."nvidia"]
  runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."nvidia".options]
  BinaryName = "/usr/bin/nvidia-container-runtime"
  SystemdCgroup = true

Which is required to use the correct runtime for workloads requiring NVIDA harware. Next, we need to configure the NVIDIA RuntimeClass as an additional K8s runtime so that any user whose pods need access to the GPU can request them by using the runtime class nvidia:

# kubectl apply -f - <<EOF
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: nvidia
handler: nvidia
EOF

Since Device Plugin version 0.15 we need to set a hardware feature label telling the DaemonSet that there is a GPU on the node. This is normally set by the Node or GPU Feature Discovery daemons both deployed as part of the GPU Operator chart. Since we don't use this, we will have to set this manually for each node with a GPU:

kubectl label nodes <node_list> nvidia.com/gpu.present="true"

and install the NVIDIA Device Plugin:

# helm repo add nvdp https://nvidia.github.io/k8s-device-plugin
# helm repo update
# helm upgrade -i nvdp nvdp/nvidia-device-plugin \
  --namespace nvidia-device-plugin --create-namespace \
  --set runtimeClassName=nvidia

The pod should start up, complete the detection and tag the nodes with the number of GPUs available. To verify, we run:

# kubectl get pods -n nvidia-device-plugin
NAME                              READY   STATUS    RESTARTS   AGE
nvdp-nvidia-device-plugin-tjfcg   1/1     Running   0          7m23s
# kubectl get node `cat /etc/HOSTNAME` -o json | jq .status.capacity
{
  "cpu": "32",
  "ephemeral-storage": "32647900Ki",
  "hugepages-1Gi": "0",
  "hugepages-2Mi": "0",
  "memory": "65295804Ki",
  "nvidia.com/gpu": "1",
  "pods": "110"
}