Quick Install: Kubernets Cluster (RKE2) with Warewulf

2025-08-26 00:00:00 +0000 | Egbert Eich | No License

In a previous blog post we've learned how we can leverage Warewulf - a node deployment tool used in High Performance Computing (HPC) - to deploy a K8s cluster using RKE2. The deployment was described step-by step explaining the rationale behind each step.
This post supplements the previous post by providing a quick install guide taking advantage of pre-build containers. With the help of these we are able to perform the deployment of an RKE2 cluster with a few simple commands.
Warewulf uses PXE boot to bring up the machines, so your cluster nodes need to be configured for this and the network needs to be configure so that the nodes will PXE-boot from the Warewulf deployment server. How to do this and how to make node known to Warewulf is covered in a post about the basic Warewulf setup.

Prerequisite: Prepare Configuration Overlay Template for RKE2

K8s agents and servers use a shared secret for authentication - the connection token. Moreover, agents need to know the host name of the server to contact.
This information is provided in a config file: /etc/rancher/rke2/config.yaml We let Warewulf provide this config file as a configuration overlay. A suitable overlay template can be installed from the package warewulf4-overlay-rk2:

 zypper -n install -y warewulf4-overlay-rke2

Set Up the Environment

First we set a few environment variables which we will use later.

cat > /tmp/rke2_environment.sh <<"EOF"
server=<add the server node here>
agents="<list of agents>"
container_url=docker://registry.opensuse.org/network/cluster/containers/containers-15.6
server_container=$container_url/warewulf-rke2-server:v1.32.7_rke2r1
agent_container=$container_url/warewulf-rke2-agent:v1.32.7_rke2r1
token="$(for n in {1..41}; do printf %2.2x $(($RANDOM & 0xff));done)"
EOF

Here we assume we have one server node and multiple agent nodes whose host names will have to replace in the above script. Lists of nodes can be specified as a comma-seperated list, but also as a range of nodes using square brackets (for example k8s-agent[00-15] would refer to k8s-agent00 to k8s-agent15) or lists and ranges combined.

Also, the script generates a connection token for the entire cluster. This token is used to allow agents (and secondary servers) to connect to the primary server1. If we set up the server persistently outside of Warewulf we either need to add this token to the config file /etc/rancher/rke2/config.yaml on the server before we start the rke2-server service for the first time or grab it from the server once it has booted.
In this example, we are using a 'short token' which does not contain the fingerprint of the root CA of the cluster. For production environments you are encouraged to create certificates for your K8s cluster beforehand and calculate the fingerprint from the root CA for use in the agent token. Refer to the appendix below how to do so.
In case your server is already running you will have to grab the token from it instead and set the variable token in the script above accordingly. It can be found in the file /var/lib/rancher/rke2/server/node-token.

Finally, we set the environment:

source /tmp/rke2_environment.sh

Obtain the Node Images

The openSUSE registry has pre-built image for K8s agents. Taking advantage of these removes the tedious task of preparing the base images. We pull these and build the container:

wwctl container import ${agent_container} leap15.6-RKE2-agent
wwctl container build leap15.6-RKE2-agent

In the previous blog we've talked about the implications deploying K8s servers using Warewulf. If we plan to deploy the server using Warewulf as well, we pull the server container from the registry also and build it:

wwctl container import ${server_container} leap15.6-RKE2-server
wwctl container build leap15.6-RKE2-server

Set up Profiles

We utilize Warewulf profiles to configure different aspects of the setup and assign these profiles to the different types of nodes as needed.

Set up rootfs

The settings in this profile will ensure that a rootfs type is set so that a call to pivot_root() performed by the container runtime will succeed.

wwctl profile add container-host
wwctl profile set --root tmpfs -A "crashkernel=no net.ifnames=1 rootfstype=ramfs" container-host

Set up Container Storage

Since the deployed nodes are ephemeral - that is they run out of a RAM disk, we may want to to store container images as well as administrative data on disk. Luckily, Warewulf is capable of configuring disks on the deployed nodes. For this example, we assumes that the container storage should reside on disk /dev/sdb, we are using the entire disk and we want to scrap the data at every boot. Of course other setups are possible, check the upstream documentation for details.

wwctl profile add container-storage
wwctl profile set --diskname /dev/sdb --diskwipe "true" \
      --partnumber 1 --partcreate --partname container_storage \
      --fsname container_storage --fswipe --fsformat ext4 \
	  --fspath /var/lib/rancher \
      container-storage

Note: When booting the node from Warewulf, the entire disk (/dev/sdb in this example) will be wiped. Make sure it does not contain any valuable data!

Set up the Connection Key

When we've set up our environment, we generated a unique connection key which we will use throughout this K8s cluster to allow agents (and secondary servers) to connect to the primary server.

wwctl profile add rke2-config-key
wwctl profile set --tagadd="connectiontoken=${token}" \
              -O rke2-config rke2-config-key

Set up the Server Profile

Here, we set up a profile which is used to point agents - and secondary servers - to the primary server:

wwctl profile add rke2-config-first-server
wwctl profile set --tagadd="server=${server}" -O rke2-config rke2-config-first-server

Set up and Start the Nodes

Now, we are ready to deploy our nodes.

Set up and deploy the Server Node

If applicable, set up the server node

wwctl node set \
      -P default,container-host,container-storage,rke2-config-key \
      -C leap15.6-RKE2-server ${server}
wwctl overlay build ${server}

Now, we are ready to PXE-boot the first server.

Set up and deploy the Agent Nodes

We now configure the agent nodes and build the overlays:

wwctl node set \
      -P default,container-host,container-storage,rke2-config-key,rke2-config-first-server \
      -C leap15.6-RKE2-agent ${agents}
wwctl overlay build ${agents}

Once the first server node is up and running we can now start PXE-booting the agents. They should connect to the server automatically.

Check Node Status

To check the status of the nodes, we connect to the server through ssh and run

kubectl get nodes

to check the status of the available agents.

Appendix: Set Up CA Certificates and caluculate Node Access Token

Rancher provides a script (enerate-custom-ca-certs.sh) to generate the different certificates required for a K8s cluster. First, download this script to current directory:

curl -sOL --output-dir
. https://github.com/k3s-io/k3s/raw/master/contrib/util/generate-custom-ca-certs.sh

and run the following commands:

export DATA_DIR=$(mktemp -d $(pwd)/tmp-XXXXXXXX)
chmod go-rwx $DATA_DIR
./generate-custom-ca-certs.sh
wwctl overlay rm -f rke2-ca
wwctl overlay create rke2-ca
files="server-ca.crt server-ca.key client-ca.crt client-ca.key \
	request-header-ca.crt request-header-ca.key \
    etcd/peer-ca.crt etcd/peer-ca.key etcd/server-ca.crt etcd/server-ca.key"
for d in $files; do
	d=${DATA_DIR}/server/tls/$d
    wwctl overlay import --parents rke2-ca $d /var/lib/rancher/rke2/${d#${DATA_DIR}/}.ww
    wwctl overlay chown rke2-ca /var/lib/rancher/rke2/${d#${DATA_DIR}/}.ww 0
    wwctl overlay chmod rke2-ca /var/lib/rancher/rke2/${d#${DATA_DIR}/}.ww $(stat -c %a $d)
done
ca_hash=$(openssl x509 -in $DATA_DIR/server/tls/root-ca.crt -outform DER \
	|sha256sum)
token="$( printf "K10${ca_hash%  *}::server:"; \
         for n in {1..41}; do printf %2.2x $(($RANDOM & 0xff));done )"

The certificates will be passed to the K8s in the system overlay. If you are setting up mulitple K8s clusters you may want to create separate certificates for each cluster and place the overlay name into the clusters namespace instead of using rke2-ca.
Before you delete $DATA_DIR, you may want to save the root and and intermiate certificates and keys ($DATADIR/server/tls/root-ca.*, ($DATADIR/server/tls/intermediate-ca.*) to a safe location. If you have a root and intermediate CA you want to reuse, you may copy these into the directory $DATADIR/server/tls before running generate-custom-ca-certs.sh. Use $token as the agent token instead.

  1. Secondary servers need to find the primary server endpoint like nodes. Therefore, they recieve the access token like nodes.