Storage and Workload Separation in Kubernetes with Longhorn
Context
Let’s say you have a k8s or a k3s cluster composed of 6 worker nodes, and for some reasons, you want to separate compute from storage in a way that some worker nodes run only Longhorn
pods and take care of storage (volumes, replicas), while the rest of the worker nodes run any other pods and use the volumes that are on the other worker nodes.
For example, you want the worker nodes node-1
and node-2
to run only Longhorn
pods and create volumes that are ready to be consumed, and the worker nodes node-3
, node-4
, node-5
, node-6
have no volumes on their disks, only workload pods, and consume the volumes from the worker nodes node-1
and node-2
. How do you do that?
How
You can make this on an existing Longhorn
or on a newly installed one.
Let’s start:
We call the worker nodes node-1
and node-2
, the storage nodes and the worker nodes node-3
, node-4
, node-5
, node-6
the compute nodes.
Configure the storage nodes
-
Label the storage nodes with the label
node.longhorn.io/create-default-disk=true
.kubectl label nodes node-1 node-2 node.longhorn.io/create-default-disk=true
-
Install
Longhorn
with the setting “Create Default Disk on Labeled Nodes” set to true.if you’re using Helm to install
longhron
you can set that value using :defaultSettings: # -- Setting that allows Longhorn to automatically create a default disk only on nodes with the label "node.longhorn.io/create-default-disk=true" (if no other disks exist). When this setting is disabled, Longhorn creates a default disk on each node that is added to the cluster. createDefaultDiskLabeledNodes: true
If you’re using another method, have a look at how to configure this setting, see https://longhorn.io/docs/archives/1.2.2/references/settings/#create-default-disk-on-labeled-nodes
Now, if you go to the Longhorn
dashboard, you’ll see that the compute nodes are disabled and have no disk.
📝📌 PS: If you’ve already installed Longhorn
and want to configure this setting on a working Longhorn
, you need to remove the compute nodes from the cluster and bring them back again to have this take effect.
Rejecting no Longhron pods from being scheduled on the storage nodes
The current changes we made will only tell Longhorn
to stop using the compute nodes for storage but will not restrict all the pods except Longhorn
’s pods from being scheduled on the storage nodes. Taint and toleration to the rescue!
Longhorn
components can be configured to tolerate taints. So, the idea is to taint the storage nodes and make only the Longhorn
pods tolerate these taints. This will result in allowing only the Longhorn
pods to be scheduled on the storage nodes and rejecting any other pod.
-
Apply the taint on the storage nodes
kubectl taint nodes node-1 node-2 node=storage:NoSchedule
-
Make the Longhorn pods tolerate the taint
Just to not make this article long, depending on whether you installed
Longhorn
or are planning to install it and with which method, steps differ. So, follow up on the blog depending on your case. Link https://longhorn.io/docs/archives/1.2.2/advanced-resources/deploy/taint-toleration/#setting-up-taints-and-tolerations-after-longhorn-has-been-installed.
Tips: If you’re trying this on an installed Longhorn
, one of the steps in the link I provided is asking to make all volumes detached. An easy way is to run this command for every namespace you have:
-
This will scale down all deployments to 0 in the namespace my-namespace.
kubectl scale deployment -n my-namespace --replicas=0 --all
- This will scale down all stateful sets to 0 in the namespace my-namespace.
kubectl scale statefulset -n my-namespace --replicas=0 --all
-
When the setting is configured, you can bring them back by (if the replica count was 1 ofc):
kubectl scale deployment -n my-namespace --replicas=1 --all
kubectl scale statefulset -n my-namespace --replicas=1 --all
Now, you have node 1 and node 2 responsible only for storing volumes and replicas and nothing else, and nodes 3, 4, and 5 responsible only for your workload pods.
Leave a comment