- Setup a cluster of 3 worker nodes
- Install Longhorn and set
Default Replica Count = 2(because we will turn off one node) - Create a StatefulSet with 2 pods using the command:
kubectl create -f https://raw.githubusercontent.com/longhorn/longhorn/master/examples/statefulset.yaml - Create a volume + pv + pvc named
vol1and create a deployment(1 pod) of default ubuntu namedshellwith the usage of pvcvol1mounted under/mnt/vol1 - Find the node which contains one pod of the StatefulSet/Deployment. Power off the node
StatefulSet
if NodeDownPodDeletionPolicy is set to do-nothing | delete-deployment-pod
- wait till the
pod.deletionTimestamphas passed - verify no replacement pod generated, the pod is stuck at terminating forever.
if NodeDownPodDeletionPolicy is set to delete-statefulset-pod | delete-both-statefulset-and-deployment-pod
- wait till pod’s status becomes
terminatingand thepod.deletionTimestamphas passed (around 7 minutes) - verify that the pod is deleted and there is a new running replacement pod.
- Verify that you can access/read/write the volume on the new pod
Deployment
if NodeDownPodDeletionPolicy is set to do-nothing | delete-statefulset-pod
- wait till the
pod.deletionTimestamphas passed - replacement pod will be stuck in
Pendingstate forever - force delete the terminating pod
- wait till replacement pod is running
- verify that you can access
vol1via theshellreplacement pod under/mnt/vol1once it is in the running state
if NodeDownPodDeletionPolicy is set to delete-deployment-pod | delete-both-statefulset-and-deployment-pod
- wait till the
pod.deletionTimestamphas passed - verify that the pod is deleted and there is a new running replacement pod.
- verify that you can access
vol1via theshellreplacement pod under/mnt/vol1
Other kinds
- Verify that Longhorn never deletes any other pod on the downed node.
Test example
One typical scenario when the enhancement has succeeded is as below. When a node (say node-x) goes down (assume using Kubernetes' default settings and user allows Longhorn to force delete pods):
| Time | Event |
|---|---|
| 0m:00s | node-xgoes down and stops sending heartbeats to Kubernetes Node controller |
| 0m:40s | Kubernetes Node controller reports node-x is NotReady. |
| 5m:40s | Kubernetes Node controller starts evicting pods from node-x using graceful termination (set DeletionTimestamp and deletionGracePeriodSeconds = 10s/30s) |
| 5m:50s/6m:10s | Longhorn forces delete the pod of StatefulSet/Deployment which uses Longhorn volume |