1. Scheduling

Manual Test

Test name Prerequisite Expectation
EKS across zone scheduling Prerequisite:

* EKS Cluster with 3 nodes across two AWS zones (zone#1, zone#2)

1. Create a volume with 2 replicas, and attach it to a node.
2. Delete a replica scheduled to each zone, repeat it few times
3. Scale volume replicas = 3
4. Scale volume replicas to 4
* Volume replicas should be scheduled one per AWS zone
* Deleting a replica in a zone should trigger a replica rebuild
* new rebuilding replica should be scheduled to the same zone as the deleted replica
* Scaling volume replicas to 3 will distribute replicas across all nodes
* Scaling volume replicas to 4 will be governed by soft anti-affinity rule, so no guarantee on which node the new replica should be scheduled.

Anti-affinity test

# Test case Steps Expectation Automation test case
1 Replica scheduling (soft anti-affinity enabled) Prerequisite:
* Replica Soft Anti-Affinity setting is Enabled
1. Create a volume
2. Attach volume to a node
3. Increase replica count to exceed the number of Longhorn node count
* New replicas will be scheduled to node
* Volume Status will be Healthy, with limited node redundancy hint icon
Limited node redundancy: at least one healthy replica is running at the same node as another
test_soft_anti_affinity_scheduling
2 Replica scheduling (soft anti-affinity disabled) Prerequisite:
* Replica Soft Anti-Affinity setting is Enabled
1. Create a volume
2. Attach volume to a node
3. Increase replica count to exceed the number of Longhorn node count
4. Disable Replica Soft Anti-Affinity setting
5. Delete a replica
6. Re-Enable Replica Soft Anti-Affinity setting
* Replicas won’t be removed after disabling Replica Soft Anti-Affinity
* when Replica Soft Anti-Affinity setting is disabled New Replicas will not be scheduled to nodes.
* when Replica Soft Anti-Affinity setting is re-enabled, New Replicas can be scheduled to nodes.
test_hard_anti_affinity_scheduling

Additional Tests

# Scenario Steps Expected Results
1 Add Disk disk1, Disable scheduling for default disk -1 1. By default the disk on a node is 0 default disk in in path - /var/lib/longhorn/
2. Add disk1 on the node
3. Disable scheduling for the default disk
4. Create a volume in Longhorn
5. Verify the replicas are scheduled on disk1
2 Add Disk disk1, Disable scheduling for default disk -2 Cluster spec - 3 worker nodes

1. Create a volume - 3 replicas in /var/lib/longhorn/ - default disk
2. Add disk 1 on /mnt/vol2on node 1
3. Disable scheduling for the default disk
4. enable scheduling for disk1
5. Update the replicas to count = 4
6. Say a replica is built on Node 2
7. Delete the replica on node 1
8. a new replica is rebuilt on node 1
9. Verify replica is now available in /mnt/vol2
Replica when rebuilt on node 1 should be available on disk 1 - /mnt/vol2
Disable Scheduling On Cordoned Node
3 Disable Scheduling On Cordoned Node: True

New volume
1. There are 4 worker nodes - custom cluster
2. Cordon a node W1
3. Create a new volume with 4 replicas
4. Verify the volume vol-1 is in detached state with error Scheduling Failure Replica Scheduling Failurewith the 4th replica in N/A state
5. Add a new worker node W5 to the cluster
6. vol-1 should become healthy.
7. Attach it to a workload and verify data can be written into the volume
1. vol-1 should be in detached state with error Scheduling Failure Replica Scheduling Failure
2. vol-1 should become healthy and should be used in a workload to write data into the volume
4 Disable Scheduling On Cordoned Node: True

Existing volume
1. There are 4 worker nodes - custom cluster
2. Create a new volume with 4 replicas
3. Volume vol-1 should be in a healthy detached state
4. Attach it to a workload and verify data can be written into the volume
5. cordon a worker node
6. Use the. volume to a workload
7. All the three replicas will be in running healthy state
8. Delete replica on cordoned worker node
9. Verify the volume vol-1 is in degraded state with error Scheduling Failure Replica Scheduling Failurewith the 4th replica in N/A state
10. Add a new worker node W5 to the cluster
11. Verify the repliica failed will be in rebuilding state now
12. vol-1 should become healthy.
13. Verify the data is consistent
5 Disable Scheduling On Cordoned Node: False

New volume
1. There are 4 worker nodes - custom cluster
2. Cordon a node W1
3. Create a new volume with 4 replicas
4. vol-1 should be in healthy.
5. Verify a replica is created on the cordoned worker node
6. Attach it to a workload and verify data can be written into the volume
6 Disable Scheduling On Cordoned Node: False

Existing volume
7 Disable Scheduling On Cordoned Node: True

Backup restore
1. There are 4 worker nodes - custom cluster
2. Cordon a node W1
3. Create a backup restore volume from an existing backup.
4. Give in the number of replicas - 4, volume name: vol-2
5. Verify the volume vol-2 is in detached state with error Scheduling Failure Replica Scheduling Failurewith the 4th replica in N/A state
6. Verify no restoring should happen on the replicas.
7. Add a new worker node W5 to the cluster
8. vol-2 should start restoring now
9. vol-2 should be in detached healthy state.
10. attach to a workload and verify the checksum of data with that of the original one
8 Disable Scheduling On Cordoned Node: False

Backup restore
1. There are 4 worker nodes - custom cluster
2. Cordon a node W1
3. Create a backup restore volume from an existing backup.
4. Give in the number of replicas - 4, volume name: vol-2
5. Verify volume is in attached state and restoring should happen on the replicas
6. vol-2 should be in detached healthy state. after restoration is complete
7. attach to a workload and verify the checksum of data with that of the original one
7 Disable Scheduling On Cordoned Node: True

Create DR volume
1. There are 4 worker nodes - custom cluster
2. Cordon a node W1
3. Create a DRV from an existing backup.
4. Give in the number of replicas - 4
5. Verify the DRV is in detached state with error Scheduling Failure Replica Scheduling Failurewith the 4th replica in N/A state
6. Verify no restoring should happen on the replicas.
7. Add a new worker node W5 to the cluster
8. DRV should start restoring now
9. DRV should be in healthy state.
10. Activate the DRV and verify it is in detached state
11. attach to a workload and verify the checksum of data with that of the original one
8 Disable Scheduling On Cordoned Node: False

Create DR volume
1. There are 4 worker nodes - custom cluster
2. Cordon a node W1
3. Create a DRV from an existing backup.
4. Give in the number of replicas - 4
5. DRV should start restoring now
6. DRV should be in healthy state.
7. Activate the DRV and verify it is in detached state
8. attach to a workload and verify the checksum of data with that of the original one
9 Replica node level soft anti affinity: False

New volume
1. There are 3 worker nodes - custom cluster
2. Create a volume with replicas - 4
3. Volume should be in detached state with error - Scheduling Failure Replica Scheduling Failurewith the 4th replica in N/A state
4. Add a worker node
5. the volume should be in healthy state
6. User should be able to use the volume on the workload
10 Replica node level soft anti affinity: True

New volume
1. There are 3 worker nodes - custom cluster
2. Create a volume with replicas - 4
3. the volume should be in healthy state. two replicas should be on the same host
4. User should be able to use the volume on the workload
[Edit]