Module `tests.test_basic`

Functions

def backup_failed_cleanup(client, core_api, volume_name, volume_size, failed_backup_ttl='3')

Setup the failed backup cleanup

def backup_labels_test(client, random_labels, volume_name, size='33554432', backing_image='')

def backup_status_for_unavailable_replicas_test(client, volume_name, size, backing_image='')

def backup_test(client, volume_name, size, backing_image='', compression_method='lz4')

def backupstore_test(client, host_id, volname, size, compression_method)

def check_volume_and_snapshot_after_corrupting_volume_metadata_file(client, core_api, volume_name, pod, test_pod_name, data_path1, data_md5sum1, data_path2, snap)

Test volume I/O and take/delete a snapshot

def prepare_data_volume_metafile(client, core_api, volume_name, csi_pv, pvc, pod, pod_make, data_path, test_writing_data=False, writing_data_path='/data/writing_data_file')

Prepare volume and snapshot for volume metafile testing

Setup:

Create a pod using Longhorn volume
Write some data to the volume then get the md5sum
Create a snapshot
Delete the pod and wait for the volume detached
Pick up a replica on this host and get the replica data path

def prepare_space_usage_for_rebuilding_only_volume(client)

Create a 7Gi volume and attach to the node.
Make a filesystem then mount this volume.
Make this volume as a disk of the node, and disable the scheduling for the default disk.

def restore_inc_test(client, core_api, volume_name, pod)

def snapshot_prune_and_coalesce_simultaneously(client, volume_name, backing_image)

def snapshot_prune_test(client, volume_name, backing_image)

def snapshot_test(client, volume_name, backing_image)

def test_allow_volume_creation_with_degraded_availability(client, volume_name)

Test Allow Volume Creation with Degraded Availability (API)

Requirement: 1. Set allow-volume-creation-with-degraded-availability to true. 2. node-level-soft-anti-affinity to false.

Steps: (degraded availability) 1. Disable scheduling for node 2 and 3. 2. Create a volume with three replicas. 1. Volume should be ready after creation and Scheduled is true. 2. One replica schedule succeed. Two other replicas failed scheduling. 3. Enable the scheduling of node 2. 1. One additional replica of the volume will become scheduled. 2. The other replica is still failed to schedule. 3. Scheduled condition is still true. 4. Attach the volume. 1. After the volume is attached, scheduled condition become false. 5. Write data to the volume. 6. Detach the volume. 1. Scheduled condition should become true. 7. Reattach the volume to verify the data. 1. Scheduled condition should become false. 8. Enable the scheduling for the node 3. 9. Wait for the scheduling condition to become true. 10. Detach and reattach the volume to verify the data.

def test_allow_volume_creation_with_degraded_availability_dr(set_random_backupstore, client, core_api, volume_name, csi_pv, pvc, pod, pod_make)

Test Allow Volume Creation with Degraded Availability (Restore)

Requirement: 1. Set allow-volume-creation-with-degraded-availability to true. 2. node-level-soft-anti-affinity to false. 3. Create a backup of 800MB.

Steps: (DR volume) 1. Disable scheduling for node 2 and 3. 2. Create a DR volume from backup with 3 replicas. 1. The scheduled condition is false. 2. Only node 1 replica become scheduled. 3. Enable scheduling for node 2 and 3. 1. Replicas scheduling to node 1, 2, 3 success. 2. Wait for restore progress to complete. 3. The scheduled condition becomes true. 4. Activate, attach the volume, and verify the data.

def test_allow_volume_creation_with_degraded_availability_error(client, volume_name)

Test Allow Volume Creation with Degraded Availability (API)

Requirement: 1. Set allow-volume-creation-with-degraded-availability to true. 2. node-level-soft-anti-affinity to false.

Steps: (no availability) 1. Disable all nodes' scheduling. 2. Create a volume with three replicas. 1. Volume should be NotReady after creation. 2. Scheduled condition should become false. 3. Attaching the volume should result in error. 4. Enable one node's scheduling. 1. Volume should become Ready soon. 2. Scheduled condition should become true. 5. Attach the volume. Write data. Detach and reattach to verify the data.

def test_allow_volume_creation_with_degraded_availability_restore(set_random_backupstore, client, core_api, volume_name, csi_pv, pvc, pod, pod_make)

Test Allow Volume Creation with Degraded Availability (Restore)

Requirement: 1. Set allow-volume-creation-with-degraded-availability to true. 2. node-level-soft-anti-affinity to false. 3. replica-replenishment-wait-interval to 0. 4. Create a backup of 800MB.

Steps: (restore) 1. Disable scheduling for node 2 and 3. 2. Restore a volume with 3 replicas. 1. The scheduled condition is true. 2. Only node 1 replica become scheduled. 3. Enable scheduling for node 2. 4. Wait for the restore to complete and volume detach automatically. Then check the scheduled condition still true. 5. Attach and wait for the volume. 1. 2 Replicas successfully scheduled to node 1 and 2. 1 Replica cannot be created due to node 3 is unscheduled. 2. The scheduled condition becomes false. 3. Verify the data.

def test_attach_without_frontend(client, volume_name)

Test attach in maintenance mode (without frontend)

Create a volume and attach to the current node with enabled frontend
Check volume has blockdev
Write snap1_data into volume and create snapshot snap1
Write more random data into volume and create another anspshot
Detach the volume and reattach with disabled frontend
Check volume still has blockdev as frontend but no endpoint
Revert back to snap1
Detach and reattach the volume with enabled frontend
Check volume contains data snap1_data

def test_aws_iam_role_arn(client, core_api)

Test AWS IAM Role ARN

Set backup target to S3
Check longhorn manager and aio instance manager Pods without 'iam.amazonaws.com/role' annotation
Add AWS_IAM_ROLE_ARN to secret
Check longhorn manager and aio instance manager Pods with 'iam.amazonaws.com/role' annotation and matches to AWS_IAM_ROLE_ARN in secret
Update AWS_IAM_ROLE_ARN from secret
Check longhorn manager and aio instance manager Pods with 'iam.amazonaws.com/role' annotation and matches to AWS_IAM_ROLE_ARN in secret
Remove AWS_IAM_ROLE_ARN from secret
Check longhorn manager and aio instance manager Pods without 'iam.amazonaws.com/role' annotation

def test_backup(set_random_backupstore, client, volume_name)

Test basic backup

Setup:

Create a volume and attach to the current node
Run the test for all the available backupstores.

Steps:

Create a backup of volume
Restore the backup to a new volume
Attach the new volume and make sure the data is the same as the old one
Detach the volume and delete the backup.
Wait for the restored volume's lastBackup to be cleaned (due to remove the backup)
Delete the volume

def test_backup_block_deletion(set_random_backupstore, client, core_api, volume_name)

Test backup block deletion

Context:

We want to make sure that we only delete non referenced backup blocks, we also don't want to delete blocks while there other backups in progress. The reason for this is that we don't yet know which blocks are required by the in progress backup, so blocks deletion could lead to a faulty backup.

Setup:

Setup minio as S3 backupstore

Steps:

Create a volume and attach to the current node
Write 4 MB to the beginning of the volume (2 x 2MB backup blocks)
Create backup(1) of the volume
Overwrite the first of the backup blocks of data on the volume
Create backup(2) of the volume
Overwrite the first of the backup blocks of data on the volume
Create backup(3) of the volume
Verify backup block count == 4 assert volume["DataStored"] == str(BLOCK_SIZE * expected_count) assert count of *.blk files for that volume == expected_count
Create an artificial in progress backup.cfg file json.dumps({"Name": name, "VolumeName": volume, "CreatedTime": ""})
Delete backup(2)
Verify backup block count == 4 (because of the in progress backup)
Delete the artificial in progress backup.cfg file
Delete backup(1)
Verify backup block count == 2
Delete backup(3)
Verify backup block count == 0
Delete the backup volume
Cleanup the volume

def test_backup_failed_disable_auto_cleanup(set_random_backupstore, client, core_api, volume_name)

Test the failed backup would be automatically deleted.

Set the default setting backupstore-poll-interval to 60 (seconds)
Set the default setting failed-backup-ttl to 0
Create a volume and attach to the current node
Create a empty backup for creating the backup volume
Write some data to the volume
Create a backup of the volume
Crash all replicas
Wait and check if the backup failed
Wait and check if the backup was not deleted.
Cleanup

def test_backup_failed_enable_auto_cleanup(set_random_backupstore, client, core_api, volume_name)

Test the failed backup would be automatically deleted.

Set the default setting backupstore-poll-interval to 60 (seconds)
Set the default setting failed-backup-ttl to 3 (minutes)
Create a volume and attach to the current node
Create a empty backup for creating the backup volume
Write some data to the volume
Create a backup of the volume
Crash all replicas
Wait and check if the backup failed
Wait and check if the backup was deleted automatically
Cleanup

def test_backup_labels(set_random_backupstore, client, random_labels, volume_name)

Test that the proper Labels are applied when creating a Backup manually.

Create a volume
Run the following steps on all backupstores
Create a backup with some random labels
Get backup from backupstore, verify the labels are set on the backups

def test_backup_lock_creation_during_deletion(set_random_backupstore, client, core_api, volume_name, csi_pv, pvc, pod_make)

Test backup locks Context: To test the locking mechanism that utilizes the backupstore, to prevent the following case of concurrent operations. - prevent backup creation during backup deletion

steps: 1. Create a volume, then create the corresponding PV, PVC and Pod. 2. Wait for the pod running and the volume healthy. 3. Write data (DATA_SIZE_IN_MB_4) to the pod volume and get the md5sum. 4. Take a backup. 5. Wait for the backup to be completed. 6. Delete the backup. 7. Create another backup of the same volume. 8. The newly created backup should failed because there is a deletion lock. 9. Wait for the first backup to be Deleted 10. Create another backup of the same volume. 11. Wait for the backup to be completed.

def test_backup_lock_deletion_during_backup(set_random_backupstore, client, core_api, volume_name, csi_pv, pvc, pod_make)

Test backup locks Context: To test the locking mechanism that utilizes the backupstore, to prevent the following case of concurrent operations. - prevent backup deletion while a backup is in progress

steps: 1. Create a volume, then create the corresponding PV, PVC and Pod. 2. Wait for the pod running and the volume healthy. 3. Write data to the pod volume and get the md5sum. 4. Take a backup. 5. Wait for the backup to be completed. 6. Write more data into the volume and compute md5sum. 7. Take another backup of the volume. 8. While backup is in progress, delete the older backup up. 9. Wait for the backup creation in progress to be completed. 10. Check the backup store, there should be 1 backup. 11. Restore the latest backup. 12. Wait for the restoration to be completed. Assert md5sum from step 6.

def test_backup_lock_deletion_during_restoration(set_random_backupstore, client, core_api, volume_name, csi_pv, pvc, pod_make)

Test backup locks Context: To test the locking mechanism that utilizes the backupstore, to prevent the following case of concurrent operations. - prevent backup deletion during backup restoration

steps: 1. Create a volume, then create the corresponding PV, PVC and Pod. 2. Wait for the pod running and the volume healthy. 3. Write data to the pod volume and get the md5sum. 4. Take a backup. 5. Wait for the backup to be completed. 6. Start backup restoration for the backup creation. 7. Wait for restoration to be in progress. 8. Delete the backup from the backup store. 9. Wait for the restoration to be completed. 10. Assert the data from the restored volume with md5sum. 11. Assert the backup count in the backup store with 0.

def test_backup_lock_restoration_during_deletion(set_random_backupstore, client, core_api, volume_name, csi_pv, pvc, pod_make)

Test backup locks Context: To test the locking mechanism that utilizes the backupstore, to prevent the following case of concurrent operations. - prevent backup restoration during backup deletion

steps: 1. Create a volume, then create the corresponding PV, PVC and Pod. 2. Wait for the pod running and the volume healthy. 3. Write data to the pod volume and get the md5sum. 4. Take a backup. 5. Wait for the backup to be completed. 6. Write more data (1.5 Gi) to the volume and take another backup. 7. Wait for the 2nd backup to be completed. 8. Delete the 2nd backup. 9. Without waiting for the backup deletion completion, restore the 1st backup from the backup store. 10. Verify the restored volume become faulted. 11. Wait for the 2nd backup deletion and assert the count of the backups with 1 in the backup store.

def test_backup_metadata_deletion(set_random_backupstore, client, core_api, volume_name)

Test backup metadata deletion

Context:

We want to be able to delete the metadata (.cfg) files, even if they are corrupt or in a bad state (missing volume.cfg).

Setup:

Setup minio as S3 backupstore
Cleanup backupstore

Steps:

Create volume(1,2) and attach to the current node
write some data to volume(1,2)
Create backup(1,2) of volume(1,2)
request a backup list
verify backup list contains no error messages for volume(1,2)
verify backup list contains backup(1,2) information for volume(1,2)
delete backup(1) of volume(1,2)
request a backup list
verify backup list contains no error messages for volume(1,2)
verify backup list only contains backup(2) information for volume(1,2)
delete volume.cfg of volume(2)
request backup volume deletion for volume(2)
verify that volume(2) has been deleted in the backupstore.
request a backup list
verify backup list only contains volume(1) and no errors
verify backup list only contains backup(2) information for volume(1)
delete backup volume(1)
verify that volume(1) has been deleted in the backupstore.
cleanup

def test_backup_status_for_unavailable_replicas(set_random_backupstore, client, volume_name)

Test backup status for unavailable replicas

Context:

We want to make sure that during the backup creation, once the responsible replica gone, the backup should in Error state and with the error message.

Setup:

Create a volume and attach to the current node
Run the test for all the available backupstores

Steps:

Create a backup of volume
Find the replica for that backup
Disable scheduling on the node of that replica
Delete the replica
Verify backup status with Error state and with an error message
Create a new backup
Verify new backup was successful
Cleanup (delete backups, delete volume)

def test_backup_volume_list(set_random_backupstore, client, core_api)

Test backup volume list Context: We want to make sure that an error when listing a single backup volume does not stop us from listing all the other backup volumes. Otherwise a single faulty backup can block the retrieval of all known backup volumes. Setup: 1. Setup minio as S3 backupstore Steps: 1. Create a volume(1,2) and attach to the current node 2. write some data to volume(1,2) 3. Create a backup(1) of volume(1,2) 4. request a backup list 5. verify backup list contains no error messages for volume(1,2) 6. verify backup list contains backup(1) for volume(1,2) 7. place a file named "backup_1234@failure.cfg" into the backups folder of volume(1) 8. request a backup list 9. verify backup list contains no error messages for volume(1,2) 10. verify backup list contains backup(1) for volume(1,2) 11. delete backup volumes(1 & 2) 12. cleanup

def test_backup_volume_restore_with_access_mode(core_api, set_random_backupstore, client, access_mode, overridden_restored_access_mode)

Test the backup w/ the volume access mode, then restore a volume w/ the original access mode or being overridden.

Prepare a healthy volume
Create a backup for the volume
Restore a volume from the backup w/o specifying the access mode => Validate the access mode should be the same the volume
Restore a volume from the backup w/ specifying the access mode => Validate the access mode should be the same as the specified

def test_backuptarget_available_during_engine_image_not_ready(client, apps_api)

Test backup target available during engine image not ready

Set backup target URL to S3 and NFS respectively
Set poll interval to 0 and 300 respectively
Scale down the engine image DaemonSet
Check engine image in deploying state
Configures backup target during engine image in not ready state
Check backup target status.available=false
Scale up the engine image DaemonSet
Check backup target status.available=true
Reset backup target setting
Check backup target status.available=false

def test_backuptarget_invalid(apps_api, client, core_api, backupstore_invalid, make_deployment_with_pvc, pvc_name, request, volume_name)

This test case does not cover the UI test mentioned in the related issue's test steps."

Setup - Give an incorrect value to Backup target.

Given - Create a volume, attach it to a workload, write data into the volume.

When - Create a backup by a manifest yaml file

Then - Backup will be failed and the backup state is Error. - Backup target will be unavailable with an explanatory condition.

def test_cleanup_system_generated_snapshots(client, core_api, volume_name, csi_pv, pvc, pod_make)

Test Cleanup System Generated Snapshots

Enabled 'Auto Cleanup System Generated Snapshot'.
Create a volume and attach it to a node.
Write some data to the volume and get the checksum of the data.
Delete a random replica to trigger a system generated snapshot.
Repeat Step 3 for 3 times, and make sure only one snapshot is left.
Check the data with the saved checksum.

def test_default_storage_class_syncup(core_api, request)

Steps: 1. Record the current Longhorn-StorageClass-related ConfigMap longhorn-storageclass. 2. Modify the default Longhorn StorageClass longhorn. e.g., update reclaimPolicy from Delete to Retain. 3. Verify that the change is reverted immediately and the manifest is the same as the record in ConfigMap longhorn-storageclass. 4. Delete the default Longhorn StorageClass longhorn. 5. Verify that the StorageClass is recreated immediately with the manifest the same as the record in ConfigMap longhorn-storageclass. 6. Modify the content of ConfigMap longhorn-storageclass. 7. Verify that the modifications will be applied to the default Longhorn StorageClass longhorn immediately. 8. Revert the modifications of the ConfigMaps. Then wait for the StorageClass sync-up.

def test_delete_backup_during_restoring_volume(set_random_backupstore, client)

Test delete backup during restoring volume

Context:

The volume robustness should be faulted if the backup was deleted during restoring the volume.

Given create volume v1 and attach to a node And write data 150M to volume v1
When create a backup of volume v1 And wait for that backup is completed And restore a volume v2 from volume v1 backup And delete the backup immediately
Then volume v2 "robustness" should be "faulted" And "status" of volume restore condition should be "False", And "reason" of volume restore condition should be "RestoreFailure"

def test_deleting_backup_volume(set_random_backupstore, client, volume_name)

Test deleting backup volumes

Create volume and create backup
Delete the backup and make sure it's gone in the backupstore

def test_dr_volume_activated_with_failed_replica(set_random_backupstore, client, core_api, volume_name)

Test DR volume activated with a failed replica

Context:

Make sure that DR volume could be activated as long as there is a ready replica.

Steps:

Create a volume and attach to a node.
Create a backup of the volume with writing some data.
Create a DR volume from the backup.
Disable the replica rebuilding.
Enable the setting allow-volume-creation-with-degraded-availability
Make a replica failed.
Activate the DR volume.
Enable the replica rebuilding.
Attach the volume to a node.
Check if data is correct.

def test_dr_volume_with_backup_and_backup_volume_deleted(set_random_backupstore, client, core_api, volume_name)

Test DR volume can be activated after delete all backups.

Context:

We want to make sure that DR volume can activate after deleting some/all backups or the backup volume.

Steps:

Create a volume and attach to the current node.
Write 4 MB to the beginning of the volume (2 x 2MB backup blocks).
Create backup(0) then backup(1) for the volume.
Verify backup block count == 4.
Create DR volume(1) and DR volume(2) from backup(1).
Verify DR volumes last backup is backup(1).
Delete backup(1).
Verify backup block count == 2.
Verify DR volumes last backup becomes backup(0).
Activate and verify DR volume(1) data is data(0).
Delete backup(0).
Verify backup block count == 0.
Verify DR volume last backup is empty.
Delete the backup volume.
Activate and verify DR volume data is data(0).

def test_dr_volume_with_backup_block_deletion(set_random_backupstore, client, core_api, volume_name)

Test DR volume last backup after block deletion.

Context:

We want to make sure that when the block is delete, the DR volume picks up the correct last backup.

Steps:

Create a volume and attach to the current node.
Write 4 MB to the beginning of the volume (2 x 2MB backup blocks).
Create backup(0) of the volume.
Overwrite backup(0) 1st blocks of data on the volume. (Since backup(0) contains 2 blocks of data, the updated data is data1["content"] + data0["content"][BACKUP_BLOCK_SIZE:])
Create backup(1) of the volume.
Verify backup block count == 3.
Create DR volume from backup(1).
Verify DR volume last backup is backup(1).
Delete backup(1).
Verify backup block count == 2.
Verify DR volume last backup is backup(0).
Overwrite backup(0) 1st blocks of data on the volume. (Since backup(0) contains 2 blocks of data, the updated data is data2["content"] + data0["content"][BACKUP_BLOCK_SIZE:])
Create backup(2) of the volume.
Verify DR volume last backup is backup(2).
Activate and verify DR volume data is data2["content"] + data0["content"][BACKUP_BLOCK_SIZE:].

def test_dr_volume_with_backup_block_deletion_abort_during_backup_in_progress(set_random_backupstore, client, core_api, volume_name)

Test DR volume last backup after block deletion aborted. This will set the last backup to be empty.

Context:

We want to make sure that when the block deletion for the last backup is aborted by operations such as backups in progress, the DR volume will still pick up the correct last backup.

Steps:

Create a volume and attach to the current node.
Write 4 MB to the beginning of the volume (2 x 2MB backup blocks).
Create backup(0) of the volume.
Overwrite backup(0) 1st blocks of data on the volume. (Since backup(0) contains 2 blocks of data, the updated data is data1["content"] + data0["content"][BACKUP_BLOCK_SIZE:])
Create backup(1) of the volume.
Verify backup block count == 3.
Create DR volume from backup(1).
Verify DR volume last backup is backup(1).
Create an artificial in progress backup.cfg file. This cfg file will convince the longhorn manager that there is a backup being created. Then all subsequent backup block cleanup will be skipped.
Delete backup(1).
Verify backup block count == 3 (because of the in progress backup).
Verify DR volume last backup is empty.
Delete the artificial in progress backup.cfg file.
Overwrite backup(0) 1st blocks of data on the volume. (Since backup(0) contains 2 blocks of data, the updated data is data2["content"] + data0["content"][BACKUP_BLOCK_SIZE:])
Create backup(2) of the volume.
Verify DR volume last backup is backup(2).
Activate and verify DR volume data is data2["content"] + data0["content"][BACKUP_BLOCK_SIZE:].

def test_engine_image_daemonset_restart(client, apps_api, volume_name)

Test restarting engine image daemonset

Get the default engine image
Create a volume and attach to the current node
Write random data to the volume and create a snapshot
Delete the engine image daemonset
Engine image daemonset should be recreated
In the meantime, validate the volume data to prove it's still functional
Wait for the engine image to become ready again
Check the volume data again.
Write some data and create a new snapshot.
1. Since create snapshot will use engine image binary.
Check the volume data again

def test_expand_pvc_with_size_round_up(client, core_api, volume_name)

test expand longhorn volume with pvc

Create LHV,PV,PVC with size '1Gi'
Attach, write data, and detach
Expand volume size to '2000000000/2G' and check if size round up '2000683008/1908Mi'
Attach, write data, and detach
Expand volume size to '2Gi' and check if size is '2147483648'
Attach, write data, and detach

def test_expansion_basic(client, volume_name)

Test volume expansion using Longhorn API

Create volume and attach to the current node
Generate data snap1_data and write it to the volume
Create snapshot snap1
Online expand the volume
Verify the volume has been expanded
Generate data snap2_data and write it to the volume
Create snapshot snap2
Generate data snap3_data and write it after the original size
Create snapshot snap3 and verify the snap3_data with location
Detach and reattach the volume.
Verify the volume is still expanded, and snap3_data remain valid
Detach the volume.
Reattach the volume in maintenance mode
Revert to snap2 and detach.
Attach the volume and check data snap2_data
Generate snap4_data and write it after the original size
Create snapshot snap4 and verify snap4_data.
Detach the volume and revert to snap1
Validate snap1_data

TODO: Add offline expansion

def test_expansion_canceling(client, core_api, volume_name, pod, pvc, storage_class)

Test expansion canceling

Create a volume, then create the corresponding PV, PVC and Pod.
Generate test_data and write to the pod
Create an empty directory with expansion snapshot tmp meta file path so that the following offline expansion will fail
Delete the pod and wait for volume detachment
Try offline expansion via Longhorn API
Wait for expansion failure then use Longhorn API to cancel it
Create a new pod and validate the volume content
Create an empty directory with expansion snapshot tmp meta file path so that the following online expansion will fail
Try online expansion via Longhorn API
Wait for expansion failure then use Longhorn API to cancel it
Validate the volume content again, then re-write random data to the pod
Retry online expansion, then verify the expansion done via Longhorn API
Validate the volume content, then check if data writing looks fine
Clean up pod, PVC, and PV

def test_expansion_with_scheduling_failure(client, core_api, volume_name, pod, pvc, storage_class)

Test if the running volume with scheduling failure can be expanded after the detachment.

Prerequisite: Setting "soft anti-affinity" is false.

Create a volume, then create the corresponding PV, PVC and Pod.
Wait for the pod running and the volume healthy.
Write data to the pod volume and get the md5sum.
Disable the scheduling for a node contains a running replica.
Crash the replica on the scheduling disabled node for the volume. Then delete the failed replica so that it won't be reused.
Wait for the scheduling failure.
Verify: 7.1. volume.ready == True. 7.2. volume.conditions[scheduled].status == False. 7.3. the volume is Degraded. 7.4. the new replica cannot be created.
Write more data to the volume and get the md5sum
Delete the pod and wait for the volume detached.
Verify: 10.1. volume.ready == True. 10.2. volume.conditions[scheduled].status == True
Expand the volume and wait for the expansion succeeds.
Verify there is no rebuild replica after the expansion.
Recreate a new pod for the volume and wait for the pod running.
Validate the volume content.
Verify the expanded part can be read/written correctly.
Enable the node scheduling.
Wait for the volume rebuild succeeds.
Verify the data written in the expanded part.
Clean up pod, PVC, and PV.

Notice that the step 1 to step 10 is identical with those of the case test_running_volume_with_scheduling_failure().

def test_expansion_with_size_round_up(client, core_api, volume_name)

test expand longhorn volume

Create and attach longhorn volume with size '1Gi'.
Write data, and offline expand volume size to '2000000000/2G'.
Check if size round up '2000683008' and the written data.
Write data, and online expand volume size to '2Gi'.
Check if size round up '2147483648' and the written data.

def test_filesystem_trim(client, fs_type)

Test the filesystem in the volume can be trimmed correctly.

Create a volume with option unmapMarkSnapChainRemoved enabled, then attach to the current node.
Make a filesystem and write file0 into the fs, calculate the checksum, then take snap0.
Write file21 and calculate the checksum. Then take snap21.
Unmount then reattach the volume without frontend. Revert the volume to snap0.
Reattach and mount the volume.
Write file11. Then take snap11.
Write file12. Then take snap12.
Write file13. Then remove file0, file11, file12, and file13. Verify the snapshots and volume head size are not shrunk.
Do filesystem trim (via Longhorn API or cmdline). Verify that:
1. snap11 and snap12 are marked as removed.
2. snap11, snap12, and volume head size are shrunk.
Disable option unmapMarkSnapChainRemoved for the volume.
Write file14. Then take snap14.
Write file15. Then remove file14 and file15. Verify that:
1. snap14 is not marked as removed and its size is not changed.
2. volume head size is shrunk.
Unmount and reattach the volume. Then revert to snap21.
Reattach and mount the volume. Verify the file0 and file21.
Cleanup.

def test_hosts(client)

Check node name and IP

def test_listing_backup_volume(client, backing_image='')

Test listing backup volumes

Create three volumes: volume1/2/3
Setup NFS backupstore since we can manipulate the content easily
Create multiple snapshots for all three volumes
Rename volume1's volume.cfg to volume.cfg.tmp in backupstore
List backup volumes. Make sure volume1 errors out but found other two
Restore volume1's volume.cfg.
Make sure now backup volume volume1 can be found
Delete backups for volume1/2, make sure they cannot be found later
Corrupt a backup.cfg on volume3
Check that the backup is listed with the other backups of volume3
Verify that the corrupted backup has Messages of type error
Check that backup inspection for the previously corrupted backup fails
Delete backups for volume3, make sure they cannot be found later

def test_multiple_volumes_creation_with_degraded_availability(set_random_backupstore, client, core_api, apps_api, storage_class, statefulset)

Scenario: verify multiple volumes with degraded availability can be created, attached, detached, and deleted at nearly the same time.

Given new StorageClass created with numberOfReplicas=5.

When set allow-volume-creation-with-degraded-availability to True. And deploy this StatefulSet: https://github.com/longhorn/longhorn/issues/2073#issuecomment-742948726 Then all 10 volumes are healthy in 1 minute.

When delete the StatefulSet. then all 10 volumes are detached in 1 minute.

When find and delete the PVC of the 10 volumes. Then all 10 volumes are deleted in 1 minute.

def test_pvc_storage_class_name_from_backup_volume(set_random_backupstore, core_api, client, volume_name, pvc_name, pvc, pod_make, storage_class)

Test the storageClasName of the restored volume's PV/PVC should be from the backup volume

Given - Create a new StorageClass kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: longhorn-test provisioner: driver.longhorn.io allowVolumeExpansion: true reclaimPolicy: Delete volumeBindingMode: Immediate parameters: numberOfReplicas: "3" - Create a PVC to use this SC apiVersion: v1 kind: PersistentVolumeClaim metadata: name: test-pvc spec: accessModes: - ReadWriteOnce storageClassName: longhorn-test resources: requests: storage: 300Mi - Attach the Volume and write some data

When - Backup the Volume

Then - the backupvolume's status.storageClassName should be longhorn-test

When - Restore the backup to a new volume - Create PV/PVC from the new volume with create new PVC option

Then - The new PVC's storageClassName should still be longhorn-test - Verify the restored data is the same as original one

def test_restore_basic(set_random_backupstore, client, core_api, volume_name, pod)

Steps: 1. Create a volume and attach to a pod. 2. Write some data into the volume and compute the checksum m1. 3. Create a backup say b1. 4. Write some more data into the volume and compute the checksum m2. 5. Create a backup say b2. 6. Delete all the data from the volume. 7. Write some more data into the volume and compute the checksum m3. 8. Create a backup say b3. 9. Restore backup b1 and verify the data with m1. 10. Restore backup b2 and verify the data with m1 and m2. 11. Restore backup b3 and verify the data with m3. 12. Delete the backup b2. 13. restore the backup b3 and verify the data with m3.

def test_restore_inc(set_random_backupstore, client, core_api, volume_name, pod)

Test restore from disaster recovery volume (incremental restore)

Run test against all the backupstores

Create a volume and attach to the current node
Generate data0, write to the volume, make a backup backup0
Create three DR(standby) volumes from the backup: sb_volume0/1/2
Wait for all three DR volumes to start the initial restoration
Verify DR volumes's lastBackup is backup0
Verify snapshot/pv/pvc/change backup target are not allowed as long as the DR volume exists
Activate standby sb_volume0 and attach it to check the volume data
Generate data1 and write to the original volume and create backup1
Make sure sb_volume1's lastBackup field has been updated to backup1
Wait for sb_volume1 to finish incremental restoration then activate
Attach and check sb_volume1's data
Generate data2 and write to the original volume and create backup2
Make sure sb_volume2's lastBackup field has been updated to backup1
Wait for sb_volume2 to finish incremental restoration then activate
Attach and check sb_volume2's data
Create PV, PVC and Pod to use sb_volume2, check PV/PVC/POD are good

FIXME: Step 16 works because the disk will be treated as a unformatted disk

def test_restore_inc_with_offline_expansion(set_random_backupstore, client, core_api, volume_name, pod)

Test restore from disaster recovery volume with volume offline expansion

Run test against a random backupstores

Create a volume and attach to the current node
Generate data0, write to the volume, make a backup backup0
Create three DR(standby) volumes from the backup: dr_volume0/1/2
Wait for all three DR volumes to start the initial restoration
Verify DR volumes's lastBackup is backup0
Verify snapshot/pv/pvc/change backup target are not allowed as long as the DR volume exists
Activate standby dr_volume0 and attach it to check the volume data
Expand the original volume. Make sure the expansion is successful.
Generate data1 and write to the original volume and create backup1
Make sure dr_volume1's lastBackup field has been updated to backup1
Activate dr_volume1 and check data data0 and data1
Generate data2 and write to the original volume after original SIZE
Create backup2
Wait for dr_volume2 to finish expansion, show backup2 as latest
Activate dr_volume2 and verify data2
Detach dr_volume2
Create PV, PVC and Pod to use sb_volume2, check PV/PVC/POD are good

FIXME: Step 16 works because the disk will be treated as a unformatted disk

def test_running_volume_with_scheduling_failure(client, core_api, volume_name, pod)

Test if the running volume still work fine when there is a scheduling failed replica

Prerequisite: Setting "soft anti-affinity" is false. Setting "replica-replenishment-wait-interval" is 0

Create a volume, then create the corresponding PV, PVC and Pod.
Wait for the pod running and the volume healthy.
Write data to the pod volume and get the md5sum.
Disable the scheduling for a node contains a running replica.
Crash the replica on the scheduling disabled node for the volume.
Wait for the scheduling failure.
Verify: 7.1. volume.ready == True. 7.2. volume.conditions[scheduled].status == False. 7.3. the volume is Degraded. 7.4. the new replica cannot be created.
Write more data to the volume and get the md5sum
Delete the pod and wait for the volume detached.
Verify: 10.1. volume.ready == True. 10.2. volume.conditions[scheduled].status == True
Recreate a new pod for the volume and wait for the pod running.
Validate the volume content, then check if data writing looks fine.
Clean up pod, PVC, and PV.

def test_setting_default_replica_count(client, volume_name)

Test Default Replica Count setting

Set default replica count in the global settings to 5
Create a volume without specify the replica count
The volume should have 5 replicas (instead of the previous default 3)

def test_settings(client)

Check input for settings

def test_snapshot(client, volume_name, backing_image='')

Test snapshot operations

Create a volume and attach to the node
Create the empty snapshot snap1
Generate and write data snap2_data, then create snap2
Generate and write data snap3_data, then create snap3
List snapshot. Validate the snapshot chain relationship
Mark snap3 as removed. Make sure volume's data didn't change
List snapshot. Make sure snap3 is marked as removed
Detach and reattach the volume in maintenance mode.
Make sure the volume frontend is still blockdev but disabled
Revert to snap2
Detach and reattach the volume with frontend enabled
Make sure volume's data is snap2_data
List snapshot. Make sure volume-head is now snap2's child
Delete snap1 and snap2
Purge the snapshot.
List the snapshot, make sure snap1 and snap3 are gone. snap2 is marked as removed.
Check volume data, make sure it's still snap2_data.

def test_snapshot_prune(client, volume_name, backing_image='')

Test removing the snapshot directly behinds the volume head would trigger snapshot prune. Snapshot pruning means removing the overlapping part from the snapshot based on the volume head content.

Create a volume and attach to the node
Generate and write data snap1_data, then create snap1
Generate and write data snap2_data with the same offset.
Mark snap1 as removed. Make sure volume's data didn't change. But all data of the snap1 will be pruned.
Detach and expand the volume, then wait for the expansion done. This will implicitly create a new snapshot snap2.
Attach the volume. Make sure there is a system snapshot with the old size.
Generate and write data snap3_data which is partially overlapped with snap2_data, plus one extra data chunk in the expanded part.
Mark snap2 as removed then do snapshot purge. Make sure volume's data didn't change. But the overlapping part of snap2 will be pruned.
Create snap3.
Do snapshot purge for the volume. Make sure snap2 will be removed.
Generate and write data snap4_data which has no overlapping with snap3_data.
Mark snap3 as removed. Make sure volume's data didn't change. But there is no change for snap3.
Create snap4.
Generate and write data snap5_data, then create snap5.
Detach and reattach the volume in maintenance mode.
Make sure the volume frontend is still blockdev but disabled
Revert to snap4
Detach and reattach the volume with frontend enabled
Make sure volume's data is correct.
List snapshot. Make sure volume-head is now snap4's child

def test_snapshot_prune_and_coalesce_simultaneously(client, volume_name, backing_image='')

Test the prune for the snapshot directly behinds the volume head would be handled after all snapshot coalescing done.

Create a volume and attach to the node
Generate and write 1st data chunk snap1_data, then create snap1
Generate and write 2nd data chunk snap2_data, then create snap2
Generate and write 3rd data chunk snap3_data, then create snap3
Generate and write 4th data chunk snap4_data, then create snap4
Overwrite all existing data chunks in the volume head.
Mark all snapshots as Removed, then start snapshot purge and wait for complete.
List snapshot. Make sure there are only 2 snapshots left: volume-head and snap4. And snap4 is an empty snapshot.
Make sure volume's data is correct.

def test_space_usage_for_rebuilding_only_volume(client, volume_name, request)

Test case: the normal scenario 1. Prepare a 7Gi volume as a node disk. 2. Create a new volume with 3Gi spec size. 3. Write 3Gi data (using dd) to the volume. 4. Take a snapshot then mark this snapshot as Removed. (this snapshot won't be deleted immediately.) 5. Write 3Gi data (using dd) to the volume again. 6. Delete a random replica to trigger the rebuilding. 7. Wait for the rebuilding complete. And verify the volume actual size won't be greater than 2x of the volume spec size. 8. Delete the volume.

def test_space_usage_for_rebuilding_only_volume_worst_scenario(client, volume_name, request)

Test case: worst scenario 1. Prepare a 7Gi volume as a node disk. 2. Create a new volume with 2Gi spec size. 3. Write 2Gi data (using dd) to the volume. 4. Take a snapshot then mark this snapshot as Removed. (this snapshot won't be deleted immediately.) 5. Write 2Gi data (using dd) to the volume again. 6. Delete a random replica to trigger the rebuilding. 7. Write 2Gi data once the rebuilding is trigger (new replica is created). 8. Wait for the rebuilding complete. And verify the volume actual size won't be greater than 3x of the volume spec size. 9. Delete the volume.

def test_storage_class_from_backup(set_random_backupstore, volume_name, pvc_name, storage_class, client, core_api, pod_make)

Test restore backup using StorageClass

Create volume and PV/PVC/POD
Write test_data into pod
Create a snapshot and back it up. Get the backup URL
Create a new StorageClass longhorn-from-backup and set backup URL.
Use longhorn-from-backup to create a new PVC
Wait for the volume to be created and complete the restoration.
Create the pod using the PVC. Verify the data

def test_volume_backup_and_restore_with_gzip_compression_method(client, set_random_backupstore, volume_name)

Scenario: test volume backup and restore with different compression methods

Issue: https://github.com/longhorn/longhorn/issues/5189

Given setup Backup Compression Method is "gzip" And setup backup concurrent limit is "4" And setup restore concurrent limit is "4"

When create a volume and attach to the current node And get the volume's details Then verify the volume's compression method is "gzip"

Then Create a backup of volume And Write volume random data Then restore the backup to a new volume And Attach the new volume and verify the data integrity And Detach the volume and delete the backup And Wait for the restored volume's lastBackup to be cleaned (due to remove the backup) And Delete the volume

def test_volume_backup_and_restore_with_lz4_compression_method(client, set_random_backupstore, volume_name)

Scenario: test volume backup and restore with different compression methods

Issue: https://github.com/longhorn/longhorn/issues/5189

Given setup Backup Compression Method is "lz4" And setup backup concurrent limit is "4" And setup restore concurrent limit is "4"

When create a volume and attach to the current node And get the volume's details Then verify the volume's compression method is "lz4"

Then Create a backup of volume And Write volume random data Then restore the backup to a new volume And Attach the new volume and verify the data integrity Then Detach the volume and delete the backup And Wait for the restored volume's lastBackup to be cleaned (due to remove the backup) And Delete the volume

def test_volume_backup_and_restore_with_none_compression_method(client, set_random_backupstore, volume_name)

Scenario: test volume backup and restore with different compression methods

Issue: https://github.com/longhorn/longhorn/issues/5189

Given setup Backup Compression Method is "none" And setup backup concurrent limit is "4" And setup restore concurrent limit is "4"

When create a volume and attach to the current node And get the volume's details Then verify the volume's compression method is "none"

def test_volume_basic(client, volume_name)

Test basic volume operations:

Check volume name and parameter
Create a volume and attach to the current node, then check volume states
Check soft anti-affinity rule
Write then read back to check volume data

def test_volume_iscsi_basic(client, volume_name)

Test basic volume operations with iscsi frontend

Create and attach a volume with iscsi frontend
Check the volume endpoint and connect it using the iscsi initiator on the node.
Write then read back volume data for validation

def test_volume_metafile_deleted(client, core_api, volume_name, csi_pv, pvc, pod, pod_make)

Scenario:

Test volume should still work when the volume meta file is removed in the replica data path.

Steps:

Delete volume meta file in this replica data path
Recreate the pod and wait for the volume attached
Check if the volume is Healthy after the volume attached
Check volume data
Check if the volume still works fine by r/w data and creating/removing snapshots

def test_volume_metafile_deleted_when_writing_data(client, core_api, volume_name, csi_pv, pvc, pod, pod_make)

Scenario:

While writing data, test volume should still work when the volume meta file is deleted in the replica data path.

Steps:

Create a pod using Longhorn volume
Delete volume meta file in this replica data path
Recreate the pod and wait for the volume attached
Check if the volume is Healthy after the volume attached
Check volume data
Check if the volume still works fine by r/w data and creating/removing snapshots

def test_volume_metafile_empty(client, core_api, volume_name, csi_pv, pvc, pod, pod_make)

Scenario:

Test volume should still work when there is an invalid volume meta file in the replica data path.

Steps:

Remove the content of the volume meta file in this replica data path
Recreate the pod and wait for the volume attached
Check if the volume is Healthy after the volume attached
Check volume data
Check if the volume still works fine by r/w data and creating/removing snapshots

def test_volume_multinode(client, volume_name)

Test the volume can be attached on multiple nodes

Create one volume
Attach it on every node once, verify the state, then detach it

def test_volume_scheduling_failure(client, volume_name)

Test fail to schedule by disable scheduling for all the nodes

Also test cannot attach a scheduling failed volume

Disable allowScheduling for all nodes
Create a volume.
Verify the volume condition Scheduled is false
Verify the volume is not ready for workloads
Verify attaching the volume will result in error
Enable allowScheduling for all nodes
Volume should be automatically scheduled (condition become true)
Volume can be attached now

def test_volume_toomanysnapshots_condition(client, core_api, volume_name)

Test Volume TooManySnapshots Condition

Create a volume and attach it to a node.
Check the 'TooManySnapshots' condition is False.
Writing data to this volume and meanwhile taking 101 snapshots.
Check the 'TooManySnapshots' condition is True.
Take one more snapshot to make sure snapshots works fine.
Delete 2 snapshots, and check the 'TooManySnapshots' condition is False.

def test_volume_update_replica_count(client, volume_name)

Test updating volume's replica count

Create a volume with 2 replicas
Attach the volume
Increase the replica to 3.
Volume will become degraded and start rebuilding
Wait for rebuilding to complete
Update the replica count to 2. Volume should remain healthy
Remove 1 replicas, so there will be 2 replicas in the volume
Verify the volume is still healthy

Volume should always be healthy even only with 2 replicas.

def test_workload_with_fsgroup(core_api, statefulset, storage_class)

Deploy a StatefulSet workload that uses Longhorn volume and has securityContext set: securityContext: runAsUser: 1000 runAsGroup: 1000 fsGroup: 1000 See https://github.com/longhorn/longhorn/issues/2964#issuecomment-910117570 for an example.
Wait for the workload pod to be running
Exec into the workload pod, cd into the mount point of the volume.
Verify that the mount point has correct filesystem permission (e.g., running ls -l on the mount point should return the permission in the format *rw*
Verify that we can read/write files.

def volume_basic_test(client, volume_name, backing_image='')

def volume_iscsi_basic_test(client, volume_name, backing_image='')

def volume_rw_test(dev)