Module `tests.test_engine_upgrade`

Functions

def check_replica_engine(volume, engineimage)

def engine_live_upgrade_rollback_test(client, core_api, volume_name, backing_image='')

def engine_live_upgrade_test(client, core_api, volume_name, backing_image='')

def engine_offline_upgrade_test(client, core_api, volume_name, backing_image='')

def prepare_auto_upgrade_engine_to_default_version(client)

def test_auto_upgrade_engine_to_default_version(client)

Steps:

Preparation: 1. set up a backup store 2. Deploy a compatible new engine image

Test auto upgrade to default engine in attached / detached volume: 1. Create 2 volumes each of 0.5Gb. 2. Attach 1 volumes vol-1. Write data to it 3. Upgrade all volumes to the new engine image 4. Wait until the upgrades are completed (volumes' engine image changed, replicas' mode change to RW for attached volumes, reference count of the new engine image changed, all engine and replicas' engine image changed) 5. Set concurrent-automatic-engine-upgrade-per-node-limit setting to 3 6. Wait until the upgrades are completed (volumes' engine image changed, replica mode change to RW for attached volumes, reference count of the new engine image changed, all engine and replicas' engine image changed, etc …) 7. verify the volumes' data

def test_auto_upgrade_engine_to_default_version_degraded_volume(client)

Steps:

Preparation: 1. set up a backup store 2. Deploy a compatible new engine image

Test auto upgrade engine to default version in degraded volume: 1. set concurrent-automatic-engine-upgrade-per-node-limit setting to 0 2. Upgrade vol-1 (an healthy attached volume) to the new engine image 3. Wait until the upgrade are completed (volumes' engine image changed, replicas' mode change to RW, reference count of the new engine image changed, engine and replicas' engine image changed) 4. Increase number of replica count to 4 to make the volume degraded 5. Set concurrent-automatic-engine-upgrade-per-node-limit setting to 3 6. In a 2-min retry loop, verify that Longhorn doesn't automatically upgrade engine image for vol-1.

def test_auto_upgrade_engine_to_default_version_dr_volume(client, set_random_backupstore)

Steps:

Preparation: 1. set up a backup store 2. Deploy a compatible new engine image

Test auto upgrade engine to default version in DR volume: 1. Create a backup for vol-1. Create a DR volume from the backup 2. Set concurrent-automatic-engine-upgrade-per-node-limit setting to 3 3. Try to upgrade the DR volume engine's image to the new engine image 4. Verify that the Longhorn API returns error. Upgrade fails. 5. Set concurrent-automatic-engine-upgrade-per-node-limit setting to 0 6. Try to upgrade the DR volume engine's image to the new engine image 7. Wait until the upgrade are completed (volumes' engine image changed, replicas' mode change to RW, reference count of the new engine image changed, engine and replicas' engine image changed) 8. Wait for the DR volume to finish restoring 9. Set concurrent-automatic-engine-upgrade-per-node-limit setting to 3 10. In a 2-min retry loop, verify that Longhorn doesn't automatically upgrade engine image for DR volume.

def test_auto_upgrade_engine_to_default_version_expanding_volume(client)

Steps:

Preparation: 1. set up a backup store 2. Deploy a compatible new engine image

Test auto upgrade engine to default version in expanding volume: 1. set concurrent-automatic-engine-upgrade-per-node-limit setting to 0 2. Upgrade vol-1 to the new engine image 3. Wait until the upgrade are completed (volumes' engine image changed, replicas' mode change to RW, reference count of the new engine image changed, engine and replicas' engine image changed) 4. Expand the vol-0 from 1Gb to 5GB 5. Wait for the vol-0 to start expanding 6. Set concurrent-automatic-engine-upgrade-per-node-limit setting to 3 7. While vol-0 is expanding, verify that its engine is not upgraded to the default engine image 8. Wait for the expansion to finish and vol-0 is detached 9. Verify that Longhorn upgrades vol-0's engine to the default version

def test_engine_crash_during_live_upgrade(client, core_api, make_deployment_with_pvc, volume_name)

Deploy an extra engine image.
Create and attach a volume to a workload, then write data into the volume.
Send live upgrade request then immediately delete the related engine manager pod/engine process (The new replicas are not in active in this case).
Verify the workload will be restarted and the volume will be reattached automatically.
Verify the upgrade is done. (It actually becomes offline upgrade.)
Verify volume healthy and the data is correct.

def test_engine_image(client, core_api, volume_name)

Test Engine Image deployment

List Engine Images and validate basic properties.
Try deleting default engine image and it should fail.
Try creating a duplicate engine image as default and it should fail
Get upgrade test image for the same versions
Test if the upgrade test image can be deployed and deleted correctly

def test_engine_image_incompatible(client, core_api, volume_name)

Test incompatible engine images

Deploy incompatible engine images
Make sure their state are incompatible once deployed.

def test_engine_live_upgrade(client, core_api, volume_name)

Test engine live upgrade

Deploy a compatible new engine image
Create a volume (with the old default engine image)
Attach the volume and write data to it
Upgrade the volume when it's attached, to the new engine image
Wait until the upgrade completed, verify the volume engine image changed
Wait for new replica mode update then check the engine status.
Verify the reference count of the new engine image changed
Verify all engine and replicas' engine image changed
Check volume data
Detach the volume. Check engine and replicas's engine image again.
Attach the volume.
Check engine/replica engine image. Check data after reattach.
Live upgrade to the original engine image,
Wait for new replica mode update then check the engine status.
Check old and new engine image reference count (new 0, old 1)
Verify all the engine and replica images should be the old image
Check volume data
Detach the volume. Make sure engine and replica images are old image

def test_engine_live_upgrade_rollback(client, core_api, volume_name)

Test engine live upgrade rollback

Deploy wrong_engine_upgrade_image compatible upgrade engine image
1. It's not functional but compatible per metadata.
Create a volume with default engine image
Attach it and write data into it.
Live upgrade to the wrong_engine_upgrade_image
Try to wait for the engine upgrade to complete. Expect it to timeout.
Rollback by upgrading to the original_engine_image
Make sure the rollback succeed and volume/engine engines are rolled back
Wait for new replica mode update then check the engine status.
Check the volume data.
Live upgrade to the wrong_engine_upgrade_image again.
Live upgrade will still fail.
Detach the volume.
The engine image for the volume will now be upgraded (since the wrong image is still compatible)
Upgrade to the original_engine_image when detached
Attach the volume and check states and data.

def test_engine_live_upgrade_while_replica_concurrent_rebuild(client, volume_name)

Test the ConcurrentReplicaRebuildPerNodeLimit won't affect volume live upgrade: 1. Set ConcurrentReplicaRebuildPerNodeLimit to 1. 2. Create 2 volumes then attach both volumes. 3. Write a large amount of data into both volumes, so that the rebuilding will take a while. 4. Deploy a compatible engine image and wait for ready. 5. Make volume 1 and volume 2 state attached and healthy. 6. Delete one replica for volume 1 to trigger the rebuilding. 7. Do live upgrade for volume 2. The upgrade should work fine even if the rebuilding in volume 1 is still in progress.

def test_engine_live_upgrade_with_intensive_data_writing(client, core_api, volume_name, pod_make)

Test engine live upgrade with intensive data writing

Deploy a compatible new engine image
Create a volume(with the old default engine image) with /PV/PVC/Pod and wait for pod to be deployed.
Write data to a tmp file in the pod and get the md5sum
Upgrade the volume to the new engine image without waiting.
Keep copying data from the tmp file to the volume during the live upgrade.
Wait until the upgrade completed, verify the volume engine image changed
Wait for new replica mode update then check the engine status.
Verify all engine and replicas' engine image changed
Verify the reference count of the new engine image changed
Check the existing data. Then write new data to the upgraded volume and get the md5sum.
Delete the pod and wait for the volume detached. Then check engine and replicas's engine image again.
Recreate the pod.
Check if the attached volume is state healthy rather than degraded.
Check the data.

def test_engine_offline_upgrade(client, core_api, volume_name)

Test engine offline upgrade

Get a compatible engine image with the default engine image, and deploy
Create a volume using the default engine image
Attach the volume and write data into it
Detach the volume and upgrade the volume engine to the new engine image
Make sure the new engine image reference count has increased to 1
Make sure we cannot delete the new engine image now (due to reference)
Attach the volume and verify it's using the new image
Verify the data. And verify engine and replicas' engine image changed
Detach the volume
Upgrade to the old engine image
Verify the volume's engine image has been upgraded
Attach the volume and verify the data