Related issue
https://github.com/longhorn/longhorn/issues/1895
Longhorn v1.1.1 handles the error during snapshot purge better and reports to Longhorn-manager.
Scenario-1
- Create a volume with 3 replicas and attach to a pod.
- Write some data into the volume and take a snapshot.
- Delete a replica that will result in creating a system generated snapshot.
- Wait for replica to finish and take another snapshot.
- ssh into a node and resize the latest snapshot. (e.g
dd if=/dev/urandom count=50 bs=1M of=<SNAPSHOT-NAME>.img) - Trigger snapshot purge by delete the oldest snapshot.
- Verify the replica (on the node from step 5) shows error
file sizes are not equal and the parent file is larger than the child fileand starts to rebuild.
Scenario-2
- Create a volume with 3 replicas and attach to a pod.
- Write some data into the volume and take two snapshots.
- Delete a replica that will result in creating a system generated snapshot.
- While the rebuilding is in progress, delete a snapshot to trigger SnapshotPurge.
- Verify that Longhorn manager reports error like
Failed to purge snapshots: REPLICA_ADDRESS: cannot purge snapshots because REPLICA_ADDRESS is rebuilding