Longhorn disaster recovery jobs

Longhorn is a brilliant piece of software (especially for those who cannot use block volumes on their Kubernetes cluster), but it can (and does) happen to have some volumes getting degraded and thus requiring to recover from a backup.

Usually I use one of two approaches / yaml files to achieve recovery

Case 1 - Manual recovery

Let’s assume that we have a volume (pvc my-volume) which is fully functional, but we want to recover some files from the previous backup. To do so, mount the old backup as a new volume, and create a new pvc for it (my-volume-old).

If you can support a bit of a downtime, then the fastest solution would be to delete the workload using my-volume so that it can be used by another pod (unless your volumes do support ReadWriteMany), then mount old and new volume through a following job

 1apiVersion: batch/v1
 2kind: Job
 3metadata:
 4  name: migration
 5spec:
 6  completions: 1
 7  parallelism: 1
 8  backoffLimit: 3
 9  template:
10    metadata:
11      name: volume-migration
12      labels:
13        name: volume-migration
14    spec:
15      restartPolicy: Never
16      containers:
17        - name: volume-migration
18          image: alpine
19          tty: true
20          command: [ "/bin/sh" ]
21          args: [ "-c", "sleep 10000" ]
22          volumeMounts:
23            - name: new-vol
24              mountPath: /mnt/new
25            - name: old-vol
26              mountPath: /mnt/old
27      volumes:
28        - name: old-vol
29          persistentVolumeClaim:
30            claimName: my-volume-old # change to data source pvc
31        - name: new-vol
32          persistentVolumeClaim:
33            claimName: my-volume # change to data target pvc

it will create a new pod to which you can attach a shell and perform copy/paste operation.

Once you have recovered files you need, simply delete the job and recreate your preexisting workload.

Automatic recovery.

Let’s assume that you have a volume which you want to recover to a previous backup (for the semplicity I am assuming that it’s empty, if not you probably want to amend the code and append rm -rf as the first command).

As in previous case, mount the backup as a volume, and create a pvc for it. Then you can fire off the job which will automatically copy over files from the old volume to the new one while maintaining the owner and all file masks.

 1apiVersion: batch/v1
 2kind: Job
 3metadata:
 4  name: volume-migration
 5spec:
 6  completions: 1
 7  parallelism: 1
 8  backoffLimit: 3
 9  template:
10    metadata:
11      name: volume-migration
12      labels:
13        name: volume-migration
14    spec:
15      restartPolicy: Never
16      containers:
17        - name: volume-migration
18          image: ubuntu:xenial
19          tty: true
20          command: [ "/bin/sh" ]
21          args: [ "-c", "apk add rsync && rsync -arv /mnt/old/ /mnt/new/" ]
22          volumeMounts:
23            - name: old-vol
24              mountPath: /mnt/old
25            - name: new-vol
26              mountPath: /mnt/new
27      volumes:
28        - name: old-vol
29          persistentVolumeClaim:
30            claimName: qbittorrent-backup # change to data source pvc
31        - name: new-vol
32          persistentVolumeClaim:
33            claimName: qbittorrent # change to data target pvc

Manual mount of an existing volume

In casae you need to mount an existing volume (ie in order to investigate the file structure, you can mount it via

apiVersion: batch/v1 kind: Job metadata: name: migration spec: completions: 1 parallelism: 1 backoffLimit: 3 template: metadata: name: volume-migration labels: name: volume-migration spec: restartPolicy: Never containers: - name: volume-migration image: alpine tty: true command: [ “/bin/sh” ] args: [ “-c”, “sleep 10000” ] volumeMounts: - name: new-vol mountPath: /mnt/new volumes: - name: new-vol persistentVolumeClaim: claimName: home-assistant

Copyright

Comments