Restoring and Backing Up Kubernetes Volumes Content

Kubernetes cluster often involves backing up and restoring the contents of volume into one another. Suppose, you have postgres database data stored in a volume of certain volume but now you need to add more capacity due to increasing number of rows in your database. In this case you might first backup the contents of volume into newly mounted volume.
This process basically involves:
- Creating a PersistentVolume for a new volume.
- Copying the content of Old volume into new.
This Process Can be done by creating a simple Job Resource in a Kubernetes. Basically what we’ll be doing is mounting a source (old volume ) and destination ( new volume ) into a busy-box image container. And we’ll execute a copy command to copy contents from one to another. Let’s start with creating a Persistent-volume i.e. adding a storage to the Kubernetes cluster. Following is a Persistent Volume definition for newly created Persistent Volume. Basically it’s named target-pv cause it is where we’ll be storing the copy of existing volume’s content.
# pv for new volume
apiVersion: v1
kind: PersistentVolume
metadata:
name: target-pv
namespace: dj-kubernetes
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local10
hostPath:
path: "/home/app/copy"
Script above will create a volume using a kubernetes cluster:
- On a host path of /home/app/copy
- It will have a capacity of 1Gb.
- It can be mounted as ReadWrite once (i.e. by a single pod)
Now let’s use this storage volume with busybox container to copy from another persistent volume.
That was for a target / new volume where a files will be copied. Now let’s see a Persistent Volume for source (old volume) from where a content will be copied.
# pv and PVC for Old Volume
apiVersion: v1
kind: PersistentVolume
metadata:
name: postgres-pv
namespace: dj-kubernetes
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local1
hostPath:
path: "/home/app/postgres"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pvc
namespace: dj-kubernetes
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: local1
So, above is the definition of existing postgres database volume. Claim named postgres-pvc is a volume claim for postgres. Basically it stores the content of postgres database into a hostpath /home/app/postgres.
# pvc for new volume with Job to copy content
apiVersion: v1
kind: PersistentVolume
metadata:
name: target-pv
namespace: dj-kubernetes
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local10
hostPath:
path: "/home/app/copy"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: target-pvc
namespace: dj-kubernetes
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: local10
---
apiVersion: batch/v1
kind: Job
metadata:
name: copy-postgres-data
namespace: dj-kubernetes
spec:
template:
spec:
restartPolicy: Never
volumes:
- name: source-volume
persistentVolumeClaim:
claimName: postgres-pvc
- name: target-volume
persistentVolumeClaim:
claimName: target-pvc
containers:
- name: copy-container
image: eeacms/rsync
command: ["sh", "-c", "rsync -av --progress --no-perms --ignore-existing /source/ /target/"]
volumeMounts:
- mountPath: /source
name: source-volume
- mountPath: /target
name: target-volume
Notice two volumeMounts:
template:
spec:
restartPolicy: Never
volumes:
- name: source-volume
persistentVolumeClaim:
claimName: postgres-pvc
- name: target-volume
persistentVolumeClaim:
claimName: target-pvc
containers:
- name: copy-container
image: eeacms/rsync
command: ["sh", "-c", "rsync -av --progress --no-perms --ignore-existing /source/ /target/"]
volumeMounts:
- mountPath: /source
name: source-volume
- mountPath: /target
name: target-volume
Here we are using eeacms/rsync docker image to execute a command
rsync -av --progress --no-perms --ignore-existing /source/ /target/
Where source is a mount path for source storage i.e. our old volume while target is a mount path for target or new volume. We have two claim names, target-pvc and postgres-pvc which are basically used to allocate a volume inside the container / pod.