Backup and Restore

In this tutorial, we will discuss about various backup and restore methodologies in Kubernetes cluster.

Let’s start by looking at what we should consider backing up in a Kubernetes cluster.

So far we have deployed a number of different applications on our Kubernetes cluster using deployment, POD, Service definition files.

We know that the ETCD cluster is where all cluster related information is stored. And if our applications are configured with persistent storage then that is another candidate for backups.

Imperative

With respective to resources that we created in the cluster, at times we used the imperative way of creating an object by executing a command. Such as while creating a namespace or secret or configmap or at times for exposing applications.

Declarative

And at times we used the declarative approach by first creating a definition file and then running the kubectl apply command on that file. This is the preferred approach if we want to save our configuration. Because now we have all the objects required for a single application in the form of object definition files in a single folder.

This can be easily be reused at a later time or shared with others. Of course we must have a copy of these files saved at all times.

A good practice is to store these on source code repositories. That it can be maintained by a team. The source code repository should be configured with the right backup solutions.

With managed/public source code repositories like GitHub, we don’t have to worry about this. With that even when we lose our entire cluster, we can redeploy our application on the cluster by simply applying this configuration files on them.

While the declarative approach is the preferred approach it is not necessary that all of our team members stick to those standards. What if someone created an object the imperative way without documenting that information anywhere?

So a better approach to backing up resource configuration is to use query the Kube API Server.

Backup – Resource Configs

Query the Kube API Sever using the Kubectl or by accessing the API server directly and save all resource configurations for all objects created on the cluster has a copy.

For example, one of the commands that can be used in a backup script is to get all PODs, deployments and services in all namespaces using the following command

$ kubectl get all --all-namespaces -o yaml > all-deploy-services.yaml

Using the above kubectl utility get all command and extract the output in a YAML format. Then save that file. And that’s just for a few resources groups. There are many other resource groups that must be considered.

Of course you don’t have to develop that solutions yourself. There are tools like ARK or now called Velero by heptio that can do this for you. It can help in taking backups of your Kubernetes cluster using the Kubernetes API.

Backup – ETCD

Let us now move on to ETCD. The ETCD cluster stores information about the state of the cluster. So information about the cluster itself, the nodes and every other resource as created within the cluster are stored here.

So instead of backing up resource as before, you may chose to backup the ETCD server itself. As we have seen the ETCD cluster is hosted on the master nodes.

While configuring ETCD we specified a location where all the data would be stored. The data directory. That is the directory that can be configured to be backup by your backup tool.

ETCD also comes with a built in snapshot solution. You can take a snapshot of the ETCD database by using the etdctl utilities snapshot save command.

$ etcdctl snapshot save snapshot.db

$ ls 
snapshot.db

Once you run the above command, A snapshot file is created by the name in the current directory.

If you wanted to be created in another location specified the full path. You can view the status of the backup using the snapshot status command.

Restore – ETCD

To restore the cluster from this backup at a later point in time, first stop the kube-api server service, as the restore process will require you to restart the ETCD cluster and the kube-api server depends on it. Then run the etcdctl snapshot restore command with the path set to the path of the backup file which is the snapshot.db file.

When ETCD restores from a backup, it initializes a new cluster configuration and configures the members of ETCD as new members to a new cluster. This is to prevent a new member from accidentally joining an existing cluster.

Say for example, you use this backup snapshot to provision a new etcd-cluster for testing purposes. You don’t want the members in the new test cluster to accidentally join the production cluster.

So during a restore you must specify a new cluster token and the same initial cluster configuration options specified in the original configuration file. On running this command a new data directory is created.

We then configure the ETCD configuration file to use the new cluster-token and data directory. The reload the service daemon and restart etcd service.

Finally start the kube-apiserver service. Your cluster should now be back in the original state.

Backup and Restore