Operating System Upgrade
In this tutorial, we will discuss about cluster maintenance purposes like upgrade operating system or applying patches like security patches etc. on your cluster.
So you have a cluster with few nodes and PODs serving applications. What happens when one of these nodes go down? Of course the PODs on them are not accessible.
Now depending upon how you deployed these PODs, your users may be impacted.
Simple Example
For example, since you have multiple replicas of the blue POD, the users accessing the blue application are not impacted as they are being served through the other blue POD that’s online.
However users accessing the green POD, are impacted as that was the only POD running the green application. Now what does Kubernetes do in this case?
If the node come back online immediately, then the kubelet process starts and the PODs come back online. However, if the node was down for more than 5 minutes, then the PODs are terminated from that node.
Well, Kubernetes considers them as dead. If the PODs where part of a replicaset then they are recreated on other nodes. The time it waits for a POD to come back online is known as the POD eviction timeout and is set on the controller manager with a default value of 5 minutes.
So when ever node goes offline, the master node waits for up to 5 minutes before considering the node dead.
When the node comes back online after the POD eviction timeout it comes up blank without any PODs scheduled on it.
Since the blue POD was part of a replicaset, it had a new POD created on another node. However since the green POD was not part of the replicaset it is just gone.
Thus if you have maintenance tasks to be performed on a node, if you know that the workloads running on the node have other replicas and if it’s okay that they do down for a short period of time. And if you’re sure the node will come back online within 5 minutes you can make a quick upgrade and reboot.
However you don’t for sure know if a node is going to be back online in 5 minutes. Well you cannot for sure say it is going to be back at all. So there is a safer way to do it.
You can purposefully drain the node of all the workloads. So that the workloads are moved to other nodes in the cluster.
Well technically they are not moved. When you drain the node the PODs are gracefully terminated from the node that they’re on and recreated on another.
$ kubectl drain node1
The node is also cordoned or marked as unschedulable. Meaning no PODs can be scheduled on this node until you specifically remove the restriction.
Now that the PODs are safe on the other nodes, you can reboot the first node. When it comes back online it is still unschedulable.
You then need to uncordon it, so that PODs can be scheduled on it again.
$ kubectl uncordon node1
Now, remember the PODs that were moved to the other nodes, don’t automatically fall back. If any of those PODs were deleted or if new PODs were created in the cluster, then they would be created on this node.
Apart from drain and uncordon, there is also another command called cordon.
$ kubectl cordon node1
Cordon simply marks a node unschedulable. Unlike drain it doesn’t terminate or move the PODs on an existing node. It simply makes sure that new PODs are not scheduled on that node.