Enable Rolling updates in Kubernetes with Zero downtime

Recently, a couple of my colleagues were trying to deploy some applications using Kubernetes and noticed a little bit of a downtime when they tried to deploy a couple of updates. They were confused over why this happened as they thought that rolling updates would be enabled by default in Kubernetes. So I thought I’d write this article up to help those who are confused as to why this is happening.

Users expect applications to be available all the time and developers are expected to deploy new versions of them several times a day. In Kubernetes this is done with rolling updates. Rolling updates allow Deployments’ update to take place with zero downtime by incrementally updating Pods instances with new ones. The new Pods will be scheduled on Nodes with available resources.

Let’s take a simple deployment manifest.

This should work fine when you execute the following command and a deployment called hello-dep will be created.

But imagine you want to update the image running in the above deployment? Then you probably have to do a kubectl edit or just edit the yaml file as following and re-deploy using kubectl apply -f

Let’s see the edited yaml file

Once you apply this or edit this, notice that there maybe a little downtime on your application because the old pods are getting terminated and the new ones are getting created. You can easily notice a downtime if you open a new tab on your terminal and run a curl command your exposed service every second. Note that I have not explained about how to create or expose a service for a deployment in this post.

Okay, now for the fix. The fix is pretty easy actually. This happens because kubernetes doesn’t know when your new pod is ready to start accepting requests, so as soon as your new pod gets created, the old pod is terminated without waiting to see if all the necessary services, processes have started in the new pod which would then enable it to receive requests.

To do this, Kubernetes provide a config option in deployment called Readiness Probe. Readiness Probe makes sure that the new pods created are ready to take on requests before terminating the old pods. To enable this, first you need to have a route in whatever the application you want to run which would return a 200 on an HTTP GET request. (Note: you can have other HTTP request methods as well, but for this post, I’m sticking with GET method)

Here are the detailed information of the fields I have added above.

  • initialDelaySeconds: Number of seconds after the container has started before readiness probes are initiated.
  • periodSeconds: How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1.
  • timeoutSeconds: Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1.
  • successThreshold: Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness. Minimum value is 1.
  • failureThreshold: When a Pod starts and the probe fails, Kubernetes will try failureThreshold times before giving up. Giving up in case of liveness probe means restarting the Pod. In case of readiness probe the Pod will be marked Unready. Defaults to 3. Minimum value is 1.

Another thing we should add is something called RollingUpdate strategy and it can be configured as follows.

The above specifies the strategy used to replace old Pods by new ones. The type can be “Recreate” or “RollingUpdate”. “RollingUpdate” is the default value. Hence the confusion of my team mates as to why Rolling updates didn’t work by default. It did, But the Kubernetes didn’t know at what time your pod was ready, so they had a downtime due to that.

maxUnavailable is an optional field that specifies the maximum number of Pods that can be unavailable during the update process. The value can be an absolute number (for example, 5) or a percentage of desired Pods (for example, 10%). The absolute number is calculated from percentage by rounding down. The value cannot be 0 if maxSurge is 0. The default value is 25%.

maxSurge is an optional field that specifies the maximum number of Pods that can be created over the desired number of Pods. The value can be an absolute number (for example, 5) or a percentage of desired Pods (for example, 10%). The value cannot be 0 if MaxUnavailable is 0. The absolute number is calculated from the percentage by rounding up. The default value is 25%.

With all the above, configurations, Let’s take a look at the final deployment file.

There you have it. I hope this was informational to you some what. Will be back with some more cool new posts on Kubernetes.