Canary Deployment with Analysis using Argo Rollouts
Hope you have gone through and enjoyed the first two parts of our progressive delivery with the Argo Rollout series, where we have seen how one can implement the blue-green and canary deployment strategy by deploying a sample application using the Argo Rollout controller in a Kubernetes cluster.
In Part 3 of this series, we will be taking a step further and exploring the canary deployment strategy with automated analysis by deploying a sample app using the Argo Rollouts. This would help us to learn how we can either fully promote or roll back our next upgrade of microservices with ease, without impacting end users, and more importantly, without any human intervention.
What is Analysis? Why is it needed?
When we are performing upgrades or deployments of new versions of our microservices, we also want to be sure that these new changes are not breaking out any functionality. To be sure of this, one needs to perform certain functional and sanity testing after every upgrade. Thanks to the Argo Rollout Analysis, this kind of testing can be performed before/during/after the upgrade in an automated way, and based on the results of the analysis, either we can roll forward or roll back the new changes completely.
Now, you must be wondering how to implement this so-called Analysis right?
For that, one needs to create and apply an AnalysisTemplate object, which gets triggered by Argo Rollout objects, that creates another k8s object called AnalysisRun. This AnalysisRun object will eventually run an analysis of your choice to decide if your upgrade is successful to roll forward or unsuccessful to roll back
For analysis, you can use metrics scraped from your canary services with the help of different monitoring providers like Prometheus/Datadog/NewRelic, etc., or you can create your own Kubernetes jobs as well to trigger your own custom set of tests or, if needed, you can perform some HTTP request against some external service and decide further.
Analysis based on an AnalysisTemplate
There are different ways to perform the analysis on the following sample analysis template as part of Rollouts.
Sample AnalysisTemplate
In the below example, we can see how we can calculate the success rate of the new canary version using Istio-based Prometheus metrics which checks how many total HTTP requests are getting http 5xxx error. Here, based on the successCondition
defined by us below, if more than 95% of requests are successful, then this analysis would be called Successful here and would promote our rollouts.
# https://argoproj.github.io/argo-rollouts/features/analysis/#background-analysis apiVersion: argoproj.io/v1alpha1 kind: AnalysisTemplate metadata: name: success-rate spec: args: - name: service-name - name: prometheus-port value: 9090 metrics: - name: success-rate successCondition: result[0] >= 0.95 provider: prometheus: address: "http://prometheus.example.com:" query: | sum(irate( istio_requests_total{reporter="source",destination_service=~"",response_code!~"5.*"}[5m] )) / sum(irate( istio_requests_total{reporter="source",destination_service=~""}[5m] ))
Background Analysis
We can run our analysis in the background while our canary rollout is progressing through its rollout steps.
Sample code of how to do analysis in the background using Rollouts. Look specifically for where the analysis
section has been mentioned below:
# https://argoproj.github.io/argo-rollouts/features/analysis/#background-analysis apiVersion: argoproj.io/v1alpha1 kind: Rollout metadata: name: guestbook spec: ... strategy: canary: analysis: templates: - templateName: success-rate startingStep: 2 # delay starting analysis run until setWeight: 40% args: - name: service-name value: guestbook-svc.default.svc.cluster.local steps: - setWeight: 20 - pause: {duration: 10m} - setWeight: 40 - pause: {duration: 10m} - setWeight: 60
Inline Analysis
We can perform our analysis inline as part of the rollout steps. When we declare that our analysis should be performed inline, then the analysis will be triggered only when that step is reached. It holds the further rollout until the analysis is complete. The success or failure of the analysis run decides if the rollout will proceed to the next step or abort the rollout completely.
Sample code of how to do inlined analysis using Rollouts. Look specifically for where the analysis
section has been mentioned below as part of i.e inlined into steps:
# https://argoproj.github.io/argo-rollouts/features/analysis/#inline-analysis apiVersion: argoproj.io/v1alpha1 kind: Rollout metadata: name: guestbook spec: ... strategy: canary: steps: - setWeight: 20 - pause: {duration: 5m} - analysis: templates: - templateName: success-rate args: - name: service-name value: guestbook-svc.default.svc.cluster.local
BlueGreen Pre Promotion Analysis
A Rollout using the BlueGreen strategy can launch an AnalysisRun before it switches traffic to the new version using pre-promotion. This can be used to block the Service selector switch until the AnalysisRun finishes successfully. The success or failure of the AnalysisRun decides if the Rollouts switches traffic or aborts the Rollout completely.
# https://argoproj.github.io/argo-rollouts/features/analysis/#bluegreen-pre-promotion-analysis apiVersion: argoproj.io/v1alpha1 kind: Rollout metadata: name: guestbook spec: ... strategy: blueGreen: activeService: active-svc previewService: preview-svc prePromotionAnalysis: templates: - templateName: smoke-tests args: - name: service-name value: preview-svc.default.svc.cluster.local
BlueGreen Post Promotion Analysis
A Rollout using a BlueGreen strategy can launch an analysis run after the traffic switch to the new version using post-promotion analysis. If the post-promotion analysis fails, the Rollout enters an aborted state and switches traffic back to the previous stable Replicaset. When post-analysis is Successful, the Rollout is considered fully promoted, and the new ReplicaSet will be marked as stable.
# https://argoproj.github.io/argo-rollouts/features/analysis/#bluegreen-post-promotion-analysis apiVersion: argoproj.io/v1alpha1 kind: Rollout metadata: name: guestbook spec: ... strategy: blueGreen: activeService: active-svc previewService: preview-svc scaleDownDelaySeconds: 600 # 10 minutes postPromotionAnalysis: templates: - templateName: smoke-tests args: - name: service-name value: preview-svc.default.svc.cluster.local
Now, you should be familiar with the crux of how analysis plays its role in the overall rollout. So let’s get our hands dirty now with some hands-on as doing is actually learning.
Lab of Argo Rollout with Canary Deployment And Analysis
As we triggered the lab through the LAB SETUP button, a terminal, and an IDE comes for us which already have a Kubernetes cluster running in it. This can be checked by running the kubectl get nodes
command.
- Clone the Argo Rollouts example GitHub repo or preferably, please fork this
git clone https://github.com/NiniiGit/argo-rollouts-example.git
Installation of Argo Rollouts controller
- Create the namespace for installation of the Argo Rollouts controller and Install the Argo Rollout through the below command, more about the installation can be found here.
kubectl create namespace argo-rollouts kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml
You will see that the controller and other components have been deployed. Wait for the pods to be in the Running
state.
kubectl get all -n argo-rollouts
- Install Argo Rollouts Kubectl plugin with
curl
for easy interaction with Rollout controller and resources.
curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64 chmod +x ./kubectl-argo-rollouts-linux-amd64 sudo mv ./kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts kubectl argo rollouts version
- Argo Rollouts comes with its own GUI as well that you can access with the below command
kubectl argo rollouts dashboard
and now by clicking on the available argo-rollout-app
URL on the right side under the LAB-URLs section.
you would be presented with UI as shown below(currently it won’t show you anything since we are yet to deploy any Argo Rollouts based app)
Now, let’s go ahead and deploy the sample app using the Canary Deployment strategy and analysis.
Canary Deployment And Analysis with Argo Rollouts
To experience how the Canary deployment via analysis works with Argo Rollouts, we will deploy the sample app which contains Rollouts with canary strategy, Service, and Ingress as Kubernetes objects.
analysis.yaml
content:
kind: AnalysisTemplate apiVersion: argoproj.io/v1alpha1 metadata: name: canary-check spec: metrics: - name: test provider: job: spec: backoffLimit: 1 template: spec: containers: - name: busybox image: busybox #args: [test] #--> for making analysis fail, uncomment restartPolicy: Never
rollout.yaml
content:
apiVersion: argoproj.io/v1alpha1 kind: Rollout metadata: name: rollouts-demo spec: replicas: 5 strategy: canary: steps: - setWeight: 20 - pause: {duration: 10} - analysis: templates: - templateName: canary-check - setWeight: 40 - pause: {duration: 10} - setWeight: 60 - pause: {duration: 10} - setWeight: 80 - pause: {duration: 10} revisionHistoryLimit: 2 selector: matchLabels: app: rollouts-demo template: metadata: labels: app: rollouts-demo spec: containers: - name: rollouts-demo image: argoproj/rollouts-demo:blue ports: - name: http containerPort: 8080 protocol: TCP resources: requests: memory: 32Mi cpu: 5m
service.yaml
content:
apiVersion: v1 kind: Service metadata: name: rollouts-demo spec: ports: - port: 80 targetPort: http protocol: TCP name: http selector: app: rollouts-demo
ingress.yaml
content:
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: rollout-ingress annotations: kubernetes.io/ingress.class: nginx spec: rules: - http: paths: - path: / pathType: Prefix backend: service: name: rollouts-demo port: number: 80
- Now, let’s create all these objects for now in the
default
namespace. Please execute the below commands
kubectl apply -f argo-rollouts-example/canary-deployment-withanalysis-example/
You would be able to see all the objects been created in the default namespace by running the below commands
kubectl get all
Now, you can access your sample app, by clicking on the app-port-80
URL under the LAB-URLs section.
- You would be able to see the app as shown below:
- Now, again visit the Argo-Rollout console through the
app-rollout-app
URL. And this time, you could see the sample deployed on theArgo Rollouts console as below
You can click on this rollout-demo
in the console and it will present you with its current status of it as below
Again, either you can use this GUI or else (preferably) use the command shown below to continue with this demo.
- You can see the current status of this rollout by running the below command as well
kubectl argo rollouts get rollout rollouts-demo
When Analysis is successful
- Now, let’s deploy the Yellow version of the app using the canary strategy via the command line
kubectl argo rollouts set image rollouts-demo rollouts-demo=argoproj/rollouts-demo:yellow
You would be able to see a new i.e yellow version-based pod of our sample app, coming up.
kubectl get pods
Currently, only 20% i.e 1 out of 5 pods with a yellow version will come online, and then it will pause
for 10 sec before it initiates Analysis
as we have mentioned in the steps above. See line 11 in the rollout.yaml
- For Rollout, if Analysis,
AnalysisRun
triggered byAnalysisTemplate
to be precise is successful then it understands that it can promote the rollout else it will safely roll back to the previous revision.
kubectl get AnalysisTemplate
Let’s confirm if it has created AnalysisRun or not
kubectl get AnalysisRun
- This
AnalysisRun
will eventually create a Kubernetes job to execute and based on its exit status, it would either roll forward or rollback.
kubectl get job -o wide
In order to show how Rollout promotes in case of your Analysis is successful, we are just running a busybox
container that will eventually exit with status code 0. This makes rollout believe that the analysis is been successfully completed and it’s good to proceed ahead.
- Execute the below command and you would be able to see Analysis being getting executed as part of rollouts
kubectl argo rollouts get rollout rollouts-demo
Also on the Argo console, you would be able to see below the kind of new revision of the app with the changed image version running.
If you visit the app URL on app-port-80
, you would still see only the majority blue version, and a very less number of yellow is visible initially because Rollout is waiting for the results of the Analysis job to decide whether it would promote and proceed with the rest of the canary deployment steps OR it will Rollback.
- Once the Analysis is successful, you would see more of the yellow version app running as below
- Now let’s delete this setup before we test how Argo Rollouts behave in case of Analysis gets failed
kubectl delete -f argo-rollouts-example/canary-deployment-withanalysis-example/
When Analysis is unsuccessful
To verify how Argo Rollout automatically roll back the new revision in case of Analysis is not successful, we would make this time Analysis
to get purposefully failed. To do that,
Let’s open the cloned repo in the online editor. Access the OPEN IDE
, which will open the VS code like editor in another tab.
- Open the
analysis.yaml
from the editor and now let’s uncomment line number 17 from theanalysis.yaml
and save it.
Now, this change will basically make your Analysis
fail and will show you how Rollout will roll back to the old revision itself. Repeat all the steps that we did earlier to make the analysis unsuccessful.
Conclusion
In this blog, we experienced how we can use the analysis feature provided by Argo Rollouts to achieve an automated canary deployment style of progressive delivery. Achieving canary deployment in this way with Argo Rollouts is simple and importantly provides much better-automated control on rolling out a new version of your application than using the default rolling update strategy of Kubernetes.
What Next?
Now we have developed some more understanding of progressive delivery and created a canary deployment with an analysis of it. Next would be diving deeper to try the last part of this series, i.e, canary deployment with traffic management using Argo Rollouts, stay tuned for this post.
You can find all the parts of this Argo Rollouts Series below:
- Part 1: Progressive Delivery with Argo Rollouts: Blue Green Deployment
- Part 2: Progressive Delivery with Argo Rollouts: Canary Deployment
- Part 3: Progressive Delivery with Argo Rollouts: Canary Deployment with Analysis