Applications running inside the Kubernetes cluster need to be scaled according to the load it encounters with. Scaling is an important process for better performance of the application. Kubernetes provides Horizontal Pod Autoscaler (HPA) to scale applications with the help of resource metrics (CPU and Memory utilization).
But scaling on the basis of this is not enough, custom and external metrics are also required when the application is complex and using other components with it. And for this Kubernetes Event-Driven Autoscaling (KEDA) is used. More about HPA and KEDA can be found through our hands-on lab on it.
In this hands-on lab, we are taking one more step ahead in learning about scaling applications in the Kubernetes cluster by exposing applications metrics to Prometheus and scaling it with the help of KEDA.
Lab Setup
You can start the lab setup by clicking on the Lab Setup button on the right side of the screen. Please note that there are app-specific URLs exposed specifically for the hands-on lab purpose.
Our lab has been set up with all necessary tools like base OS (Ubuntu), developer tools like Git, Vim, wget, and others.
About KEDA And Prometheus
In this section, let’s quickly get acquainted with KEDA and Prometheus.
KEDA, a Kubernetes Event-Driven Autoscaler can be installed in the Kubernetes cluster alongside HPA to scale the applications on the basis of events it triggered. It extends the functionality of HPA.
KEDA comes with 30+ built-in scalers (events) through which applications scaled easily. Some of the scalers are Prometheus, Redis, etc. KEDA helps in optimizing the cost, it can scale resources from 0 to 1 or 1 to 0. Scaling from 1 to n and back is being done by HPA.
Prometheus, open-source software used for metrics-based monitoring and generating alerts is a project maintained under the hood of the Cloud Native Computing Foundation. It scrapes metrics from applications and stores them in a time-series database. It offers a query language PromQL which helps in querying the database and analyzing the performance of applications.
From the above diagram, it can be clearly seen that metrics adapter
in KEDA fetches the application metrics from Prometheus scaler and on the basis of configuration in the Prometheus Scaled Object
, KEDA and HPA then scale the application accordingly.
Lab With KEDA and Prometheus
As we triggered the lab through the LAB SETUP button, a terminal, and an IDE comes for us which already have a Kubernetes cluster running in it. This can be checked by running the kubectl get nodes
command.
KEDA Installation
There are many ways to deploy KEDA in a Kubernetes cluster, and we will be installing it using helm.
- First, do HELM installation
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh
- Add Keda’s helm repo and install it inside
keda
namespace.
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
kubectl create namespace keda
helm install keda kedacore/keda --namespace keda
Prometheus Installation
- Install Prometheus in the Kubernetes cluster through helm by adding its repo.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/prometheus
- Check the Keda and Prometheus installation by checking its pod status
kubectl get pods -n keda
kubectl get pods
- To access the Prometheus Server through one of the app URLs under Lab-URLs section, expose
prometheus-server
to NodePort service.
kubectl get svc
kubectl edit svc prometheus-server
In this, change the type:ClusterIP
to type: NodePort
and also add nodePort:30000
under ports attribute.
Check the ports of the prometheus-server
by executing the kubectl get svc
command. Also, access the Prometheus server through the prometheus-ui app-URLs under the Lab-URLs section.
Setting Up Application
- Deploy the frontend of the application by creating a deployment and exposing it through a service.
kubectl apply -f frontend.yaml
- Create the backend of the app by creating its deployment and exposing it through service.
kubectl apply -f backend.yaml
kubectl get pods,svc
- To access the application through browser deploy the ingress for it
kubectl apply -f ingress.yaml
kubectl get ingress
Now, access the app through the app-port-80
URL under the LAB-URL section and will get an rsvp app like shown in the image below.

- To check the pod metrics configure the metrics server. Keda already has a metrics adapter for this but as an end-user to see CPU utilization, this needs to be installed.
kubectl apply -f components.yaml
kubectl get pods -n kube-system
- Check the resource utilization of the pods by running the following command
kubectl top pods
- Now to increase the load and usage on the app, install locust through pip, and install Flask as a prerequisite for locust.
pip install locust
- Create a locustfile for load testing
locust -f locust_file.py --host <APP_URL> --users 500 --spawn-rate 20 --web-port=8089
Here, replace <APP_URL>
with the rsvp app URL and access the locust UI from the app-port-8089
URL under the lab URL section and will see a locust UI as shown in the image below.
Click the Start swarming
button in the locust UI to enable the load on the rsvp app and will see an output like below

- To enable scaling of pods with KEDA and Prometheus, create a
Prometheus ScaledObject
forrsvp
deployment (frontend.yaml
)
Here, inside scaleTargetRef
(line 8), the name of the resource and by default, its kind is deployment is mentioned along with the type of triggers currently here Prometheus. Prometheus trigger(line 10) has certain details such as:
- serverAddress - It will have a URL to the Prometheus server. Replace the
<PROMETHEUS-UI URL>
with theprometheus-ui
URL that you will get at under the Lab-URLs section.
- metricName: It will contain the name of the metric which will be used for scaling. Here we are using
container_cpu_usage_seconds_total
to find the CPU usage by thersvp
deployment.
- threshold: This will have the value of CPU usage by deployment and as soon as it reaches this threshold, the deployment will be scaled up.
- query: It has the PromQL query to scrape the metrics from the application and start scaling according to the response. Currently, it has
sum(container_cpu_usage_seconds_total{pod=~"rsvp-.*"})
, this query will give the sum of the CPU usage by all the pods ofrsvp
deployment.
kubectl apply -f prometheus-scaler.yaml
kubectl get scaledobjects
This ScaledObject
will also create HPA, check that through the following command
kubectl get hpa
- As soon as the load starts getting increased on the application, KEDA starts working and will scale up the pods with HPA.
kubectl top pods
kubectl get hpa
kubectl get pods
Conclusion
In this hands-on lab, we saw how to scale the application by using KEDA and Prometheus.