Autoscaling is one of the important features of Kubernetes and Horizontal Pod Autoscaler (HPA) comes to mind when pods need to be scaled on the basis of CPU and memory consumption. You can find out more about autoscaling and HPA through our Autoscaling in Kubernetes hands-on lab.
HPA is a good option to scale applications on the basis of CPU and memory metrics but in some cases, this is not enough especially when different and complex components are integrated with applications.
In this hands-on lab, we will be going to learn about the limitations of HPA and how KEDA helps in solving it.
Lab Setup
You can start the lab setup by clicking on the Lab Setup button on the right side of the screen. Please note that there are app-specific URLs exposed specifically for the hands-on lab purpose.
Our lab has been set up with all necessary tools like base OS (Ubuntu), developer tools like Git, Vim, wget, and others.
Limitations of Horizontal Pod Autoscaler (HPA)
The limitations of HPA are :
- No external metric support: When working with distributed and complex applications, different components (event sources) are used with it such as Prometheus, Apache Kafka, cloud providers, and other events. As one would want to scale applications based on external metrics available through these events and not only through CPU and memory utilization.
- Scaling down to zero is impossible: HPA doesn’t have the functionality to scale pod replicas to zero when the load on the application is zero. It can only scale pods from 1 to n number of replicas, cannot scale it down to zero, or scale it up from zero to one.
So, to solve these problems, Kubernetes Event-Driven Autoscaler (KEDA) got introduced.
About Kubernetes Event-Driven Autoscaler (KEDA)
KEDA, a Kubernetes Event-Driven Autoscaler is a lightweight component that can be added to your Kubernetes cluster to scale applications based on the number of events it process. It makes autoscaling simple and optimizes the cost by providing a feature of scaling resources to zero.
KEDA works along with the Horizontal Pod Autoscaler (HPA) in a Kubernetes cluster by extending its functionality. KEDA provides 30+ built-in event-driven scalers which help in scaling the applications by managing the infrastructure itself. It also allows you to write your own custom scalers.
As KEDA helps in optimizing the cost, it can scale resources from 0 to 1 or 1 to 0. Scaling from 1 to n and back is being done by the HPA.
KEDA’s Architecture
The architecture and working of KEDA are quite simple to understand as there are three main components: metrics, operator and scaler.
- Operator (Agent): When KEDA is installed, an operator
keda-operator
gets created which is responsible for activating/deactivating deployments to scale to and from zero on events and also creates HPA objects in the cluster.
- Metrics (Metrics Adapter): This helps in presenting event metrics data to HPA for scaling.
- Scalers: It connect to an external event component like Prometheus and fetches out the metrics, which helps in the scaling of resources.

Custom Resources
When KEDA is installed it creates four custom resources that help in mapping event-source with authentication to provision workload resources and jobs for scaling.
- ScaledObjects: It maps an event source like Prometheus to workload resources, let’s say, Kubernetes Deployment which needs to be scaled.
- ScaledJobs: It maps an even source with Kubernetes Jobs for scaling.
- TriggerAuthentication or ClusterTriggerAuthentication: It is defined inside
ScaledObject/ScaledJob
for authenticating the event sources.
Lab With KEDA
As we triggered the lab through the LAB SETUP button, a terminal, and an IDE comes for us which already have a Kubernetes cluster running in it. This can be checked by running the kubectl get nodes
command.
kubectl get nodes
KEDA Installation
There are many ways to deploy KEDA in a Kubernetes cluster, and we will install it using Helm.
- First, do HELM installation
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh
- Add Keda’s helm repo and install it inside the
keda
namespace.
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda --namespace keda --create-namespace
kubectl get pods -n keda
Setting Up Application
- Deploy the frontend of the application by creating a deployment and exposing it through a service.
kubectl apply -f frontend.yaml
- Create the backend of the app by creating its deployment and exposing it through service.
kubectl apply -f backend.yaml
kubectl get pods,svc
- To access the application through browser deploy the ingress for it
kubectl apply -f ingress.yaml
kubectl get ingress
Now, access the app through the app-port-80
URL under the LAB-URL section and will get an rsvp app like shown in the image below.

- To check the pod metrics, configure metrics server. Keda already has a metrics adapter for this but as an end-user to see CPU utilization, this needs to be installed.
kubectl apply -f components.yaml
kubectl get pods -n kube-system
- Check the resource utilization of the pods by running the following command
kubectl top pods
- Now to increase the load and usage on the app, install locust through pip, and install Flask as a prerequisite for locust.
apt update && apt install python3-pip -y
pip install flask
pip install locust
- Create a locustfile for load testing
locust -f locust_file.py --host <APP_URL> --users 100 --spawn-rate 20 --web-port=8089
Here, replace <APP_URL>
with the rsvp app URL and access the locust UI from the app-port-8089
URL under the lab URL section and will see a locust UI as shown in the image below.
Click the Start swarming
button in the locust UI to enable the load on the rsvp app and will see an output like below.

- To enable scaling of pods with KEDA, create a CPU ScaledObject for
rsvp
deployment (frontend.yaml
)
Here, inside scaleTargetRef
(line 11), the name of the resource and, by default, its kind is deployment is mentioned along with the type of triggers currently here CPU to tell KEDA to scale the application accordingly.
kubectl apply -f scaler.yaml
kubectl get scaledobjects
This ScaledObject
will also create HPA, check that through the following command
kubectl get hpa
- As soon as the load starts getting increased on the application, KEDA starts working and will scale up the pods with HPA.
kubectl top pods
kubectl get hpa
kubectl get pods
What Next?
As we have seen scaling the application on the basis of CPU metrics with KEDA. In the next hands-on lab, we will be scaling the same application with Prometheus and KEDA.
Conclusion
In this hands-on lab, we saw the limitations of HPA and how KEDA solves it and implemented KEDA in a Kubernetes cluster.