AutoScaling in Kubernetes with Horizontal Pod Autoscaler using custom Metrics from Prometheus

AutoScaling in Kubernetes with Horizontal Pod

Autoscaler using custom Metrics from
Dhruvin Soni · Follow
Published in Towards Dev
11 min read · Apr 23, 2024

What is HPA in K8s?

Horizontal Pod Autoscaler (HPA), in Kubernetes, is a system that automatically
adjusts the number of pods in a deployment or replica set based on observed CPU
utilization or other select metrics provided by the user. The main goal of HPA is to… 1/24
scale out (add more pods) when the load increases and scale in (remove pods) when
the load decreases, ensuring that applications maintain performance without
wasting resources.

AWS metrics into Prometheus

Data in Prometheus can come from two sources.

1. Application specific metric

2. Data from AWS cloudwatch.

In this article, we will deploy YACE CloudWatch Exporter to get data from AWS
CloudWatch which exports data into Prometheus format. Once we have data in
Prometheus we can use any available metrics to scale our deployment.

eksctl , helm & awscli installed on the server

Now, let’s start configuring the project.

Step 1: Create an AWS EKS cluster

Create an AWS EKS cluster using the below command.

eksctl create cluster --name <cluster-name> --region ap-south-1 --instance-type

Step 2: Turn on OpenID Connect (OIDC)

You will use IAM roles for service accounts to limit secret access to your pods.
By setting this up, the provider will retrieve the pod identity and exchange this
identity for an IAM role.

Run the following command to turn on OpenID Connect (OIDC).… 2/24
eksctl utils associate-iam-oidc-provider --region=<region> --cluster=<cluster-n

Note: Make sure to update the region & cluster-name

Step 3: Set the variables

We will use the IAM Role for Service Account (IRSA) to allow the YACE
CloudWatch exporter to get data from AWS CloudWatch.

Set your AWS account ID to an environment variable with the following


account_id=$(aws sts get-caller-identity --query "Account" --output text)

Set your cluster’s OIDC identity provider to an environment variable with the
following command.

export AWS_REGION='ap-south-1' # Your EKS cluster region

oidc_provider=$(aws eks describe-cluster --name <cluster-name> --region $AWS_RE

Set variables for the namespace and name of the service account. We will deploy
YACE in default namespace and we will configure YACE to create a service account
named yace-cw-exporter

export namespace=default
export service_account=yace-cw-exporter

Step 4: Create an IAM Role

Run the following command to create a trust policy file for the IAM role.… 3/24
cat >trust-relationship.json <<EOF

"Version": "2012-10-17",
"Statement": [
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::$account_id:oidc-provider/$oidc_provider"
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"$oidc_provider:aud": "",
"$oidc_provider:sub": "system:serviceaccount:$namespace:$service_acco

Create the role using the following command.

aws iam create-role --role-name cloudwatch-exporter-role --assume-role-policy-d

Attach managed IAM policy to the role.

# Attach cloudwatch exporter read only policy

aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/CloudWatchReadO

# Attch Resource Group and tagging read only policy

aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/ResourceGroupsa

Step 5: Install kube-prometheus-stack

We will use the kube-prometheus-stack helm chart to install Prometheus.… 4/24
It uses the Prometheus operator to install Prometheus and it has custom CRDs to
read data exported in Prometheus format.

Create monitoring namespace using the following command.

kubectl create ns monitoring

Add prometheus-community repository

helm repo add prometheus-community

Get the default values file.

helm show values prometheus-community/kube-prometheus-stack > kube-prometheus-s

Update the following in kube-prometheus-stack-values.yaml file.

ruleSelectorNilUsesHelmValues: false
serviceMonitorSelectorNilUsesHelmValues: false
podMonitorSelectorNilUsesHelmValues: false
probeSelectorNilUsesHelmValues: false

Install Prometheus on Kubernetes

helm install -f kube-prometheus-stack-values.yaml prometheus prometheus-communi… 5/24
Step 6: Create SQS Queue

We will scale our deployment based on the number of messages in the AWS SQS

Run the following command to create the SQS queue.

aws sqs create-queue --queue-name <your_queue_name>

Send some sample messages to the SQS queue.

aws sqs send-message --queue-url <your_queue_url> --message-body "my test messa

aws sqs send-message --queue-url <your_queue_url> --message-body "my test messa
aws sqs send-message --queue-url <your_queue_url> --message-body "my test messa

Step 7: Install KEDA

KEDA is a Kubernetes-based event-driven autoscaling framework that allows you
to scale workloads based on external events, such as messages from a message
queue or events from an event source.

KEDA provides a wide range of event sources and supports custom metrics,
making it a flexible option for autoscaling based on external metrics. KEDA can
work in conjunction with HPA, allowing you to use external metrics from event
sources to trigger scaling actions with HPA.

We will install KEDA using Helm.

Add kedacore helm repo using the below command.

helm repo add kedacore

Update Helm repo… 6/24
helm repo update

Install keda Helm chart

helm install keda kedacore/keda --namespace keda --create-namespace

Step 8: Install YACE CloudWatch Exporter

We will use YACE CloudWatch Exporter to get data from AWS CloudWatch and it
will export data into Prometheus format.

We will enable servicemonitor custom resource so that Prometheus can read

exported metrics from the application.

Add Helm Repo using the following command.

helm repo add nerdswords

Now we will get the default chart values and we need to do some modifications
to give AWS CloudWatch access using IRSA and create servicemonitor custom

Get the default chart values using the following command.

helm show values nerdswords/yet-another-cloudwatch-exporter > yace-cw-exporter-

Open yace-cw-exporter-values.yaml and make the following changes.

# -- Specifies whether a service account should be created
create: true… 7/24
# -- Labels to add to the service account

labels: {}
# -- Annotations to add to the service account
# add annotation for IRSA <ARN of IAM role>
# -- The name of the service account to use.
# If not set and create is true, a name is generated using the fullname templ
name: "yace-cw-exporter"

# When set true then use a ServiceMonitor to configure scraping
enabled: true

config: |-
apiVersion: v1alpha1
sts-region: ap-south-1
- Name
- type: AWS/SQS
- ap-south-1
period: 60
length: 60
- name: NumberOfMessagesSent
statistics: [Sum,Average]
- name: NumberOfMessagesReceived
statistics: [Sum,Average]
- name: NumberOfMessagesDeleted
statistics: [Sum,Average]
- name: ApproximateAgeOfOldestMessage
statistics: [Sum,Average]
- name: NumberOfEmptyReceives
statistics: [Sum,Average]
- name: SentMessageSize
statistics: [Sum,Average]
- name: ApproximateNumberOfMessagesNotVisible
statistics: [Sum,Average]
- name: ApproximateNumberOfMessagesDelayed
statistics: [Sum,Average]… 8/24
- name: ApproximateNumberOfMessagesVisible
statistics: [Sum,Average]

Note: Make sure to replace the region

The above config is very important. The YACE CloudWatch Exporter will get the
metrics of the service defined in the config. So, you have to add all the required
AWS services and their metrics in the config.

All the configuration options are available here and example configs are
available here.

Deploy the YACE CloudWatch Exporter Helm Chart using the following

helm install -f yace-cw-exporter-values.yaml yace-cw-exporter nerdswords/yet-an

Verify YACE CW Exporter can get metrics.

kubectl port-forward <your_yace_pod_name> 5000:5000

# Open in your browser and click on Metrics

Verify metrics are available in Prometheus

kubectl port-forward svc/prometheus-kube-prometheus-prometheus 9090:9090 -n mon

# Open browser and type and you should see prometheus UI

Step 9: Create the application helm chart

Our AWS metrics are available in prometheus we can use KEDA to scale our
deployment based on ApproximateNumberOfMessagesVisible (The number of
messages to be processed)… 9/24
Create a folder for storing Helm related files

mkdir portfolio-app
cd portfolio-app
mkdir helmfile
cd helmfile

Create Chart.yaml file and add the following content

apiVersion: v2
name: portfolio-app
description: A Helm chart for portfolio-app
type: application
version: 1.0.0
appVersion: "1.0.0"

Create values.yaml file and add the below code to it.

# Default values for keda-sample.

# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

replicaCount: 1
repository: nginx
pullPolicy: Always
# Overrides the image tag whose default is the chart appVersion.
tag: "latest"
type: ClusterIP
port: 80
memory: 256Mi
cpu: 100m
memory: 128Mi
enabled: true
pollingInterval: 30 # This is the interval to check each trigger on. By defau
cooldownPeriod: 60 # The period to wait after the last trigger reported activ… 10/24
minReplicaCount: 1 # Minimum number of replicas KEDA will scale the resource

maxReplicaCount: 6 # This setting is passed to the HPA definition that KEDA w
failureThreshold: 3
replicas: 4
serverAddress: "http://prometheus-kube-prometheus-prometheus.monitoring.s
metricName: aws_sqs_approximate_number_of_messages_visible_average # This
query: aws_sqs_approximate_number_of_messages_visible_average{dimension_Q
threshold: "10.00"

In this values file we have added a few variables that we will use while creating
ScaledObject custom resource provided by KEDA to trigger our deployment
when threshold is crossed.

Create templates folder to store Kubernetes resource related files

mkdir templates
cd templates

Create file _helpers.tpl

Expand the name of the chart.
{{- define "" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this
If release name contains chart name it will be used as a full name.
{{- define "portfolio-app.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}… 11/24
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}

{{- end }}
{{- end }}
{{- end }}
Create chart name and version as used by the chart label.
{{- define "portfolio-app.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | tr
{{- end }}
Common labels
{{- define "portfolio-app.labels" -}} {{ include "portfolio-appe.chart" . }}
{{ include "portfolio-app.selectorLabels" . }}
{{- if .Chart.AppVersion }} {{ .Chart.AppVersion | quote }}
{{- end }} {{ .Release.Service }}
{{- end }}
Selector labels
{{- define "portfolio-app.selectorLabels" -}} {{ include "" . }} {{ .Release.Name }}
{{- end }}
Create the name of the service account to use
{{- define "portfolio-app.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "portfolio-app.fullname" .) }}
{{- else }}
{{- default "default" }}
{{- end }}
{{- end }}

Create file deployment.yaml and add the below code to it.

apiVersion: apps/v1
kind: Deployment
name: {{ include "portfolio-app.fullname" . }}
{{- include "portfolio-app.labels" . | nindent 4 }}
spec:… 12/24
replicas: {{ .Values.replicaCount }}
{{- include "portfolio-app.selectorLabels" . | nindent 6 }}
{{- include "portfolio-app.labels" . | nindent 8 }}
- name: {{ .Chart.Name }}
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default
imagePullPolicy: {{ .Values.image.pullPolicy }}
- name: http
containerPort: {{ .Values.service.port }}
protocol: TCP
{{- toYaml .Values.resources | nindent 12 }}

Create file service.yaml and add the below code it.

apiVersion: v1
kind: Service
name: {{ include "portfolio-app.fullname" . }}
{{- include "portfolio-app.labels" . | nindent 4 }}
type: {{ .Values.service.type }}
- port: {{ .Values.service.port }}
targetPort: http
protocol: TCP
name: http
{{- include "portfolio-app.selectorLabels" . | nindent 4 }}

Now the most important file scaledobject.yaml… 13/24
{{- if .Values.keda.enabled -}}

kind: ScaledObject
name: {{ include "portfolio-app.fullname" . }}
{{- include "portfolio-app.labels" . | nindent 4 }}
apiVersion: apps/v1
kind: Deployment
name: {{ include "portfolio-app.fullname" . }}
pollingInterval: {{ .Values.keda.pollingInterval }}
cooldownPeriod: {{ .Values.keda.cooldownPeriod }}
minReplicaCount: {{ .Values.keda.minReplicaCount }}
maxReplicaCount: {{ .Values.keda.maxReplicaCount }}
{{- with .Values.keda.fallback }}
{{- toYaml . | nindent 4 }}
{{- end }}
- type: prometheus
serverAddress: {{ .Values.keda.triggers.prometheus.serverAddress }}
metricName: {{ .Values.keda.triggers.prometheus.metricName }}
query: {{ tpl .Values.keda.triggers.prometheus.query . }}
threshold: {{ .Values.keda.triggers.prometheus.threshold | quote }}
{{- end -}} Name of the deployment to scale

spec.triggers.metadata.serverAddress Address of Prometheus server

spec.triggers.metadata.query Prometheus Query (query must return a

vector/scalar single element response)

You can read here for more configuration options of ScaledObject and
configuration options for the Prometheus trigger here.

Generate local helm template.

helm template <path_to_Chart.yaml>… 14/24
Install keda-sample helm chart

helm install keda-sample <path_to_Chart.yaml>

Verify the resources are created.

# Get pod
kubectl get po

# Get service
kubectl get svc

# Get scaledobject
kubectl get scaledobject

# Get HPA
kubectl get hpa

You might have observed in Architecture diagram that KEDA internally creates
native Kubernetes HPA resources.

Verify that scaledobject can connect to the Prometheus server.

kubectl describe scaledobject portfolio-app

# If you see following events it means that scaledobject is able to get data fr
Message: ScaledObject is defined correctly and is ready for scaling
Reason: ScaledObjectReady
Status: True
Type: Ready
Message: Scaling is performed because triggers are active
Reason: ScalerActive
Status: True
Type: Active
Message: No fallbacks are active on this scaled object
Reason: NoFallbackFound… 15/24
Status: False
Type: Fallback
Status: Unknown
Type: Paused

Note: Any changes related to HPA configuration should be done in “ScaledObject”


Step 10: Test the ScaledObject

Now that everything is ready it is time to test the scaledObject.

So, we will send some more sample messages in our SQS queue so that they
cross the threshold of “10.00” (in our example) and we should see a new pod.

Send some test messages into the queue

for i in $(seq 1 15); do aws sqs send-message --queue-url <queue-url> --message

Once Data is available in Prometheus (because for KEDA trigger source is

Prometheus) you will see that HPA will be in action and will scale-out the

You can describe the events to verify scale-out.

You can verify the new pods will be created.

Let’s delete a few messages from the SQS queue.


(MSG=$(aws sqs receive-message --queue-url $QUEUE_URL) --max-number-of-messages

[ ! -z "$MSG" ] && echo "$MSG" | jq -r '.Messages[] | .ReceiptHandle' | \
(xargs -I {} aws sqs delete-message --queue-url $QUEUE_URL --receipt-handle {
echo "$MSG" | jq -r '.Messages[] | .Body')… 16/24
Once Data is available in Prometheus (because for KEDA trigger source is

Prometheus) you will see that HPA will not immediately scale-in deploy.

This is because the default value of stabilizationWindowSeconds is 300.

You can modify this setting in the scaledObject configuration.

In this article, I walked you through how to scale your Kubernetes Deployment
based on the external metric from Prometheus using KEDA. You can modify the
configs as per your and play with it. You can even use any other service’s metrics to
scale your deployment.

K8s Keda Prometheus DevOps Horizontal Pod Autoscaler


Written by Dhruvin Soni

3K Followers · Writer for Towards Dev

Senior Cloud Infrastructure Engineer | AWS | Automation | 2x AWS | CKA | Terraform Certified | k8s | Docker |

Dhruvin Soni

Dependency Inversion Vs Dependency Injection

Dhruvin Soni in overcast blog

Tanmay Bhandge

Spacelift in Spacelift

Tom Jose in kotaicode


Rajesh Thakur in

You might also like