Unlock the benefits of Horizontal Autoscaling on Azure Kubernetes Services

Are you a developer or coder looking for ways to maximize efficiency when using Azure Kubernetes Service (AKS) for large-scale applications? If so, horizontal autoscaling is what you need! Horizontal autoscaling not only lets you maintain your existing architecture but also enables your system to continually adjust the number of pods in response to changing workloads. In this blog post, we’ll explore how horizontal autoscaling works and how it can help make AKS even more efficient. Ready? Let’s dive in!

What is Azure Kubernetes Service (AKS)?

Azure Kubernetes Service (AKS) is a managed container service from Microsoft Azure. It allows users to quickly build, deploy and manage containerized applications in the cloud. AKS removes the complexity of managing a Kubernetes cluster by taking care of critical tasks such as provisioning, scaling, and upgrading clusters. It also offers an easy-to-use user interface for creating and managing clusters. AKS is ideal for organizations looking to quickly launch applications in a cloud environment with minimal effort and cost.

Horizontal autoscaling on AKS

Horizontal autoscaling on Azure Kubernetes Services (AKS) allows users to quickly scale out their cluster to meet the demand of their application in a cost-efficient way. It enables users to set upper and lower limits on the number of nodes they want in their cluster. It also allows users to configure rules that trigger scaling events such as CPU utilization or when a specific metric threshold is reached. Horizontal autoscaling ensures that your applications remain available and performant even during peak load times by automatically adding or removing compute resources based on need.

Benefits of horizontal autoscaling

Horizontal autoscaling in Azure Kubernetes Services (AKS) offers a number of benefits, including:

  • Scalability: AKS can automatically increase or decrease the number of PODs in a cluster based on workload demands, ensuring that resources are always available when needed.
  • Improved performance: Horizontal autoscaling ensures that the cluster can handle sudden spikes in traffic and reduce latency for users.
  • Cost optimization: By scaling PODs up or down as needed, you can avoid paying for unused resources and optimize costs.
  • Improved availability: If a node fails, AKS can automatically replace it, improving the overall reliability and availability of your application.
  • Automation: The entire process of scaling is automated, reducing the need for manual intervention and minimizing the risk of errors.

How does Horizontal Pod Autoscaler works?

Horizontal POD Autoscaler works in this fashion:

  • Horizontal Pod Autoscaler (HPA) automatically adjusts the number of pods in a deployment, replication controller, or replica set, stateful set depending on the CPU utilization of those resources.
  • By setting a target CPU usage percentage, the HPA automatically adds or removes pods to meet that goal.
  • Kubernetes metrics server is used by the HPA to verify CPU metrics for each pod.
  • The HPA does not need to be manually installed – it is available as an API resource in Kubernetes by default.

How to enable Horizontal Autoscaling on AKS

Kubernetes allows for horizontal pod autoscaling, which can adjust the number of pods in a deployment based on CPU utilization and other select metrics. The Metrics Server provides resource usage data to Kubernetes and is automatically included in version 1.10 and higher AKS clusters. To see your AKS cluster’s version, use the az aks show command as illustrated in this example.

az aks show --resource-group myResourceGroup --name myAKSCluster --query kubernetesVersion --output table

If your AKS cluster is less than 1.10, the Metrics Server is not automatically installed. Metrics Server installation manifests are available as a component.yaml asset on Metrics Server releases, which means you can install them via a URL.

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml

To use the autoscaler, all containers in your pods and your pods must have CPU requests and limits defined.

There are three ways to implement the Autoscaler:

1. Create a manifest file to define Autoscaler behavior:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
  name: my-hpa
  namespace: default
    apiVersion: apps/v1
    kind: Deployment
    name: my-deployment
  minReplicas: 1
  maxReplicas: 10
  - type: Resource
      name: cpu
        type: Utilization
        averageUtilization: 50

And now apply the manifest file with the kubectl command

kubectl apply -f manifest.yaml

2. Define the resource info while creating the YAML

  - name: azure-vote-front
    image: mcr.microsoft.com/azuredocs/azure-vote-front:v1
    - containerPort: 80
        cpu: 250m
        cpu: 500m

3. Enable horizontal pod autoscaling via kubectl

This option is most suitable if you have not defined it in the YAML file while creating it: Here is the example:

kubectl autoscale deployment myapp --cpu-percent=50 --min=3 --max=10

Frequently asked questions about horizontal autoscaling on AKS

  1. What triggers horizontal autoscaling in AKS?
  • Horizontal autoscaling in AKS is triggered when the utilization of a specified resource (such as CPU or memory) exceeds a defined threshold. The HorizontalPodAutoscaler object defines the conditions that trigger scaling.
  1. How does AKS determine the number of replicas to add or remove?
  • The AKS HorizontalPodAutoscaler calculates the desired number of replicas based on the current utilization of the specified resource and the target utilization specified in the HPA object.
  1. Can I set different autoscaling rules for different resources?
  • Yes, you can specify different autoscaling rules for different resources (such as CPU and memory) in the same HPA object.
  1. How do I monitor horizontal autoscaling in AKS?
  • You can use the kubectl get hpa command to see the current status of the HPA, including the current number of replicas, target utilization, and conditions that triggered scaling.
  1. Can I manually scale the number of replicas in AKS?
  • Yes, you can manually scale the number of replicas in AKS using the kubectl scale command. However, it’s recommended to let the HorizontalPodAutoscaler handle scaling automatically to ensure optimal resource utilization and cost efficiency.
  1. Is horizontal autoscaling available in all AKS pricing tiers?
  • Yes, horizontal autoscaling is available in all AKS pricing tiers and is included in the standard AKS pricing.
  1. How does horizontal autoscaling affect my AKS costs?
  • Horizontal autoscaling can help optimize your AKS costs by automatically scaling the number of nodes up or down as needed, avoiding the costs associated with unused resources. However, be aware that scaling nodes up will result in higher costs.


In conclusion, Horizontal Autoscaling on Azure Kubernetes Services is an incredibly useful tool for optimizing resource usage and cost efficiency. By automatically scaling the number of nodes up or down depending on demand, you can find the best balance between resources needed and money spent. It’s also worth noting that further manual scaling can be done if necessary.

+ There are no comments

Add yours

Leave a Reply