This article is part of the series in which I talk about Kubernetes concepts and the most common use case scenarios. In this quickstart, I show how you can scale the number of Azure VM nodes to change the cluster’s capacity for hosting workloads. You learn how to:

  • Scale the Kubernetes nodes
  • Manually scale Kubernetes pods that run your application
  • Configure autoscaling pods that run the app

Assumption:

For this demo, I have created two deployments named blog-back and blog-front which can be seen with the kubectl get deployment command as follows :

kubectl get deployment

The following example output shows one blog-back and one blog-front deployment with one POD each:

Output:

NAME          READY   UP-TO-DATE   AVAILABLE   AGE
blog-back     1/1     1            0           6s
block-front   1/1     1            0           6s

Manually scale pods

To see the number of pods we use the kubectl get command as follows:

kubectl get pods

The following example output shows one blog-back pod and one blog-front pod:

Output:

NAME                          READY STATUS    RESTARTS   AGE
blog-back-2549686872-4d2r5    1/1   Running    0         31m
blog-front-848767080-tf34m    1/1   Running    0         31m

To manually change the number of pods in the blog-front deployment, use the kubectl scale command. The following example increases the number of blog-front pods to 5:

kubectl scale --replicas=5 deployment/blog-front

Run kubectl get pods again to verify that AKS successfully creates the additional pods. After a minute or so, the pods are available in your cluster:

kubectl get pods

NAME                          READY     STATUS    RESTARTS   AGE
blog-back-2606967446-nmpcf    1/1       Running   0          15m
blog-front-3309479140-2hfh0   1/1       Running   0          3m
blog-front-3309479140-bzt05   1/1       Running   0          3m
blog-front-3309479140-fvcvm   1/1       Running   0          3m
blog-front-3309479140-hrbf2   1/1       Running   0          15m
blog-front-3309479140-qphz8   1/1       Running   0          3m

Autoscale pods

Kubernetes supports horizontal pod autoscaling to adjust the number of pods in a deployment depending on CPU utilization or other select metrics. The Metrics Server is used to provide resource utilization to Kubernetes, and is automatically deployed in AKS clusters versions 1.10 and higher. To see the version of your AKS cluster, use the az aks show command, as shown in the following example:

az aks show --resource-group myResourceGroup --name myAKSCluster --query kubernetesVersion --output table

To use the autoscaler, all containers in your pods and your pods must have CPU requests and limits defined. In the blog-front deployment, the blog-front container already requests 0.25 CPU, with a limit of 0.5 CPU.

These resource requests and limits are defined for each container as shown in the following example snippet:

  containers:
  - name: blog-front
    image: mcr.microsoft.com/azuredocs/azure-vote-front:v1
    ports:
    - containerPort: 80
    resources:
      requests:
        cpu: 250m
      limits:
        cpu: 500m

The following example uses the kubectl autoscale command to autoscale the number of pods in the blog-front deployment. If average CPU utilization across all pods exceeds 50% of their requested usage, the autoscaler increases the pods up to a maximum of 10 instances. A minimum of 3 instances is then defined for the deployment:

kubectl autoscale deployment azure-vote-front --cpu-percent=50 --min=3 --max=10

Alternatively, you can create a manifest file to define the autoscaler behavior and resource limits. The following is an example of a manifest file named blog-hpa.yaml.

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: blog-back-hpa
spec:
  maxReplicas: 10 # define max replica count
  minReplicas: 3  # define min replica count
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: blog-back
  targetCPUUtilizationPercentage: 50 # target CPU utilization

---

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: blog-front-hpa
spec:
  maxReplicas: 10 # define max replica count
  minReplicas: 3  # define min replica count
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: blog-front
  targetCPUUtilizationPercentage: 50 # target CPU utilization

Use kubectl apply to apply the autoscaler defined in the blog-hpa.yaml manifest file.

kubectl apply -f blog-hpa.yaml

To see the status of the autoscaler, use the kubectl get hpa command as follows:

kubectl get hpa

NAME         REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
blog-front   Deployment/blog-front   0% / 50%   3         10        3          2m

Manually scale AKS nodes

You can adjust the number of nodes manually if you plan more or fewer container workloads on your cluster. The following example increases the number of nodes to three in the Kubernetes cluster named myAKSCluster. The command takes a couple of minutes to complete.

az aks scale --resource-group myResourceGroup --name myAKSCluster --node-count 3

When the cluster has successfully scaled, the output is similar to following example:

"agentPoolProfiles": [
  {
    "count": 3,
    "dnsPrefix": null,
    "fqdn": null,
    "name": "myAKSCluster",
    "osDiskSizeGb": null,
    "osType": "Linux",
    "ports": null,
    "storageProfile": "ManagedDisks",
    "vmSize": "Standard_D2_v2",
    "vnetSubnetId": null
  }

Next steps

Advertisement