This article is part of the series in which I talk about Kubernetes concepts and the most common use case scenarios. In this quickstart, I show how you can scale the number of Azure VM nodes to change the cluster’s capacity for hosting workloads. You learn how to:
- Scale the Kubernetes nodes
- Manually scale Kubernetes pods that run your application
- Configure autoscaling pods that run the app
Assumption:
For this demo, I have created two deployments named blog-back
and blog-front
which can be seen with the kubectl get deployment command as follows :
kubectl get deployment
The following example output shows one blog-back and one blog-front deployment with one POD each:
Output:
NAME READY UP-TO-DATE AVAILABLE AGE
blog-back 1/1 1 0 6s
block-front 1/1 1 0 6s
Manually scale pods
To see the number of pods we use the kubectl get
command as follows:
kubectl get pods
The following example output shows one blog-back pod and one blog-front pod:
Output:
NAME READY STATUS RESTARTS AGE
blog-back-2549686872-4d2r5 1/1 Running 0 31m
blog-front-848767080-tf34m 1/1 Running 0 31m
To manually change the number of pods in the blog-front deployment, use the kubectl scale command. The following example increases the number of blog-front pods to 5:
kubectl scale --replicas=5 deployment/blog-front
Run kubectl get pods again to verify that AKS successfully creates the additional pods. After a minute or so, the pods are available in your cluster:
kubectl get pods
NAME READY STATUS RESTARTS AGE
blog-back-2606967446-nmpcf 1/1 Running 0 15m
blog-front-3309479140-2hfh0 1/1 Running 0 3m
blog-front-3309479140-bzt05 1/1 Running 0 3m
blog-front-3309479140-fvcvm 1/1 Running 0 3m
blog-front-3309479140-hrbf2 1/1 Running 0 15m
blog-front-3309479140-qphz8 1/1 Running 0 3m
Autoscale pods
Kubernetes supports horizontal pod autoscaling to adjust the number of pods in a deployment depending on CPU utilization or other select metrics. The Metrics Server is used to provide resource utilization to Kubernetes, and is automatically deployed in AKS clusters versions 1.10 and higher. To see the version of your AKS cluster, use the az aks show command, as shown in the following example:
az aks show --resource-group myResourceGroup --name myAKSCluster --query kubernetesVersion --output table
To use the autoscaler, all containers in your pods and your pods must have CPU requests and limits defined. In the blog-front
deployment, the blog-front container already requests 0.25 CPU, with a limit of 0.5 CPU.
These resource requests and limits are defined for each container as shown in the following example snippet:
containers:
- name: blog-front
image: mcr.microsoft.com/azuredocs/azure-vote-front:v1
ports:
- containerPort: 80
resources:
requests:
cpu: 250m
limits:
cpu: 500m
The following example uses the kubectl autoscale command to autoscale the number of pods in the blog-front deployment. If average CPU utilization across all pods exceeds 50% of their requested usage, the autoscaler increases the pods up to a maximum of 10 instances. A minimum of 3 instances is then defined for the deployment:
kubectl autoscale deployment azure-vote-front --cpu-percent=50 --min=3 --max=10
Alternatively, you can create a manifest file to define the autoscaler behavior and resource limits. The following is an example of a manifest file named blog-hpa.yaml
.
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: blog-back-hpa
spec:
maxReplicas: 10 # define max replica count
minReplicas: 3 # define min replica count
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: blog-back
targetCPUUtilizationPercentage: 50 # target CPU utilization
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: blog-front-hpa
spec:
maxReplicas: 10 # define max replica count
minReplicas: 3 # define min replica count
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: blog-front
targetCPUUtilizationPercentage: 50 # target CPU utilization
Use kubectl apply
to apply the autoscaler defined in the blog-hpa.yaml
manifest file.
kubectl apply -f blog-hpa.yaml
To see the status of the autoscaler, use the kubectl get hpa
command as follows:
kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
blog-front Deployment/blog-front 0% / 50% 3 10 3 2m
Manually scale AKS nodes
You can adjust the number of nodes manually if you plan more or fewer container workloads on your cluster. The following example increases the number of nodes to three in the Kubernetes cluster named myAKSCluster. The command takes a couple of minutes to complete.
az aks scale --resource-group myResourceGroup --name myAKSCluster --node-count 3
When the cluster has successfully scaled, the output is similar to following example:
"agentPoolProfiles": [
{
"count": 3,
"dnsPrefix": null,
"fqdn": null,
"name": "myAKSCluster",
"osDiskSizeGb": null,
"osType": "Linux",
"ports": null,
"storageProfile": "ManagedDisks",
"vmSize": "Standard_D2_v2",
"vnetSubnetId": null
}