Kubernetes runs your workload by placing containers into Pods to run on Nodes. A node may be a virtual or physical machine, depending on the cluster. Since we are talking about Azure Kubernetes Service, these nodes are usually the underlying Virtual machine scale sets. Each node is managed by the control plane and contains the services necessary to run Pods.

In Azure Kubernetes Service (AKS), nodes of the same configuration are grouped together into node pools. These node pools contain the underlying VMs that run your applications. The initial number of nodes and their size (SKU) is defined when you create an AKS cluster, which creates a system node pool. To support applications that have different compute or storage demands, you can create additional user node pools. System node pools serve the primary purpose of hosting critical system pods such as CoreDNS and tunnelfront. User node pools serve the primary purpose of hosting your application pods. However, application pods can be scheduled on system node pools if you wish to only have one pool in your AKS cluster. User node pools are where you place your application-specific pods. For example, use these additional user node pools to provide GPUs for compute-intensive applications, or access to high-performance SSD storage.

System and user node pools

For a system node pool, AKS automatically assigns the label kubernetes.azure.com/mode: system to its nodes. This causes AKS to prefer scheduling system pods on node pools that contain this label. This label does not prevent you from scheduling application pods on system node pools. However, we recommend you isolate critical system pods from your application pods to prevent misconfigured or rogue application pods from accidentally killing system pods. You can enforce this behavior by creating a dedicated system node pool. Use the CriticalAddonsOnly=true:NoSchedule taint to prevent application pods from being scheduled on system node pools.

System node pools have the following restrictions:

  • System pools osType must be Linux.
  • User node pools osType may be Linux or Windows.
  • System pools must contain at least one node, and user node pools may contain zero or more nodes.
  • System node pools require a VM SKU of at least 2 vCPUs and 4GB memory. But burstable-VM(B series) is not recommended.
  • A minimum of two nodes 4 vCPUs is recommended(e.g. Standard_DS4_v2), especially for large clusters (Multiple CoreDNS Pod replicas, 3-4+ add-ons, etc.).
  • System node pools must support at least 30 pods as described by the minimum and maximum value formula for pods.
  • Spot node pools require user node pools.
  • Adding an additional system node pool or changing which node pool is a system node pool will NOT automatically move system pods. System pods can continue to run on the same node pool even if you change it to a user node pool. If you delete or scale down a node pool running system pods that was previously a system node pool, those system pods are redeployed with preferred scheduling to the new system node pool.

You can do the following operations with node pools:

  • Create a dedicated system node pool (prefer scheduling of system pods to node pools of mode:system)
  • Change a system node pool to be a user node pool, provided you have another system node pool to take its place in the AKS cluster.
  • Change a user node pool to be a system node pool.
  • Delete user node pools.
  • You can delete system node pools, provided you have another system node pool to take its place in the AKS cluster.
  • An AKS cluster may have multiple system node pools and requires at least one system node pool.
  • If you want to change various immutable settings on existing node pools, you can create new node pools to replace them. One example is to add a new node pool with a new maxPods setting and delete the old node pool.