In previous labs, you deployed a single pod and a deployment with a set number of replicas and it scaled it manually. In this lab, you will deploy a HorizontalPodAutoscaler to automatically scale the deployment when a certain condition is met. Conditions such as elevated CPU or memory usage are examples of conditions that can trigger an autoscale.
To begin, create a dedicated directory for this lab and switch into it:
cd ~
mkdir random-facts-app-autoscaling && cd random-facts-app-autoscaling
Create a new namespace called random-facts-app-autoscaling with the label lab=random-facts-app-autoscaling.
Create a Deployment manifest for the application with the following contents:
apiVersion: apps/v1
kind: Deployment
metadata:
name: random-facts-app
namespace: random-facts-app-autoscaling
labels:
lab: random-facts-app-autoscaling
spec:
replicas: 1
selector:
matchLabels:
lab: random-facts-app-autoscaling
template:
metadata:
labels:
lab: random-facts-app-autoscaling
spec:
containers:
- name: random-facts-app
image: us-central1-docker.pkg.dev/<YOUR_PROJECT_ID>/<YOUR_REGISTRY_NAME>/random-facts-app:1.0
ports:
- containerPort: 5000
resources:
requests:
cpu: 0.1
memory: 256M
Apply it to your cluster and validate the pods are running with the kubectl get pods command.
Create a service with a LoadBalancer type. We will need a public IP address to send traffic to our application to trigger a scale up.
apiVersion: v1
kind: Service
metadata:
name: random-facts-app-service
namespace: random-facts-app-autoscaling
labels:
lab: random-facts-app-autoscaling
spec:
selector:
lab: random-facts-app-autoscaling
ports:
- name: http
port: 5000
protocol: TCP
targetPort: 5000
type: LoadBalancer
Apply your Service manifest and then use the kubectl get service command to retrieve the External IP.
Note: If it says <pending>, give it a few moments for GCP to assign a public IP.
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
autoscaling-app ClusterIP 10.43.149.39 <YOUR_EXTERNAL_IP> 80/TCP 18m
Open a new tab in your browser and navigate to http://<YOUR_EXTERNAL_IP>:5000.
Configure Autoscaling #
Next, we will configure the Deployment to scale up/down to maintain an average CPU utilization of 15% across all pods.
Create the HorizontalPodAutoscaler manifest for the application with the following contents:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: random-facts-app-autoscaler
namespace: random-facts-app-autoscaling
labels:
lab: random-facts-app-autoscaling
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: random-facts-app
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 15
Save your manifest and apply it to the cluster.
Use the kubectl get hpa command to view the status of the HorizontalPodAutoscaler. Initially, the pods will not have been running for long, and therefore the Kubernetes metrics server (which houses utilization metrics for all workloads in the cluster) will not have any statistics for the app.
You will something similar to the following:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
random-facts-app-autoscaler Deployment/random-facts-app <unknown>/15% 1 3 0 7s
After a few minutes, <unknown> should be replaced with an actual percentage. As shown above, 15% is the target utilization that we specified above in the manifest, and the hpa controller will attempt to automatically adjust the number of replicas to maintain that target.
Generate traffic and Observe Autoscaler Behaviour #
On your workstation, we will be using the Apache HTTP server benchmarking tool (ab) to generate traffic. This will cause an increase of cpu usage of the app above the utilization target, and ultimately will cause the app to scale up.
In your current Cloud Shell terminal, run the following command to watch your autoscaler:
watch kubectl get hpa -n random-facts-app-autoscaling
In a new Cloud Shell terminal, run the following command to watch your pods:
watch kubectl get pods -n random-facts-app-autoscaling
Finally, in a third Cloud Shell terminal, run the following ab command to generate traffic.
ab -n 100000000 -c 10 http://<YOUR_EXTERNAL_IP>:5000/
The above command will send 100,000,000 GET requests to your app, over 10 connections concurrently. Please ensure that there is a trailing / at the end of the URL. This is required and if omitted will result in an invalid URL error.
Important: If the above ab command is not found or failing due to firewall blocks,
reinstall the ab tool and use kubectl port-forward as a workaround.
# Reinstall the ab tool
sudo apt get update
sudo apt install apache2-utils -y
# Then, use port forwarding and a modified ab command
kubectl port-forward service/random-facts-app-autoscaling 5000:http --namespace random-facts-app-autoscaling
ab -n 100000000 -c 10 http://localhost:5000/
While the ab load generator is running, flip back and forth through your two terminals where you are watching the HorizontalPodAutoscaler and the Pods. Notice that there is now a spike in targets and that two new pods get deployed in an attempt to keep up with the traffic.
Scaling Down #
To scale down the app, we simply stop ab from generating traffic, we can reduce the amount of requests, or raise the target utilization. When traffic is stopped, HorizontalPodAutoscaler will detect that CPU utilization is below 15% and will begin to stop pods automatically. However, this is done after a delay, to reduce the chance of flapping the replica count (or preventing the number of replicas changing rapidly, potentially causing instability.)
If the ab command is still running, stop it and close that Cloud Shell terminal.
In the Cloud Shell where you were watching the HorizontalPodAutoscaler, notice that the that the CPU utilization metric has decreased below the threshold of 15%.
And in the Cloud Shell where you were watching the Pods, notice the pods start to terminate if they haven’t already. You eventually will see just one pod running, which matches the minimum pods parameter set in the autoscaling configuration above.
NAME READY STATUS RESTARTS AGE
random-facts-app-6f9984d959-dvrv7 1/1 Running 0 27m
Clean Up #
Before moving onto the next lab, run the following command to delete your Service:
kubectl delete service random-facts-app-service -n random-facts-app-autoscaling