How Kubernetes computes CPU utilization for HPA?

Question

I want to understand how HPA computes CPU utilization across Pods.

According to this doc it takes the average of CPU utilization of a pod (average across the last 1 minute) divided by the CPU requested by the pod. Then it computes the arithmetic mean of all the pods' CPU.

Unfortunately the doc contains some information that are outdated like for example that --horizontal-pod-autoscaler-sync-period is by default set to 30 seconds but in the official doc , the default value is 15 seconds.

When I tested, I noticed that HPA scales up even before that average CPU reaches the threshold I set (which is 90%), Which made me think that maybe it takes the maximum CPU across Pods and not the average.

My question is where I can find an updated documentation to understand exactly how HPA works?

Answer 1

Note that I've not a Kubernetes cluster at hand, this is a theoretical answer based on the source code of k8s.
See if this actually matches your experience.

Kubernetes is opensource, here seems to be the HPA code .

The functions GetResourceReplica and calcPlainMetricReplicas (for non-utilization percentage) compute the number of replicas given the current metrics.
Both use the usageRatio returned by GetMetricUtilizationRatio , this value is multiplied by the number of currently ready pods in the Replica to get the new number of pods:

New_number_of_pods = Old_numbers_of_ready_pods * usageRatio

There is a tolerance check (ie if the usageRatio falls close enough to 1, nothing is done) and the pending and unkown-state pods are ignored (considered to use 0% of the resource) while the pods without metrics are considered to use 100% of the resource.

The usageRatio is computed by GetResourceUtilizationRatio that is passed the metrics and the requests (of resources) of all the pods, it goes as follow:

utilization = Total_sum_resource_usage_all_pods / Total_sum_resource_requests_all_pods
usageRatio = utilization * 100 / targetUtilization

Where targetUtilization comes from the HPA spec.
The code is easier to read than this summary of mine, in this context the term request means "resource request" (that's an educated guess).

So I'd say that 90% is the resource usage across all pods computed as they were all a single pod requesting the sum of each pod's request and collecting the metrics as they were all running on a single dedicated node.

Answer 2

According to https://github.com/kubernetes/kubernetes/issues/78988#issuecomment-502106361 this is configuration dependent and an issue of the metrics server and the kublet reporting, the HPA should rather only using the information: https://kubernetes.io/docs/tasks/debug-application-cluster/resource-metrics-pipeline/#cpu

I think the duration is should be defined by the kubelet's --housekeeping-interval and defaults to 10 seconds

How Kubernetes computes CPU utilization for HPA?

Question

2 answers

solution1
5 2020-04-04 17:15:50

solution2
0 2020-04-04 11:39:34

How Kubernetes computes CPU utilization for HPA?

Question

2 answers

solution1 5 2020-04-04 17:15:50

solution2 0 2020-04-04 11:39:34

solution1
5 2020-04-04 17:15:50

solution2
0 2020-04-04 11:39:34