[英]How to implement Kubernetes horizontal pod autoscaling with scale up/down policies?
Kubernetes v1.19 in AWS EKS AWS EKS 中的 Kubernetes v1.19
I'm trying to implement horizontal pod autoscaling in my EKS cluster, and am trying to mimic what we do now with ECS.我正在尝试在我的 EKS 集群中实现水平 pod 自动缩放,并尝试模仿我们现在使用 ECS 所做的事情。 With ECS, we do something similar to the following
使用 ECS,我们执行类似于以下的操作
I'm trying to use the HorizontalPodAutoscaler
kind, and helm create
gives me this template.我正在尝试使用
HorizontalPodAutoscaler
类型,而helm create
给了我这个模板。 (Note I modified it to suit my needs, but the metrics
stanza remains.) (注意我修改了它以满足我的需要,但
metrics
节仍然存在。)
{- if .Values.autoscaling.enabled }}
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: {{ include "microserviceChart.Name" . }}
labels:
{{- include "microserviceChart.Name" . | nindent 4 }}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ include "microserviceChart.Name" . }}
minReplicas: {{ include "microserviceChart.minReplicas" . }}
maxReplicas: {{ include "microserviceChart.maxReplicas" . }}
metrics:
{{- if .Values.autoscaling.targetCPUUtilizationPercentage }}
- type: Resource
resource:
name: cpu
targetAverageUtilization: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}
{{- end }}
{{- if .Values.autoscaling.targetMemoryUtilizationPercentage }}
- type: Resource
resource:
name: memory
targetAverageUtilization: {{ .Values.autoscaling.targetMemoryUtilizationPercentage }}
{{- end }}
{{- end }}
However, how do I fit the scale up/down information shown in Horizontal Pod Autoscaling in the above template, to match the behavior that I want?但是,我如何适应上述模板中Horizontal Pod Autoscaling中显示的放大/缩小信息,以匹配我想要的行为?
The Horizontal Pod Autoscaler automatically scales the number of Pods in a replication controller, deployment, replica set or stateful set based on observed metrics (like CPU
or Memory
). Horizontal Pod Autoscaler 根据观察到的指标(如
CPU
或Memory
)自动扩展复制 controller、部署、副本集或有状态集的 Pod 数量。
There is an official walkthrough focusing on HPA
and it's scaling:有一个针对
HPA
的官方演练及其扩展:
The algorithm that scales the amount of replicas is the following:缩放副本数量的算法如下:
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
An example (of already rendered) autoscaling can be implemented with a YAML
manifest like below:可以使用
YAML
清单来实现一个(已经渲染的)自动缩放示例,如下所示:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: HPA-NAME
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: DEPLOYMENT-NAME
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 75
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 75
A side note!
旁注!
HPA
will use calculate both metrics and chose the one with biggerdesiredReplicas
!HPA
将使用计算这两个指标并选择具有更大desiredReplicas
的指标!
Addressing a comment I wrote under the question:解决我在问题下写的评论:
I think we misunderstood each other.
我想我们互相误解了。 It's perfectly okay to "scale up when CPU >= 90" but due to logic behind the formula I don't think it will be possible to say "scale down when CPU <=70".
“当 CPU >= 90 时按比例放大”是完全可以的,但由于公式背后的逻辑,我认为不可能说“当 CPU <=70 时按比例缩小”。 According to the formula it would be something in the midst of: scale up when CPU >= 90 and scale down when CPU =< 45.
根据公式,它将处于中间状态:当 CPU >= 90 时放大,当 CPU =< 45 时缩小。
This example could be misleading and not 100% true in all scenarios.此示例可能具有误导性,并且并非在所有情况下都 100% 正确。 Taking a look on following example:
看看下面的例子:
HPA
set to averageUtilization
of 75%
. HPA
设置为75%
的averageUtilization
。 Quick calculations with some degree of approximation (default tolerance for HPA
is 0.1
):具有某种程度的近似值的快速计算(
HPA
的默认容差为0.1
):
2
replicas: 2
副本:
scale-up
(by 1
) should happen when: currentMetricValue
is >= 80%
: scale-up
(按1
)应该在以下情况下发生: currentMetricValue
>= 80%
:
x = ceil[2 * (80/75)]
, x = ceil[2,1(3)]
, x = 3
x = ceil[2 * (80/75)]
, x = ceil[2,1(3)]
, x = 3
scale-down
(by 1
) should happen when currentMetricValue
is <= 33%
:currentMetricValue
<= 33%
时应该发生按scale-down
(按1
):
x = ceil[2 * (33/75)]
, x = ceil[0,88]
, x = 1
x = ceil[2 * (33/75)]
, x = ceil[0,88]
, x = 1
8
replicas: 8
个副本:
scale-up
(by 1
) should happen when currentMetricValue
is >= 76%
:currentMetricValue
>= 76%
时,应该发生scale-up
(按1
):
x = ceil[8 * (76/75)]
, x = ceil[8,10(6)]
, x = 9
x = ceil[8 * (76/75)]
, x = ceil[8,10(6)]
, x = 9
scale-down
(by 1
) should happen when currentMetricValue
is <= 64%
:currentMetricValue
<= 64%
时应该发生按scale-down
(按1
):
x = ceil[8 * (64/75)]
, x = ceil[6,82(6)]
, x = 7
x = ceil[8 * (64/75)]
, x = ceil[6,82(6)]
, x = 7
Following this example, having 8
replicas with their currentMetricValue
at 55
( desiredMetricValue
set to 75
) should scale-down
to 6
replicas.按照这个例子,有
8
个副本,其currentMetricValue
为55
( desiredMetricValue
设置为75
)应该scale-down
到6
副本。
More information that describes the decision making of HPA
( for example why it's doesn't scale ) can be found by running:可以通过运行找到更多描述
HPA
决策的信息(例如为什么它不能扩展):
$ kubectl describe hpa HPA-NAME
Name: nginx-scaler
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Sun, 07 Mar 2021 22:48:58 +0100
Reference: Deployment/nginx-scaling
Metrics: ( current / target )
resource memory on pods (as a percentage of request): 5% (61903667200m) / 75%
resource cpu on pods (as a percentage of request): 79% (199m) / 75%
Min replicas: 1
Max replicas: 10
Deployment pods: 5 current / 5 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 4m48s (x4 over 5m3s) horizontal-pod-autoscaler did not receive metrics for any ready pods
Normal SuccessfulRescale 103s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 71s horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 71s horizontal-pod-autoscaler New size: 5; reason: cpu resource utilization (percentage of request) above target
HPA
scaling procedures can be modified by the changes introduced in Kubernetes version 1.18
and newer where the: HPA
缩放过程可以通过 Kubernetes 1.18
版和更新版本中引入的更改进行修改,其中:
Support for configurable scaling behavior
支持可配置的缩放行为
Starting from v1.18 the
v2beta2
API allows scaling behavior to be configured through the HPAbehavior
field.从v1.18开始,
v2beta2
API 允许通过 HPAbehavior
字段配置缩放行为。 Behaviors are specified separately for scaling up and down inscaleUp
orscaleDown
section under thebehavior
field.在
behavior
字段下的scaleUp
或scaleDown
部分中分别指定放大和缩小的行为。 A stabilization window can be specified for both directions which prevents the flapping of the number of the replicas in the scaling target.可以为两个方向指定稳定 window,以防止缩放目标中副本数量的波动。 Similarly specifying scaling policies controls the rate of change of replicas while scaling.
同样,指定扩展策略可以控制扩展时副本的变化率。
Kubernetes.io: Docs: Tasks: Run application: Horizontal pod autoscale: Support for configurable scaling behavior
Kubernetes.io:文档:任务:运行应用程序:水平 pod 自动缩放:支持可配置的缩放行为
I'd reckon you could used newly introduced field like behavior
and stabilizationWindowSeconds
to tune your workload to your specific needs.我认为您可以使用新引入的字段(例如
behavior
和stabilizationWindowSeconds
窗口秒)来调整您的工作负载以满足您的特定需求。
I also do recommend reaching out to EKS
documentation for more reference, support for metrics and examples.我还建议您访问
EKS
文档以获取更多参考、对指标和示例的支持。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.