如何使用扩展/缩减策略实现 Kubernetes 水平 pod 自动扩展？

Question

Kubernetes v1.19 in AWS EKS AWS EKS 中的 Kubernetes v1.19

I'm trying to implement horizontal pod autoscaling in my EKS cluster, and am trying to mimic what we do now with ECS.我正在尝试在我的 EKS 集群中实现水平 pod 自动缩放，并尝试模仿我们现在使用 ECS 所做的事情。 With ECS, we do something similar to the following使用 ECS，我们执行类似于以下的操作

scale up when CPU >= 90% after 3 consecutive 1-min periods of sampling在连续 3 个 1 分钟的采样周期后 CPU >= 90% 时按比例放大
scale down when CPU <= 60% after 5 consecutive 1-min periods of sampling在 5 个连续 1 分钟的采样周期后 CPU <= 60% 时按比例缩小
scale up when memory >= 85% after 3 consecutive 1-min periods of sampling当 memory >= 85% 在连续 3 个 1 分钟的采样周期后放大
scale down when memory <= 70% after 5 consecutive 1-min periods of sampling当 memory <= 5 个连续 1 分钟的采样周期后 70% 时按比例缩小

I'm trying to use the HorizontalPodAutoscaler kind, and helm create gives me this template.我正在尝试使用HorizontalPodAutoscaler类型，而helm create给了我这个模板。 (Note I modified it to suit my needs, but the metrics stanza remains.) （注意我修改了它以满足我的需要，但metrics节仍然存在。）

{- if .Values.autoscaling.enabled }}
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: {{ include "microserviceChart.Name" . }}
  labels:
    {{- include "microserviceChart.Name" . | nindent 4 }}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{ include "microserviceChart.Name" . }}
  minReplicas: {{ include "microserviceChart.minReplicas" . }}
  maxReplicas: {{ include "microserviceChart.maxReplicas" . }}
  metrics:
    {{- if .Values.autoscaling.targetCPUUtilizationPercentage }}
    - type: Resource
      resource:
        name: cpu
        targetAverageUtilization: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}
    {{- end }}
    {{- if .Values.autoscaling.targetMemoryUtilizationPercentage }}
    - type: Resource
      resource:
        name: memory
        targetAverageUtilization: {{ .Values.autoscaling.targetMemoryUtilizationPercentage }}
    {{- end }}
{{- end }}

However, how do I fit the scale up/down information shown in Horizontal Pod Autoscaling in the above template, to match the behavior that I want?但是，我如何适应上述模板中Horizontal Pod Autoscaling中显示的放大/缩小信息，以匹配我想要的行为？

Answer 1

The Horizontal Pod Autoscaler automatically scales the number of Pods in a replication controller, deployment, replica set or stateful set based on observed metrics (like CPU or Memory ). Horizontal Pod Autoscaler 根据观察到的指标（如CPU或Memory ）自动扩展复制 controller、部署、副本集或有状态集的 Pod 数量。

There is an official walkthrough focusing on HPA and it's scaling:有一个针对HPA的官方演练及其扩展：

Kubernetes.io: Docs: Tasks: Run application: Horizontal pod autoscale: Walkthrough Kubernetes.io：文档：任务：运行应用程序：水平 pod 自动缩放：演练

The algorithm that scales the amount of replicas is the following:缩放副本数量的算法如下：

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

An example (of already rendered) autoscaling can be implemented with a YAML manifest like below:可以使用YAML清单来实现一个（已经渲染的）自动缩放示例，如下所示：

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: HPA-NAME
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: DEPLOYMENT-NAME
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 75

A side note!旁注！

HPA will use calculate both metrics and chose the one with bigger desiredReplicas ! HPA将使用计算这两个指标并选择具有更大desiredReplicas的指标！

Addressing a comment I wrote under the question:解决我在问题下写的评论：

I think we misunderstood each other.我想我们互相误解了。 It's perfectly okay to "scale up when CPU >= 90" but due to logic behind the formula I don't think it will be possible to say "scale down when CPU <=70". “当 CPU >= 90 时按比例放大”是完全可以的，但由于公式背后的逻辑，我认为不可能说“当 CPU <=70 时按比例缩小”。 According to the formula it would be something in the midst of: scale up when CPU >= 90 and scale down when CPU =< 45.根据公式，它将处于中间状态：当 CPU >= 90 时放大，当 CPU =< 45 时缩小。

This example could be misleading and not 100% true in all scenarios.此示例可能具有误导性，并且并非在所有情况下都 100% 正确。 Taking a look on following example:看看下面的例子：

HPA set to averageUtilization of 75% . HPA设置为75%的averageUtilization 。

Quick calculations with some degree of approximation (default tolerance for HPA is 0.1 ):具有某种程度的近似值的快速计算（ HPA的默认容差为0.1 ）：

2 replicas: 2副本：
- scale-up (by 1 ) should happen when: currentMetricValue is >= 80% : scale-up （按1 ）应该在以下情况下发生： currentMetricValue >= 80% ：
  - x = ceil[2 * (80/75)] , x = ceil[2,1(3)] , x = 3 x = ceil[2 * (80/75)] , x = ceil[2,1(3)] , x = 3
- scale-down (by 1 ) should happen when currentMetricValue is <= 33% :当currentMetricValue <= 33%时应该发生按scale-down （按1 ）：
  - x = ceil[2 * (33/75)] , x = ceil[0,88] , x = 1 x = ceil[2 * (33/75)] , x = ceil[0,88] , x = 1
8 replicas: 8个副本：
- scale-up (by 1 ) should happen when currentMetricValue is >= 76% :当currentMetricValue >= 76%时，应该发生scale-up （按1 ）：
  - x = ceil[8 * (76/75)] , x = ceil[8,10(6)] , x = 9 x = ceil[8 * (76/75)] , x = ceil[8,10(6)] , x = 9
- scale-down (by 1 ) should happen when currentMetricValue is <= 64% :当currentMetricValue <= 64%时应该发生按scale-down （按1 ）：
  - x = ceil[8 * (64/75)] , x = ceil[6,82(6)] , x = 7 x = ceil[8 * (64/75)] , x = ceil[6,82(6)] , x = 7

Following this example, having 8 replicas with their currentMetricValue at 55 ( desiredMetricValue set to 75 ) should scale-down to 6 replicas.按照这个例子，有8个副本，其currentMetricValue为55 （ desiredMetricValue设置为75 ）应该scale-down到6副本。

More information that describes the decision making of HPA ( for example why it's doesn't scale ) can be found by running:可以通过运行找到更多描述HPA决策的信息（例如为什么它不能扩展）：

$ kubectl describe hpa HPA-NAME

Name:                                                     nginx-scaler
Namespace:                                                default
Labels:                                                   <none>
Annotations:                                              <none>
CreationTimestamp:                                        Sun, 07 Mar 2021 22:48:58 +0100
Reference:                                                Deployment/nginx-scaling
Metrics:                                                  ( current / target )
  resource memory on pods  (as a percentage of request):  5% (61903667200m) / 75%
  resource cpu on pods  (as a percentage of request):     79% (199m) / 75%
Min replicas:                                             1
Max replicas:                                             10
Deployment pods:                                          5 current / 5 desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type     Reason                   Age                   From                       Message
  ----     ------                   ----                  ----                       -------
  Warning  FailedGetResourceMetric  4m48s (x4 over 5m3s)  horizontal-pod-autoscaler  did not receive metrics for any ready pods
  Normal   SuccessfulRescale        103s                  horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target
  Normal   SuccessfulRescale        71s                   horizontal-pod-autoscaler  New size: 4; reason: cpu resource utilization (percentage of request) above target
  Normal   SuccessfulRescale        71s                   horizontal-pod-autoscaler  New size: 5; reason: cpu resource utilization (percentage of request) above target

HPA scaling procedures can be modified by the changes introduced in Kubernetes version 1.18 and newer where the: HPA缩放过程可以通过 Kubernetes 1.18版和更新版本中引入的更改进行修改，其中：

Support for configurable scaling behavior支持可配置的缩放行为

Starting from v1.18 the v2beta2 API allows scaling behavior to be configured through the HPA behavior field.从v1.18开始， v2beta2 API 允许通过 HPA behavior字段配置缩放行为。 Behaviors are specified separately for scaling up and down in scaleUp or scaleDown section under the behavior field.在behavior字段下的scaleUp或scaleDown部分中分别指定放大和缩小的行为。 A stabilization window can be specified for both directions which prevents the flapping of the number of the replicas in the scaling target.可以为两个方向指定稳定 window，以防止缩放目标中副本数量的波动。 Similarly specifying scaling policies controls the rate of change of replicas while scaling.同样，指定扩展策略可以控制扩展时副本的变化率。

Kubernetes.io: Docs: Tasks: Run application: Horizontal pod autoscale: Support for configurable scaling behavior Kubernetes.io：文档：任务：运行应用程序：水平 pod 自动缩放：支持可配置的缩放行为

I'd reckon you could used newly introduced field like behavior and stabilizationWindowSeconds to tune your workload to your specific needs.我认为您可以使用新引入的字段（例如behavior和stabilizationWindowSeconds窗口秒）来调整您的工作负载以满足您的特定需求。

I also do recommend reaching out to EKS documentation for more reference, support for metrics and examples.我还建议您访问EKS文档以获取更多参考、对指标和示例的支持。

如何使用扩展/缩减策略实现 Kubernetes 水平 pod 自动扩展？

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-03-08 08:46:19

Support for configurable scaling behavior支持可配置的缩放行为

如何使用扩展/缩减策略实现 Kubernetes 水平 pod 自动扩展？

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-03-08 08:46:19

Support for configurable scaling behavior支持可配置的缩放行为

解决方案1
2 已采纳 2021-03-08 08:46:19