简体繁体 English

如何使用每个节点运行一个并使用所有可用资源的 Pod 在 Kubernetes (GKE) 上自动扩展？

[英]How to auto scale on Kubernetes (GKE) with a pod that runs one per node and uses all available resources?

原文 2019-08-19 19:16:16 7 1 kubernetes/ google-cloud-platform/ google-kubernetes-engine

I think I have a pretty simple scenario: I need to auto-scale on Google Kubernetes Engine with a pod that runs one per node and uses all available remaining resources on the node.我想我有一个非常简单的场景：我需要在 Google Kubernetes Engine 上使用一个 pod 自动扩展，每个节点运行一个并使用节点上所有可用的剩余资源。

"Remaining" resources means that there are certain basic pod services running on each node such logging and metrics, which need their requested resources. “剩余”资源意味着在每个节点上运行某些基本的 pod 服务，例如日志记录和指标，它们需要它们请求的资源。 But everything left should go to this particular pod, which is in fact the main web service for my cluster.但是剩下的一切都应该转到这个特定的 pod，它实际上是我集群的主要 Web 服务。

Also, these remaining resources should be available when the pod's container starts up, rather than through vertical autoscaling with pod restarts.此外，这些剩余资源应该在 pod 的容器启动时可用，而不是通过 pod 重新启动的垂直自动缩放。 The reason is that the container has certain constraints that make restarts sort of expensive: heavy disk caching, and issues with licensing of some 3rd party software I use.原因是容器具有某些限制，使重新启动有点昂贵：大量磁盘缓存，以及我使用的某些 3rd 方软件的许可问题。 So although certainly the container/pod is restartable, I'd like to avoid except for rolling updates.因此，尽管容器/吊舱肯定是可重新启动的，但除了滚动更新之外，我还是想避免。

The cluster should scale nodes when CPU utilization gets too high (say, 70%).当 CPU 利用率过高（比如 70%）时，集群应该扩展节点。 And I don't mean requested CPU utilization of a node's pods, but rather the actual utilization, which is mainly determined by the web service's load.我指的不是节点 pod 的请求 CPU 利用率，而是实际利用率，这主要取决于 Web 服务的负载。

How should I configure the cluster for this scenario?我应该如何为这种情况配置集群？ I've seen there's cluster auto scaling, vertical pod autoscaling, and horizontal pod autoscaling.我已经看到有集群自动缩放、垂直 pod 自动缩放和水平 pod 自动缩放。 There's also Deployment vs DaemonSet, although it does not seem that DaemonSet is designed for pods that need to scale.还有 Deployment vs DaemonSet，尽管 DaemonSet 似乎不是为需要扩展的 pod 设计的。 So I think Deployment may be necessary, but in a way that limits one web service pod per node (pod anti affinity??).所以我认为部署可能是必要的，但以某种方式限制每个节点一个 Web 服务 pod（pod 反亲和性？？）。

How do I put all this together?我如何将所有这些放在一起？

1 个解决方案

You could set up a Deployment with a resource request that equals a single node's allocatable resources (ie, total resources minus auxiliary services as you mentioned).您可以使用等于单个节点的可分配资源（即，总资源减去您提到的辅助服务）的资源请求来设置部署。 Then configure Horizontal Pod Autoscaling to scale up your deployment when CPU request utilization goes above 70%;然后配置 Horizontal Pod Autoscaling 以在 CPU 请求利用率超过 70% 时扩展您的部署； this should do the trick as in this case request utilization rate is essentially the same as total node resource utilization rate, right?这应该可以解决问题，因为在这种情况下，请求利用率与总节点资源利用率基本相同，对吗？ However if you do want to base scaling on actual node CPU utilization, there's always scaling by external metrics .但是，如果您确实希望根据实际节点 CPU 利用率进行扩展，则始终可以通过外部指标进行扩展。

Technically the Deployment's resource request doesn't have to exactly equal remaining resources;从技术上讲，Deployment 的资源请求不必完全等于剩余资源； rather it's enough for the request to be large enough to prevent two pods being ran on the same node.相反，请求足够大以防止两个 Pod 在同一个节点上运行就足够了。 As long as that's the case and there's no resource limits, the pod ends up consuming all the available node resources.只要是这种情况并且没有资源限制，pod 最终会消耗所有可用的节点资源。

Finally configure cluster autoscaling on your GKE node pool and we should be good to go.最后在您的 GKE 节点池上配置集群自动缩放，我们应该一切顺利。 Vertical Pod Autoscaling doesn't really come into play here as pod resource request stays constant, and DaemonSets aren't applicable as they're not scalable via HPA as mentioned.垂直 Pod 自动缩放在这里并没有真正发挥作用，因为 Pod 资源请求保持不变，并且 DaemonSets 不适用，因为它们不能通过 HPA 进行扩展，如上所述。