简体繁体 English

单节点kubernetes集群上的HPA实现

[英]HPA Implementation on single node kubernetes cluster

原文 2019-09-17 07:28:53 8 1 docker/ elasticsearch/ kubernetes/ google-cloud-platform/ google-kubernetes-engine

I am running Kubernetes cluster on GKE. 我正在GKE上运行Kubernetes集群。 Running the monolithic application and now migrating to microservices so both are running parallel on cluster. 运行整体应用程序，现在迁移到微服务，因此两者都在集群上并行运行。

A monolithic application is simple python app taking the memory of 200Mb around. 整体应用程序是简单的python应用程序，占用200Mb的内存。

K8s cluster is simple single node cluster GKE having 15Gb memory and 4vCPU . K8s群集是具有15Gb memory and 4vCPU简单单节点群集GKE。

Now i am thinking to apply the HPA for my microservices and monolithic application. 现在，我正在考虑将HPA用于我的微服务和整体应用程序。

On single node i have also installed Graylog stack which include ( elasticsearch, mongoDb, Graylog pod). 在单节点上，我还安装了Graylog堆栈，其中包括（ elasticsearch, mongoDb, Graylog pod）。 Sperated by namespace Devops . 由命名空间Devops 。

In another namespace monitoring there is Grafana, Prometheus, Alert manager running. 在另一个名称空间监视中，正在运行Grafana, Prometheus, Alert manager 。

There is also ingress controller and cert-manager running. 还有ingress controller and cert-manager正在运行。

Now in default namespace there is another Elasticsearch for application use, Redis, Rabbitmq running. 现在，在默认名称空间中，还有另一个Elasticsearch用程序使用， Redis, Rabbitmq正在运行。 These all are single pod, Type statefulsets or deployment with volume. 这些都是单个pod，Type statefulsets或带有卷的deployment 。

Now i am thinking to apply the HPA for microservices and application. 现在，我正在考虑将HPA应用于微服务和应用程序。

Can someone suggest how to add node-pool on GKE and auto scale. 有人可以建议如何在GKE上添加节点池和自动扩展。 When i added node in pool and deleted old node from GCP console whole cluster restarted and service goes down for while. 当我在池中添加节点并从GCP控制台删除旧节点时，整个集群将重新启动，并且服务会停顿一会儿。

Plus i am thinking to use the affinity/anti-affinity so can someone suggest devide infrastructure and implement HPA. 另外，我正在考虑使用affinity/anti-affinity因此有人可以建议使用基础架构并实施HPA。

1 个解决方案

From the wording in your question, I suspect that you want to move your current workloads to the new pool without disruption. 从你的问题的措辞，我怀疑你想你当前的工作负载迁移到新池不中断。

Since this action represents a voluntary disruption , you can start by defining a PodDisruptionBudget to control the number of pods that can be evicted in this voluntary disruption operation : 由于这个动作代表一个自愿中断 ，您可以通过定义开始PodDisruptionBudget来控制，可以在此自愿中断操作被驱逐豆荚数量：

A PDB limits the number of pods of a replicated application that are down simultaneously from voluntary disruptions. PDB限制了由于自愿中断而同时关闭的已复制应用程序的Pod数量。

The settings in the PDB depend on your application and your business needs, for a reference on the values to apply, you can check this . PDB中的设置取决于您的应用程序和您的业务需求，有关要应用的值的参考，可以选中此。

Following this, you can drain the nodes where your application is scheduled since it will be "protected" by the budget and, drain uses the Eviction API instead of directly deleting the pods, which should make evictions graceful. 此后，您可以drain计划您的应用程序的节点，因为它会受到预算的“保护” ，并且drain使用Eviction API而不是直接删除Pod，这将使逐出更为顺畅。

Regarding Affinity , I'm not sure how it fits in the beforementioned goal that you're trying to achieve. 关于Affinity ，我不确定它是否适合您要实现的上述目标。 However, there is an answer of this particular regard in the comments . 但是，注释中对此特殊问题有一个答案。