简体   繁体   English

如何使用 Terraform 扩展 Kube.netes 集群以避免停机?

[英]How to scale up Kubernetes cluster with Terraform avoiding downtime?

Here's the scenario: we have some applications running on a Kube.netes cluster on Azure. Currently our production cluster has one Nodepool with 3 nodes which are fairly low on resources because we still don't have that many active users/requests simultaneously.场景如下:我们有一些应用程序在 Azure 上的 Kube.netes 集群上运行。目前我们的生产集群有一个 Nodepool 和 3 个节点,资源相当低,因为我们仍然没有同时有那么多活动用户/请求。

Our backend APIs app is running on three pods, one on each node.我们的后端 API 应用程序在三个 pod 上运行,每个节点一个。 I was told I will have need to increase resources soon (I'm thinking more memory or even replacing the VMs of the nodes with better ones).有人告诉我我需要尽快增加资源(我在考虑更多 memory 甚至用更好的虚拟机替换节点的虚拟机)。

We structured everything Kube.netes related using Terraform and I know that replacing VMs in a node is a destructive action, meaning the cluster will have to be replaces, new config and all deployments, services and etc will have to be reapplied.我们使用 Terraform 构建了 Kube.netes 相关的所有内容,我知道更换节点中的虚拟机是一种破坏性操作,这意味着必须更换集群、新config和所有部署、服务等都必须重新应用。

I am fairly new to the Kube.netes and Terraform world, meaning I can do the basics to get an application up and running but I would like to learn what is the best practice when it comes to scaling and performance.我是 Kube.netes 和 Terraform 世界的新手,这意味着我可以做一些基础知识来启动和运行应用程序,但我想了解在扩展和性能方面的最佳实践是什么。 How can I perform such increase in resources without having any downtime of our services?我怎样才能在不让我们的服务停机的情况下增加资源?

I'm wondering if having an extra Nodepool would help while I replace the VM's of the other one (I might be absolutely wrong here)我想知道在我替换另一个 VM 时是否有额外的 Nodepool 会有所帮助(我在这里可能完全错了)

If there's any link, course, tutorial you can point me to it's highly appreciated.如果有任何链接、课程、教程,您可以指出我非常感谢。

(Moved from comments) (从评论中移动)

In Azure, when you're performing cluster upgrade, there's a parameter called "max surge count" which is equal to 1 by default.在 Azure 中,当您执行集群升级时,有一个名为“max surge count”的参数,默认情况下等于 1。 What it means is when you update your cluster or node configuration, it will first create one extra node with the updated configuration - and only then it will safely drain and remove one of old ones.这意味着当您更新集群或节点配置时,它将首先使用更新后的配置创建一个额外的节点——然后它才会安全地耗尽并删除其中一个旧节点。 More on this here: Azure - Node Surge Upgrade更多相关信息: Azure - Node Surge Upgrade

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将实时生产 kubernetes 集群迁移到另一个集群,同时最大限度地减少停机时间? - How can I migrate a live production kubernetes cluster to another cluster while minimizing downtime? 如何使用 terraform 限制 kube.netes 集群上的磁盘使用 - How to limit disk usage on kubernetes cluster using terraform Terraform:错误:Kube.netes 集群不可访问:配置无效 - Terraform: Error: Kubernetes cluster unreachable: invalid configuration Kube.netes HPA - 扩大冷却时间 - Kubernetes HPA - Scale up cooldown 如何通过 terraform 修复新配置的 EKS 集群上的 kube.netes_config_map 资源错误? - How to fix kubernetes_config_map resource error on a newly provisioned EKS cluster via terraform? terraform output Google Kubernetes 集群入口负载均衡器 Z957B527BCFBAD3E380F58ZD2068 - terraform output Google Kubernetes cluster inggress load balancer ip 通过 Terraform 设置自动缩放 kube.netes 集群的启动磁盘大小 - Setting boot disk size for autoscaling kubernetes cluster through Terraform Kube.netes HPA 无法扩展 - Kubernetes HPA doesn't scale up 在 Terraform 中使用本地 Kube.netes 集群中的公共 ECR 镜像 - Using a public ECR image in local Kubernetes cluster in Terraform 如何使用 Terraform 将 GKE 凭证传递给 kube.netes 提供商? - How to pass GKE credential to kubernetes provider with Terraform?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM