简体   繁体   中英

How does multiple replicas/pods scale Kubernetes?

From what I understand, using multiple replicas as well as auto-scaling is supposed to help in the case that lots of people visit your website and make calls to services provided by your Kubernetes cluster.

How do the replicas help with scaling?

Aren't these extra pods all just running on the same computer with constant resources?
That would mean that they're all limited by a constant amount of CPU and memory.

Kubernetes has couple of scaling mechanisms. Horizontal Pod Autoscaler being the basic, but not the only one.

With HPA you can spin up additional PODs according to some metrics (most commonly cpu and memory). At some point you will hit a moment when your cluster nodes do not have enough resources to satisfy resource requirements of your pods (you will have pods in Pending state due to lack of nodes available for scheduling).

At that point a Cluster Autoscaler can kick in and ie. scale AWS ASG (or some other cloud-ish node pool) to add new node to the cluster and make space for the pending pod(s)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM