简体繁体 English

在现实世界中扩展Docker容器

[英]Scaling Docker containers in the real world

原文 2016-01-05 12:36:51 8 3 amazon-web-services/ docker/ scalability/ kubernetes

I have a few basic questions on scaling Docker containers: 我有一些关于扩展Docker容器的基本问题：

I have 5 different apps. 我有5个不同的应用程序。 They are not connected to each other. 它们没有相互连接。 Before having containers I would run 1 app per VM and scale them up and down individually in the cloud. 在拥有容器之前，我会为每个VM运行1个应用程序，并在云中单独调整它们。

Now with containers I get the isolation on top of a VM, so now I can potentially run one host with 5 docker containers where each app is isolated in its own container. 现在使用容器我可以在VM上获得隔离，所以现在我可以运行一个带有5个docker容器的主机，其中每个应用程序都在自己的容器中隔离。

As long as I have enough resources on my host I can scale up and down those containers individually as my traffic grows or shrinks. 只要我的主机上有足够的资源，我可以随着流量的增长或缩小而单独扩展和缩小这些容器。 eg I have 3 containers running app 1, but only 1 container running app 2. 例如，我有3个容器运行app 1，但只有1个容器运行app 2。

At peak times app 3 gets a lot of traffic and I need to launch a 2nd host which runs only containers for app 3. 在高峰时段，应用程序3获得了大量流量，我需要启动第二个主机，该主机仅运行应用程序3的容器。

My first question is if the above makes sense what I say or if I have misunderstood something. 我的第一个问题是，如果上述内容对我所说的内容有所了解，或者我是否误解了某些内容。 My second question is what technology is currently available to get this all done in an automated way. 我的第二个问题是目前有哪些技术可以自动完成所有这些工作。 I need a load balancer and an auto scaling group which is capable of the above scenario without me having to do manual interventions. 我需要一个负载均衡器和一个能够实现上述场景的自动缩放组，而无需我进行手动干预。

I looked into AWS ECS and am not quite sure if it can satisfy my needs as I outlined it above. 我查看了AWS ECS，并且我不太确定它是否能够满足我的需求，如上所述。

Does anyone know how to achieve this, or is there a better way of managing and scaling my 5 apps which I am missing? 有谁知道如何实现这一目标，还是有更好的方法来管理和扩展我缺少的5个应用程序？

UPDATE: 更新：

Via Twitter I have been pointed to Kubernetes and specifically to the docs on the Horizontal Pod Autoscaler . 通过Twitter，我被指向Kubernetes ，特别是关于Horizontal Pod Autoscaler的文档。

Might be useful for others as well. 也可能对其他人有用。 I will update this question as I learn more. 随着我的了解，我会更新这个问题。

3 个解决方案

There are several options, but none that I know that does it all: you will need 2 things: autoscaling hosts according to signals, then autoscale containers on the hosts. 有几个选项，但我知道没有任何选择：您将需要两件事：根据信号自动调节主机，然后在主机上自动调节容器。

The following are the solutions to deploy and scale containers on the hosts (not necessarily auto -scale though): 以下是在主机上部署和扩展容器的解决方案（不一定是自动缩放）：

Kubernetes is an orchestration tool which allows to schedule and (with the optional autoscaler) to autoscale pods (groups of containers) in the cluster. Kubernetes是一个编排工具，允许安排和（使用可选的自动缩放器）到群集中的自动缩放pod（容器组）。 It makes sure your containers are running somewhere if a host fails. 如果主机出现故障，它可确保您的容器在某处运行。 Google Container Engine (GKE) offers this as a service, however i am not sure they have the same functionalities to autoscale the number of VMs in the cluster as AWS does. Google容器引擎（GKE）将此作为服务提供，但我不确定它们是否具有与AWS相同的自动调整群集中VM数量的功能。

Mesos : somewhat similar to Kubernetes but not dedicated to running containers. Mesos ：有点类似于Kubernetes但不专用于运行容器。

Docker Swarm : the Docker multi-host deployment solution, allows you control many hosts as if they were a single Docker host. Docker Swarm ：Docker多主机部署解决方案，允许您控制许多主机，就像它们是单个Docker主机一样。 I don't believe it has any kind of 'autoscaling' capability, and I don't believe it takes care of making sure pods are always running somewhere: it's basically docker for cluster. 我不相信它有任何“自动缩放”功能，我不相信它会确保pod总是在某个地方运行：它基本上是群集的docker。

[EDIT] Docker supports restarting failing containers with the restart=always option, also, as of Docker 1.11 Docker Swarm is a mode in Docker Daemon, and supports rescheduling containers on node failure: it will restart containers on a different node if a node is no longer available. [编辑] Docker支持使用restart=always选项重启失败的容器，从Docker 1.11开始，Docker Swarm是Docker守护程序中的一种模式，并支持在节点故障时重新安排容器：如果节点是一个节点，它将重新启动不同节点上的容器不再可用。

Docker 1.11+ is becoming a lot like Kubernetes in terms of functionalities. Docker 1.11+在功能方面变得与Kubernetes非常相似。 It has some nice features (like TLS between nodes by default), but still lacks things like static IPs and storage provisioning 它有一些很好的功能（默认情况下节点之间的TLS），但仍然缺乏静态IP和存储配置等功能

None of these solutions will autoscale the number of hosts for you, but they can scale the number of containers on the hosts. 这些解决方案都不会为您自动调整主机数量，但可以扩展主机上的容器数量。

For autoscaling hosts, solutions are specific to your cloud provider, so these are dedicated solution. 对于自动扩展主机，解决方案特定于您的云提供商，因此这些是专用解决方案。 The key part for you is to integrate the two: AWS allows deployment of Kubernetes on CoreOS; 关键部分是集成两者：AWS允许在CoreOS上部署Kubernetes; I don't think they offer this as a service, so you need to deploy your own CoreOS cluster and Kubernetes. 我不认为他们将此作为服务提供，因此您需要部署自己的CoreOS集群和Kubernetes。

Now my personal opinion (and disclaimer) 现在我的个人意见（和免责声明）

I have mostly used Kubernetes on GKE and bare-metal, as well as Swarm a about 6 months ago, and i run an infra with ~35 services on GKE: 我大部分时间都在GKE和裸机上使用Kubernetes，以及大约6个月前使用Swarm，我在GKE运行了一个约35个服务的基础：

Frankly, GKE with Kubernetes as a Service offers most of what you want, but it's not AWS. 坦率地说，GKE与Kubernetes即服务提供了您想要的大部分内容，但它不是AWS。 Scaling hosts is still a bit tricky and will require some work. 扩展主机仍然有点棘手，需要一些工作。

Setting up your own Kubernetes or Mesos on AWS or bare metal is very feasible, but there is quite a learning curve: it all depends if you really strongly feel about being on AWS and are willing to spend the time. 在AWS或裸机上设置您自己的Kubernetes或Mesos是非常可行的，但是有一个学习曲线：这完全取决于您是否真的非常感觉自己是在AWS上而且愿意花时间。

Swarm is probably the easiest to get working with, but more limited, however homebuilt script can well do the job core job: use AWS APIs to scale hosts, and Swarm to deploy. Swarm可能是最容易使用的，但是更有限，但是自制脚本可以很好地完成工作核心工作：使用AWS API来扩展主机，使用Swarm进行部署。 The availability guarantee though would require you monitoring and take care of re-launching containers if a node fails. 但是，可用性保证需要您监视并在节点出现故障时负责重新启动容器。

Other than that, there are also container hosting providers that may do the job for you: 除此之外，还有一些容器托管服务提供商可以为您完成这项工作：

Scalingo is one i know of but there are others. Scalingo是我所知道的，但还有其他人。 https://scalingo.com/ https://scalingo.com/
OVH Sail Above has this service in alpha. OVH Sail Above在alpha中提供此服务。 https://www.runabove.com/sailabove.xml https://www.runabove.com/sailabove.xml

I would take a look at Tutum(Recently acquired by Docker actually). 我会看看Tutum（最近被Docker收购）。 It ties into CI and I believe it has autoscaling capabilities. 它与CI相关联，我相信它具有自动缩放功能。

https://www.tutum.co/ https://www.tutum.co/

UPDATE: this is supported by AWS ECS with Task Placement Constraints . 更新：具有任务放置约束的 AWS ECS支持此功能。

Have your ECS cluster served by two auto-scaling groups (ASGs). 让您的ECS群集由两个自动扩展组（ASG）提供服务。
In the first ASG, set min , max and desired size all to 1. 在第一个ASG中，将min ， max和desired大小全部设置为1。
Tag this instance with a custom attribute ALLOW_ALL_APPS = TRUE . 使用自定义属性 ALLOW_ALL_APPS = TRUE标记此实例。 Do this in the user data script. 在用户数据脚本中执行此操作。
In the second ASG, set min and desired size to 0 and max size to 1 (I'm assuming you only want 2 instances). 在第二个ASG中，将min和desired大小设置为0，将max size设置为1（我假设您只需要2个实例）。
Tag this instance with custom attribute ALLOW_ALL_APPS = FALSE . 使用自定义属性ALLOW_ALL_APPS = FALSE标记此实例。 Again in the user data script. 再次在用户数据脚本中。
The scale up alarm for the second ASG will be determined by load on the first ASG. 第二个ASG的放大警报将由第一个ASG上的负载确定。
If you know when peak times are for app 3 you could bump it up preemptively with a scheduled scaling action. 如果您知道app 3的高峰时间，您可以使用预定的缩放操作预先提升它。
Scaling down for the second ASG is when its load drops enough so that the first ASG can handle it on its own. 缩小第二个ASG的时间是它的负载下降到足以使第一个ASG可以自己处理它。
In the service definitions for apps 1, 2, 4 and 5 you would have placement constraints restricting them to run only on nodes where ALLOW_ALL_APPS = TRUE . 在应用程序1,2,4和5的服务定义中，您将拥有放置约束，限制它们仅在ALLOW_ALL_APPS = TRUE节点上运行。
In the service definition for app 3 there are no placement constraints. 在app 3的服务定义中，没有放置约束。
Service auto scaling is configured for all apps based on container or application metrics. 根据容器或应用程序指标为所有应用程序配置服务自动缩放。