简体   繁体   English

docker 容器内的多个线程

[英]Multiple threads inside docker container

I need to spawn N threads inside a docker container.我需要在 docker 容器内生成 N 个线程。 I am going to receive a list of elements, then divide it in chunks and each thread will process each chunk.我将接收一个元素列表,然后将其分成块,每个线程将处理每个块。

So I am using a docker container with one process and N threads.所以我使用一个带有一个进程和 N 个线程的 docker 容器。 Is it good practice in docker?这是 docker 的好习惯吗? I think so, because we have, eg, apacha webserver that handle connections spawining threads.我认为是这样,因为我们有,例如,处理产生线程的连接的 apacha 网络服务器。

Or it will be better to spawn N container each one for each chunk?或者最好为每个块生成 N 个容器? If it is, what is the correct way to do this?如果是,那么执行此操作的正确方法是什么?

A container as such has nothing to do with the computation you need to perform.这样的容器与您需要执行的计算无关。 The question you are posting is whether I should have multiple processes doing my processing or multiple threads spawned by the same process doing the processing ?您发布的问题是我是否应该有多个进程进行处理,还是由同一进程产生的多个线程进行处理?

A container is just a platform for running your application in the environment you want.容器只是在您想要的环境中运行应用程序的平台 Period.期间。 It means, you would be running a process inside a container to run your business logic.这意味着,您将在容器内运行一个流程来运行您的业务逻辑。 Multiple containers simply means multiple processes and as it is advised, you should go for multiple threads rather than multiple processes as spawning a new process (in your case, as container) would eat up more resources and would also require more memory etc. So it is better to have just one container which will spawn multiple threads to do the job for you.多个容器只是意味着多个进程,正如建议的那样,您应该使用多个线程而不是多个进程,因为产生一个新进程(在您的情况下,作为容器)会消耗更多资源,并且还需要更多内存等。所以它最好只有一个容器,它会产生多个线程来为您完成这项工作。

However, it also depends upon the configuration of the underlying machine on which the container is started.但是,它还取决于启动容器的底层机器的配置 If it makes sense to spawn multiple containers with multiple threads because of the multicore capabilities of the underlying hardware, you should do that as well.如果由于底层硬件的多核功能而产生具有多个线程的多个容器是有意义的,那么您也应该这样做。

Short answer:简短的回答:

Run your program as a single docker container.将您的程序作为单个 docker 容器运行。 Think of a Docker container as a lightweight isolated environment, akin to a virtual environment, where you can run a program/service.将 Docker 容器视为一个轻量级的隔离环境,类似于虚拟环境,您可以在其中运行程序/服务。 This service can run multiple threads, all launched from the parent program - it is still one service running on a single Docker container.该服务可以运行多个线程,所有线程都从父程序启动——它仍然是在单个 Docker 容器上运行的一个服务。

Explanation:说明:

Lets assume you have a program that spawns threads to do some work - this program might be a thread pool to do some computation on a set of chunks or it could be a web server like Apache.假设您有一个程序可以生成线程来完成一些工作——这个程序可能是一个线程池,用于对一组块进行一些计算,或者它可能是一个像 Apache 这样的 Web 服务器。 It could even be some Python code that instantiates a process pool do the crunch computation.甚至可以是一些 Python 代码实例化进程池来执行紧缩计算。 In all these cases all the threads and processes belong to a master process that can be thought as a single program or service.在所有这些情况下,所有线程和进程都属于可以被视为单个程序或服务的主进程。 This single program is triggered through a single user command, the command that you will specify in the Dockerfile ENTRYPOINT.这个单个程序是通过单个用户命令触发的,您将在 Dockerfile ENTRYPOINT 中指定该命令。

For example, you can run an Apache server container using the official Apache image on docker hub docker hub ref ):例如,您可以使用 docker hub docker hub ref上的官方 Apache 映像运行 Apache 服务器容器:

docker run -dit --name my-apache-app -v "$PWD":/usr/local/apache2/htdocs/ httpd:2.4

And this will run the Apache web server as a single container, irrespective of how many threads it executes, which can easily be referred to when the operator wants it stopped, restarted, deleted, etc, using the docker commands.这会将 Apache Web 服务器作为单个容器运行,而不管它执行多少线程,当操作员希望它停止、重新启动、删除等时,可以使用 docker 命令轻松引用它。 And this is more convenient, as we don't need to worry about attaching mounting volumes, opening ports, and linking multitudes of containers, so they communicate to each other.这更方便,因为我们不需要担心附加安装卷、打开端口和链接多个容器,因此它们相互通信。

So the main point is that you want to spawn a container for each service instance.所以重点是你想为每个服务实例生成一个容器。 If you wanted to launch duplicate instances of the parent process, for example, run Apache on two machines as part of a load balanced configuration, then you would run two containers, one on each host.如果您想启动父进程的重复实例,例如,在两台机器上运行 Apache 作为负载平衡配置的一部分,那么您将运行两个容器,一个在每个主机上。

As an aside, if you have a use case where you needed to run diverse jobs in batch system, where each job required a specific libraries installed, then that type of use case would benefit from the environment isolation that would one would achieve from running different containers.顺便说一句,如果您有一个用例需要在批处理系统中运行不同的作业,其中每个作业都需要安装一个特定的库,那么这种类型的用例将受益于环境隔离,人们可以通过运行不同的容器。 But this is not what you asked, your question specifically mentioned a web server spawning threads and processes utilizing threads to do work on chunks, and for those cases you spawn a single container for the service/program.但这不是您问的问题,您的问题特别提到了 Web 服务器生成线程和利用线程处理块的进程,并且在这些情况下,您为服务/程序生成单个容器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM