简体   繁体   English

使用docker-compose运行分布式气流架构时如何将新用户添加到docker镜像

[英]How to add new user to docker image when running distributed airflow architecture using docker-compose

(THE ORIGINAL QUESTION WAS EDITED TO MAKE IT MORE CLEAR) (对原始问题进行了编辑以使其更清楚)

  1. SOLUTION AT THE END OF THE QUESTION问题末尾的解决方案
  2. ANOTHER SOLUTION IN THE ANSWER答案中的另一种解决方案

The goal and the setup目标和设置

The main goal is to run container based processing(using the DockerOperator) when the airflow celery worker is also running inside a docker container.主要目标是当气流 celery 工作人员也在 docker 容器内运行时运行基于容器的处理(使用 DockerOperator)。 At the moment, I'm testing the setup at one machine, but in the end I'll run the celery worker containers at separate machines operating in the same network sharing some of the airflow specific mount points(dags,logs,plugins) and user ids etc.目前,我正在一台机器上测试设置,但最后我将在同一网络中运行的不同机器上运行 celery 工作容器,共享一些特定于气流的安装点(dags、日志、插件)和用户ID等

I'm launching the whole setup from a docker-compose.yml where I set AIRFLOW_UID to match my UID at the host machine and AIRFLOW_GID to 0 as suggested in the airflow documentation.我正在从 docker-compose.yml 启动整个设置,我将 AIRFLOW_UID 设置为与主机上的 UID 匹配,并将 AIRFLOW_GID 设置为 0,如气流文档中的建议。 At the host, my UID belongs to docker group, but it doesn't belong to group 0. The /var/run/docker.sock is mounted into the containers.在主机上,我的 UID 属于 docker 组,但不属于 0 组。 /var/run/docker.sock已安装到容器中。

TEST 1测试 1

I followed the example represented here https://towardsdatascience.com/using-apache-airflow-dockeroperator-with-docker-compose-57d0217c8219 .我按照这里展示的例子https://towardsdatascience.com/using-apache-airflow-dockeroperator-with-docker-compose-57d0217c8219 Using the above-mentioned setup with the official airflow image 2.1.4 and DockerOperator.将上述设置与官方气流映像 2.1.4 和 DockerOperator 一起使用。 Task run fails, which is related to the fact that the default user doesn't have the needed permissions to /var/run/docker.sock .任务运行失败,这与默认用户没有/var/run/docker.sock所需的权限有关。 (I still need to check if adding the user to group 0 at the host would solve the issue as pointed out by @JarekPotiuk in the his comment. The problem is that group 0 is the root group and most likely I'll not get permission to add the user to it) (我仍然需要检查将用户添加到主机上的组 0 是否可以解决@JarekPotiuk 在他的评论中指出的问题。问题是组 0 是根组,很可能我不会获得许可将用户添加到它)

[2021-09-27 05:38:30,863] {taskinstance.py:1463} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 706, in urlopen
    chunked=chunked,
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 394, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.6/http/client.py", line 1291, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.6/http/client.py", line 1337, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.6/http/client.py", line 1286, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.6/http/client.py", line 1046, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.6/http/client.py", line 984, in send
    self.connect()
  File "/home/airflow/.local/lib/python3.6/site-packages/docker/transport/unixconn.py", line 30, in connect
    sock.connect(self.unix_socket)
PermissionError: [Errno 13] Permission denied

TEST 2测试 2

I created custom image from the official image by adding 'newuser' with an UID that matches my UID at the host and 'docker' group that matches the one at the host.我通过添加与我在主机上的 UID 相匹配的 UID 的“newuser”和与主机上的 UID 相匹配的“docker”组,从官方镜像创建了自定义镜像。

However, when I launch the setup, the user I created in the image build phase is not there and I can't understand why.但是,当我启动设置时,我在映像构建阶段创建的用户不在那里,我不明白为什么。 There is a 'default' user with uid=1234 and gid=0.有一个 uid=1234 和 gid=0 的“默认”用户。 This default user is created if I use the official Image and just define AIRFLOW_UID in the docker-compose.yml.如果我使用官方图像并在 docker-compose.yml 中定义 AIRFLOW_UID,则会创建此默认用户。

Dockerfile: Dockerfile:

FROM apache/airflow:2.1.0

USER root
RUN useradd newuser -u 1234 -g 0

RUN groupadd --gid 986 docker \
    && usermod -aG docker newuser
USER newuser

Also, if I don't create the newuser and just add airflow user to docker group then the airflow user is really added to the docker group as it should.另外,如果我不创建 newuser 并且只是将气流用户添加到 docker 组,那么气流用户确实会按原样添加到 docker 组。

Does docker-compose overwrite the users created at the image build phase? docker-compose 是否会覆盖在镜像构建阶段创建的用户? What would be the best way to solve this issue?解决此问题的最佳方法是什么?

SOLUTION解决方案

This solution makes it possible to user DockerOperator from airflow container to launch DockerContainers at host.该解决方案使用户可以从气流容器中使用 DockerOperator 在主机上启动 DockerContainers。

You can choose either the default UID=50000 and GID=0 or a custom UID and GID=0.您可以选择默认 UID=50000 和 GID=0 或自定义 UID 和 GID=0。 Create a docker group at the host and add the chosen UID to it.在主机上创建一个 docker 组并将所选的 UID 添加到其中。 Then add the airflow user inside the container into the docker group.然后将容器内的airflow用户加入docker组。 You can do this by adding the group in the compose file您可以通过在撰写文件中添加组来执行此操作

group_add:
  - <docker GID>

In addition, you have to mount the docker.sock file to the container此外,您必须将 docker.sock 文件挂载到容器

volumes:
  - /var/run/docker.sock:/var/run/docker.sock

and add a variable AIRFLOW__CORE__ENABLE_XCOM_PICKLING=True并添加一个变量 AIRFLOW__CORE__ENABLE_XCOM_PICKLING=True

I'm launching the whole setup from a docker-compose.yml where I set AIRFLOW_UID=1234 and AIRFLOW_GID=0.我正在从 docker-compose.yml 启动整个设置,其中我设置了 AIRFLOW_UID=1234 和 AIRFLOW_GID=0。 I'm using a docker image based on the official airflow image with the addition that I have created 'newuser' with gid=1234 and 'docker' group with gid that matches the one at the host.我正在使用基于官方气流图像的 docker 图像,并添加了我创建的 gid=1234 的“newuser”和 gid 与主机上的匹配的“docker”组。

You should not do it at all.你根本不应该这样做。 The user will be created automatically by Airflow's image entrypoint when you use a differnt UID than default - see https://airflow.apache.org/docs/docker-stack/entrypoint.html#allowing-arbitrary-user-to-run-the-container .当您使用与默认值不同的 UID 时,Airflow 的图像入口点将自动创建用户 - 请参阅https://airflow.apache.org/docs/docker-stack/entrypoint.html#allowing-arbitrary-user-to-run-容器 In fact all that you want to do should be possible without having to extend the Airflow image.事实上,您想要做的一切都应该是可能的,而无需扩展 Airflow 图像。

What you need to do, you need to create this user that you want to run inside the container ON THE HOST - not in the container.您需要做的是,您需要创建要在主机上的容器内运行的用户 - 而不是在容器中。 And it should belong to the docker group ON THE HOST - not in the container.它应该属于主机上的docker group ——而不是在容器中。

Docker works in the way that it uses the same kernel/users that are defined in the system, so when you run something as a user in the container, it is run with the "host" user priviledges, so you you map your docker socket to within the container, it will be able to use the socket/run docker command becaue it will have the right permissions on the host. Docker 的工作方式是使用系统中定义的相同内核/用户,因此当您在容器中以用户身份运行某些内容时,它会以“主机”用户权限运行,因此您可以映射 docker 套接字在容器内,它将能够使用 socket/run docker 命令,因为它将在主机上拥有正确的权限。

Therefore (in case you run your docker-compose as regular user who already belongs to docker group) the best way is the one suggested in the quick-start - ie run airflow with your "host" user that you are logged in with: https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html因此(如果您以已经属于 docker 组的普通用户身份运行 docker-compose)最好的方法是快速入门中建议的方法 - 即使用您登录的“主机”用户运行气流: https ://airflow.apache.org/docs/apache-airflow/stable/start/docker.html

This also makes all the files created in container belong to the "logged in user" (if they are created in directories mounted inside - such as logs directory).这也使得在容器中创建的所有文件都属于“登录用户”(如果它们是在安装在内部的目录中创建的 - 例如日志目录)。

But if your goal is to use it in "unattended" environment, then likely creating the new user on your host and adding the user to both 0 and docker groups should solve the problem.但是,如果您的目标是在“无人值守”环境中使用它,那么可能在您的主机上创建新用户并将该用户添加到0docker组应该可以解决问题。

As a complement to the great answer from @JarekPotiuk, if as indicated in your comments the problem is related to permissions issues when using the DockerOperator , you can try the following approach.作为对@JarekPotiuk 出色答案的补充,如果如您的评论所示,问题与使用DockerOperator时的权限问题有关,您可以尝试以下方法。

The idea is including in the airflow docker-compose.yml file a service based on the bobrik/socat image.这个想法是在airflow docker-compose.yml文件中包含一个基于bobrik/socat图像的服务。 Something like:就像是:

docker-proxy:
  image: bobrik/socat
  command: "TCP4-LISTEN:2375,fork,reuseaddr UNIX-CONNECT:/var/run/docker.sock"
  ports:
    - 2375:2375
  volumes:
    - /var/run/docker.sock:/var/run/docker.sock
  restart: always

This will effectively create a bridge with you host docker daemon and would allow you to run your containers using the DockerOperator without permissions issues by providing an appropriate value for the docker_url argument:这将有效地创建一个与您托管docker守护程序的桥梁,并允许您通过为docker_url参数提供适当的值来使用DockerOperator运行您的容器而不会出现权限问题:

docker_based_task = DockerOperator(
    task_id="a_docker_based_one",
    docker_url="tcp://docker-proxy:2375"
    # ...
)  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 docker-compose 在分布式 airflow 架构上配置 celery worker? - How to configure celery worker on distributed airflow architecture using docker-compose? 如何在使用docker-compose时为mongodb图像添加--auth? - how to add --auth for mongodb image when using docker-compose? 在 docker-compose 中运行容器时使用当前用户 - Using current user when running container in docker-compose 使用docker hub时如何强制docker-compose下载新镜像? - How to force docker-compose to download a new image when using docker hub? 如何为使用官方 docker-compose 运行的 Airflow 设置远程调试? - How to setup remote debug for Airflow running with official docker-compose? 使用docker-compose时如何将主机用户传递给Dockerfile - how to pass host user to Dockerfile when using docker-compose 如何在 Airflow (docker-compose) 中安装包? - How to install packages in Airflow (docker-compose)? 无法在 Airflow 1.10.14 上运行 DAG,在官方 Apache\Airflow 图像上运行 docker-compose - Fail to run DAG on Airflow 1.10.14 running with docker-compose on official Apache\Airflow image 如何使用正则表达式在docker-compose文件中指定docker镜像 - How to specify a docker image in docker-compose file using regex 如何使用 docker-compose 在 Kafka Docker 中添加更多代理 - How to add more brokers in Kafka Docker using docker-compose
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM