繁体   English   中英

如何从 Windows 主机上的 docker 容器内部执行 docker 命令

[英]how to execute docker command from inside docker container on windows host

情况如下:我已经成功地在本地开发了一个超级简单的 ETL 过程,它从某个远程位置提取数据,然后将未处理的数据写入我本地 Windows 机器上的 MongoDB 容器中。 现在,我想为每个任务使用 DockerOperator 使用 Apache-Airflow 来安排这个过程,即我想为我的源代码创建一个 docker 映像,然后使用 DockerOperator 执行该映像中的源代码。 由于我在 Windows 机器上工作,我只能在 docker 容器中使用 Airflow 来实际触发 Airflow DAG。 Airflow 容器(下面称为 webserver)和 Mongo 容器(下面称为 mongo)都在docker-compose.yml文件中指定,您可以在最后看到。

据我所知,每次触发我的简单 ETL DAG 并执行 DockerOperator 时,webserver 容器都会为每个 ETL 任务创建一个新的“同级”容器,然后执行这个新容器中的源代码并执行任务完成后,这个新容器被再次删除。 如果到目前为止我的理解是正确的,那么 webserver 容器需要能够执行 docker 命令,例如docker build...才能创建这些同级容器。

为了测试这个理论,我将卷/var/run/docker.sock:/var/run/docker.sock/usr/bin/docker:/usr/bin/docker添加到docker-compose.yml中的 webserver 容器的定义中docker-compose.yml文件,以便 webserver 容器可以在我的主机(windows)机器上使用 docker 守护进程。 然后,我使用docker-compose up -d启动了 webserver 和 mongo 容器,我使用docker exec -it <name_of_webserver_container> /bin/bash进入了 webserver 容器,然后我尝试了简单的 docker 命令docker ps --all 但是,此命令的输出是bash: docker: command not found 因此,似乎 Docker 没有正确安装在 webserver 容器中。 如何确保 Docker 安装在 webserver 容器中,以便可以创建其他同级容器?

您可以在下面找到docker-compose.yml文件和用于 webserver 容器的Dockerfile的相关方面。

docker-compose.yml位于项目根目录:

webserver:
        build: ./docker-airflow
        restart: always
        privileged: true
        depends_on:
            - postgres  # some other service I cut out from this post
            - mongo
            - mongo-express  # some other service I cut out from this post
        environment:
            - LOAD_EX=n
            - EXECUTOR=Local
            - POSTGRES_USER=some_user
            - POSTGRES_PASSWORD=some_pw
            - POSTGRES_DB=airflowdb
        volumes:
            # DAG folder
            - ./docker-airflow/dags:/usr/local/airflow/dags
            # Add path for external python modules
            - ./src:/home/python_modules
            # Add path for airflow workspace folder
            - ./docker-airflow/workdir:/home/workdir
            # Mount the docker socket from the host (currently my laptop) into the webserver container
            - //var/run/docker.sock:/var/run/docker.sock  # double // are necessary for windows host
        ports:
            # Change port to 8081 to avoid Jupyter conflicts
            - 8081:8080
        command: webserver
        healthcheck:
            test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
            interval: 30s
            timeout: 30s
            retries: 3
        networks:
            - mynet

位于Dockerfile docker-airflow airflow 文件夹中的 webserver 容器的 Dockerfile:

FROM puckel/docker-airflow:1.10.4

# Adds DAG folder to the PATH
ENV PYTHONPATH "${PYTHONPATH}:/home/python_modules:/usr/local/airflow/dags"

# Install the optional packages and change the user to airflow again
COPY requirements.txt requirements.txt
USER root
RUN pip install -r requirements.txt

# Install docker inside the webserver container
RUN pip install -U pip && pip install docker
ENV SHARE_DIR /usr/local/share

# Install simple text editor for debugging
RUN ["apt-get", "update"]
RUN ["apt-get", "-y", "install", "vim"]

USER airflow

编辑/更新

合并 Noe 的评论后,我将 webserver 容器的 Dockerfile 更改为以下内容:

FROM puckel/docker-airflow:1.10.4

# Adds DAG folder to the PATH
ENV PYTHONPATH "${PYTHONPATH}:/home/python_modules:/usr/local/airflow/dags"

# Install the optional packages and change the user to airflow again
COPY requirements.txt requirements.txt
USER root
RUN pip install -r requirements.txt

# Install docker inside the webserver container
RUN curl -sSL https://get.docker.com/ | sh
ENV SHARE_DIR /usr/local/share

# Install simple text editor for debugging
RUN ["apt-get", "update"]
RUN ["apt-get", "-y", "install", "vim"]

USER airflow

我将docker==4.1.0添加到了requirements.txt文件(在上面的 Dockerfile 中引用),该文件包含了 webserver 容器中所有要安装的包。

但是,现在,当我第一次使用docker-compose up --build -d启动服务时,然后像这样docker exec -it <name_of_webserver_container> /bin/bash进入网络服务器容器,然后输入简单的 docker 命令docker ps --all ,我得到以下输出:

Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.40/containers/json?all=1: dial unix /var/run/docker.sock: connect: permission denied

所以,似乎我仍然需要授予一些权利/特权,这让我感到困惑,因为在docker-compose.yml文件的 webserver 部分中,我已经放置了privileged: true 那么有人知道这个问题的原因吗?

编辑/更新/回答

从 webserver 容器的 Dockerfile 中删除USER airlfow后,我可以在 webserver 容器中 docker 命令!

您正在尝试做的是在 docker 中称为 docker。

你需要做这些事情:

  • 在容器中安装 docker 客户端

添加RUN curl -sSL https://get.docker.com/ | sh RUN curl -sSL https://get.docker.com/ | sh

  • 安装 docker 套接字

你做得很好 mount //var/run/docker.sock:/var/run/docker.sock

  • 以特权模式运行您的容器

privileged: true添加到您的容器

在您的特定情况下,您需要执行以下操作:

  • 删除RUN pip install -U pip && pip install docker因为我们已经安装了它
  • 去掉USER airflow ,需要使用默认用户或者root用户
  • docker==4.1.0添加到 requirements.txt

@Noe 的方法也适用于我。 我还必须使用wsl --set-version Ubuntu-20.04 2将我的 Ubuntu 的 WSL 从 V1 升级到 V2

这是 Airflow 2.1.1 的 Dockerfile + docker compose

Dockerfile

FROM apache/airflow:2.1.1

ENV PYTHONPATH "${PYTHONPATH}:/home/python_modules:/opt/airflow/dags"

COPY requirements.txt requirements.txt
USER root
RUN pip install -r requirements.txt

RUN curl -sSL https://get.docker.com/ | sh
ENV SHARE_DIR /usr/local/share

码头工人撰写

---
    version: '3'
    x-airflow-common:
      &airflow-common
      build: .
      # image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.1.1}
      # 
      # group_add:
      #   - 0
      environment:
        &airflow-common-env
        AIRFLOW__CORE__EXECUTOR: CeleryExecutor
        AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
        AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow
        AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
        AIRFLOW__CORE__FERNET_KEY: ''
        AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
        AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
        # Need as env var otherwise container crashes while exiting. Airflow Issue # 13487
        AIRFLOW__CORE__ENABLE_XCOM_PICKLING: 'true'
        AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: 5 # Just to have a fast load in the front-end. Do not use in prod w/ config 
        # Enable the Airflow API
        AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth'
        # _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-snowflake-connector-python==2.3.10 boto3==1.15.18 botocore==1.18.18 paramiko==2.6.0 docker==5.0.0}
        # PYTHONPATH: "${PYTHONPATH}:/home/python_modules:/opt/airflow/dags"
      volumes:
        - ./dags:/opt/airflow/dags
        - ./logs:/opt/airflow/logs
        - ./plugins:/opt/airflow/plugins
        # Pass the Docker Daemon as a volume to allow the webserver containers to start docker images
        # Windows requires a leading double slash (//) to address the Docker socket on the host
        - //var/run/docker.sock:/var/run/docker.sock
      #user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-50000}" 
      #user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-0}" 
      depends_on:
        redis:
          condition: service_healthy
        postgres:
          condition: service_healthy
    
    services:
      postgres:
        image: postgres:13
        environment:
          POSTGRES_USER: airflow
          POSTGRES_PASSWORD: airflow
          POSTGRES_DB: airflow
        volumes:
          - postgres-db-volume:/var/lib/postgresql/data
        healthcheck:
          test: ["CMD", "pg_isready", "-U", "airflow"]
          interval: 5s
          retries: 5
        restart: always
    
      redis:
        image: redis:latest
        ports:
          - 6379:6379
        healthcheck:
          test: ["CMD", "redis-cli", "ping"]
          interval: 5s
          timeout: 30s
          retries: 50
        restart: always
    
      airflow-webserver:
        <<: *airflow-common
        # Give extended privileges to the container
        command: webserver
        ports:
          - 8080:8080
        healthcheck:
          test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
          interval: 10s
          timeout: 10s
          retries: 5
        restart: always
    
      airflow-scheduler:
        <<: *airflow-common
        command: scheduler
        healthcheck:
          test: ["CMD-SHELL", 'airflow jobs check --job-type SchedulerJob --hostname "$${HOSTNAME}"']
          interval: 10s
          timeout: 10s
          retries: 5
        restart: always
    
      airflow-worker:
        <<: *airflow-common
        # Give extended privileges to the container
        command: celery worker
        healthcheck:
          test:
            - "CMD-SHELL"
            - 'celery --app airflow.executors.celery_executor.app inspect ping -d "celery@$${HOSTNAME}"'
          interval: 10s
          timeout: 10s
          retries: 5
        restart: always
      
      # Runs airflow-db-init and airflow-db-upgrade
      # Creates a new user airflow/airflow
      airflow-init:
        <<: *airflow-common
        command: version
        environment:
          <<: *airflow-common-env
          _AIRFLOW_DB_UPGRADE: 'true'
          _AIRFLOW_WWW_USER_CREATE: 'true'
          _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
          _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}
    
      flower:
        <<: *airflow-common
        command: celery flower
        ports:
          - 5555:5555
        healthcheck:
          test: ["CMD", "curl", "--fail", "http://localhost:5555/"]
          interval: 10s
          timeout: 10s
          retries: 5
        restart: always
    
    volumes:
      postgres-db-volume:

我刚刚将 docker 二进制文件本身安装到容器中,不确定是否推荐:

  volumes:
    - /var/run/docker.sock:/var/run/docker.sock
    - /usr/bin/docker:/usr/bin/docker

在这种情况下,我不需要构建新图像只是为了安装 docker cli

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM