简体   繁体   中英

How to install packages in Airflow (docker-compose)?

The question is very similar to the one already available . The only difference is that I ran Airflow in docker

Step by step:

  1. Put docker-compose.yaml to PyCharm project
  2. Put requirements.txt to PyCharm project
  3. Run docker-compose up
  4. Run DAG and receive a ModuleNotFoundError

I want to start Airflow using docker-compose with the dependencies from requirements.txt . These dependencies should be available by PyCharm interpreter and during DAGs execution

Is there a solution that doesn't require rebuilding the image?

Got the answer at airflow GitHub discussions. The only way now to install extra python packages to build your own image. I will try to explain this solution in more details

Step 1. Put Dockerfile , docker-compose.yaml and requirements.txt files to the project directory

Step 2. Paste to Dockefile code below:

FROM apache/airflow:2.1.0
COPY requirements.txt .
RUN pip install -r requirements.txt

Step 3. Paste to docker-compose.yaml code, which you can find in the official documentation . Replace section image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.1.0} with build: . :

---
version: '3'
x-airflow-common:
  &airflow-common
  build: .
  # REPLACED # image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.1.0}
  environment:
    &airflow-common-env
    AIRFLOW__CORE__EXECUTOR: CeleryExecutor
    AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
    AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow
    AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
    AIRFLOW__CORE__FERNET_KEY: ''
    AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
    AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
    AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth'
  volumes:
    - ./dags:/opt/airflow/dags
    - ./logs:/opt/airflow/logs
    - ./plugins:/opt/airflow/plugins
  user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-50000}"
  depends_on:
    redis:
      condition: service_healthy
    postgres:
      condition: service_healthy

# ...

Your project directory at this moment should look like this:

airflow-project
|docker-compose.yaml
|Dockerfile
|requirements.txt

Step 4. Run docker-compose up to start Airflow, docker-compose should build your image automatically from Dockerfile . Run docker-compose build to rebuild the image and update dependencies

Is there a solution that doesn't require rebuilding the image?

Yes there is now: currently (oct-2021 v2.2.0) it's available as an env variable:

_PIP_ADDITIONAL_REQUIREMENTS

It is used in the docker-compose.yml file. That should do the trick without building a complete image as some of the other answers explain (very well actually:-)

See: https://airflow.apache.org/docs/apache-airflow/stable/docker-compose.yaml

Official documentation https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#environment-variables-supported-by-docker-compose

Another alternative is update your file docker-compose.yml, put the follow lines with all the commands you need

  command: -c "pip3 install apache-airflow-providers-sftp  apache-airflow-providers-ssh --user"

And rebuild the image

docker-compose up airflow-init
docker-compose up

1. Create new Airflow docker image with installed Python requirements

Check what Airflow image your docker-compose.yaml is using and use that image, in my case it's: apache/airflow:2.3.2 I same folder where you have your docker-compose.yaml create Dockerfile with following content:

FROM apache/airflow:2.3.2
COPY requirements.txt /requirements.txt
RUN pip install --user --upgrade pip
RUN pip install --no-cache-dir --user -r /requirements.txt

2. Build new Airflow image

In same folder run:

docker build . --tag pyrequire_airflow:2.3.2

3. Use new image in your docker-compose.yaml

Find name of the airflow image used in your docker-compose.yaml under AIRFLOW_IMAGE_NAME . Change:

image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.3.2}

To:

image: ${AIRFLOW_IMAGE_NAME:-pyrequire_airflow:2.3.2}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM