简体   繁体   中英

Airflow - Kubernetes Executor : How to only mount only a directory of a Persistent Volume Claim that correspond to the run_id

I am using airflow with kubernetes executor.

It works when I use executor_config to mount a PersistentVolumeClaim.

However, I would like only to mount a subPath that would be dynamic, something like this:

executor_config={
    "KubernetesExecutor":
    {"volumes": [
                {
                    "name": "workdir-volume",
                    "persistentVolumeClaim": {"claimName": "my-volume-claim"},
                },
            ],
     "volume_mounts": [
                {
                    "mountPath": "/app/workdir/",
                    "name": "workdir-volume",
                    "subPath": "{{ run_id }}_{{ ds }}"
                },
            ]}
},

It doesn't work for two reasons:

  • executor_config is not in the template_fields. Therefore, I created a new operator which include executor_config.

  • my understanding is the render is only done after pod start because when I look at the rendered task from the dashboard, it is fine, but the mounted directory is not rendered

Does someone have an idea on how to do this?

That will not be doable AFAIK since the run_id or ds is only available for each DAG/task run. You need to handle this at your script by passing in parameters in the task definition. Example:

t1 = BashOperator(
        task_id='t1',
        bash_command="extract.py --path='{{run_id}}'")
...

Using the parameters, let the script create the subdirectories.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM