简体   繁体   English

我可以从 cloud composer DAG 中执行 python 脚本吗?

[英]Can I execute python scripts from within a cloud composer DAG?

(First time user of cloud composer) All examples I have seen define very simple python functions within the DAG. (云作曲家的第一次用户)我看到的所有示例都在 DAG 中定义了非常简单的 Python 函数。

I have multiple lengthy python scripts I want to run.我有多个要运行的冗长的 python 脚本。 Can I put these inside a task?我可以将这些放在任务中吗?

If so, is it then better to use the PythonOperator or call them from the BashOperator?如果是这样,那么使用 PythonOperator 还是从 BashOperator 调用它们更好?

Eg something like例如类似的东西

default_dag-args ={}
with models.DAG('jobname', schedule_interval = datetime.timedelta(days=1), default_args = default_dag_args) as dag: 
do_stuff1 = python_operator.PythonOperator(
    task_id ='task_1'
    python_callable =myscript1.py)
do_stuff2 = python_operator.PythonOperator(
    task_id ='task_2'
    python_callable =myscript2.py)

If you put your python scripts into separate files, you can actually use both PythonOperator and BashOperator to execute the scripts.如果将 Python 脚本放入单独的文件中,则实际上可以同时使用 PythonOperator 和 BashOperator 来执行脚本。

Let's assume you place your python scripts under the following folder structure.假设您将 Python 脚本放在以下文件夹结构下。

dags/
    my_dag.py
    tasks/
         myscript1.py
         myscript2.py

Using PythonOperator in my_dag.py使用PythonOperatormy_dag.py

from datetime import timedelta

from airflow.models import DAG
from airflow.operators.python_operator import PythonOperator
from scripts import myscript1, myscript2

default_dag_args = {}

with DAG(
    "jobname",
    schedule_interval=timedelta(days=1),
    default_args=default_dag_args,
) as dag:
    do_stuff1 = PythonOperator(
        task_id="task_1",
        python_callable=myscript1.main,  # assume entrypoint is main()
    )
    do_stuff2 = PythonOperator(
        task_id="task_2",
        python_callable=myscript2.main,  # assume entrypoint is main()
    )

Using BashOperator in my_dag.py使用BashOperatormy_dag.py

from datetime import timedelta

from airflow.models import DAG
from airflow.operators.bash_operator import BashOperator

default_dag_args = {}

with DAG(
    "jobname",
    schedule_interval=timedelta(days=1),
    default_args=default_dag_args,
) as dag:
    do_stuff1 = BashOperator(
        task_id="task_1",
        bash_command="python /path/to/myscript1.py",
    )
    do_stuff2 = BashOperator(
        task_id="task_2",
        bash_command="python /path/to/myscript2.py",
    )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM