简体   繁体   English

气流运行使用 PythonOperator 通过 gcsfuse 连接的 python 脚本

[英]Airflow run python script connected via gcsfuse using PythonOperator

I want to run a Python script that is stored in this gcp directory:我想运行一个存储在这个 gcp 目录中的 Python 脚本:

 /home/airflow/gcsfuse/dags/external/projectXYZ/test.py

I used the Bash Operator before to execute the script which works in theory but I'm getting some errors for some functions in some python libraries.我之前使用 Bash Operator 来执行理论上可行的脚本,但是在某些 python 库中的某些函数出现了一些错误。 Therefore I want to test the PythonOperator if it works.因此,我想测试 PythonOperator 是否有效。 For the BashOperator I used the following code snippet:对于 BashOperator,我使用了以下代码片段:

run_python = BashOperator(
        task_id='run_python',
        bash_command='python /home/airflow/gcsfuse/dags/external/projectXYZ/test.py'
    )

For the PythonOperator I saw some posts importing a function of a python script.对于 PythonOperator,我看到了一些导入 python 脚本函数的帖子。 However I don't know how I get Airflow to recognize an import.但是我不知道如何让 Airflow 识别导入。 The only option I have to interact between stuff on the gcp and Airflow is through the gcsfuse/dags/external folder.我必须在 gcp 和 Airflow 上的内容之间进行交互的唯一选择是通过 gcsfuse/dags/external 文件夹。 How can I execute the file from this path instead of calling a function in the PythonOperator?如何从该路径执行文件,而不是调用 PythonOperator 中的函数?

So after some researching and testing I came to the conclusion that it is not possible to execute a python file which is located on a gcp storage bucket with the PytonOperator.因此,经过一些研究和测试,我得出的结论是,无法使用 PytonOperator 执行位于 gcp 存储桶上的 python 文件。 If there is a python file in a gcp storage bucket which is connected to Airflow via gcsfuse then you need to use the BashOperator.如果通过 gcsfuse 连接到 Airflow 的 gcp 存储桶中有一个 python 文件,那么您需要使用 BashOperator。 If you want to use the PythonOperator you either have to write you python code inside your dag and call a function with the PythonOperator or you import a function from a python file that is already stored on the airflow storage itself and then call this function with the PythonOperator.如果你想使用 PythonOperator,你要么必须在 dag 中编写 python 代码并使用 PythonOperator 调用函数,要么从已经存储在气流存储本身的 python 文件中导入函数,然后使用Python操作员。

Feel free to correct me if I am mistaken如果我错了,请随时纠正我

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM