I have a packaged DAG where I store my module in sub-folder. I'm using PythonVirtualenvOperator and want to access this module from the virtual env.
The folder system -
dags/
packaged_dag.zip/
dag.py
package/
my_module.py
__init__.py
dag.py
from airflow.operators.python_operator import PythonVirtualenvOperator
def my_function(**kwargs):
from package import my_module
with models.DAG(
default_args=default_dag_args) as dag:
virtualenv_task = PythonVirtualenvOperator(
task_id="virtualenv_python",
python_callable=my_function,
system_site_packages=True,
dag=dag,
)
For this I will get package module not found. If I'll move the import to the main dag file (like with PythonVirtualenvOperator) - it will work fine but I want the file from the virtualenv.
This worked for me:
dags/
folder/
my_module.py
main.py
main.py
default_args = {
"owner": "airflow"}
dag = DAG(
dag_id='example',
default_args=default_args,
schedule_interval=None,
start_date=days_ago(2),
)
def my_module_virtualenv():
"""
Example function that will be performed in a virtual environment.
Importing at the module level ensures that it will not attempt to import the
library before it is installed.
"""
from dags.folder.my_module import main as my_module
my_module()
check_database = PythonVirtualenvOperator(
task_id="my_module",
python_callable=my_module_virtualenv,
requirements=["psycopg2-binary==2.8.6"],
system_site_packages=True,
dag=dag,
)
It's not standard practice since there's no __init__.py
but it works. __init__.py
was giving me trouble when I tried to docker-compose up airflow-init
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.