I am defining a global variable timestamp
with multiple operators referencing this variable. It seems that this variable is redefined when each operator is run? Below is a minimal reproducible example. I expected that both test
and test2
would print the same timestamp, but they are printing different timestamps (seconds apart) in Airflow.
import datetime
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
start_date = datetime.datetime(
year=2022,
month=3,
day=30,
hour=18,
minute=0,
)
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
def test():
print(timestamp)
def test2():
print(timestamp)
with DAG(
'airflow_test',
description='airflow test',
max_active_runs=1,
start_date=start_date,
) as dag:
test = PythonOperator(
task_id='test',
python_callable=test,
dag=dag
)
test2 = PythonOperator(
task_id='test2',
python_callable=test2,
dag=dag
)
test >> test2
What is actually happening internally when this script is run by Airflow that causes this to occur?
i prefer use airflow default builtin like:
from airflow.utils.dates import days_ago
example:
project_cfg = {
'owner': 'airflow',
'email': ['your-email@example.com'],
'email_on_failure': True,
'start_date': days_ago(1),
'retries': 1,
'retry_delay': timedelta(hours=1),
}
for your question i prefer use xcom to refrence your global variable
another way is make a function and call it for each task or make local variable
or try this one:
with DAG(dag_id="test",
start_date=days_ago(3),
schedule_interval="@daily",
catchup=False) as dag:
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
@task
def test1():
return timestamp
@task
def test2():
return timestamp
test2(test1())
i hope it'll help full
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.