简体   繁体   中英

Airflow execution_date wrong value

I have to run spark job and in that spark job we have to pass date as an argument to read current directory. I am using Airflow to schedule job. Below are some info

start_date

import pendulum
local_tz = pendulum.timezone("Asia/Kolkata")
start_date': datetime(year=2020, month=8, day=3,tzinfo=local_tz)

schedule_interval

schedule_interval='20 0 * * *'

value to pass in job

{{ (execution_date + macros.timedelta(hours=5,minutes=30) - macros.timedelta(days=1)).strftime("%Y/%m/%d") }}

We have to run this job at midnight for the previous day but this expression giving me date for a day before yesterday. I added 5:30 because our airflow use UTC time.

Can anybody explain what is happening here with reference?

Thanks

Below is the definition for execution date

The execution time in Airflow is not the actual run time, but rather the start timestamp of its schedule period. For example, the execution time of the first DAG run is 2019–12–05 7:00:00, though it is executed on 2019–12–06.Dec 9, 2019

taken from https://towardsdatascience.com/apache-airflow-tips-and-best-practices-ff64ce92ef8#:~:text=The%20execution%20time%20in%20Airflow,on%202019%E2%80%9312%E2%80%9306 .

You don't need the macros.timedelta(days=1)).strftime("%Y/%m/%d") in your value

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM