简体   繁体   中英

How to Calculate time difference between two date columns

Assume the below dataframe, df

   Start_Date   End_Date
0   20201101    20201130
1   20201201    20201231
2   20210101    20210131
3   20210201    20210228
4   20210301    20210331

How to Calculate time difference between two date columns in days?

Required Output

   Start_Date   End_Date   Diff_in_Days
0   20201101    20201130
1   20201201    20201231
2   20210101    20210131
3   20210201    20210228
4   20210301    20210331

First idea is convert columns to datetimes, get difference and convert timedeltas to days by Series.dt.days :

df['Diff_in_Days'] = (pd.to_datetime(df['End_Date'], format='%Y%m%d')
                        .sub(pd.to_datetime(df['Start_Date'], format='%Y%m%d'))
                        .dt.days)
print (df)
   Start_Date  End_Date  Diff_in_Days
0    20201101  20201130            29
1    20201201  20201231            30
2    20210101  20210131            30
3    20210201  20210228            27
4    20210301  20210331            30

Another solution better if processing datetimes later is reassign back columns and use solution above:

df['Start_Date'] = pd.to_datetime(df['Start_Date'], format='%Y%m%d')
df['End_Date'] = pd.to_datetime(df['End_Date'], format='%Y%m%d')

df['Diff_in_Days'] = df['End_Date'].sub(df['Start_Date']).dt.days
print (df)
  Start_Date   End_Date  Diff_in_Days
0 2020-11-01 2020-11-30            29
1 2020-12-01 2020-12-31            30
2 2021-01-01 2021-01-31            30
3 2021-02-01 2021-02-28            27
4 2021-03-01 2021-03-31            30

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM