简体   繁体   中英

Work with date column in DataFrame in Python Pandas?

I have DataFrame like below:

rng = pd.date_range('2020-12-01', periods=5, freq='D')
df = pd.DataFrame({"ID" : [1, 2, 2, 1, 3],
                   "status" : ["acc", "rem", "rem", "acc", "other"], "date" : rng})

And I need to create DataFrame with columns:

  1. New1 = amount of the days from the last "acc" agreemtn until today 28.12
  2. New2 = amount of the days from the last "rem" agreement until today 28.12

Result like below:

在此处输入图像描述

Like this:

In [2608]: t = pd.to_datetime('today').normalize()

In [2615]: In [2627]: x = abs(df.groupby(['ID', 'status'])['date'].max() - t).dt.days.reset_index()
In [2619]: y = x.pivot('ID', 'status', 'date')

In [2620]: y
Out[2620]: 
status   acc  other   rem
ID                       
1       24.0    NaN   NaN
2        NaN    NaN  25.0
3        NaN   23.0   NaN

Note: You can rename acc , rem to New1 and New2 . I've kept it as is for more understanding.

code:

df=df.groupby(['status'])['date'].agg('last').reset_index()
df['diff']=abs(pd.to_datetime('today').day-df['date'].dt.day)
df_final=df.pivot(columns='status',values='diff')

output:

df_final
Out[104]: 
status   acc  other   rem
0       24.0    NaN   NaN
1        NaN   23.0   NaN
2        NaN    NaN  25.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM