简体   繁体   中英

pandas fill missing time intervals as given in a dataframe

I have a DataFrame looking like:

gap_id species time_start time_stop
1 wheat 2021-11-22 00:01:00 2021-11-22 00:03:00
2 fescue 2021-12-18 05:52:00 2021-12-18 05:53:00

I would like to expand the DataFrame such that I get as many rows as the number of minutes between time_start and time_stop for each gap_id :

gap_id species time
1 wheat 2021-11-22 00:01:00
1 wheat 2021-11-22 00:02:00
1 wheat 2021-11-22 00:03:00
2 fescue 2021-12-18 05:52:00
2 fescue 2021-12-18 05:53:00

I've tried the method pd.data_range but I don't know how to couple it with a groupby made on gap_id

Thanks in advance

If small DataFrame and performance is not important generate for each row date_range and then use DataFrame.explode :

df['time'] = df.apply(lambda x: pd.date_range(x['time_start'], x['time_stop'], freq='T'), axis=1)
df = df.drop(['time_start','time_stop'], axis=1).explode('time')

print (df)
   gap_id species                time
0       1   wheat 2021-11-22 00:01:00
0       1   wheat 2021-11-22 00:02:00
0       1   wheat 2021-11-22 00:03:00
1       2  fescue 2021-12-18 05:52:00
1       2  fescue 2021-12-18 05:53:00

For large DataFrames repeat indices by difference start and stop columns in minutes first and then add counter by GroupBy.cumcount with convert to timedeltas by to_timedelta :

df['time_start'] = pd.to_datetime(df['time_start'])
df['time_stop'] = pd.to_datetime(df['time_stop'])

df = (df.loc[df.index.repeat(df['time_stop'].sub(df['time_start']).dt.total_seconds() // 60 + 1)]
        .drop('time_stop', axis=1)
        .rename(columns={'time_start':'time'}))
       
td = pd.to_timedelta(df.groupby(level=0).cumcount(), unit='Min')

df['time'] += td
df = df.reset_index(drop=True)
print (df)
   gap_id species                time
0       1   wheat 2021-11-22 00:01:00
1       1   wheat 2021-11-22 00:02:00
2       1   wheat 2021-11-22 00:03:00
3       2  fescue 2021-12-18 05:52:00
4       2  fescue 2021-12-18 05:53:00

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM