简体   繁体   中英

aggregate time series dataframe by 15 minute intervals

I am combining a bunch of different datasets to create an aggregation to analyse in 15 minute intervals.

The currently dataframe I have looks like this,

<bound method NDFrame.to_clipboard of                        id                       user_id  sentiment  magnitude  \
2020-10-04 14:06:00  10.0  cPL1Fg7BqRXvSFKeU1mJT7KCCTq2       -0.1        0.1   
2020-10-04 14:06:05  11.0  cPL1Fg7BqRXvSFKeU1mJT7KCCTq2       -0.8        0.8   
2020-10-05 12:28:58  12.0  cPL1Fg7BqRXvSFKeU1mJT7KCCTq2       -0.2        0.2   
2020-10-05 12:29:16  13.0  cPL1Fg7BqRXvSFKeU1mJT7KCCTq2       -0.2        0.2   
2020-10-05 12:29:31  14.0  cPL1Fg7BqRXvSFKeU1mJT7KCCTq2        0.2        0.2   

                     angry  disgusted  fearful  happy  neutral  sad  \
2020-10-04 14:06:00    NaN        NaN      NaN    NaN      NaN  NaN   
2020-10-04 14:06:05    NaN        NaN      NaN    NaN      NaN  NaN   
2020-10-05 12:28:58    NaN        NaN      NaN    NaN      NaN  NaN   
2020-10-05 12:29:16    NaN        NaN      NaN    NaN      NaN  NaN   
2020-10-05 12:29:31    NaN        NaN      NaN    NaN      NaN  NaN   

                     surprised  heartRate  steps  
2020-10-04 14:06:00        NaN        NaN    NaN  
2020-10-04 14:06:05        NaN        NaN    NaN  
2020-10-05 12:28:58        NaN        NaN    NaN  
2020-10-05 12:29:16        NaN        NaN    NaN  
2020-10-05 12:29:31        NaN        NaN    NaN  >

I want to aggregate the dataframe into 15 minute intervals.

I think groupby is the best approach? But I'm finding it hard to get it to work particularly well: /

Thanks in advance,

There are two options, either we can use resample or pd.Grouper(which is performant).

Let me share example of pd.Grouper to add column values for 15 mins interval.

Code

pd.DataFrame(df.groupby([pd.Grouper(key='date', freq='15Min')]).sum()).reset_index()

Input sample from your data

    date                 id
0   2020-10-04 14:06:00 10.0
1   2020-10-04 14:06:05 11.0
2   2020-10-05 12:28:58 12.0
3   2020-10-05 12:29:16 13.0
4   2020-10-05 12:29:31 14.0

Output

    date           id
0   2020-10-04 14:00:00 21.0
1   2020-10-04 14:15:00 0.0
2   2020-10-04 14:30:00 0.0
3   2020-10-04 14:45:00 0.0
4   2020-10-04 15:00:00 0.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM