Pandas - groupby cumulative timeperiod

Question

Here's my problem: Imagine a dataframe indexed by time.

df = pd.DataFrame(index=["00:00:00", 
"00:00:08","00:00:14","00:00:21","00:00:23","00:00:49"],data={"col1":["a", 
"b","a","a", "c", "d"], "col2":[4,4,4,6,6,7], "col3":[2,17,2,2,3,50]})

I would now like to apply a function and group the data based on cumulative time in 15 second intervals, ie for timestamps between 00:00:00 - 00:00:15, 00:00:00 - 00:00:30, 00:00:00 - 00:00:45, etc.

Let's say for example, I want to sum all values of col2, col3 and divide one by the other, if the value in col1 is "a" in each of those intervals.

The output should be something like:

         output
00:00:15    2
00:00:30    2.3333

Appreciate any help!

Answer 1

First convert index to timedeltas by to_timedelta and add 15 seconds for shifting it, then filter only a rows by boolean indexing and Series.eq ( == ).

Then DataFrame.resample sum , then DataFrame.cumsum and last divide columns by Series.div :

df.index = pd.to_timedelta(df.index) + pd.Timedelta(15, unit='s')

df = df[df['col1'].eq('a')].resample('15S').sum().cumsum()
df['out'] = df['col2'].div(df['col3'])
print (df)
          col2  col3       out
00:00:15     8     4  2.000000
00:00:30    14     6  2.333333

Alternative is converting to datetime s:

df.index = pd.to_datetime(df.index) + pd.Timedelta(15, unit='s')

df = df[df['col1'].eq('a')].resample('15S').sum().cumsum()
df['out'] = df['col2'].div(df['col3'])
print (df)
                     col2  col3       out
2019-03-21 00:00:15     8     4  2.000000
2019-03-21 00:00:30    14     6  2.333333

Answer 2

df = pd.DataFrame(index=["00:00:00", "00:00:08","00:00:14","00:00:21","00:00:23","00:00:49"],data={"col1":["a","b","a","a", "c", "d"], "col2":[4,4,4,6,6,7], "col3":[2,17,2,2,3,50]})
df.index = pd.to_datetime(df.index, format='%H:%M:%S')
df = df[df['col1']=='a'].resample('15s', how='sum').cumsum()
df['output'] = df['col2']/df['col3']

Pandas - groupby cumulative timeperiod

Question

2 answers

solution1
3 ACCPTED 2019-03-21 10:05:16

solution2
1 2019-03-21 10:15:20

Pandas - groupby cumulative timeperiod

Question

2 answers

solution1 3 ACCPTED 2019-03-21 10:05:16

solution2 1 2019-03-21 10:15:20

solution1
3 ACCPTED 2019-03-21 10:05:16

solution2
1 2019-03-21 10:15:20