I have the following code that generates a timeseries with 1 min steps but would like to have the time gaps filled. ie 13:58 is missing in between. Every ip should be represented in the gap with zero values.
flow = {'date': ['2020-11-13 13:57:51','2020-11-13 13:57:51','2020-11-13 13:57:52','2020-11-13 13:59:53','2020-11-13 13:59:54'],
'source_ip': ['192.168.1.1','192.168.1.2','10.0.0.1','192.168.1.1','192.168.1.1'],
'destination_ip': ['10.0.0.1', '10.0.0.1', '192.168.1.1', '192.168.1.2', '192.168.1.2'],
'source_bytes':[5,1,2,3,3]
}
df = pd.DataFrame(flow, columns = ['date', 'source_ip', 'destination_ip', 'source_bytes'])
df['date'] = pd.to_datetime(df['date'])
df2 = (df.melt(['date', 'source_bytes'], value_name='ip')
.groupby(['ip', pd.Grouper(key='date', freq='1min')])['source_bytes']
.agg(['sum','min','mean'])
.unstack(fill_value=0)
.stack()
.reset_index()
)
print (df2)
ip date sum min mean
0 10.0.0.1 2020-11-13 13:57:00 8 1 2.666667
1 10.0.0.1 2020-11-13 13:59:00 0 0 0.000000
2 192.168.1.1 2020-11-13 13:57:00 7 2 3.500000
3 192.168.1.1 2020-11-13 13:59:00 6 3 3.000000
4 192.168.1.2 2020-11-13 13:57:00 1 1 1.000000
5 192.168.1.2 2020-11-13 13:59:00 6 3 3.000000
How can this be achieved?
First change unstack
by first level for DatetimeIndex
, and add DataFrame.asfreq
for add missing minutes:
df = pd.DataFrame(flow, columns = ['date', 'source_ip', 'destination_ip', 'source_bytes'])
df['date'] = pd.to_datetime(df['date'])
df2 = (df.melt(['date', 'source_bytes'], value_name='ip')
.groupby(['ip', pd.Grouper(key='date', freq='1min')])['source_bytes']
.agg(['sum','min','mean'])
.unstack(0,fill_value=0)
.asfreq('Min', fill_value=0)
.stack()
.reset_index()
)
print (df2)
date ip sum min mean
0 2020-11-13 13:57:00 10.0.0.1 8 1 2.666667
1 2020-11-13 13:57:00 192.168.1.1 7 2 3.500000
2 2020-11-13 13:57:00 192.168.1.2 1 1 1.000000
3 2020-11-13 13:58:00 10.0.0.1 0 0 0.000000
4 2020-11-13 13:58:00 192.168.1.1 0 0 0.000000
5 2020-11-13 13:58:00 192.168.1.2 0 0 0.000000
6 2020-11-13 13:59:00 10.0.0.1 0 0 0.000000
7 2020-11-13 13:59:00 192.168.1.1 6 3 3.000000
8 2020-11-13 13:59:00 192.168.1.2 6 3 3.000000
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.