[英]Pandas dataframe resample and count events per day
I have a dataframe with time-index.我有一个带有时间索引的数据框。 I can resample the data to get (eg) mean per-day, however I would like also to get the counts per day.我可以重新采样数据以获得(例如)每天的平均值,但是我也想获得每天的计数。 Here is a sample:这是一个示例:
import datetime
import pandas as pd
import numpy as np
dates = pd.date_range(datetime.datetime(2012, 4, 5, 11,
0),datetime.datetime(2012, 4, 7, 7, 0),freq='5H')
var1 = np.random.sample(dates.size) * 10.0
var2 = np.random.sample(dates.size) * 10.0
df = pd.DataFrame(data={'var1': var1, 'var2': var2}, index=dates)
df1=df.resample('D').mean()
I'd like to get also a 3rd column 'count' which counts per day:我还想获得每天计数的第三列“计数”:
count
3
5
7
Thank you very much!非常感谢!
Use Resampler.agg
and then flatten MultiIndex
in columns:使用Resampler.agg
,然后在列中展平MultiIndex
:
df1 = df.resample('D').agg({'var1': 'mean','var2': ['mean', 'size']})
df1.columns = df1.columns.map('_'.join)
df1 = df1.rename(columns={'var2_size':'count'})
print (df1)
var1_mean var2_mean count
2012-04-05 3.992166 4.968410 3
2012-04-06 6.843105 6.193568 5
2012-04-07 4.568436 3.135089 1
Alternative solution with Grouper
: Grouper
替代解决方案:
df1 = df.groupby(pd.Grouper(freq='D')).agg({'var1': 'mean','var2': ['mean', 'size']})
df1.columns = df1.columns.map('_'.join)
df1 = df1.rename(columns={'var2_size':'count'})
print (df1)
var1_mean var2_mean count
2012-04-05 3.992166 4.968410 3
2012-04-06 6.843105 6.193568 5
2012-04-07 4.568436 3.135089 1
EDIT:编辑:
r = df.resample('D')
df1 = r.mean().add_suffix('_mean').join(r.size().rename('count'))
print (df1)
var1_mean var2_mean count
2012-04-05 7.840487 6.885030 3
2012-04-06 4.762477 5.091455 5
2012-04-07 2.702414 6.046200 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.