[英]Calculate percentiles/quantiles for a timeseries with resample or groupby - pandas
I have a time series of hourly values and I am trying to derive some basic statistics on a weekly/monthly basis. 我有一个小时价值的时间序列,我试图每周/每月得出一些基本的统计数据。
If we use the following abstract dataframe, were each column is time-series: 如果我们使用以下抽象数据框,则每列都是时间序列:
rng = pd.date_range('1/1/2016', periods=2400, freq='H')
df = pd.DataFrame(np.random.randn(len(rng), 4), columns=list('ABCD'), index=rng)
print df[:5]
returns: print df[:5]
返回:
A B C D
2016-01-01 00:00:00 1.521581 0.102335 0.796271 0.317046
2016-01-01 01:00:00 -0.369221 -0.179821 -1.340149 -0.347298
2016-01-01 02:00:00 0.750247 0.698579 0.440716 0.362159
2016-01-01 03:00:00 -0.465073 1.783315 1.165954 0.142973
2016-01-01 04:00:00 1.995332 1.230331 -0.135243 1.189431
I can call: 我可以打电话:
r = df.resample('W-MON')
and then use: r.min()
, r.mean()
, r.max()
, which all work fine. 然后使用: r.min()
, r.mean()
, r.max()
,这些都可以正常工作。 For instance print r.min()[:5]
returns: 例如print r.min()[:5]
返回:
A B C D
2016-01-04 -2.676778 -2.450659 -2.401721 -3.209390
2016-01-11 -2.710066 -2.372032 -2.864887 -2.387026
2016-01-18 -2.984805 -2.527528 -3.414003 -2.616434
2016-01-25 -2.625299 -2.947864 -2.642569 -2.262959
2016-02-01 -2.100062 -2.568878 -3.008864 -2.315566
However, if I try to calculate percentiles , using the quantile formula, ie r.quantile(0.95)
, I get one value for each column 但是,如果我尝试使用分位数公式计算百分 位数 ,即r.quantile(0.95)
,我会为每列获得一个值
A 0.090502
B 0.136594
C 0.058720
D 0.125131
Is there a way to combine the grouping / resampling using quantiles as arguments? 有没有办法将分位数/重采样结合使用分位数作为参数?
Thanks 谢谢
I think you can use Resampler.apply
, because Resampler.quantile
is not implemented yet: 我认为你可以使用Resampler.apply
,因为还没有实现 Resampler.quantile
:
np.random.seed(1234)
rng = pd.date_range('1/1/2016', periods=2400, freq='H')
df = pd.DataFrame(np.random.randn(len(rng), 4), columns=list('ABCD'), index=rng)
#print (df)
r = df.resample('W-MON')
print (r.apply(lambda x: x.quantile(0.95)))
A B C D
2016-01-04 1.540236 1.925962 1.439512 1.606239
2016-01-11 1.727545 1.520913 1.596961 1.652290
2016-01-18 1.595396 1.669630 1.763577 1.933235
2016-01-25 1.500270 1.604542 1.648790 1.778329
2016-02-01 1.608245 1.791356 1.548159 1.786005
2016-02-08 1.531625 1.408163 1.300414 1.877863
2016-02-15 1.818673 1.613632 1.498623 1.524481
2016-02-22 1.557928 1.566523 1.974486 1.727555
2016-02-29 1.530757 1.529591 1.869422 1.433620
2016-03-07 1.651609 1.452537 1.585765 1.414499
2016-03-14 1.311807 1.717968 1.410036 1.903715
2016-03-21 1.529065 1.693964 1.784480 1.708263
2016-03-28 1.636786 1.405565 1.809235 1.802555
2016-04-04 1.768068 1.564308 1.552492 1.801424
2016-04-11 1.824578 1.794437 1.649749 1.564300
With groupby
is possible use DataFrameGroupBy.quantile
: 使用groupby
可以使用DataFrameGroupBy.quantile
:
g = df.groupby([pd.TimeGrouper('W-MON')])
print (g.quantile(0.95))
A B C D
2016-01-04 1.540236 1.925962 1.439512 1.606239
2016-01-11 1.727545 1.520913 1.596961 1.652290
2016-01-18 1.595396 1.669630 1.763577 1.933235
2016-01-25 1.500270 1.604542 1.648790 1.778329
2016-02-01 1.608245 1.791356 1.548159 1.786005
2016-02-08 1.531625 1.408163 1.300414 1.877863
2016-02-15 1.818673 1.613632 1.498623 1.524481
2016-02-22 1.557928 1.566523 1.974486 1.727555
2016-02-29 1.530757 1.529591 1.869422 1.433620
2016-03-07 1.651609 1.452537 1.585765 1.414499
2016-03-14 1.311807 1.717968 1.410036 1.903715
2016-03-21 1.529065 1.693964 1.784480 1.708263
2016-03-28 1.636786 1.405565 1.809235 1.802555
2016-04-04 1.768068 1.564308 1.552492 1.801424
2016-04-11 1.824578 1.794437 1.649749 1.564300
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.