简体   繁体   English

使用resample或groupby - pandas计算时间序列的百分位数/分位数

[英]Calculate percentiles/quantiles for a timeseries with resample or groupby - pandas

I have a time series of hourly values and I am trying to derive some basic statistics on a weekly/monthly basis. 我有一个小时价值的时间序列,我试图每周/每月得出一些基本的统计数据。

If we use the following abstract dataframe, were each column is time-series: 如果我们使用以下抽象数据框,则每列都是时间序列:

rng = pd.date_range('1/1/2016', periods=2400, freq='H')
df = pd.DataFrame(np.random.randn(len(rng), 4), columns=list('ABCD'), index=rng)

print df[:5] returns: print df[:5]返回:

                            A         B         C         D
2016-01-01 00:00:00  1.521581  0.102335  0.796271  0.317046
2016-01-01 01:00:00 -0.369221 -0.179821 -1.340149 -0.347298
2016-01-01 02:00:00  0.750247  0.698579  0.440716  0.362159
2016-01-01 03:00:00 -0.465073  1.783315  1.165954  0.142973
2016-01-01 04:00:00  1.995332  1.230331 -0.135243  1.189431

I can call: 我可以打电话:

r = df.resample('W-MON')

and then use: r.min() , r.mean() , r.max() , which all work fine. 然后使用: r.min()r.mean()r.max() ,这些都可以正常工作。 For instance print r.min()[:5] returns: 例如print r.min()[:5]返回:

                   A         B         C         D
2016-01-04 -2.676778 -2.450659 -2.401721 -3.209390
2016-01-11 -2.710066 -2.372032 -2.864887 -2.387026
2016-01-18 -2.984805 -2.527528 -3.414003 -2.616434
2016-01-25 -2.625299 -2.947864 -2.642569 -2.262959
2016-02-01 -2.100062 -2.568878 -3.008864 -2.315566

However, if I try to calculate percentiles , using the quantile formula, ie r.quantile(0.95) , I get one value for each column 但是,如果我尝试使用分位数公式计算百分 位数 ,即r.quantile(0.95) ,我会为每列获得一个值

A    0.090502
B    0.136594
C    0.058720
D    0.125131

Is there a way to combine the grouping / resampling using quantiles as arguments? 有没有办法将分位数/重采样结合使用分位数作为参数?

Thanks 谢谢

I think you can use Resampler.apply , because Resampler.quantile is not implemented yet: 我认为你可以使用Resampler.apply ,因为还没有实现 Resampler.quantile

np.random.seed(1234)
rng = pd.date_range('1/1/2016', periods=2400, freq='H')
df = pd.DataFrame(np.random.randn(len(rng), 4), columns=list('ABCD'), index=rng)
#print (df)

r = df.resample('W-MON')

print (r.apply(lambda x: x.quantile(0.95)))
                   A         B         C         D
2016-01-04  1.540236  1.925962  1.439512  1.606239
2016-01-11  1.727545  1.520913  1.596961  1.652290
2016-01-18  1.595396  1.669630  1.763577  1.933235
2016-01-25  1.500270  1.604542  1.648790  1.778329
2016-02-01  1.608245  1.791356  1.548159  1.786005
2016-02-08  1.531625  1.408163  1.300414  1.877863
2016-02-15  1.818673  1.613632  1.498623  1.524481
2016-02-22  1.557928  1.566523  1.974486  1.727555
2016-02-29  1.530757  1.529591  1.869422  1.433620
2016-03-07  1.651609  1.452537  1.585765  1.414499
2016-03-14  1.311807  1.717968  1.410036  1.903715
2016-03-21  1.529065  1.693964  1.784480  1.708263
2016-03-28  1.636786  1.405565  1.809235  1.802555
2016-04-04  1.768068  1.564308  1.552492  1.801424
2016-04-11  1.824578  1.794437  1.649749  1.564300

With groupby is possible use DataFrameGroupBy.quantile : 使用groupby可以使用DataFrameGroupBy.quantile

g = df.groupby([pd.TimeGrouper('W-MON')])

print (g.quantile(0.95))
                   A         B         C         D
2016-01-04  1.540236  1.925962  1.439512  1.606239
2016-01-11  1.727545  1.520913  1.596961  1.652290
2016-01-18  1.595396  1.669630  1.763577  1.933235
2016-01-25  1.500270  1.604542  1.648790  1.778329
2016-02-01  1.608245  1.791356  1.548159  1.786005
2016-02-08  1.531625  1.408163  1.300414  1.877863
2016-02-15  1.818673  1.613632  1.498623  1.524481
2016-02-22  1.557928  1.566523  1.974486  1.727555
2016-02-29  1.530757  1.529591  1.869422  1.433620
2016-03-07  1.651609  1.452537  1.585765  1.414499
2016-03-14  1.311807  1.717968  1.410036  1.903715
2016-03-21  1.529065  1.693964  1.784480  1.708263
2016-03-28  1.636786  1.405565  1.809235  1.802555
2016-04-04  1.768068  1.564308  1.552492  1.801424
2016-04-11  1.824578  1.794437  1.649749  1.564300

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM