简体   繁体   English

将 qcut 应用于滚动分析

[英]Apply qcut to rolling analysis

I would like to apply pandas qcut to a rolling window.我想将 pandas qcut 应用于滚动窗口。 I'm not sure how to go about doing this...idea is to take last 20 days, find the values which fall in the upper quartile, find the averages of values in the upper quartile.我不知道该怎么做……想法是在过去 20 天里,找到落在上四分位数的值,找到上四分位数的平均值。 And return the average for that one rolled time series.并返回该滚动时间序列的平均值。

So if I have所以如果我有

s = pd.Series([5,6,10,12,13,13,20,21,22])
s.rolling(2,2).apply(lambda x: pd.qcut(x,5))

This results in这导致

0   NaN
1   NaN
2   NaN
3   NaN
4   NaN
5   NaN
6   NaN
7   NaN
8   NaN
dtype: float64

How do I get the qcut intervals for each time series ?如何获得每个时间序列的 qcut 间隔? Thanks.谢谢。 Note in the example I have a 2 day rolling window.请注意,在示例中我有一个为期 2 天的滚动窗口。 It's just to make things simpler只是为了让事情更简单

I think you can do it by selectioning in your apply the x that correspond to the highest quartile.我认为您可以通过在apply选择对应于最高四分位数的x来实现。 With a rolling of 6 and q=4 , you can do: rolling 6 和q=4 ,您可以执行以下操作:

print (s.rolling(6,6).apply(lambda x: x[pd.qcut(x, q=4, labels=[1,2,3,4]) == 4].mean()))
0     NaN
1     NaN
2     NaN
3     NaN
4     NaN
5    13.0
6    20.0
7    20.5
8    21.5
dtype: float64

I use the labels parameter to be able to select the higher quartile (here name 4) that will have different value for each rolling so not sure how to do differently.我使用labels参数来选择更高的四分位数(此处为名称 4),每个滚动的值都不同,因此不确定如何做不同的事情。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM