简体   繁体   English

熊猫数据框中的不重叠滚动窗口

[英]Non-overlapping rolling windows in pandas dataframes

I am familiar with the Pandas Rolling window functions, but they always have a step size of 1. I want to do a moving aggregate function in Pandas, but where the entries don't overlap. 我熟悉Pandas Rolling窗口函数,但它们的步长始终为1。我想在Pandas中执行移动聚合函数,但输入项不重叠。

In this Dataframe: 在此数据框中: 在此处输入图片说明

df.rolling(2).min()

will yield: 将产生:

N/A 519 566 727 1099 12385

But I want a fixed window with a step size of 2, so it yields: 但是我想要一个步长为2的固定窗口,所以它会产生:

519 727 12385

Because with a fixed window, it should step over by the size of that window instead. 因为使用固定的窗口,它应该逐步超过该窗口的大小。

There's no such built in argument in the rolling function, but you can compute the usual rolling function and then skip every n th row (where n=2 in your case). rolling函数中没有这样的内置参数,但是您可以计算通常的滚动函数,然后跳过第n行(在您的情况下为n=2 )。

df.rolling(n).min()[n-1::n]

As you mentioned in your comment, this might result in many redudant computations which will be ignored (especially if n is large). 正如您在评论中提到的那样,这可能会导致许多冗余计算,这些计算将被忽略(特别是如果n大)。 Instead, you could use the following code which partitions (groups) the data into bins of size n : 相反,您可以使用以下代码将数据划分(分组)为大小为n

df.groupby(df.index // n).min()

I did not check if it's indeed more efficient, but I believe it should be. 我没有检查它是否确实更有效,但我认为应该如此。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM