非重叠滚动 windows 在 pandas groupby

Question

I want to create non-overlapping rolling or sliding window in pandas groupby我想在 pandas groupby 中创建不重叠的滚动或滑动 window

import pandas as pd
df1 = pd.DataFrame( {'a1':['A','A','B','B','B','B','B','B'],'a2':[1,1,1,2,2,2,2,2], 'b':[1,2,5,5,5,4,6,2]})

For overlapping rolling window, I can do this对于重叠滚动 window，我可以这样做

df1.groupby(['a1','a2']).rolling(2).mean()

But is there any way to make it non-overlapping?但是有没有办法让它不重叠？

The output should be like this output应该是这样的

pd.DataFrame('a1':['A','B','B','B','B'],'a2':[1,1,2,2,2],'b':[1.5,NaN,5,5,NaN])

Explanation解释

When a1 is A and a2 is 1 , the value of b is 1 and 2 .当a1为A且a2为1时， b 的值为1和2 。 Adding both results in 1.5 .在1.5中添加两个结果。
When a1 is B and a2 is 1 , the value of b is 5 .当a1为B且a2为1时， b的值为5 。 As the value of b is less than the length of the sliding window, we got NaN .由于b的值小于滑动 window 的长度，我们得到NaN 。
When a1 is B and a2 is 2 , the value of b is 5,5,4,6,2 .当a1为B且a2为2时， b 的值为5,5,4,6,2 。 As sliding window is 2 , so adding (5+5)/2=5 , (4+6)/2=5 .由于滑动 window 是2 ，所以添加(5+5)/2=5 , (4+6)/2=5 。 And last value is NaN as length is less than sliding window.最后一个值为NaN ，因为长度小于滑动 window。

Answer 1

Well, one approach (not very elegant), is to do:好吧，一种方法（不是很优雅）是：

def non_overlapping_mean(x, window=2):
    return x.groupby(np.arange(len(x)) // window).apply(lambda x: np.nan if len(x) < 2 else x.mean())


res = df1.groupby(['a1', 'a2'])['b'].apply(non_overlapping_mean).droplevel(-1).reset_index()
print(res)

Output Output

  a1  a2    b
0  A   1  1.5
1  B   1  NaN
2  B   2  5.0
3  B   2  5.0
4  B   2  NaN

The main idea is to groupby into consecutive chunks, and is done here:主要思想是将groupby分成连续的块，并在此处完成：

x.groupby(np.arange(len(x)) // window)

非重叠滚动 windows 在 pandas groupby

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-12-27 19:25:00

非重叠滚动 windows 在 pandas groupby

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-12-27 19:25:00

解决方案1
2 已采纳 2020-12-27 19:25:00