[英]How to get the average for the next n rows for every row in Python
How can do the average of the next n rows for every row in a data frame?如何为数据框中的每一行计算接下来的 n 行的平均值? I have a data frame that looks like :
我有一个看起来像的数据框:
Object|Value
A|1
B|2
C|3
D|4
E|5
F|6
G|7
H|8
I|9
J|10
K|11
L|12
M|13
and I want to average the next 3 rows for every row so the output would be like我想为每一行平均接下来的 3 行,所以输出就像
Object|Value|Average_3
A|1|3
B|2|4
C|3|5
D|4|6
... and so on ... 等等
I was thinking of doing something like我正在考虑做类似的事情
df['average_3']=df['value'].apply(lambda x: x.shift(1)+x.shift(2)+x.shift(3)
However, the n number of rows will not always be the same so I was wondering how I can apply a for loop inside the lambda function and also how will this manage the last n rows since they won't have all the future rows to do the average on?但是,n 行数并不总是相同,所以我想知道如何在 lambda 函数中应用 for 循环,以及这将如何管理最后 n 行,因为它们不会有所有未来的行要做平均数? Sorry for the weird formatting
抱歉奇怪的格式
Use rolling
with shift
Please note how the window size is 3 and the minimum observations is 3. Also, I used .shift(-2)
to move all the values up two spots (because your window is 3. If your window is 4 it would be .shift(-3)
).使用带
shift
rolling
请注意窗口大小是 3,最小观察值是 3。此外,我使用.shift(-2)
将所有值向上移动两个点(因为您的窗口是 3。如果您的窗口是 4 它将是.shift(-3)
)。 The last three values are null because there are not three observations after index 10.最后三个值为空,因为在索引 10 之后没有三个观察值。
import pandas as pd
from io import StringIO
# sample data
s = """Object|Value
A|1
B|2
C|3
D|4
E|5
F|6
G|7
H|8
I|9
J|10
K|11
L|12
M|13"""
df = pd.read_csv(StringIO(s), sep='|')
# use rolling with shift
df['Average_3'] = df.shift(-1).rolling(3, 3).mean().shift(-2)
out出去
Object Value Average_3
0 A 1 3.0
1 B 2 4.0
2 C 3 5.0
3 D 4 6.0
4 E 5 7.0
5 F 6 8.0
6 G 7 9.0
7 H 8 10.0
8 I 9 11.0
9 J 10 12.0
10 K 11 NaN
11 L 12 NaN
12 M 13 NaN
Based on your comment are you looking to create a function?根据您的评论,您是否希望创建一个函数?
def rolling_func(df, window, min_window):
df[f'Average_{str(window)}'] = df.shift(-1).rolling(window, min_window).mean().shift(-(window-1))
return df
rolling_func(df, 4, 4)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.