How can do the average of the next n rows for every row in a data frame? I have a data frame that looks like :
Object|Value
A|1
B|2
C|3
D|4
E|5
F|6
G|7
H|8
I|9
J|10
K|11
L|12
M|13
and I want to average the next 3 rows for every row so the output would be like
Object|Value|Average_3
A|1|3
B|2|4
C|3|5
D|4|6
... and so on
I was thinking of doing something like
df['average_3']=df['value'].apply(lambda x: x.shift(1)+x.shift(2)+x.shift(3)
However, the n number of rows will not always be the same so I was wondering how I can apply a for loop inside the lambda function and also how will this manage the last n rows since they won't have all the future rows to do the average on? Sorry for the weird formatting
Use rolling
with shift
Please note how the window size is 3 and the minimum observations is 3. Also, I used .shift(-2)
to move all the values up two spots (because your window is 3. If your window is 4 it would be .shift(-3)
). The last three values are null because there are not three observations after index 10.
import pandas as pd
from io import StringIO
# sample data
s = """Object|Value
A|1
B|2
C|3
D|4
E|5
F|6
G|7
H|8
I|9
J|10
K|11
L|12
M|13"""
df = pd.read_csv(StringIO(s), sep='|')
# use rolling with shift
df['Average_3'] = df.shift(-1).rolling(3, 3).mean().shift(-2)
out
Object Value Average_3
0 A 1 3.0
1 B 2 4.0
2 C 3 5.0
3 D 4 6.0
4 E 5 7.0
5 F 6 8.0
6 G 7 9.0
7 H 8 10.0
8 I 9 11.0
9 J 10 12.0
10 K 11 NaN
11 L 12 NaN
12 M 13 NaN
Based on your comment are you looking to create a function?
def rolling_func(df, window, min_window):
df[f'Average_{str(window)}'] = df.shift(-1).rolling(window, min_window).mean().shift(-(window-1))
return df
rolling_func(df, 4, 4)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.