简体   繁体   中英

How to get the average for the next n rows for every row in Python

How can do the average of the next n rows for every row in a data frame? I have a data frame that looks like :

Object|Value
    A|1
    B|2
    C|3
    D|4
    E|5
    F|6
    G|7
    H|8
    I|9
    J|10
    K|11
    L|12
    M|13

and I want to average the next 3 rows for every row so the output would be like

Object|Value|Average_3
    A|1|3
    B|2|4
    C|3|5
    D|4|6

... and so on

I was thinking of doing something like

df['average_3']=df['value'].apply(lambda x: x.shift(1)+x.shift(2)+x.shift(3)

However, the n number of rows will not always be the same so I was wondering how I can apply a for loop inside the lambda function and also how will this manage the last n rows since they won't have all the future rows to do the average on? Sorry for the weird formatting

Use rolling with shift Please note how the window size is 3 and the minimum observations is 3. Also, I used .shift(-2) to move all the values up two spots (because your window is 3. If your window is 4 it would be .shift(-3) ). The last three values are null because there are not three observations after index 10.

import pandas as pd
from io import StringIO

# sample data
s = """Object|Value
A|1
B|2
C|3
D|4
E|5
F|6
G|7
H|8
I|9
J|10
K|11
L|12
M|13"""
df = pd.read_csv(StringIO(s), sep='|')
# use rolling with shift
df['Average_3'] = df.shift(-1).rolling(3, 3).mean().shift(-2)

out

   Object  Value  Average_3
0       A      1        3.0
1       B      2        4.0
2       C      3        5.0
3       D      4        6.0
4       E      5        7.0
5       F      6        8.0
6       G      7        9.0
7       H      8       10.0
8       I      9       11.0
9       J     10       12.0
10      K     11        NaN
11      L     12        NaN
12      M     13        NaN

Update

Based on your comment are you looking to create a function?

def rolling_func(df, window, min_window):
    df[f'Average_{str(window)}'] = df.shift(-1).rolling(window, min_window).mean().shift(-(window-1))
    return df

rolling_func(df, 4, 4)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM