Pandas Rolling Mean Depending on Row Value

Question

I have the following dataframe in pandas:

Visitor    Home     Visitor_rtg    Home_rtg
 Sixers    Celtics     83.4         100.7
 Thunder   Warriors    97.7         105.6
 Bucks     Hornets     110.2        109.2
 Pelicans  Sixers      88.1         89.2
 Hornets   Wizards     90.1         102.3

What I would like to do is keep the rolling mean of rtg for each team, regardless of if they are home or visitor. So for example in the fourth row the sixers home_rtg would be 86.3 because (83.4+89.2) / 2 . And the final row the Hornets would be 99.65 because (109.2+90.1)/2 I have found df.rolling() but cant figure out how to get this to only calculate per team. The ordering also matters because I want to track the rating progression throughout the season.

Expected output:

Visitor    Home     Visitor_rtg    Home_rtg
 Sixers    Celtics     83.4         100.7
 Thunder   Warriors    97.7         105.6
 Bucks     Hornets     110.2        109.2
 Pelicans  Sixers      88.1         *86.3
 Hornets   Wizards     *99.65        102.3

Answer 1

`stack` + `concat` + `groupby` + `mean`

IIUC, you can stack your dataframe, and groupby + mean the values. No need for rolling

f = df[['Visitor', 'Home']].stack().reset_index(drop=True)
s = df[['Visitor_rtg', 'Home_rtg']].stack().reset_index(drop=True)
pd.concat([f, s], axis=1).groupby(0)[1].mean()

Outputs

Bucks       110.2
Celtics     100.7
Hornet      109.2
Pelicans     88.1
Sixers       86.3
Thunder      97.7
Warriors    105.6
Name: 1, dtype: float64

Edited answer

To get your desired output, you can use expanding (not rolling) to get the mean and transform to broadcast back to each original row. In the end, just reshape underlying values of your dataframe to get final output

df2 = pd.concat([f, s], axis=1)
df2[1] = df2.groupby(0).transform(lambda s: s.expanding().mean())
pd.DataFrame(df2.values.reshape(-1,4), columns=df.columns)


    Visitor     Home    Visitor_rtg Home_rtg
0   Sixers      83.4    Celtics     100.7
1   Thunder     97.7    Warriors    105.6
2   Bucks       110.2   Hornets     109.2
3   Pelicans    88.1    Sixers      86.3
4   Hornets     99.65   Wizards     102.3

Answer 2

A not particularly pleasant solution:

df = pd.DataFrame([['Sixers', 'Celtics', 83.4, 100.7], 
                   ['Pelicans', 'Sixers', 88.1, 89.2]], 
                  columns=['Visitor', 'Home', 'Visitor_rtg', 'Home_rtg'])
melted = df.melt(['Visitor', 'Home'])
pd.concat([melted.loc[melted['variable'] == 'Visitor_rtg', ['Visitor', 'value']].rename(columns={'Visitor': 'Team'}), 
            melted.loc[melted['variable'] == 'Home_rtg', ['Home', 'value']].rename(columns={'Home': 'Team'})], 
           axis=0, 
           ignore_index=True
      ).groupby('Team').mean()

Answer 3

You can use reshape on the underlying numpy arrays for an efficient solution.

Step 1
There is quite a bit of reshaping involved, and finally putting the data into an empty numpy array.

home = df.iloc[:, [1,3]]
away = df.iloc[:, [0,2]]
total = np.empty((home.shape[0] * 2, home.shape[1]), dtype=object)
total[::2] = away
total[1::2] = home

Step 2
Now that we have the data formatted, we can use expanding.mean and a bit more reshaping

d = pd.DataFrame({'team': total[:, 0], 'score': total[:, 1].astype(float)}) 
out = d.groupby('team').score.expanding().mean().reset_index().sort_values(by='level_1') 

pd.DataFrame(
    out[['team', 'score']].values.reshape(-1, 4)[:, [0,2,1,3]],
    columns=df.columns
)

    Visitor      Home Visitor_rtg Home_rtg
0    Sixers   Celtics        83.4    100.7
1   Thunder  Warriors        97.7    105.6
2     Bucks   Hornets       110.2    109.2
3  Pelicans    Sixers        88.1     86.3
4   Hornets   Wizards       99.65    102.3

Pandas Rolling Mean Depending on Row Value

Question

3 answers

solution1
1 ACCPTED 2018-11-14 21:03:07

`stack` + `concat` + `groupby` + `mean`

Edited answer

solution2
0 2018-11-14 21:07:33

solution3
0 2018-11-14 21:24:34

Pandas Rolling Mean Depending on Row Value

Question

3 answers

solution1 1 ACCPTED 2018-11-14 21:03:07

stack + concat + groupby + mean

Edited answer

solution2 0 2018-11-14 21:07:33

solution3 0 2018-11-14 21:24:34

solution1
1 ACCPTED 2018-11-14 21:03:07

`stack` + `concat` + `groupby` + `mean`

solution2
0 2018-11-14 21:07:33

solution3
0 2018-11-14 21:24:34