Here is a sample of my pandas dataframe
Player_A Player_B Gain_A Gain_B
John Max -3 3
Max Lucy 4 -4
Lucy John 1 -1
Max John -5 5
John Lucy -2 2
I wish to create a new column, 'Sum_2_A', which displays the sum of the most recent two instances of a player's 'Gain' (not including the value from the current row)
ie, the expected output in the given sample would be as follows
Player_A Player_B Gain_A Gain_B Sum_2_A
John Max -3 3 -3
Max Lucy 4 -4 4
Lucy John 1 -1 1
Max John -5 5 7
John Lucy -2 2 4
I can do it via for loops, but it's way too slow to be useful. Any help is appreciated.
Thanks
IIUC, you can convert the data to long form, rolling sum on groupby:
new_df = (pd.wide_to_long(df.reset_index(), stubnames=['Player','Gain'],
i='index',j='type',
sep='_', suffix = '.*'
)
.sort_index()
)
new_df['Sum_2'] = (new_df.groupby('Player')
.Gain.rolling(3).sum()
.reset_index('Player',drop=True)
.sort_index()
.sub(new_df['Gain'])
.fillna(new_df['Gain'])
)
new_df.unstack('type')
Output:
Player Gain Sum_2
type A B A B A B
index
0 John Max -3 3 -3.0 3.0
1 Max Lucy 4 -4 4.0 -4.0
2 Lucy John 1 -1 1.0 -1.0
3 Max John -5 5 7.0 -4.0
4 John Lucy -2 2 4.0 -3.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.