[英]fetching most recent values in pandas dataframe
Here is a sample of my pandas dataframe这是我的 pandas dataframe 的样本
Player_A Player_B Gain_A Gain_B
John Max -3 3
Max Lucy 4 -4
Lucy John 1 -1
Max John -5 5
John Lucy -2 2
I wish to create a new column, 'Sum_2_A', which displays the sum of the most recent two instances of a player's 'Gain' (not including the value from the current row)我希望创建一个新列“Sum_2_A”,它显示玩家“增益”的最近两个实例的总和(不包括当前行的值)
ie, the expected output in the given sample would be as follows即,给定样本中的预期 output 如下
Player_A Player_B Gain_A Gain_B Sum_2_A
John Max -3 3 -3
Max Lucy 4 -4 4
Lucy John 1 -1 1
Max John -5 5 7
John Lucy -2 2 4
I can do it via for loops, but it's way too slow to be useful.我可以通过 for 循环来做到这一点,但它太慢了,无法使用。 Any help is appreciated.任何帮助表示赞赏。
Thanks谢谢
IIUC, you can convert the data to long form, rolling sum on groupby: IIUC,您可以将数据转换为长格式,在 groupby 上滚动总和:
new_df = (pd.wide_to_long(df.reset_index(), stubnames=['Player','Gain'],
i='index',j='type',
sep='_', suffix = '.*'
)
.sort_index()
)
new_df['Sum_2'] = (new_df.groupby('Player')
.Gain.rolling(3).sum()
.reset_index('Player',drop=True)
.sort_index()
.sub(new_df['Gain'])
.fillna(new_df['Gain'])
)
new_df.unstack('type')
Output: Output:
Player Gain Sum_2
type A B A B A B
index
0 John Max -3 3 -3.0 3.0
1 Max Lucy 4 -4 4.0 -4.0
2 Lucy John 1 -1 1.0 -1.0
3 Max John -5 5 7.0 -4.0
4 John Lucy -2 2 4.0 -3.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.