根据 dataframe Python 中另一列的值计算一列的移动平均值（熊猫）

Question

I am trying to create a column of the 10-day moving average of points for nba players.我正在尝试为 nba 球员创建一个 10 天移动平均得分列。 My dataframe has game by game statistics for each player, and I would like to have the moving average column contain the 10 day moving average at that point.我的 dataframe 有每个玩家的逐场统计数据，我想让移动平均列包含当时的 10 天移动平均线。 I have tried df.groupby('player')['points].rolling(10,1).mean, but this is just giving me the number of points scored on that day as the moving average.我试过 df.groupby('player')['points].rolling(10,1).mean，但这只是给了我当天得分的移动平均数。 All of the players from each day are listed and then the dataframe moves onto the following day, so I could have a couple hundred rows with the same date but different players' stats.列出了每天的所有球员，然后 dataframe 移动到第二天，所以我可以有几百行具有相同日期但不同球员的统计数据。 Any help would be greatly appreciated.任何帮助将不胜感激。 Thanks.谢谢。

Answer 1

As stated, you really should provide a sample dataset, and show what you are trying to achieve.如前所述，您确实应该提供一个示例数据集，并展示您想要实现的目标。 However, I love working with sports data so don't mind puting in the minute or so to get a sample set.但是，我喜欢处理运动数据，所以不介意花一分钟左右的时间来获取样本集。

So basically you need to do a rolling mean on a groupby.所以基本上你需要对 groupby 做一个滚动平均值。 You'll notice obviously the first 10 rows of each player are blank, because it doesn't have 10 dates to take the mean of.你会注意到每个玩家的前 10 行显然是空白的，因为它没有 10 个日期来取平均值。 You can change that by changing the min to 1. Also, when you do this, you want to make sure your data is sorted by date (which here it already is).您可以通过将 min 更改为 1 来更改它。此外，当您这样做时，您希望确保您的数据按日期排序（这里已经是）。

import pandas as pd

player_link_list = ['https://www.basketball-reference.com/players/l/lavinza01/gamelog/2021/',
                    'https://www.basketball-reference.com/players/v/vucevni01/gamelog/2021/',
                    'https://www.basketball-reference.com/players/j/jamesle01/gamelog/2021/',
                    'https://www.basketball-reference.com/players/d/davisan02/gamelog/2021/']

dfs = []
for link in player_link_list:
    w=1
    df = pd.read_html(link)[-1]
    df = df[df['Rk'].ne('Rk')]   
    df = df[df['PTS'].ne('Inactive')]   
    df['Player'] = link.split('/')[-4]
    df['PTS'] = df['PTS'].astype(int,errors = 'ignore')
    dfs.append(df)
    

df = pd.concat(dfs)

df['rolling_10_avg'] = df.groupby('Player')['PTS'].transform(lambda s: s.rolling(10, min_periods=10).mean())

根据 dataframe Python 中另一列的值计算一列的移动平均值（熊猫）

问题描述

1 个解决方案

解决方案1
2 2021-04-23 07:49:48

根据 dataframe Python 中另一列的值计算一列的移动平均值（熊猫）

问题描述

1 个解决方案

解决方案1 2 2021-04-23 07:49:48

解决方案1
2 2021-04-23 07:49:48