简体   繁体   中英

How to combine rows in a pandas dataframe that have the same value in one column

I have a pandas dataframe of NBA player stats from the 2019-2020 season. Some players names show up more than once because they played on multiple different teams throughout the season. I want to organize the dataframe so that each player's name only appears once, and for the players whose names appear more than once, I want to take the average of all their stats and put it into one row.

For example, if there was a player that played on 3 different teams and appeared in 3 consecutive rows, I want to combine those 3 rows into one row, with that new row being the average of all the stats for the three rows.

Here is an example of player names appearing multiple times:

例子

Is there any simple way to do this? I don't know how many times a player might appear, and I don't know how many players' names appear multiple times. I want to iterate through the dataframe and take the average of all stats for rows that have the same player name.

If needed, I can delete the 'Tm' column, or any of the string columns really (besides 'Player') since I don't absolutely need those, but I'd rather keep them if possible.

You can use group by method for this:

cols = [col for col in df.columns if all(char.isdigit() for char in col)]
df.groupby('player')[cols].mean()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM