I'm trying to generalize the question I asked here .
The mlb
dataframe looks like
Player Position Salary Year
0 Mike Witt Pitcher 1400000 1988
1 George Hendrick Outfielder 989333 1988
2 Chili Davis Outfielder 950000 1988
3 Brian Downing Designated Hitter 900000 1988
4 Bob Boone Catcher 883000 1988
5 Bob Boone Catcher 883000 1989
6 Frank Smith Catcher 993000 1988
7 Frank Smith Pitcher 1300000 1989
Note that the same player may be listed multiple times for different years. I'm trying to find the player with maximum total salary for each position. Output should be something like:
Position Player Salary
0 Pitcher Mike Witt 1400000
1 Outfielder George Hendrick 989333
2 Brian Downing Designated Hitter 900000
3 Catcher Bob Boone 1766000
I think I need to do something like group by Position, then group by Player, then sum for each player and find the maximum. But I'm having trouble doing this.
Once I do positions = mlb.groupby("Position")
I'm having trouble doing the next step. I think a nested groupby by Player is necessary, but I don't know how to proceed.
This is messy but gets the job done.
df = pd.DataFrame({'Player':['Mike Witt','George Hendrick','Chili Davis','Brian Downing','Bob Boone','Bob Boone'],
'Position':['Pitcher','Outfielder','Outfielder','Designated Hitter','Catcher','Catcher'],
'Salary':[1400000,989333, 950000,900000,883000,900000],
'Year':[1988,1988,1988,1988,1988,1988]})
gp = df.groupby(['Player','Position']).sum()['Salary'].to_frame().reset_index()
gp.sort('Salary',ascending=False).drop_duplicates('Position')
OR
gp.groupby('Position').max()
Like @dawg mentioned, this will essentially treat a player that has multiple positions as different players so their salaries per position are what is shown here.
Player Position Salary
0 Bob Boone Catcher 1783000
4 Mike Witt Pitcher 1400000
3 George Hendrick Outfielder 989333
1 Brian Downing Designated Hitter 900000
Try this
import numpy as np
g = df.groupby(['Position', 'Player']).aggregate({'Salary': sum, 'Player': lambda y: np.unique(y)})
print g.max(level=['Position'])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.