简体   繁体   中英

Pandas dataframe building summary table

Example df:

id home away winner loser winner_score loser_score
0  A    B    A      B     20            10
1  C    D    D      C     20            5
2  A    D    D      A     30            0

My goal is to build a win/loss/for/against sort of table. End result:

Team W L Total for Total against
A 1 1 20 40
B 0 1 10 20
C 0 1 5 20
D 2 0 50 5

I can use groupby('winner/loser') to get the wins and losses, but am unable to properly groupby to get the summed scores (and mean, but that's pretty similar). My main issue is when a team has never won a match, I'll end up with NaNs.

My method now is along the lines of:

by_winner = df.groupby('winner').sum()
by_loser = df.groupby('loser').sum()

overall_table['score_for'] = by_winner.score + by_loser.score

I'm also not sure how to even phrase the question, but I would like to be able to run these stats from a concept of one line = one match, but I don't know how to group by the winner and loser, so that I get summed results of all teams at once.

Let's try:

counts = pd.get_dummies(df[['winner','loser']].stack()).sum(level=1).T


winner_score = (df.groupby('winner')[['winner_score','loser_score']].sum()
   .rename(columns={'winner_score':'for', 'loser_score':'against'})
)

loser_score = (df.groupby('loser')[['winner_score','loser_score']].sum()
   .rename(columns={'winner_score':'against', 'loser_score':'for'})
)

pd.concat((counts, winner_score.add( loser_score, fill_value=0) ), axis=1)

Output:

   winner  loser  against   for
A       1      1     40.0  20.0
B       0      1     20.0  10.0
C       0      1     20.0   5.0
D       2      0      5.0  50.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM