I am researching ATP Tour male tennis data. Currently, I have a Pandas dataframe that contains ~60,000 matches. Every row contains information / statistics about the match, split between the winner and the loser. I have sorted the dataframe on date. Currently I am trying to calculate the ELO-rating of both the winner and the loser for every match (thus every row). To calculate the ELO-rating, one needs the ELO-rating for both players in their previous match. Another difficulty arises, as the winner of the current match might have been a loser in his previous match. As a result, the 'winner_player_id' value of the current match might be in the 'loser_player_id' column for the previous match.
I am not sure how to efficiently select the previous ELO-ratings for both players per row, as this entails a search across multiple columns.
Every row includes the following columns:
array(['match_id', 'tourney_dates', 'round_order', 'tourney_name',
'tourney_year_id', 'tourney_round_name', 'winner_player_id',
'winner_slug', 'loser_player_id', 'loser_slug', 'elo_player_1', 'elo_player_2'])
Your time is appreciated!
One approach would be to sort each winner and loser in each row by player name/ID, so the order will be stable regardless of who wins/loses. Here's an example:
df.join(pd.DataFrame(
np.sort(df[['winner_name', 'loser_name']].values, axis=1),
columns=['name1', 'name2']))
df.head(10)
Output:
winner_name loser_name name1 name2
0 Nicklas Kulti Michael Stich Michael Stich Nicklas Kulti
1 Michael Stich Jim Courier Jim Courier Michael Stich
2 Nicklas Kulti Magnus Larsson Magnus Larsson Nicklas Kulti
3 Jim Courier Martin Sinner Jim Courier Martin Sinner
4 Michael Stich Jimmy Arias Jimmy Arias Michael Stich
5 Nicklas Kulti Fabrice Santoro Fabrice Santoro Nicklas Kulti
6 Magnus Larsson Patrik Kuhnen Magnus Larsson Patrik Kuhnen
7 Jim Courier Paul Haarhuis Jim Courier Paul Haarhuis
8 Nicklas Kulti Magnus Gustafsson Magnus Gustafsson Nicklas Kulti
9 Michael Stich Gilad Bloom Gilad Bloom Michael Stich
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.