I have a dataframe ranksdf
containing player names
, dates
, and their ranking
per the date. The date column is a parsed datetime
object (maybe relevant for date comparison later):
player date ranking
A 20120601 1
B 20120601 2
C 20120601 3
A 20130601 1
B 20130601 2
C 20130601 3
What I want to do is to add a new column which counts tournament wins
of each player until that date. the information on tournament wins comes from another dataframe called matchesdf
:
t_name t_date w_name round
X 20120101 A F
X 20120101 A SF
Y 20120201 B F
Y 20120201 B SF
Z 20130101 A F
t_name
= tournament name t_date
= date of the tournament w_name
= winner name round
= the round in the tournament. F
= Final, SF
= Semifinal From the second dataframe I know when a specific player won a tournament at a give time by counting the rows where round equals F
.
So what I want to do is to add a new column to ranksdf
counting the tournament wins but only until ranksdf.date
.
In pseudocode something like this: ranksdf['t_wins'] = ranksdf.apply(lambda x: matchesdf[(matchesdf['t_date'] < x['date']) & (matchesdf['w_name'] == x['player']) & (matchesdf['round'] == 'F')].count())
So, the constraints on looking up the info in matchesdf
are the time (because I want to know only the wins until the time of the ranking in ranksdf
), the player name obviously, and the round (because tournament wins are defined by winning the Final).
The result should look like this:
player date ranking t_wins
A 20120601 1 1
B 20120601 2 1
C 20120601 3 0
A 20130601 1 2
B 20130601 2 1
C 20130601 3 0
Thanks for helping me.
只需将axis = 1添加到您的apply函数中,它将起作用:
ranksdf["t_wins"] = ranksdf.apply(lambda x: len(matchesdf[(matchesdf['t_date'] < x['date']) & (matchesdf['w_name'] == x['player']) & (matchesdf['round'] == 'F')]), axis =1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.