简体   繁体   中英

How to combine two pandas dataframes value by value

I have 2 dataframes - players (only has playerid) and dates (only has date). I want new dataframe which will contain for each player each date. In my case, players df contains about 2600 rows and date df has 1100 rows. I used 2 for loops to do this, but it is really slow, is there a way to do it faster via some function? thx

my loop:

player_elo = pd.DataFrame(columns = ['PlayerID','Date'])
for row in players.itertuples():
    idx = row.Index
    pl = players.at[idx,'PlayerID']
    for i in dates.itertuples():
        idd = row.Index
        dt = dates.at[idd, 0]
        new = {'PlayerID': [pl], 'Date': [dt]}
        new = pd.DataFrame(new)
        player_elo = player_elo.append(new)

If you have a key that is repeated for each df , you can come up with the cartesian product you are looking for using pd.merge() .

import pandas as pd

players = pd.DataFrame([['A'], ['B'], ['C']], columns=['PlayerID'])

dates = pd.DataFrame([['12/12/2012'],['12/13/2012'],['12/14/2012']], columns=['Date'])
dates['Date'] = pd.to_datetime(dates['Date'])

players['key'] = 1
dates['key'] = 1
print(pd.merge(players, dates,on='key')[['PlayerID', 'Date']])

Output

      PlayerID   Date
0        A     2012-12-12
1        A     2012-12-13
2        A     2012-12-14
3        B     2012-12-12
4        B     2012-12-13
5        B     2012-12-14
6        C     2012-12-12
7        C     2012-12-13
8        C     2012-12-14

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM