简体   繁体   English

如何按值组合两个熊猫数据框

[英]How to combine two pandas dataframes value by value

I have 2 dataframes - players (only has playerid) and dates (only has date). 我有2个数据框-播放器(仅具有playerid)和日期(仅具有日期)。 I want new dataframe which will contain for each player each date. 我想要新的数据框,其中将包含每个日期的每个玩家。 In my case, players df contains about 2600 rows and date df has 1100 rows. 就我而言,播放器df包含约2600行,而日期df有1100行。 I used 2 for loops to do this, but it is really slow, is there a way to do it faster via some function? 我使用了2个for循环来执行此操作,但它确实很慢,是否可以通过某些函数来更快地执行此操作? thx 谢谢

my loop: 我的循环:

player_elo = pd.DataFrame(columns = ['PlayerID','Date'])
for row in players.itertuples():
    idx = row.Index
    pl = players.at[idx,'PlayerID']
    for i in dates.itertuples():
        idd = row.Index
        dt = dates.at[idd, 0]
        new = {'PlayerID': [pl], 'Date': [dt]}
        new = pd.DataFrame(new)
        player_elo = player_elo.append(new)

If you have a key that is repeated for each df , you can come up with the cartesian product you are looking for using pd.merge() . 如果您对每个df都有重复的key ,则可以使用pd.merge()您要寻找的笛卡尔乘积。

import pandas as pd

players = pd.DataFrame([['A'], ['B'], ['C']], columns=['PlayerID'])

dates = pd.DataFrame([['12/12/2012'],['12/13/2012'],['12/14/2012']], columns=['Date'])
dates['Date'] = pd.to_datetime(dates['Date'])

players['key'] = 1
dates['key'] = 1
print(pd.merge(players, dates,on='key')[['PlayerID', 'Date']])

Output 输出量

      PlayerID   Date
0        A     2012-12-12
1        A     2012-12-13
2        A     2012-12-14
3        B     2012-12-12
4        B     2012-12-13
5        B     2012-12-14
6        C     2012-12-12
7        C     2012-12-13
8        C     2012-12-14

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM