简体   繁体   English

在 dataframe 中对多个列执行相同合并的更有效方法?

[英]More efficient way to do the same merge on multiple columns in a dataframe?

Input:输入:

df1

    OFF_P1  OFF_P2  OFF_P3  OFF_P4  OFF_P5  GAME_ID
0   1629675 1627736 1630162 201976  1629020 22101224
1   201599  1630178 202699  1629680 201980  22101228
2   1630191 1630180 1630587 1630240 1628402 22101228
3   1627759 201143  1628464 1628369 203935  22101223
4   1630573 1630271 1630238 1628436 1630346 22101223

df2

    PLAYER_ID GAME_ID   PTS
0   201980    21900002  28
1   201586    21900001  13
2   1628366   21900001  8
3   200755    21900001  16
4   202324    21900001  6

Desired Output:所需 Output:

    OFF_P1  OFF_P2  OFF_P3  OFF_P4  OFF_P5  GAME_ID  OFF_P1_PTS  OFF_P2_PTS  etc...
0   1629675 1627736 1630162 201976  1629020 22101224 28          13          ...
1   201599  1630178 202699  1629680 201980  22101228 12
2   1630191 1630180 1630587 1630240 1628402 22101228 14
3   1627759 201143  1628464 1628369 203935  22101223 8
4   1630573 1630271 1630238 1628436 1630346 22101223 19

I would like to merge the PTS column from df2 to df1 but for each column of OFF_P1 , OFF_P2 , etc...我想将PTS列从df2合并到df1但对于OFF_P1OFF_P2等的每一列......

Is there a more efficient way to do this other than something like the below?除了下面的方法之外,还有更有效的方法吗?

df1 = df1.merge(df2, left_on=['GAME_ID', 'OFF_P1'], right_on=['GAME_ID', 'PLAYER_ID'])
df1 = df1.merge(df2, left_on=['GAME_ID', 'OFF_P2'], right_on=['GAME_ID', 'PLAYER_ID'])
df1 = df1.merge(df2, left_on=['GAME_ID', 'OFF_P3'], right_on=['GAME_ID', 'PLAYER_ID'])
df1 = df1.merge(df2, left_on=['GAME_ID', 'OFF_P4'], right_on=['GAME_ID', 'PLAYER_ID'])
df1 = df1.merge(df2, left_on=['GAME_ID', 'OFF_P5'], right_on=['GAME_ID', 'PLAYER_ID'])

I would prefer the MultiIndex.map approach:我更喜欢MultiIndex.map方法:

d = df2.set_index(['GAME_ID', 'PLAYER_ID'])['PTS']

for c in df1.filter(like='OFF_P'):
    df1[f'{c}_PTS'] = df1.set_index(['GAME_ID', c]).index.map(d)

print(df1)

    OFF_P1   OFF_P2   OFF_P3   OFF_P4   OFF_P5   GAME_ID  OFF_P1_PTS  OFF_P2_PTS  OFF_P3_PTS  OFF_P4_PTS  OFF_P5_PTS  OFF_P1_PTS_PTS  OFF_P2_PTS_PTS  OFF_P3_PTS_PTS  OFF_P4_PTS_PTS  OFF_P5_PTS_PTS
0  1629675  1627736  1630162   201976  1629020  22101224        28.0         NaN         NaN         NaN         NaN             NaN             NaN             NaN             NaN             NaN
1   201599  1630178   202699  1629680   201980  22101228         NaN        13.0         NaN         NaN         NaN             NaN             NaN             NaN             NaN             NaN
2  1630191  1630180  1630587  1630240  1628402  22101228         NaN         NaN         NaN         NaN         NaN             NaN             NaN             NaN             NaN             NaN
3  1627759   201143  1628464  1628369   203935  22101223        16.0         NaN         NaN         NaN         NaN             NaN             NaN             NaN             NaN             NaN
4  1630573  1630271  1630238  1628436  1630346  22101223         NaN         NaN         NaN         NaN         NaN             NaN             NaN             NaN             NaN             NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在熊猫中合并列的更有效方法 - More efficient way to merge columns in pandas 合并同一 DataFrame 中的多个列 - Merge multiple columns in same DataFrame 对数据框中的列进行排名的更有效方法 - More efficient way to rank columns in a dataframe 如何在 dataframe 中合并具有相同名称的多个列 - How to merge multiple columns with same names in a dataframe 使用循环或 lambda 在多个数据帧中添加具有相同字符串值的列的更有效方法? - More efficient way to add columns with same string values in multiple dataframes with loops or lambdas? 遍历PySpark DataFrame和创建新列的更有效方法 - More efficient way to loop through PySpark DataFrame and create new columns 从Dataframe中的2个或更多列获取唯一值的有效方法 - Efficient way to get the unique values from 2 or more columns in a Dataframe PySpark - 一种查找具有多个不同值的 DataFrame 列的有效方法 - PySpark - an efficient way to find DataFrame columns with more than 1 distinct value 有没有更有效的方式来写入要列出的数据框(列和数据)? - Is there a more efficient way to write a dataframe (columns and data) to list? 根据初始行中的值将熊猫数据帧的多行合并为一行,向该行添加新列的最有效方法? - Most efficient way to merge multiple rows of a pandas dataframe in to one row, adding new columns to the row, based on values in the initial rows?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM