将 Pandas DataFrames 与不同列中的键合并

Question

I'm trying to merge two Pandas DataFrames which are as follows:我正在尝试合并两个 Pandas DataFrames，如下所示：

import pandas as pd

df1 = pd.DataFrame({'PAIR': ['140-120', '200-280', '350-310', '410-480', '500-570'],
                    'SCORE': [99, 70, 14, 84, 50]})
print(df1)

      PAIR  SCORE
0  140-120     99
1  200-280     70
2  350-310     14
3  410-480     84
4  500-570     50

df2 = pd.DataFrame({'PAIR1': ['140-120', '280-200', '350-310', '480-410', '500-570'],
                    'PAIR2': ['120-140', '200-280', '310-350', '410-480', '570-500'],
                    'BRAND' : ['A', 'V', 'P', 'V', 'P']})
print(df2)

     PAIR1    PAIR2 BRAND
0  140-120  120-140     A
1  280-200  200-280     V
2  350-310  310-350     P
3  480-410  410-480     V
4  500-570  570-500     P

If you take a closer look, you will notice that each value in the PAIR column of df1 match either the value in PAIR1 or PAIR2 of df2 .如果仔细观察，您会注意到df1的PAIR列中的每个值PAIR2与df2 PAIR1或PAIR2中的值匹配。 In df2 , the keys are present in both ways (eg 140-120 and 120-140) .在df2 ，密钥以两种方式存在（例如140-120和120-140） 。

My goal is to merge the two DataFrames to obtain the following result:我的目标是合并两个 DataFrame 以获得以下结果：

      PAIR  SCORE BRAND
0  140-120     99     A
1  200-280     70     V
2  350-310     14     P
3  410-480     84     V
4  500-570     50     P

I tried to first merge df1 with df2 the following way:我尝试通过以下方式首先将df1与df2合并：

df3 = pd.merge(left = df1, right = df2, how = 'left', left_on = 'PAIR', right_on = 'PAIR1')

Then, taking the resulting DataFrame df3 and merge it back with df2 :然后，获取生成的 DataFrame df3并将其与df2合并：

df4 = pd.merge(left = df3, right = df2, how = 'left', left_on = 'PAIR', right_on = 'PAIR2')

print(df4)

      PAIR  SCORE  PAIR1_x  PAIR2_x BRAND_x  PAIR1_y  PAIR2_y BRAND_y
0  140-120     99  140-120  120-140       A      NaN      NaN     NaN
1  200-280     70      NaN      NaN     NaN  280-200  200-280       V
2  350-310     14  350-310  310-350       P      NaN      NaN     NaN
3  410-480     84      NaN      NaN     NaN  480-410  410-480       V
4  500-570     50  500-570  570-500       P      NaN      NaN     NaN

This is not my desired result.这不是我想要的结果。 I don't how else I can account for the fact that the correct key might be either in PAIR1 or PAIR2 .我不知道我还能怎么解释正确的密钥可能在PAIR1或PAIR2 。 Any help would be appreciated.任何帮助，将不胜感激。

Answer 1

Somewhat clumsy solution: build a Series that maps each pair in df2 to its corresponding brand, then pass this mapping to df1['PAIR'].map() .有点笨拙的解决方案：构建一个系列，将df2每一对映射到其对应的品牌，然后将此映射传递给df1['PAIR'].map() 。

# Build a series whose index maps pairs to values
mapper = df2.melt(id_vars='BRAND').set_index('value')['BRAND']
mapper
value
140-120    A
280-200    V
350-310    P
480-410    V
500-570    P
120-140    A
200-280    V
310-350    P
410-480    V
570-500    P
Name: BRAND, dtype: object

# Use the mapper on df1['PAIR']
df1['BRAND'] = df1['PAIR'].map(mapper)
df1
      PAIR  SCORE BRAND
0  140-120     99     A
1  200-280     70     V
2  350-310     14     P
3  410-480     84     V
4  500-570     50     P

Answer 2

temp_df1 = df2[['PAIR1', 'BRAND']]

temp_df2 = df2[['PAIR2', 'BRAND']]

temp_df2.rename(columns= {'PAIR2' : 'PAIR1'}, inplace= True)

big_df = pd.concat([temp_df1, temp_df2])

pd.merge(df1, big_df, how = 'left',  left_on = 'PAIR', right_on = 'PAIR1')

Answer 3

You are trying to suceesively, merge on column pairs PAIR and PAIR1 & PAIR and PAIR2 both times maintaining the argument how='left' which is creating all the NaN values.您正在尝试成功地合并列对PAIR和PAIR1 & PAIR和PAIR2同时维护创建所有NaN值的参数how='left' 。

Take a look atPandas Merging 101 .看看Pandas 合并 101 。

For your current implementation you need to take subset of the current result and remove the NaN 's.对于您当前的实现，您需要获取当前结果的子集并删除NaN 。

A much simpler solution would be to manipulate the PAIR in df1 so that it matches the pattern (large-small) or (small-large) in either of PAIR1 or PAIR2一个更简单的解决方案是操作df1的PAIR ，使其匹配PAIR1或PAIR2中的模式（大-小）或（小-大）

# for working with PAIR2
df1['FOR_MERGE'] = df1['PAIR'].map(lambda x: '-'.join([str(_) for _ in sorted(x.split('-'))])).values

df2['FOR_MERGE'] = df2['PAIR1'].map(lambda x: '-'.join([str(_) for _ in sorted(x.split('-'))])).values


pd.merge(df1[['FOR_MERGE', 'SCORE']], df2[['FOR_MERGE', 'BRAND']], how='left')

将 Pandas DataFrames 与不同列中的键合并

问题描述

3 个解决方案

解决方案1
3 已采纳 2020-02-25 22:05:19

解决方案2
1 2020-02-25 22:07:53

解决方案3
0 2020-02-25 22:08:55

将 Pandas DataFrames 与不同列中的键合并

问题描述

3 个解决方案

解决方案1 3 已采纳 2020-02-25 22:05:19

解决方案2 1 2020-02-25 22:07:53

解决方案3 0 2020-02-25 22:08:55

解决方案1
3 已采纳 2020-02-25 22:05:19

解决方案2
1 2020-02-25 22:07:53

解决方案3
0 2020-02-25 22:08:55