[英]Join two dataframes on multiple columns in Python
I have two dataframes with names df1 and df2. 我有两个名为df1和df2的数据帧。
df1= DF1 =
col1 col2 count
0 1 36 200
1 12 15 200
2 13 17 100
df2= DF2 =
product_id product_name
0 1 abc
1 2 xyz
2 3 aaaa
3 12 qwert
4 13 sed
5 15 qase
6 36 asdf
7 17 zxcv
The entries in col1 and col2 are product_id from df2. col1和col2中的条目是df2中的product_id。
I want to make a new dataframe 'df3', which has the following columns and entries. 我想创建一个新的数据帧'df3',它包含以下列和条目。
df3= DF3 =
col1 | col1_name | col2 | col2_name | count
0 1 | abc | 36 | asdf | 200
1 12 | qwert | 15 | qase | 200
2 13 | sed | 17 | zxcv | 100
ie add a col1_name
and col2_name
wherever product_id
from df2
is equal to col1
& col2
values. 即在df2
product_id
等于col1
和col2
值的任何地方添加col1_name
和col2_name
。
Is it possible to do so with: 是否可以这样做:
df3 = pd.concat([df1, df2], axis=1)
My knowledge of Pandas df and Python is beginner level. 我对Pandas df和Python的了解是初学者的。 Is there a way to do so? 有办法吗? Thanks in advance. 提前致谢。
I think you can use map
by dict
generated from df2
and then sort columns names by sort_index
: 我想你可以使用df2
生成的dict
map
,然后按sort_index
对列名进行sort_index
:
d = df2.set_index('product_id')['product_name'].to_dict()
print (d)
{1: 'abc', 2: 'xyz', 3: 'aaaa', 36: 'asdf', 17: 'zxcv', 12: 'qwert', 13: 'sed', 15: 'qase'}
df1['col1_name'] = df1.col1.map(d)
df1['col2_name'] = df1.col2.map(d)
df1 = df1.sort_index(axis=1)
print (df1)
col1 col1_name col2 col2_name count
0 1 abc 36 asdf 200
1 12 qwert 15 qase 200
2 13 sed 17 zxcv 100
df1 = df1.drop(['col1','col2'], axis=1)
print (df1)
col1_name col2_name count
0 abc asdf 200
1 qwert qase 200
2 sed zxcv 100
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.