[英]Dataframe join in pandas on part of the String of Joining Columns
I have 2 Dataframes in pandas, I want to INNER join the 2 tables on JoinColumn
,But Values in Table1.JoinColumn
is like 1_String1_0
and in the Table2.JoinColumn
is like 2_String1_1
, What is possible way to join both tables without splitting the column values by "_" and join later. 我在熊猫中有2个数据框,我想INNER连接
JoinColumn
上的2个表,但是Table1.JoinColumn
值就像1_String1_0
,而Table2.JoinColumn
值就像2_String1_1
,在不分割列值的情况下2_String1_1
两个表的可能方法是什么按“ _”并稍后加入。
Table1= pd.DataFrame({'JoinColumn' : pd.Series(['1_Abc_0','2_Cde_1','3_Efg_0','5_xyz_1'], index=['a', 'b', 'c','d']), 'Col2' : pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd']),'Col3' : pd.Series([1, 2., 3., 4.], index=['a', 'b', 'c', 'd'])})
Table2= pd.DataFrame({'JoinColumn' : pd.Series(['2_Abc_1','2_Cde_0','6_Efg_0','9_xyz_2'], index=['a', 'b', 'c','d']), 'Col2' : pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd']),'Col3' : pd.Series([1, 2., 3., 4.], index=['a', 'b', 'c', 'd'])})
I want to merge these two tables on JoinColumn
considering on values like "Abc","xyz" etc.. 我想在
JoinColumn
上合并这两个表,并考虑“ Abc”,“ xyz”等值。
I think is necessary split, but Series
should be input of left_on
and right_on
parameters: 我认为有必要进行拆分,但是
Series
应该输入left_on
和right_on
参数:
df = pd.merge(Table1,
Table2,
left_on=Table1['JoinColumn'].str.split('_').str[1],
right_on=Table2['JoinColumn'].str.split('_').str[1])
print (df)
Col2_x Col3_x JoinColumn_x Col2_y Col3_y JoinColumn_y
0 10 1.0 1_Abc_0 10 1.0 2_Abc_1
1 20 2.0 2_Cde_1 20 2.0 2_Cde_0
2 30 3.0 3_Efg_0 30 3.0 6_Efg_0
3 40 4.0 5_xyz_1 40 4.0 9_xyz_2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.