简体   繁体   English

数据框在连接列字符串的一部分中加入熊猫

[英]Dataframe join in pandas on part of the String of Joining Columns

I have 2 Dataframes in pandas, I want to INNER join the 2 tables on JoinColumn ,But Values in Table1.JoinColumn is like 1_String1_0 and in the Table2.JoinColumn is like 2_String1_1 , What is possible way to join both tables without splitting the column values by "_" and join later. 我在熊猫中有2个数据框,我想INNER连接JoinColumn上的2个表,但是Table1.JoinColumn值就像1_String1_0 ,而Table2.JoinColumn值就像2_String1_1 ,在不分割列值的情况下2_String1_1两个表的可能方法是什么按“ _”并稍后加入。

Table1= pd.DataFrame({'JoinColumn' : pd.Series(['1_Abc_0','2_Cde_1','3_Efg_0','5_xyz_1'], index=['a', 'b', 'c','d']), 'Col2' : pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd']),'Col3' : pd.Series([1, 2., 3., 4.], index=['a', 'b', 'c', 'd'])})

Table2= pd.DataFrame({'JoinColumn' : pd.Series(['2_Abc_1','2_Cde_0','6_Efg_0','9_xyz_2'], index=['a', 'b', 'c','d']), 'Col2' : pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd']),'Col3' : pd.Series([1, 2., 3., 4.], index=['a', 'b', 'c', 'd'])})

I want to merge these two tables on JoinColumn considering on values like "Abc","xyz" etc.. 我想在JoinColumn上合并这两个表,并考虑“ Abc”,“ xyz”等值。

I think is necessary split, but Series should be input of left_on and right_on parameters: 我认为有必要进行拆分,但是Series应该输入left_onright_on参数:

df = pd.merge(Table1, 
              Table2, 
              left_on=Table1['JoinColumn'].str.split('_').str[1],
              right_on=Table2['JoinColumn'].str.split('_').str[1])

print (df)
   Col2_x  Col3_x JoinColumn_x  Col2_y  Col3_y JoinColumn_y
0      10     1.0      1_Abc_0      10     1.0      2_Abc_1
1      20     2.0      2_Cde_1      20     2.0      2_Cde_0
2      30     3.0      3_Efg_0      30     3.0      6_Efg_0
3      40     4.0      5_xyz_1      40     4.0      9_xyz_2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM