简体   繁体   中英

Dataframe join in pandas on part of the String of Joining Columns

I have 2 Dataframes in pandas, I want to INNER join the 2 tables on JoinColumn ,But Values in Table1.JoinColumn is like 1_String1_0 and in the Table2.JoinColumn is like 2_String1_1 , What is possible way to join both tables without splitting the column values by "_" and join later.

Table1= pd.DataFrame({'JoinColumn' : pd.Series(['1_Abc_0','2_Cde_1','3_Efg_0','5_xyz_1'], index=['a', 'b', 'c','d']), 'Col2' : pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd']),'Col3' : pd.Series([1, 2., 3., 4.], index=['a', 'b', 'c', 'd'])})

Table2= pd.DataFrame({'JoinColumn' : pd.Series(['2_Abc_1','2_Cde_0','6_Efg_0','9_xyz_2'], index=['a', 'b', 'c','d']), 'Col2' : pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd']),'Col3' : pd.Series([1, 2., 3., 4.], index=['a', 'b', 'c', 'd'])})

I want to merge these two tables on JoinColumn considering on values like "Abc","xyz" etc..

I think is necessary split, but Series should be input of left_on and right_on parameters:

df = pd.merge(Table1, 
              Table2, 
              left_on=Table1['JoinColumn'].str.split('_').str[1],
              right_on=Table2['JoinColumn'].str.split('_').str[1])

print (df)
   Col2_x  Col3_x JoinColumn_x  Col2_y  Col3_y JoinColumn_y
0      10     1.0      1_Abc_0      10     1.0      2_Abc_1
1      20     2.0      2_Cde_1      20     2.0      2_Cde_0
2      30     3.0      3_Efg_0      30     3.0      6_Efg_0
3      40     4.0      5_xyz_1      40     4.0      9_xyz_2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM