简体   繁体   中英

How do I merge two DFs with python if two columns match across DFs

df1 is all_cases

df2 is all_bca

The following code merge all_cases & all_bca(3 columns) if either EFE / Manual E-Form / Gate Pass No. or Realization Date match across two dfs.

cross = pd.merge(all_cases,\
                 all_bca[['EFE / Manual E-Form / Gate Pass No.','Realization Date','BCA(FC)',\
                          'Foreign Bank Charges','Agent/Brokerage Commision'
]], on=('EFE / Manual E-Form / Gate Pass No.','Realization Date'), how='left')

I want to merge if both the columns match. How do I do it.

all_cases

EFE / Manual E-Form / Gate Pass No.    Realization Date      
123456                                 1/1/2019         
789654                                 2/18/2019                    
852147                                 1/3/2018             
93258                                  1/4/2019           

all_bca

EFE / ......    Realization Date      BCA(FC)     Charges       Commision
123456             8/1/2019           88           8               8
789654             2/18/2019          300          30              10
852147             1/3/2018           500          25              20
93258              1/4/2019           1000         20              30
2530245            1/1/2019           333          33              33

desired result

EFE     Realization Date    BCA(FC)     Charges   Commision    Check 
123456     1/1/2019              -         -           -        Not Match
789654     2/18/2019             300       30          10        Match       
852147     1/3/2018              500       25          20        Match
93258      1/4/2019              -          -           -       Not Match


Currenct Output

EFE     Realization Date  BCA(FC)  Charges Commision  Check 
123456  1/1/2019              88     8         8       Match
789654  2/18/2019            300     30       10       Match
852147  1/3/2018             500     25       20       Match    
93258   1/4/2019             1000    20       30       Match   

Use the pd.merge with default inner join

all_cases = pd.DataFrame([['123456','1/1/2019'],['789654','2/18/2019'],['852147','1/3/2018'],['93258','1/4/2019 ']],
                      columns=['EFE / Manual E-Form / Gate Pass No.','Realization Date'])
all_bca = pd.DataFrame([['123456','8/1/2019','88','8','8'],
                        ['789654','2/18/2019','300','30','10'],
                        ['852147','1/3/2018','500','25','20'],
                        ['93258','1/4/2019','1000','20','30'],
                        ['2530245','1/1/2019','333','33','33']],
                      columns=['EFE / Manual E-Form / Gate Pass No.','Realization Date','BCA(FC)','Charges','Commision'])
cross = all_cases.merge(all_bca, 
                        on=['EFE / Manual E-Form / Gate Pass No.','Realization Date'],
                        how='inner')
print(cross)

Output:

  EFE / Manual E-Form / Gate Pass No. Realization Date  ... Charges Commision
0                              789654        2/18/2019  ...      30        10
1                              852147         1/3/2018  ...      25        20

[2 rows x 5 columns]

EDIT1:

If you want to keep the all_cases as such then try this:

cross = all_cases.merge(all_bca,
                        on=['EFE / Manual E-Form / Gate Pass No.','Realization Date'],
                        how='right',
                        indicator='check')
print(cross)

Output:

  EFE / Manual E-Form / Gate Pass No. Realization Date  ... Commision       check
0                              789654        2/18/2019  ...        10        both
1                              852147         1/3/2018  ...        20        both
2                              123456         8/1/2019  ...         8  right_only
3                               93258         1/4/2019  ...        30  right_only
4                             2530245         1/1/2019  ...        33  right_only

[5 rows x 6 columns]

The rows with check=='both' are the matching ones

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM