简体   繁体   中英

Pandas - Merge 2 df with same column names but exclusive values

  • I have 1 main df MainDF, with column key and other columns not relevant.
    • I also have 2 other dfs, dfA and dfB, with 2 columns, key and tariff. The keys in dfA and dfB are exclusive, ie there is no key in both dfA and dfB.
    • On my MainDF, I do: MainDF.merge(dfA, how = 'left', on='key') , which will add the column "tariff" to my MainDF, for the keys in dfA and also in MainDF. This will put NaN to all keys in MainDF not in dfA
    • Now, I need to do MainDF.merge(dfB, how = 'left', on='key') to add the tariff for the keys in MainDF but not in dfA.
    • When I do the second merge, it will create in MainDF 2 columns tariff_x and tariff_y because tariff is already in MainDF following the first merge. However, since the keys are exclusive, I need to keep only one column tariff with the not-NaN values when possible.

How should I do so in a python way ? I could add a new column which is either tariff_x or tariff_y but I don't find that very elegant.

Thanks

你可以先concat dfAdfB与合并前MainDF

MainDF.merge(pd.concat([dfA, dfB], axis=0), how='left', on='key')

Do you need something like this:

dfA = pd.DataFrame({'tariff': [1, 2, 3], 'A': list('abc')})
dfB = pd.DataFrame({'tariff': [4, 5, 6], 'A': list('def')})

dfJoin = pd.concat([dfA, dfB], ignore_index=True)

     A    B  tariff
0    a  NaN       1
1    b  NaN       2
2    c  NaN       3
3  NaN    d       4
4  NaN    e       5
5  NaN    f       6

Now you can merge with dfJoin .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM