I have two DataFrame that look like this:
bin last_4 brand name chargeback
0 112233 1234 visa Joe 0
1 445566 5678 visa Susy 0
2 778899 9012 mastercard James 0
bin last_4 chargeback
0 112233 1234 1
1 445566 5678 1
I want to get the following result:
bin last_4 brand name chargeback
0 112233 1234 visa Joe 0
1 445566 5678 visa Susy 0
2 778899 9012 mastercard James 0
3 112233 1234 visa Joe 1
4 445566 5678 visa Susy 1
I have already tried several attempts of pd.merge()
method. However when I called pd.merge(df_1, df_2, how='outer', on=['bin', 'last_4'])
I got only 3 rows with duplicated 'chargeback' column like this:
bin last_4 brand name chargeback_x chargeback_y
0 112233 1234 visa Joe 0 1.0
1 445566 5678 visa Susy 0 1.0
2 778899 9012 mastercard James 0 NaN
And when I call pd.merge(df_1, df_2, how='outer', on=['bin', 'last_4', 'chargeback'])
I got NaN
values in 'brand' and 'name' columns:
bin last_4 brand name chargeback
0 112233 1234 visa Joe 0
1 445566 5678 visa Susy 0
2 778899 9012 mastercard James 0
3 112233 1234 NaN NaN 1
4 445566 5678 NaN NaN 1
So do you know how can I get these replicated rows with full information?
You can use pd.concat
with pd.merge
:
pd.concat([df1,df2.merge(df1.drop('chargeback', axis=1),how='left',on=['bin', 'last_4'])])
Out[1]:
bin last_4 brand name chargeback
0 112233 1234 visa Joe 0
1 445566 5678 visa Susy 0
2 778899 9012 mastercard James 0
0 112233 1234 visa Joe 1
1 445566 5678 visa Susy 1
Since, the second dataframe has some missing information, merge
the first dataframe with the second, but don't merge in the 'chargeback' column. Then, concat
this new merged dataframe with the first dataframe.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.