I have two dataframes with the same date and client id, but with a different amount.
I try to get another dataframe with dfA amount value and keep the another 0's on dfB when dfA does not exist
dfA:
client_id date amount
0 1 2020-07-11 100
1 1 2020-07-10 90
2 1 2020-07-09 80
3 1 2020-07-12 70
3 1 2020-07-01 86
dfB:
client_id date amount
0 1 2020-07-11 0
1 1 2020-07-10 0
2 1 2020-07-09 0
3 1 2020-07-07 0
4 1 2020-07-06 0
5 1 2020-07-05 0
5 1 2020-07-04 0
3 1 2020-07-03 0
4 1 2020-07-02 0
5 1 2020-07-01 0
I want to get:
dfResult:
client_id date amount
0 1 2020-07-11 100
1 1 2020-07-10 90
2 1 2020-07-09 80
3 1 2020-07-07 70
4 1 2020-07-06 0
5 1 2020-07-05 0
5 1 2020-07-04 0
3 1 2020-07-03 0
4 1 2020-07-02 0
5 1 2020-07-01 86
You can concat
the df's together, sort by amount and then drop duplicates.
dfResult = pd.concat([dfA,dfB]).sort_values(by='amout',ascending = False).drop_duplicates(subset=['client_id','date'],keep='first').reset_index().sort_values(by=['client id','date'],ascending = (True,False))
try this,
(
dfB.date.map(
dfA.set_index('date')['amount'].to_dict()
).fillna(0.0)
)
Or
(
dfB.merge(
dfA, on=['client_id', 'date'], suffixes=("_x", ""), how='left'
).fillna(0.0).drop(columns=["amount_x"])
)
client_id date amount
0 1 2020-07-11 100.0
1 1 2020-07-10 90.0
2 1 2020-07-09 80.0
3 1 2020-07-07 0.0
4 1 2020-07-06 0.0
5 1 2020-07-05 0.0
5 1 2020-07-04 0.0
3 1 2020-07-03 0.0
4 1 2020-07-02 0.0
5 1 2020-07-01 86.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.