I have 2 dataframes and i want to join them on 2 columns and get only the record if more than 1 record is present for that inner joins
When i combine both Dataframe using Inner Join on 'Patient_id' and 'diag_date', I get
I want only idx '934814' of DF1 -> Nasal Steroids to map against '42775' of DF2, and not with any other indexes I dont want to groupby patient_id, and take the last record., it is required while merging the 2 tables. I want only the last row in inner join instead of it applying on all. Can you guys please suggest some solutions!
Thanks a lot!
Use DataFrame.drop_duplicates
with keep='last'
and columns used for join before DataFrame.merge
:
df = (DF1.drop_duplicates(['Patient_id','Prescription_date'], keep='last')
.merge(DF2, on=['Patient_id','Prescription_date']))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.