Hi so I have 2 dataframes with many rows of data but for simplicity, I took out only a few for example like so:
df1:
id valid name note
------------------------------------------------------------------
1 yes tom, jane He is a engineer.She is a teacher.
1 no tim He's a doctor
2 no john He's a student
df2:
id name note Criterior1 Criterior2 valid
---------------------------------------------------------------------------------
1 tom He is a engineer. yes no no
1 jane She is a teacher. yes no no
1 tim He's a doctor. yes no yes
2 john He's a student no yes yes
df2 is similar to df1, however, I combined the cell values for 'note' and 'name' column where they share the same 'id' and 'valid' column value.
I want to combine them into one dataframe taking id/valid/name/note column from df1 and criterior1/criterior2 column from df2 according to id like so:
df3:
id valid name note Criterior1 Criterior2
---------------------------------------------------------------------------------------------
1 yes tom, jane He is a engineer.She is a teacher. yes no
1 no tim He's a doctor yes no
2 no john He's a student no yes
I tried using many codes like:
df3=df2.merge(df1,how="left")
for some reason, I'm getting NaN value for rows where I have combined values like for id=1 and valid = yes. However for rows that I did not combine like id=1 and valid = no, there is no issue with the merge.
df3:
id valid name note Criterior1 Criterior2
---------------------------------------------------------------------------------------------
1 yes tom, jane He is a engineer.She is a teacher. NaN NaN
1 no tim He's a doctor yes no
2 no john He's a student no yes
Try this:
df1.merge(df2, on='id', how='left')
When you don't specify the on
parameter in merge, by default does the merging based on all common columns.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.