简体   繁体   中英

Pandas dataframe left join returning NaN when column value is different

Hi so I have 2 dataframes with many rows of data but for simplicity, I took out only a few for example like so:

df1:
id     valid     name           note
------------------------------------------------------------------
1      yes       tom, jane      He is a engineer.She is a teacher.
1      no        tim            He's a doctor
2      no        john           He's a student



df2:
id     name      note                Criterior1    Criterior2    valid 
---------------------------------------------------------------------------------
1      tom       He is a engineer.   yes           no            no
1      jane      She is a teacher.   yes           no            no
1      tim       He's a doctor.      yes           no            yes
2      john      He's a student      no            yes           yes

df2 is similar to df1, however, I combined the cell values for 'note' and 'name' column where they share the same 'id' and 'valid' column value.

I want to combine them into one dataframe taking id/valid/name/note column from df1 and criterior1/criterior2 column from df2 according to id like so:

df3:
id     valid     name           note                                  Criterior1    Criterior2
---------------------------------------------------------------------------------------------
1      yes       tom, jane      He is a engineer.She is a teacher.    yes           no
1      no        tim            He's a doctor                         yes           no
2      no        john           He's a student                        no            yes

I tried using many codes like:

df3=df2.merge(df1,how="left")

for some reason, I'm getting NaN value for rows where I have combined values like for id=1 and valid = yes. However for rows that I did not combine like id=1 and valid = no, there is no issue with the merge.

df3:
id     valid     name           note                                  Criterior1    Criterior2
---------------------------------------------------------------------------------------------
1      yes       tom, jane      He is a engineer.She is a teacher.    NaN           NaN
1      no        tim            He's a doctor                         yes           no
2      no        john           He's a student                        no            yes

Try this:

df1.merge(df2, on='id', how='left')

When you don't specify the on parameter in merge, by default does the merging based on all common columns.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM