简体   繁体   中英

get the matching rows for multiple columns pandas dataframe

Consider my first data frame df1

      col1 col2  col3
0    hello    q     1
1    world    q     2
2  welcome    r     3
3    hello    t     4

And second data frame df2

    col1 col2  col3
0  hello    q     2

Need output like

      col1 col2  col3
0    hello    q     2
1    world    q     2
2  welcome    r     3
3    hello    t     4

'col1' and 'col2' should be equal and if 'col3' differs get the output and replace the value in dataframe first

I tried to use merge

df1.merge(df2, on=['col1', 'col2'])

    col1 col2  col3_x  col3_y
0  hello    q       1       2

But I don't know what to do next.

Option 1

merge , append , drop_duplicates

df1.drop('col3', 1).merge(df2).append(df1).drop_duplicates(['col1', 'col2'])

      col1 col2  col3
0    hello    q     2
1    world    q     2
2  welcome    r     3
3    hello    t     4

Option 2

set_index and combine_first

cols = ['col1', 'col2']
df2.set_index(cols).combine_first(
    df1.set_index(cols)
).reset_index().astype(df1.dtypes)

      col1 col2  col3
0    hello    q     2
1    hello    t     4
2  welcome    r     3
3    world    q     2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM