简体   繁体   中英

Combine Pandas Data Frame if Values Match in a Columns

I have 2 data frames that I want to merge/combine based on a condition.

Let's say these are two dfs.

df1
    name    tpye    option  store
    a       2       8       0
    b       4       9       8
    c       3       6       2
    g       3       2       7
    k       1       6       2
    m       3       6       5


df2
   name     red     green   yellow
   a        r       g       y
   b        r       g       y
   m        r       g       y   

What I am trying to do if df2['name'] value exist in df1['name'], add the red, green columns to final_df .

So the final_df would like

name    tpye    option  store   red     green   yellow
a       2       8       0       r       g       y
b       4       9       8       r       g       y
c       3       6       2
g       3       2       7
k       1       6       2
m       3       6       5       r       g       y

Try this. It works because pandas can assign efficiently via index, especially when the index is unique within each dataframe.

df1 = df1.set_index('name')
df2 = df2.set_index('name')
df1[['red', 'green', 'yellow']] = df2[['red', 'green', 'yellow']]

Alternatively, pd.merge will work, as @PaulH mentioned:

df1.merge(df2, how='left', on='name')

You can use pandas join function. Your first dataframe would be the one you want all the values. For example:

import pandas as pd

d1 = pd.DataFrame({'col1': [1, 2 , 4],  'col2': [3, 4 , 5]})

d2 = pd.DataFrame({'col1': [1, 10], 'col3': [3, 4]})

joined = d1.set_index('col1').join(d2.set_index('col1'))

Which gives exactly what you want:

>>joined

      col2  col3
col1            
1        3   3.0
2        4   NaN
4        5   NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM