Pandas add column to df based on values in a second df

Question

I have two separate dataframes df1 and df2 , both dataframes contain an id column which links rows between them. df2 has a group column that df1 does not contain. What I would like to do is go through each id in df1 and check to see if it is in df2 then if it is to take the group column value and put it in df1 under a new column of the same name. Would it be easiest to write a function to loop through or is there a pandas trick I could utilize here?

Answer 1

df1 = pd.DataFrame([[1, 'a'],
                    [2, 'b'],
                    [3, 'c']], columns=['id', 'attr'])
df2 = pd.DataFrame([[2, 'd'],
                    [3, 'e'],
                    [4, 'f']], columns=['id', 'group'])

df1.merge(df2, how='left')

Answer 2

You can merge the two dataframes in one by joining them on the id column and then keep only the columns that you need:

df1 = merge(df1, df2, how='left', on='id')
df1.drop('unwanted_column',1)

Pandas add column to df based on values in a second df

Question

2 answers

solution1
3 ACCPTED 2016-08-11 16:29:09

solution2
2 2016-08-11 16:33:18

Pandas add column to df based on values in a second df

Question

2 answers

solution1 3 ACCPTED 2016-08-11 16:29:09

solution2 2 2016-08-11 16:33:18

solution1
3 ACCPTED 2016-08-11 16:29:09

solution2
2 2016-08-11 16:33:18