dataframe assign columns from other dataframe with different size by key

Question

I have two dataframes:

df1 = 
key A  B  C  
r1  1  2  7  
r2  6  3  3  

df2 = 
key A  B  C  D  E  
 r1 1  2  3  4  7
 r1 1  3  2  1  5
 r2 5  7  1  2  2
 r2 6  2  4  9  3
 r3 1  2  7  7  1
 r4 9  0  2  1  2

I want to add a column E to df1, so it will take the value from the first occurence of that key from df2.

So df1 will be:

df1 = 
 key  A  B  C  E
 r1   1  2  7  7
 r2   6  3  3  2

What is the best way to do so?

Answer 1

Use GroupBy.first with DataFrame.join :

df = df1.join(df2.groupby('key')['E'].first(), on='key')
print (df)
  key  A  B  C  E
0  r1  1  2  7  7
1  r2  6  3  3  2

Or DataFrame.drop_duplicates with DataFrame.merge :

df = df1.merge(df2.drop_duplicates('key')[['key','E']], on='key', how='left')
print (df)
  key  A  B  C  E
0  r1  1  2  7  7
1  r2  6  3  3  2

EDIT:

If possible column E not exist modify second solution with Index.intersection :

print (df2)
  key  A  B  C  D  E1
0  r1  1  2  3  4   7
1  r1  1  3  2  1   5
2  r2  5  7  1  2   2
3  r2  6  2  4  9   3
4  r3  1  2  7  7   1
5  r4  9  0  2  1   2

cols = ['key'] + df2.columns.intersection(['E']).tolist()
print (cols)
['key']

df = df1.merge(df2.drop_duplicates('key')[cols], on='key', how='left')
print (df)
  key  A  B  C
0  r1  1  2  7
1  r2  6  3  3

dataframe assign columns from other dataframe with different size by key

Question

1 answers

solution1
1 ACCPTED 2020-07-08 12:08:03

dataframe assign columns from other dataframe with different size by key

Question

1 answers

solution1 1 ACCPTED 2020-07-08 12:08:03

solution1
1 ACCPTED 2020-07-08 12:08:03