How to update one pandas dataframe with another dataframe (update the old data and add new data)

Question

df1 and df2 have the same data structure. I want to update df1's record with df2's when "key" values are matched and also add the records from df2 to df1 when their "key" value is not existed in df1, what kind of function should I use? Thanks.

df columns: assignee id issuetype key

df1:

assignee id issuetype key
Tom       1    bug    TP-1 
Jane      2    bug    TP-2 
Tim       3    bug    TP-3

df2:

assignee id issuetype key
Tom       1    story  TP-1 
Anna      2    bug    TP-2 
Tim       3    bug    TP-3 
Jane      4    bug    TP-4

df1 after updating using df2:

assignee id issuetype key
Tom       1    story  TP-1 
Anna      2    bug    TP-2 
Tim       3    bug    TP-3 
Jane      4    bug    TP-4

Answer 1

Use concat with DataFrame.drop_duplicates :

df = pd.concat([df2, df1]).drop_duplicates(subset=['key'])
print (df)
  assignee  id issuetype   key
0      Tom   1     story  TP-1
1     Anna   2       bug  TP-2
2      Tim   3       bug  TP-3
3     Jane   4       bug  TP-4

Or DataFrame.update with DataFrame.reindex :

cols = df1.columns
df1 = df1.set_index('key')
df2 = df2.set_index('key')
df1 = df1.reindex(columns=df1.columns.union(df2.columns, sort=False),
                  index=df1.index.union(df2.index, sort=False))

df1.update(df2)
df1 = df1.reset_index().reindex(columns=cols)
print (df1)

  assignee   id issuetype   key
0      Tom  1.0     story  TP-1
1     Anna  2.0       bug  TP-2
2      Tim  3.0       bug  TP-3
3     Jane  4.0       bug  TP-4

How to update one pandas dataframe with another dataframe (update the old data and add new data)

Question

1 answers

solution1
1 ACCPTED 2020-09-11 07:06:04

How to update one pandas dataframe with another dataframe (update the old data and add new data)

Question

1 answers

solution1 1 ACCPTED 2020-09-11 07:06:04

solution1
1 ACCPTED 2020-09-11 07:06:04