简体   繁体   中英

How to update one pandas dataframe with another dataframe (update the old data and add new data)

df1 and df2 have the same data structure. I want to update df1's record with df2's when "key" values are matched and also add the records from df2 to df1 when their "key" value is not existed in df1, what kind of function should I use? Thanks.

df columns: assignee id issuetype key

df1:

assignee id issuetype key
Tom       1    bug    TP-1 
Jane      2    bug    TP-2 
Tim       3    bug    TP-3 

df2:

assignee id issuetype key
Tom       1    story  TP-1 
Anna      2    bug    TP-2 
Tim       3    bug    TP-3 
Jane      4    bug    TP-4

df1 after updating using df2:

assignee id issuetype key
Tom       1    story  TP-1 
Anna      2    bug    TP-2 
Tim       3    bug    TP-3 
Jane      4    bug    TP-4

Use concat with DataFrame.drop_duplicates :

df = pd.concat([df2, df1]).drop_duplicates(subset=['key'])
print (df)
  assignee  id issuetype   key
0      Tom   1     story  TP-1
1     Anna   2       bug  TP-2
2      Tim   3       bug  TP-3
3     Jane   4       bug  TP-4

Or DataFrame.update with DataFrame.reindex :

cols = df1.columns
df1 = df1.set_index('key')
df2 = df2.set_index('key')
df1 = df1.reindex(columns=df1.columns.union(df2.columns, sort=False),
                  index=df1.index.union(df2.index, sort=False))

df1.update(df2)
df1 = df1.reset_index().reindex(columns=cols)
print (df1)

  assignee   id issuetype   key
0      Tom  1.0     story  TP-1
1     Anna  2.0       bug  TP-2
2      Tim  3.0       bug  TP-3
3     Jane  4.0       bug  TP-4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM