简体   繁体   中英

How do I merge two Pandas DataFrames and add the overlapping columns

I am trying to merge multiple DataFrames on same DocID then sum up the weights but when I do merge it creates Weight_x,Weight_y. This would be fine for only two DataFrames but the amount of Dataframes to merge changes based on user input so merging creates Weight_x, Weight_y multiple times. So how can I merge more than 2 DataFrames such that they are merging on DocID and Weight is Summed?

Example:

df1= DocID Weight
     1     4
     2     7
     3     8
df2= DocID Weight
     1     5
     2     9
     8     1
finalDf=
     DocID Weight
     1     9
     2     16

You can merge, set the 'DocID' column as the index, then sum the remaining columns together. Then you can reindex and rename the columns in the resulting final_df as needed:

df_final = pd.merge(df1, df2, on=['DocID']).set_index(['DocID']).sum(axis=1)
df_final = pd.DataFrame({"DocID": df_final.index, "Weight":df_final}).reset_index(drop=True)

Output:

>>> df_final
   DocID  Weight
0      1       9
1      2      16
df1.set_index('DocID').add(df2.set_index('DocID')).dropna()

        Weight
DocID   
1       9.0
2       16.0

Can you try this pd.merge(df1, df2, on=['DocID']).set_index(['DocID']).sum(axis=1) You can now give any name to the sum column.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM