简体   繁体   中英

merge and sum two dataframes where columns match python pandas

I might be approaching this is the wrong way. If I have two dataframes a and b:

Dataframe A:

a b c
1 2 4
1 6 5
1 8 7

and Dataframe B:

a b c d
1 2 4 9
1 6 5 7
1 8 7 10

And I want to join them, but cruically sum the columns where the column name matchs, and keep the column names that don't match. So the final joined table would look like:

a   b   c   d
2   4   8   9
2   12  10  7
2   16  14  10

Note: The indexs would always match (ie the same number of records in the same order)

You could call add on the larger df passing in the other df and call fillna :

In [18]:
df1.add(df).fillna(df1)

Out[18]:
   a   b   c   d
0  2   4   8   9
1  2  12  10   7
2  2  16  14  10

Another way is to use combine_first :

In [20]:
df1.add(df).combine_first(df1)

Out[20]:
   a   b   c   d
0  2   4   8   9
1  2  12  10   7
2  2  16  14  10

You can align the dataframe, fill NAN with zero and perform simple addition.

d = pd.DataFrame(data, columns=['a','b','c'])
d2 = pd.DataFrame(data2, columns=['a','b','c','d'])
d, d2 = d.align(d2, fill_value=0)
sum_df = d + d2

In [23]: d
Out[23]: 
   a  b  c  d
0  1  2  4  0
1  1  6  5  0
2  1  8  7  0

In [24]: d2
Out[24]: 
   a  b  c   d
0  1  2  4   9
1  1  6  5   7
2  1  8  7  10

In [25]:sum_df
Out[25]: 
   a   b   c   d
0  2   4   8   9
1  2  12  10   7
2  2  16  14  10

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM