Sum columns of two pandas dataframes of different sizes only for certain rows

Question

I have two pandas dataframes looking like:

df1:
      n  column1
0   5.0      0.0
1   6.0      0.0
2   7.0      0.0
3   8.0      0.0
4   9.0      0.0
5  10.0      0.0

df2:
     n  column2
0  6.0      1.0
1  7.0      1.0
2  8.0      1.0

I want to sum column1 and column2 only for rows where n is the same. Desired output looks like:

df3:
      n  column1
0   5.0      0.0
1   6.0      1.0
2   7.0      1.0
3   8.0      1.0
4   9.0      0.0
5  10.0      0.0

Please note that:

Values of n may vary from one case to another so I can't fill the columns of df2 with zeroes and perform a classical sum.
Values of n should not be modified in the end. So I'd like to avoid work-arounds like shifting n values to make them match with rows indexes.
What I've tried so far produces something like:
```
  n column1 0 5.0 1.0 1 6.0 1.0 2 7.0 1.0 3 8.0 NaN 4 9.0 NaN 5 10.0 NaN 
```
Because sum is by default performed based on row's indexes in common rather than on n as I wish.

How can I perform this with pandas built-in functions ?

Answer 1

Use Series.add , but first create indexes from columns n by set_index :

df = (df2.set_index('n')['column2']
         .add(df1.set_index('n')['column1'], fill_value=0)
         .reset_index(name='column1'))
print (df)
      n  column1
0   5.0      0.0
1   6.0      1.0
2   7.0      1.0
3   8.0      1.0
4   9.0      0.0
5  10.0      0.0

Another solution with merge and left join:

df = (df1.merge(df2, on='n', how='left'))
df['column1'] = df['column2'].add(df['column1'], fill_value=0)
df = df.drop('column2', axis=1)
print (df)
      n  column1
0   5.0      0.0
1   6.0      1.0
2   7.0      1.0
3   8.0      1.0
4   9.0      0.0
5  10.0      0.0

Answer 2

i solved it by merging the dataframe and sum it on pandas:

df = pd.merge(df1, df2, how='outer', on='n')

df['sum'] = df['column1'] + df['column2']

df[['n', 'sum']].fillna(0)

the result looks like this:

    n   sum
0   5.0 0.0
1   6.0 1.0
2   7.0 1.0 
3   8.0 1.0
4   9.0 0.0
5   10.0    0.0

Sum columns of two pandas dataframes of different sizes only for certain rows

Question

2 answers

solution1
3 ACCPTED 2018-10-18 06:35:45

solution2
0 2018-10-18 06:44:25

Sum columns of two pandas dataframes of different sizes only for certain rows

Question

2 answers

solution1 3 ACCPTED 2018-10-18 06:35:45

solution2 0 2018-10-18 06:44:25

solution1
3 ACCPTED 2018-10-18 06:35:45

solution2
0 2018-10-18 06:44:25