简体   繁体   中英

how can i add together the data of two dataframes

I want to add together data from two dataframes in this way:

    >>> df1 = pd.DataFrame({'col1': [1, 2, 3], 'col2': [2, 3, 2], 
'col3': ['aaa', 'bbb', 'ccc']})
>>> df1
   col1  col2 col3
0     1     2  aaa
1     2     3  bbb
2     3     2  ccc

    >>> df2 = pd.DataFrame({'col1': [4, 4, 5], 'col2': [4, 4, 5], 
'col3': ['some', 'more', 'third']})

>>> df2
   col1  col2   col3
0     4     4   some
1     4     4   more
2     5     5  third

I would like the result to be:

>>> result
   col1  col2   col3
0     4     4   some
1     4     4   more
2     9     7  third
3     1     2    aaa
4     2     3    bbb

That is: if there exist a col3 which have the same value, then col1 + col2 for that entry shall be added together. If it doesnt exist, the rows should just to be concatted. The order of the rows doesnt matter, and I don't need to keep df1 and df2, I just care about the result afterwards.

What is the best way to achieve this?

The data I've just loaded from different csv files that look exactly like that, so maybe there is an alternative way to do it as well? The result I just want to save again as a csv file that looks like above.

Let's use pd.concat and groupby to sum values.

pd.concat([df1,df2]).groupby('col3').sum().reset_index().reindex_axis(['col1','col2','col3'],axis=1)

Output:

   col1  col2   col3
0     1     2    aaa
1     2     3    bbb
2     4     4   more
3     4     4   some
4     9     7  third

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM