简体   繁体   English

如何将两个数据帧的数据加在一起

[英]how can i add together the data of two dataframes

I want to add together data from two dataframes in this way: 我想以这种方式将来自两个数据帧的数据加在一起:

    >>> df1 = pd.DataFrame({'col1': [1, 2, 3], 'col2': [2, 3, 2], 
'col3': ['aaa', 'bbb', 'ccc']})
>>> df1
   col1  col2 col3
0     1     2  aaa
1     2     3  bbb
2     3     2  ccc

    >>> df2 = pd.DataFrame({'col1': [4, 4, 5], 'col2': [4, 4, 5], 
'col3': ['some', 'more', 'third']})

>>> df2
   col1  col2   col3
0     4     4   some
1     4     4   more
2     5     5  third

I would like the result to be: 我希望结果是:

>>> result
   col1  col2   col3
0     4     4   some
1     4     4   more
2     9     7  third
3     1     2    aaa
4     2     3    bbb

That is: if there exist a col3 which have the same value, then col1 + col2 for that entry shall be added together. 即:如果存在具有相同值的col3,则该条目的col1 + col2应加在一起。 If it doesnt exist, the rows should just to be concatted. 如果不存在,则仅应保留行。 The order of the rows doesnt matter, and I don't need to keep df1 and df2, I just care about the result afterwards. 行的顺序无关紧要,并且我不需要保留df1和df2,我只关心之后的结果。

What is the best way to achieve this? 实现此目标的最佳方法是什么?

The data I've just loaded from different csv files that look exactly like that, so maybe there is an alternative way to do it as well? 我刚刚从不同的csv文件加载的数据看起来像那样,所以也许还有另一种方法可以做到? The result I just want to save again as a csv file that looks like above. 我只想再次将结果保存为上面所示的csv文件。

Let's use pd.concat and groupby to sum values. 让我们使用pd.concatgroupby对值求和。

pd.concat([df1,df2]).groupby('col3').sum().reset_index().reindex_axis(['col1','col2','col3'],axis=1)

Output: 输出:

   col1  col2   col3
0     1     2    aaa
1     2     3    bbb
2     4     4   more
3     4     4   some
4     9     7  third

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM