简体   繁体   English

根据特定列值组合 Dataframe 上的行并添加其他值

[英]Combining rows on a Dataframe based on a specific column value and add other values

So I have a dataframe and I'm looking to combine two of the same column values together but leave all the others as they were.所以我有一个 dataframe,我希望将两个相同的列值组合在一起,但保留所有其他值。

For example:例如:

c1            c2

shark         20
pizza         25
asteroid      10
shark         90
asteroid      30

And I want it to combine everything in the c1 column with ONLY the value shark and add the values from c2 so I get this:我希望它将c1列中的所有内容仅与值shark结合起来,并添加c2中的值,这样我就得到了:

c1            c2

shark         110
pizza         25
asteroid      10
asteroid      30

groupby combines everything which is not what I want. groupby结合了我不想要的一切。

I'm guessing using df.loc will get me somewhere but I'm not sure how to apply it back to the dataframe so it's changes it.我猜使用df.loc会让我到达某个地方,但我不确定如何将它应用回 dataframe 所以它会改变它。

You can do this using Groupby.sum and df.append :您可以使用Groupby.sumdf.append执行此操作:

Pick rows from df where c1 == 'shark' , groupby and sum c2 .从 df 中选择行,其中c1 == 'shark' , groupby 和 sum c2 Then append the remaining rows where c1 != 'shark' to the earlier dataframe.然后append剩余的行,其中c1 != 'shark'到更早的 dataframe。

In [2009]: x = df[df.c1 == 'shark'].groupby('c1', as_index=False).sum()
In [2012]: res = x.append(df[df.c1 != 'shark'])

In [2013]: res
Out[2013]: 
         c1   c2
0     shark  110
1     pizza   25
2  asteroid   10
4  asteroid   30

Here is a go这是一个 go

import pandas as pd
import numpy as np

df = pd.DataFrame({'c1' : ['shark', 'pizza', 'asteroid', 'shark', 'asteroid'], 'c2': [20, 25, 10, 90, 30]})
pd.pivot_table(df,index='c1',aggfunc=np.sum)

           c2
c1           
asteroid   40
pizza      25
shark     110

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM