[英]Combining rows on a Dataframe based on a specific column value and add other values
So I have a dataframe and I'm looking to combine two of the same column values together but leave all the others as they were.所以我有一个 dataframe,我希望将两个相同的列值组合在一起,但保留所有其他值。
For example:例如:
c1 c2
shark 20
pizza 25
asteroid 10
shark 90
asteroid 30
And I want it to combine everything in the c1
column with ONLY the value shark
and add the values from c2
so I get this:我希望它将
c1
列中的所有内容仅与值shark
结合起来,并添加c2
中的值,这样我就得到了:
c1 c2
shark 110
pizza 25
asteroid 10
asteroid 30
groupby
combines everything which is not what I want. groupby
结合了我不想要的一切。
I'm guessing using df.loc
will get me somewhere but I'm not sure how to apply it back to the dataframe so it's changes it.我猜使用
df.loc
会让我到达某个地方,但我不确定如何将它应用回 dataframe 所以它会改变它。
You can do this using Groupby.sum
and df.append
:您可以使用
Groupby.sum
和df.append
执行此操作:
Pick rows from df where c1 == 'shark'
, groupby and sum c2
.从 df 中选择行,其中
c1 == 'shark'
, groupby 和 sum c2
。 Then append
the remaining rows where c1 != 'shark'
to the earlier dataframe.然后
append
剩余的行,其中c1 != 'shark'
到更早的 dataframe。
In [2009]: x = df[df.c1 == 'shark'].groupby('c1', as_index=False).sum()
In [2012]: res = x.append(df[df.c1 != 'shark'])
In [2013]: res
Out[2013]:
c1 c2
0 shark 110
1 pizza 25
2 asteroid 10
4 asteroid 30
Here is a go这是一个 go
import pandas as pd
import numpy as np
df = pd.DataFrame({'c1' : ['shark', 'pizza', 'asteroid', 'shark', 'asteroid'], 'c2': [20, 25, 10, 90, 30]})
pd.pivot_table(df,index='c1',aggfunc=np.sum)
c2
c1
asteroid 40
pizza 25
shark 110
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.