I have a pandas DataFrame like this:
>>> df = pd.DataFrame({'MONTREGL':[10,10,2222,35,200,56,5555],'SINID':['aaa','aaa','aaa','bbb','bbb','ccc','ccc'],'EXTRA':[400,400,400,500,500,333,333]})
>>> df
MONTREGL SINID EXTRA
0 10 aaa 400
1 10 aaa 400
2 2222 aaa 400
3 35 bbb 500
4 200 bbb 500
5 56 ccc 333
6 5555 ccc 333
I want to sum the column MONTREGL
for each groupby SINID
...
So I get 2242 for aaa and so on... ALSO I want to keep the value of column EXTRA
.
This is the expected result:
MONTREGL SINID EXTRA
0 2242 aaa 400
1 235 bbb 500
2 5611 ccc 333
Thanks for your help in advance!
I ended up using this script:
dff = df.groupby(["SINID","EXTRA"]).MONTREGL.sum().reset_index()
And it works in this test and production.
The code below works for your example:
df1 = df.groupby(["SINID"]).sum()
df1['EXTRA'] = df.groupby(["SINID"]).mean()['EXTRA']
Result :
MONTREGL EXTRA
SINID
aaa 2242 400.0
bbb 235 500.0
ccc 5611 333.0
my suggestion would be to filter you dataframe with conditions related to other columns then apply sum function,
it goes something like this.
import pandas as pd
df=pd.Dataframe({a:[1,2,3],b:[2001,2015,2019],c:[1,0,1]})
aux=df[df.c>0]
sa=aux.a.sum()
sb=aux.b.sum()
My syntax may not be correct ( i didnt run the code ) but it will probably work and lead you to your answer
Good luck.
I know this post is old, but this might be helpful for others:
Using loc: df.loc[df['SINID'] == aaa].MONTREGL.sum()
Using groupby: df.groupby('SINID')['MONTREGL'].sum()
A similar question is answered in the following link (check Alex Riley's response):
How do I sum values in a column that match a given condition using pandas?
Good luck,
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.