简体   繁体   中英

How to sum all values in a df column (col1) where three other column (col2,col3,col4) match

I have four columns in my df:

Col1 Col2 Col3 Col4
1 0 0 1
5 0 0 0
6 1 0 1
1 0 0 1

and I want to get to this df:

Col1 Col2 Col3 Col4
2 0 0 1
5 0 0 0
6 1 0 1

by summing col1 values when the other three columns are all the same.

Edit: I tried

df = df.groupby(["col2","col3","col4"]).col1.sum()

but print(df) clearly shows those columns not summed. Is it possible I should try to force the column to a type first?

You can use groupby() to group by ALL colummns, then use .agg() to only sum Col1:

df.groupby(['Col1', 'Col2', 'Col3', 'Col4'], as_index=False).agg({'Col1':'sum'})

   Col2  Col3  Col4  Col1
0     0     0     1     2
1     0     0     1     5
2     1     0     0     6

(Please note the columns are now in a different order. You can re-arrange them if necessary)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM