How to sum all values in a df column (col1) where three other column (col2,col3,col4) match

Question

I have four columns in my df:

Col1	Col2	Col4
1	0	1
5	0	0
6	1	1
1	0	1

and I want to get to this df:

Col1	Col2	Col4
2	0	1
5	0	0
6	1	1

by summing col1 values when the other three columns are all the same.

Edit: I tried

df = df.groupby(["col2","col3","col4"]).col1.sum()

but print(df) clearly shows those columns not summed. Is it possible I should try to force the column to a type first?

Answer 1

You can use groupby() to group by ALL colummns, then use .agg() to only sum Col1:

df.groupby(['Col1', 'Col2', 'Col3', 'Col4'], as_index=False).agg({'Col1':'sum'})

   Col2  Col3  Col4  Col1
0     0     0     1     2
1     0     0     1     5
2     1     0     0     6

(Please note the columns are now in a different order. You can re-arrange them if necessary)

How to sum all values in a df column (col1) where three other column (col2,col3,col4) match

Question

1 answers

solution1
0 2022-09-15 14:22:57

How to sum all values in a df column (col1) where three other column (col2,col3,col4) match

Question

1 answers

solution1 0 2022-09-15 14:22:57

solution1
0 2022-09-15 14:22:57