I have four columns in my df:
Col1 | Col2 | Col3 | Col4 |
---|---|---|---|
1 | 0 | 0 | 1 |
5 | 0 | 0 | 0 |
6 | 1 | 0 | 1 |
1 | 0 | 0 | 1 |
and I want to get to this df:
Col1 | Col2 | Col3 | Col4 |
---|---|---|---|
2 | 0 | 0 | 1 |
5 | 0 | 0 | 0 |
6 | 1 | 0 | 1 |
by summing col1 values when the other three columns are all the same.
Edit: I tried
df = df.groupby(["col2","col3","col4"]).col1.sum()
but print(df)
clearly shows those columns not summed. Is it possible I should try to force the column to a type first?
You can use groupby()
to group by ALL colummns, then use .agg()
to only sum Col1:
df.groupby(['Col1', 'Col2', 'Col3', 'Col4'], as_index=False).agg({'Col1':'sum'})
Col2 Col3 Col4 Col1
0 0 0 1 2
1 0 0 1 5
2 1 0 0 6
(Please note the columns are now in a different order. You can re-arrange them if necessary)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.