简体   繁体   中英

Sum values in one column based on specific values in other column

I have a DataFrame in Pandas for example:

df = pd.DataFrame("a":[0,0,1,1,0], "penalty":["12", "15","13","100", "22"])

and how can I sum values in column "penalty" but I would like to sum only these values from column "penalty" which have values 0 in column "a"?

You can filter your dataframe with this:

import pandas as pd
data ={'a':[0,0,1,1,0],'penalty':[12, 15,13,100, 22]}
df = pd.DataFrame(data)
print(df.loc[df['a'].eq(0), 'penalty'].sum())

This way you are selecting the column penalty from your dataframe where the column a is equal to 0. Afterwards, you are performing the .sum() operation, hence returning your expected output (49). The only change I made was remove the quotation mark so that the values for the column penalty were interpreted as integers and not strings. If the input are necessarily strings, you can simply change this with df['penalty'] = df['penalty'].astype(int)

Filter the rows which has 0 in column a and calculate the sum of penalty column.

import pandas as pd
data ={'a':[0,0,1,1,0],'penalty':[12, 15,13,100, 22]}
df = pd.DataFrame(data)
df[df.a == 0].penalty.sum()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM