简体   繁体   中英

Pandas dataframe - Sum a column wrt to values in another column

I have a data that looks like this :-

data = {"doc1" : {'a': 2 , 'b': 1,'c':3}, "doc2" :  {'a': 1 , 'b': 1,'c':3}, "doc3" : {'a': 1 , 'b': 1,'c':3}}

I convert it into a dataframe :-

df = pd.DataFrame.from_dict(data,orient='index')

Dataframe looks like this :-

acb doc1 2 3 1 doc2 1 3 1 doc3 1 3 1

Now I want to sum all the values in column b where column a values is 1.

So the value I want will be 2.

Is there an easy way to do this rather than iterating through both the columns ? I checked other posts and found this :-

This makes use of .loc function. df.loc[df['a'] == 1, 'b'].sum()

But for some reason, I can't seem to make it to work with my dataframe.

Please let me know.

Thanks.

You are very close. See below.

>>> df[df['a'] == 1]['b'].sum()
2

Instead of using .loc , try just filtering the dataframe first ( df[df['a'] == 1] ), then selecting the column 'b' , and then summing.

Edit: I'll leave this here for future reference, although depending on the version of pandas you're using, your solution should work (thanks, @maxymoo). I'm running 0.18.1 and both approaches worked.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM