[英]Pandas dataframe - Sum a column wrt to values in another column
I have a data that looks like this :- 我有一个看起来像这样的数据:
data = {"doc1" : {'a': 2 , 'b': 1,'c':3}, "doc2" : {'a': 1 , 'b': 1,'c':3}, "doc3" : {'a': 1 , 'b': 1,'c':3}}
I convert it into a dataframe :- 我将其转换为数据框:-
df = pd.DataFrame.from_dict(data,orient='index')
Dataframe looks like this :- 数据框看起来像这样:-
acb doc1 2 3 1 doc2 1 3 1 doc3 1 3 1
Now I want to sum all the values in column b where column a values is 1. 现在,我想对b列中的所有值求和,其中a列的值为1。
So the value I want will be 2. 所以我想要的值是2。
Is there an easy way to do this rather than iterating through both the columns ? 有没有简单的方法可以做到这一点,而不是遍历这两列? I checked other posts and found this :-
我检查了其他帖子,发现了这一点 :
This makes use of .loc function. 这利用了.loc函数。
df.loc[df['a'] == 1, 'b'].sum()
But for some reason, I can't seem to make it to work with my dataframe. 但是由于某种原因,我似乎无法使其与我的数据框一起使用。
Please let me know. 请告诉我。
Thanks. 谢谢。
You are very close. 你很亲密 See below.
见下文。
>>> df[df['a'] == 1]['b'].sum()
2
Instead of using .loc
, try just filtering the dataframe first ( df[df['a'] == 1]
), then selecting the column 'b'
, and then summing. 而不是使用
.loc
,请尝试首先仅过滤数据帧( df[df['a'] == 1]
),然后选择列'b'
,然后求和。
Edit: I'll leave this here for future reference, although depending on the version of pandas you're using, your solution should work (thanks, @maxymoo). 编辑:我将其留在此处以供将来参考,尽管根据您所使用的熊猫的版本,您的解决方案应该可以工作(谢谢@maxymoo)。 I'm running
0.18.1
and both approaches worked. 我正在运行
0.18.1
并且两种方法都有效。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.