[英]Sum a column in a pandas dataframe where a condition is met in one column, but grouped by another
I have a dataframe like this:我有一个这样的数据框:
Ref_No Definition Total_to_Add
0 ref1 B 20
1 ref2 A 30
2 ref1 B 40
3 ref2 A 50
4 ref1 B 60
5 ref2 B 50
6 ref1 B 60
7 ref2 B 50
8 ref1 B 60
For each reference, I want to sum Total_to_Add if they are 'B' and the same reference number (i'll have another column for A).对于每个参考,如果它们是“B”且参考号相同(我将有另一列用于 A),我想对 Total_to_Add 求和。 There are 100's of reference numbers.
有 100 个参考编号。
I can sum those that meet a single condition like this:我可以总结满足这样一个条件的那些:
df['ANSWER'] = df[df['Definition']=='A']['Total_to_Add'].sum()
Or i can group by a reference like this:或者我可以按这样的参考分组:
df['ANSWER']=(df.groupby('Ref_No')['Total_to_Add'].transform('sum'))
But i can't seem to combine these functions, ie create a new column that totals up if the definition is 'B' and total by Ref_No.但我似乎无法组合这些功能,即创建一个新列,如果定义为“B”,则总计为 Ref_No。
I'm aiming for output like the below:我的目标是像下面这样的输出:
Ref_No Definition Total_to_Add Total_'B'
0 ref1 B 20 240
1 ref2 A 30 100
2 ref1 B 40 240
3 ref2 A 50 100
4 ref1 B 60 240
5 ref2 B 50 100
6 ref1 B 60 240
7 ref2 B 50 100
8 ref1 B 60 240
Any wisdom appreciated!任何智慧赞赏! Thanks
谢谢
Replace non B
values to 0
by Series.where
and then use GroupBy.transform
:通过
Series.where
将非B
值替换为0
,然后使用GroupBy.transform
:
df['ANSWER']= (df['Total_to_Add'].where(df.Definition=='B', 0)
.groupby(df['Ref_No']).transform('sum'))
print (df)
Ref_No Definition Total_to_Add Total_'B' ANSWER
0 ref1 B 20 240 240
1 ref2 A 30 100 100
2 ref1 B 40 240 240
3 ref2 A 50 100 100
4 ref1 B 60 240 240
5 ref2 B 50 100 100
6 ref1 B 60 240 240
7 ref2 B 50 100 100
8 ref1 B 60 240 240
Try:尝试:
df['Total_B'] = (df['Definition'].eq('B').mul(df['Total_to_Add'])
.groupby(df['Ref_No']).transform('sum'))
[out] [出去]
Ref_No Definition Total_to_Add Total_B
0 ref1 B 20 240
1 ref2 A 30 100
2 ref1 B 40 240
3 ref2 A 50 100
4 ref1 B 60 240
5 ref2 B 50 100
6 ref1 B 60 240
7 ref2 B 50 100
8 ref1 B 60 240
This will produce the sum of 'Total_to_Add' in 'Total_B' column if the 'Definition' == 'B'.如果 'Definition' == 'B',这将在 'Total_B' 列中产生 'Total_to_Add' 的总和。
df['Total_B']=df[df['Definition']=='B'].groupby(by=['Ref_No','Definition'])['Total_to_Add'].transform('sum')
I will do transform
我会做
transform
s=df['Total_to_Add'].mask(df.Definition!='B').groupby(df['Ref_No']).transform('sum')
s
0 240.0
1 100.0
2 240.0
3 100.0
4 240.0
5 100.0
6 240.0
7 100.0
8 240.0
Name: Total_to_Add, dtype: float64
df['New']=s
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.