[英]How can I iterate over a pandas dataframe so I can divide specific values based on a condition?
[英]How can i iterate for an specific column a sum of values based on a condition
这是 DataFrame 和我的代码
import pandas as pd
data = {
'year': ['2000','2000', '2000', '2000','2000','2000','2000','2000','2000','2000','2000','2000','2000','2000','2000',
'2001','2001','2001','2001','2001','2001','2001','2001','2001','2001','2001','2001','2001','2001','2001',
'2002','2002','2002','2002','2002','2002','2002','2002','2002','2002','2002','2002','2002','2002','2002'],
'type':[2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4],
'other_type':[0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4],
'Fee':[0,0,0,0,0,33,40,50,2,33,0,0,0,0,0,
30,50,10,200,45,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,30,50,10,200,45]
}
dfobj = pd.DataFrame(data)
dfobj.head()
我想根据列类型的条件进行筛选,如果值等于 3,并且如果列 other_type 为 0 和 1,则对 Fee 列的这些值求和。 这就是我所拥有的
row_Sum = data.loc[(data['type']==3)&(data['other_type'] <2)].sum(axis=0,numeric_only=True)
但问题是所有年份都被分组到结果中,我试过这个但它是低效的,因为每年都是一年,真正的 df 就像数千行和许多列和年份。
row_Sum = dfobj.loc[(dfobj['year']==2000)&(dfobj['type']==3)&(dfobj['other_type'] <2)].sum(axis=0,numeric_only=True)
主要目标是对所有年份应用求和条件。
非常感谢您能提供的任何帮助,谢谢!
您可以过滤并使用groupby.sum
:
m = dfobj['type'].eq(3) & dfobj['other_type'].isin([0, 1])
out = dfobj.loc[m, 'Fee'].groupby(dfobj['year']).sum()
output:
year
2000 73
2001 0
2002 0
Name: Fee, dtype: int64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.