[英]Count a certain value for each country
I am attempting to do a Excel countif
function with pandas
but hitting a roadblock in doing so.我正在尝试用 pandas 做一个 Excel countif
pandas
但这样做遇到了障碍。
I have this dataframe
.我有这个dataframe
。 I need to count the YES
for each country quarter-wise.我需要按季度计算每个国家/地区的YES
。 I have posted the requested answers below.我已经在下面发布了要求的答案。
result.head(3)
Country Jan 1 Feb 1 Mar 1 Apr 1 May 1 Jun 1 Quarter_1 Quarter_2
FRANCE Yes Yes No No No No 2 0
BELGIUM Yes Yes No Yes No No 2 1
CANADA Yes No No Yes No No 1 1
I tried the following but Pandas
spats out a total value instead showing a 5
for all the values under Quarter_1
.我尝试了以下方法,但Pandas
吐出一个总值,而不是为Quarter_1
下的所有值显示5
。 I am oblivious on how to calculate my function below by Country
?我不知道如何按Country
计算我的 function? Any assistance with this please!请对此提供任何帮助!
result['Quarter_1'] = len(result[result['Jan 1'] == 'Yes']) + len(result[result['Feb 1'] == 'Yes'])
+ len(result[result['Mar 1'] == 'Yes'])
We can use the length of your column and take the floor division to create your quarters.我们可以使用您的柱子的长度并按楼层划分来创建您的宿舍。 Then we groupby on these and take the sum.然后我们对这些进行分组并求和。
Finally to we add the prefix Quarter
:最后我们添加前缀Quarter
:
df = df.set_index('Country')
grps = np.arange(len(df.columns)) // 3
dfn = (
df.join(df.eq('Yes')
.groupby(grps, axis=1)
.sum()
.astype(int)
.add_prefix('Quarter_'))
.reset_index()
)
Or using list comprehension to rename your columns:或使用列表推导重命名您的列:
df = df.set_index('Country')
grps = np.arange(len(df.columns)) // 3
dfn = df.eq('Yes').groupby(grps, axis=1).sum().astype(int)
dfn.columns = [f'Quarter_{col+1}' for col in dfn.columns]
df = df.join(dfn).reset_index()
Country Jan 1 Feb 1 Mar 1 Apr 1 May 1 Jun 1 Quarter_1 Quarter_2
0 FRANCE Yes Yes No No No No 2 0
1 BELGIUM Yes Yes No Yes No No 2 1
2 CANADA Yes No No Yes No No 1 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.