[英]Count all words in comma separated strings per group in pandas
i would like to count schools(separated by commas) from the data frame given below. 我想从下面给出的数据框中计算学校(用逗号分隔)。
Dataframe: 数据帧:
State Counties Schools_list
S1 C1 GradeA,GradeB,GradeC
S1 C1 GradeD
S2 C1 GradeA,GradeB
S2 C2 GradeC
S3 C2 GradeA,GradeB
S3 C3 GradeC,GradeD
Output: 输出:
State Schools_count
S1 4
S2 3
S3 4
How to count comma separated list of schools from last column by State. 如何按州计算逗号分隔的学校列表和最后一列的学校。
A simple solution here would be to count the commas: 一个简单的解决方案是计算逗号:
df['Schools_list'].str.count(',').add(1).groupby(df.State).sum()
State
S1 4
S2 3
S3 4
Name: Schools_list, dtype: int64
Note that, once you have counted the commas, group on the State name to get the count by state. 请注意,计算完逗号后,请按州名称分组,以按州进行计数。
As a DataFrame, 作为一个DataFrame,
(df['Schools_list'].str.count(',')
.add(1)
.groupby(df.State)
.sum()
.reset_index(name='Schools_count'))
State Schools_count
0 S1 4
1 S2 3
2 S3 4
You can also split on comma and find the length of the lists created, but this is a bit slower. 您也可以使用逗号分割并找到所创建列表的长度,但这要慢一些。
df['Schools_list'].str.split(',+').str.len().groupby(df.State).sum()
State
S1 4
S2 3
S3 4
Name: Schools_list, dtype: int64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.