[英]sum of value occurrence grouped by another column pandas df
I need to count the occurrence of each value in column name
and group by column industry
.我需要按列industry
统计列name
和分组中每个值的出现次数。 The goal is to get the sum of each name per industry.目标是获得每个行业的每个名称的总和。 My data looks like this:我的数据如下所示:
industry name
Home Mike
Home Mike,Angela,Elliot
Fashion Angela,Elliot
Fashion Angela,Elliot
The desired output is:所需的 output 是:
Home Mike:2 Angela:1 Elliot:1
Fashion Angela:2 Elliot:2
Moving this out of comments, debugged and proved working:将其从评论中移出,经过调试并证明有效:
# count() in the next line won't work without an extra column
df['name_list'] = df['name'].str.split(',')
df.explode('name_list').groupby(['industry', 'name_list']).count()
Result:结果:
name
industry name_list
Fashion Angela 2
Elliot 2
Home Angela 1
Elliot 1
Mike 2
You may use collections.Counter
to return a series of dictionaries as follows:您可以使用collections.Counter
返回一系列字典,如下所示:
from collections import Counter
s = df.name.str.split(',').groupby(df.industry).sum().agg(Counter)
Out[506]:
industry
Fashion {'Angela': 2, 'Elliot': 2}
Home {'Mike': 2, 'Angela': 1, 'Elliot': 1}
Name: name, dtype: object
Note : Each cell is a Counter
object.注意:每个单元格是一个Counter
object。 Counter
is a subclass of dictionary, so you can apply dictionary operations on it as an dictionary. Counter
是字典的子类,因此您可以在其上应用字典操作作为字典。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.