I need to count the occurrence of each value in column name
and group by column industry
. The goal is to get the sum of each name per industry. My data looks like this:
industry name
Home Mike
Home Mike,Angela,Elliot
Fashion Angela,Elliot
Fashion Angela,Elliot
The desired output is:
Home Mike:2 Angela:1 Elliot:1
Fashion Angela:2 Elliot:2
Moving this out of comments, debugged and proved working:
# count() in the next line won't work without an extra column
df['name_list'] = df['name'].str.split(',')
df.explode('name_list').groupby(['industry', 'name_list']).count()
Result:
name
industry name_list
Fashion Angela 2
Elliot 2
Home Angela 1
Elliot 1
Mike 2
You may use collections.Counter
to return a series of dictionaries as follows:
from collections import Counter
s = df.name.str.split(',').groupby(df.industry).sum().agg(Counter)
Out[506]:
industry
Fashion {'Angela': 2, 'Elliot': 2}
Home {'Mike': 2, 'Angela': 1, 'Elliot': 1}
Name: name, dtype: object
Note : Each cell is a Counter
object. Counter
is a subclass of dictionary, so you can apply dictionary operations on it as an dictionary.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.