sum of value occurrence grouped by another column pandas df

Question

I need to count the occurrence of each value in column name and group by column industry . The goal is to get the sum of each name per industry. My data looks like this:

industry            name
Home             Mike
Home             Mike,Angela,Elliot
Fashion          Angela,Elliot
Fashion          Angela,Elliot

The desired output is:

Home Mike:2 Angela:1 Elliot:1
Fashion Angela:2 Elliot:2

Answer 1

Moving this out of comments, debugged and proved working:

# count() in the next line won't work without an extra column
df['name_list'] = df['name'].str.split(',')
df.explode('name_list').groupby(['industry', 'name_list']).count()

Result:

                    name
industry name_list      
Fashion  Angela        2
         Elliot        2
Home     Angela        1
         Elliot        1
         Mike          2

Answer 2

You may use collections.Counter to return a series of dictionaries as follows:

from collections import Counter
s = df.name.str.split(',').groupby(df.industry).sum().agg(Counter)

Out[506]:
industry
Fashion               {'Angela': 2, 'Elliot': 2}
Home       {'Mike': 2, 'Angela': 1, 'Elliot': 1}
Name: name, dtype: object

Note : Each cell is a Counter object. Counter is a subclass of dictionary, so you can apply dictionary operations on it as an dictionary.

sum of value occurrence grouped by another column pandas df

Question

2 answers

solution1
1 ACCPTED 2020-08-17 18:46:42

solution2
0 2020-08-17 18:48:46

sum of value occurrence grouped by another column pandas df

Question

2 answers

solution1 1 ACCPTED 2020-08-17 18:46:42

solution2 0 2020-08-17 18:48:46

solution1
1 ACCPTED 2020-08-17 18:46:42

solution2
0 2020-08-17 18:48:46