简体   繁体   English

由另一列 pandas df 分组的值出现的总和

[英]sum of value occurrence grouped by another column pandas df

I need to count the occurrence of each value in column name and group by column industry .我需要按列industry统计列name和分组中每个值的出现次数。 The goal is to get the sum of each name per industry.目标是获得每个行业的每个名称的总和。 My data looks like this:我的数据如下所示:

industry            name
Home             Mike
Home             Mike,Angela,Elliot
Fashion          Angela,Elliot
Fashion          Angela,Elliot

The desired output is:所需的 output 是:

Home Mike:2 Angela:1 Elliot:1
Fashion Angela:2 Elliot:2

Moving this out of comments, debugged and proved working:将其从评论中移出,经过调试并证明有效:

# count() in the next line won't work without an extra column
df['name_list'] = df['name'].str.split(',')
df.explode('name_list').groupby(['industry', 'name_list']).count()

Result:结果:

                    name
industry name_list      
Fashion  Angela        2
         Elliot        2
Home     Angela        1
         Elliot        1
         Mike          2

You may use collections.Counter to return a series of dictionaries as follows:您可以使用collections.Counter返回一系列字典,如下所示:

from collections import Counter
s = df.name.str.split(',').groupby(df.industry).sum().agg(Counter)

Out[506]:
industry
Fashion               {'Angela': 2, 'Elliot': 2}
Home       {'Mike': 2, 'Angela': 1, 'Elliot': 1}
Name: name, dtype: object

Note : Each cell is a Counter object.注意:每个单元格是一个Counter object。 Counter is a subclass of dictionary, so you can apply dictionary operations on it as an dictionary. Counter是字典的子类,因此您可以在其上应用字典操作作为字典。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 由另一列 pandas 分组的列中的总和值 - sum values in column grouped by another column pandas 通过熊猫的另一列的总和制作一个分组列 - Make a grouped column by sum of another column with pandas 更新pandas数据帧,其值等于相同df和另一个df的总和 - Updating pandas dataframe with value equal to sum of same df and another df Python - Pandas DF - 对与另一列中的条件匹配的列中的值求和 - Python - Pandas DF - sum values in a column that match a condition in another column 通过另一列的分组值的总和对pandas数据框中的列进行归一化 - Normalize column in pandas dataframe by sum of grouped values of another column Pandas:检查一个df中的值是否存在于另一个DF的任何列中 - Pandas: Check if value in one df exists in any column of another DF 根据另一个df python pandas更新df列值 - update df column value based on another df python pandas 使用另一个pandas DF的min值中的id填充pandas列 - populate a pandas column with the id from the min value of another pandas DF 在 pandas df 中查找 A 列中的 True 值是否是自 B 列中最后一个 True 以来他的第一次出现 - In pandas df find if the True value in column A is his first occurrence since last True in column B 基于另一列的值对一列Pandas DF进行条件运算 - Conditional operation on one column of Pandas DF based on value of another column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM