简体   繁体   English

如何将 dataframe 分组为 2 列并获得最高计数?

[英]How to group a dataframe with 2 columns and get highest count?

I have the following dataframe (df).我有以下 dataframe (df)。 Col A is a category and Col B is an item in that category. Col A 是一个类别,Col B 是该类别中的一个项目。

Col A可乐 Col B B列
red红色的 car
red红色的 car
red红色的 truck卡车
red红色的 ball
blue蓝色的 bus公共汽车
blue蓝色的 bus公共汽车
blue蓝色的 bus公共汽车
blue蓝色的 truck卡车
blue蓝色的 car

I want to get another dataframe (df2) showing the total count of distinct categories in Col A, followed by the count highest occurring item in Col B corresponding to Col A as below:我想得到另一个 dataframe (df2) 显示 Col A 中不同类别的总数,然后是 Col B 中对应于 Col A 的计数最高的项目,如下所示:

Col A可乐 Count A数A Col B B列 Count B计数 B
red红色的 4 4个 car 2 2个
blue蓝色的 5 5个 bus公共汽车 2 2个

Any idea on how to generate this dataframe?关于如何生成此 dataframe 的任何想法?

I have tried this command:我试过这个命令:

df2 = df.groupby('Col A')['Col B'].apply(lambda x: x.value_counts().index[0]).reset_index() df2 = df.groupby('Col A')['Col B'].apply(lambda x: x.value_counts().index[0]).reset_index()

and I get the following result:我得到以下结果:

Col A可乐 Col B B列
red红色的 car
blue蓝色的 bus公共汽车

I don't know how to get the two counts column.我不知道如何获得两个计数列。 Any ideas?有任何想法吗?

You can use Counter from Collections :您可以使用Collections中的Counter

from collections import Counter
final = df.groupby(['Col A']).agg({'Col A':'count','Col B':list})

'''
       Col A                        Col B
Col A                                    
blue       5  [bus, bus, bus, truck, car]
red        4      [car, car, truck, ball]
'''

final['Col_b'] = final['Col B'].apply(lambda x: Counter(x).most_common(1)[0][0]) #get most common value
final['Count_b'] = final['Col B'].apply(lambda x: Counter(x).most_common(1)[0][1]) #get count of most common value
final=final.drop('Col B',axis=1).rename(columns={'Col A':'Count A'}).reset_index()

Output : Output :

|    | Col A   |   Count A | Col_b   |   Count_b |
|---:|:--------|----------:|:--------|----------:|
|  0 | blue    |         5 | bus     |         3 |
|  1 | red     |         4 | car     |         2 |

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何过滤和分组 pandas DataFrame 以获得两列组合的计数 - How to filter and group pandas DataFrame to get count for combination of two columns 如何获取数据框列中的项目数? - How to get count of items in the columns of dataframe? 如何根据多列值对 pandas 数据框进行分组、计数和取消堆叠? - How to group , count, and unstack a pandas dataframe based on multiple columns values? 分组数据框并获得总和和计数? - Group dataframe and get sum AND count? 在DataFrame中按组获取日期计数 - Get a count of dates by group in a DataFrame 如何使用python按两列进行分组,求和并使用其中一列进行排序并获得熊猫中每组的n个最高值 - How to use python to group by two columns, sum them and use one of the columns to sort and get the n highest per group in pandas 如何在对两列进行分组并在 Pandas Dataframe 中获取值计数后获得最高值行? - How to get the highest value row after grouping two columns and getting value counts in Pandas Dataframe? Pandas - 按多列分组并获得 1 列的计数 - Pandas - Group by multiple columns and get count of 1 of the columns 如何组合列,将它们分组然后获得总数? - How to combine columns, group them then get a total count? 如何进行分组并获取行数并将其分配给熊猫列 - how to do group by and get the row count and assign it to columns pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM