Groupby 多列使用 python pandas 查找一列的唯一计数

Question

我有 dataframe 像：

column1    column2    column3
 ram        tall        good
 rohan      short       fine
 ajay       tall        best
 alia       tall        good
 aman       medium      fine
 john       short       good
 jack       short       fine

现在我需要 output 像：

基于 column1-> 的高、短、中的唯一计数

tall=2 , short=1 , medium=0

基于 column1-> 的高、短、中的唯一罚款计数

tall=0 , short=2 , medium=1

基于 column1-> 的高、短、中最佳的唯一计数

tall=1 , short=0 , medium=0

我是 pandas 的初学者。 提前致谢

Answer 1

让我们试试pd.crosstab ：

pd.crosstab(df['column3'], df['column2'])

column2  medium  short  tall
column3                     
best          0      0     1
fine          1      2     0
good          0      1     2

Answer 2

使用value_counts + unstack

res = df[['column3', 'column2']].value_counts().unstack('column2', fill_value=0)
print(res)

Output

column2  medium  short  tall
column3                     
best          0      0     1
fine          1      2     0
good          0      1     2

作为替代groupby + unstack ：

res = df.groupby(['column3', 'column2']).count().unstack('column2', fill_value=0)
print(res)

Output (groupby)

        column1           
column2  medium short tall
column3                   
best          0     0    1
fine          1     2    0
good          0     1    2

这两种方法背后的想法是创建一个索引，然后将其拆开。 如果您想匹配问题中指定的相同顺序，请先转换为分类：

df['column2'] = pd.Categorical(df['column2'], categories=['tall', 'short', 'medium'], ordered=True)
res = df[['column3', 'column2']].value_counts().unstack('column2', fill_value=0)
print(res)

Output

column2  tall  short  medium
column3                     
best        1      0       0
fine        0      2       1
good        2      1       0

Groupby 多列使用 python pandas 查找一列的唯一计数

问题描述

2 个解决方案

解决方案1
5 2020-12-26 09:36:58

解决方案2
1 已采纳 2020-12-26 09:31:09

Groupby 多列使用 python pandas 查找一列的唯一计数

问题描述

2 个解决方案

解决方案1 5 2020-12-26 09:36:58

解决方案2 1 已采纳 2020-12-26 09:31:09

解决方案1
5 2020-12-26 09:36:58

解决方案2
1 已采纳 2020-12-26 09:31:09