简体   繁体   English

Pandas:如何按一列分组并显示每组所有其他列的唯一值计数?

[英]Pandas: How to group by one column and show count for unique values for all other columns per group?

( Data sample and attempts at the end of the question ) 问题末尾的数据样本和尝试

With a dataframe such as this:使用这样的数据框:

    Type    Class   Area    Decision
0   A       1       North   Yes
1   B       1       North   Yes
2   C       2       South   No
3   A       3       South   No
4   B       3       South   No
5   C       1       South   No
6   A       2       North   Yes
7   B       3       South   Yes
8   B       1       North   No

How can I group by Decision and get a count of Decision for unique values under the other columns so that I end up with this:如何按Decision分组并获取其他列下唯一值的Decision计数,以便我最终得到:

Decision  Area_North  Aread_South  Class_1  Class_2  Type_A  Type_B  Type_C
Yes       3           1            2        0        2       2       1
No        1           4            1        1        1       2       2

I was sure I could get a good start using groupby().agg() like this:我确信我可以像这样使用groupby().agg()有一个好的开始:

dfg = df.groupby('Decision').agg({'Type':'count',
                           'Class':'count',
                           'Decision':'count'})

And then pivot the result, but it's not enough by far.然后旋转结果,但到目前为止还不够。 I'll need to include the unique values of all other columns somehow.我需要以某种方式包含所有其他列的唯一值。 I was sure I've seen somwehere that you could replace 'Position':'count' with 'Position':pd.Series.unique , but I can't seem to get it to work.我确信我在某些地方看到过,您可以将'Position':'count'替换为'Position':pd.Series.unique ,但我似乎无法让它发挥作用。

Code:代码:

import pandas as pd

df = pd.DataFrame({'Type': {0: 'A',
                          1: 'B',
                          2: 'C',
                          3: 'A',
                          4: 'B',
                          5: 'C',
                          6: 'A',
                          7: 'B',
                          8: 'B'},
                     'Class': {0: 1, 1: 1, 2: 2, 3: 3, 4: 3, 5: 1, 6: 2, 7: 3, 8: 1},
                     'Area': {0: 'North',
                          1: 'North',
                          2: 'South',
                          3: 'South',
                          4: 'South',
                          5: 'South',
                          6: 'North',
                          7: 'South',
                          8: 'North'},
                     'Decision': {0: 'Yes',
                          1: 'Yes',
                          2: 'No',
                          3: 'No',
                          4: 'No',
                          5: 'No',
                          6: 'Yes',
                          7: 'Yes',
                          8: 'No'}})

dfg = df.groupby('Decision').agg({'Type':'count',
                           'Class':'count',
                           'Decision':'count'})
dfg

Use DataFrame.melt with DataFrame.pivot_table and flatten MultiIndex :DataFrame.meltDataFrame.pivot_table DataFrame.melt使用并展平MultiIndex

df = df.melt('Decision').pivot_table(index='Decision', 
                                     columns=['variable','value'], 
                                     aggfunc='size', 
                                     fill_value=0)
df.columns = df.columns.map('{0[0]}_{0[1]}'.format)
df = df.reset_index()
print (df)
  Decision  Area_North  Area_South  Class_1  Class_2  Class_3  Type_A  Type_B  \
0       No           1           4        2        1        2       1       2   
1      Yes           3           1        2        1        1       2       2   

   Type_C  
0       2  
1       0  

melt with groupby + value_countsgroupby + value_counts melt

s=df.melt('Decision').groupby(['Decision','variable']).\
    value.value_counts().unstack(level=[1,2],fill_value=0)
variable  Area       Class       Type      
value    South North     1  3  2    B  C  A
Decision                                   
No           4     1     2  2  1    2  2  1
Yes          1     3     2  1  1    2  0  2

You can also modify above columns by您还可以通过以下方式修改上述列

s.columns = s.columns.map('{0[0]}_{0[1]}'.format) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM