简体   繁体   English

GROUP BY python 后占总大小的百分比

[英]Percentage from Total size after GROUP BY python

   code_module  final_result

    AAA      Distinction            44
                  Fail              91
                 Pass              487
                Withdrawn          126

THIS IS AN OUTCOME OF PYTHON CODE这是 Python 代码的结果

 studentInfo.groupby(['code_module','final_result']).agg({'code_module':[np.size]})
  • I want to calculate the percentage of each final_result from the total我想从总数中计算每个 final_result 的百分比
  • the math is AAA.pass/AAA.total数学是 AAA.pass/AAA.total

  • the total is the sum of all the numbers above.总数是上述所有数字的总和。

I believe you need SeriesGroupBy.value_counts with parameter normalize :我相信你需要SeriesGroupBy.value_counts和参数normalize

s1 = studentInfo.groupby('code_module')['final_result'].value_counts(normalize=True)
print (s1)
code_module  final_result
AAA          Pass            0.651070
             Withdrawn       0.168449
             Fail            0.121658
             Distinction     0.058824
Name: final_result, dtype: float64

Or divide your simplify solution with DataFrameGroupBy.size by sum per first level of MultiIndex或者将您的使用简化的解决方案DataFrameGroupBy.sizesum %的第一级MultiIndex

s = studentInfo.groupby(['code_module','final_result']).size()
s2 = s.div(s.sum(level=0), level=0)
print (s2)
code_module  final_result
AAA          Distinction     0.058824
             Fail            0.121658
             Pass            0.651070
             Withdrawn       0.168449
dtype: float64

Difference between solutions is value_counts return output Series in descending order so that the first element is the most frequently-occurring element, size not.解决方案之间的区别是value_counts以降序返回输出Series ,以便第一个元素是最常出现的元素,而size不是。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM