简体   繁体   English

计算年份子集中的类别并除以子集中的总数

[英]Counting categories within subsets of years and divide by total count within the subset

I am counting the number of negative numbers and positive numbers within each year.我正在计算每年负数和正数的数量。 Ultimately I want to get the percent of negative and positive for each year.最终,我想获得每年负面和正面的百分比。

I tried groupby year and counting the categories, but the new columns appears with no name.我尝试按年份分组并计算类别,但新列出现时没有名称。

    df1= df.groupby(['Year','Count of Negative/Positive Margins'])['Count of Negative/Positive Margins'].count()

    df1.head()
    Out[194]: 
    Year  Count of Negative/Positive Margins
    2005  1                                     4001
          2                                     1373
    2006  1                                     4046
          2                                     1304
    2007  1                                     4156
    Name: Count of Negative/Positive Margins, dtype: int64 

This my expected output:这是我的预期输出:

    2005  1                                     74%
          2                                     26%
    .
    .
    .

Use SeriesGroupBy.value_counts with grouping only column Year and parameter normalize=True , then multiple by 100 , round by Series.round , convert to strings and add % :使用SeriesGroupBy.value_counts仅对列Year和参数normalize=True进行分组,然后乘以100 ,按Series.roundSeries.round ,转换为字符串并添加%

df = (df.groupby('Year')['Count of Negative/Positive Margins']
        .value_counts(normalize=True)
        .mul(100)
        .round()
        .astype(str)
        .add('%')
        .reset_index(name='percentage')
       )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM