如何在pandas系列中找到groupby函数的比率

Question

I have used groupby to group the dataset by the occupations and gender.我使用groupby按职业和性别对数据集进行分组。 Now I want to find the ratio between male and females for each occupation.现在我想找出每个职业的男女比例。 I am unable to think of how to proceed.我想不出如何继续。

Answer 1

Here's one way using pandas.pivot_table and vectorised Pandas calculations.这是使用pandas.pivot_table和矢量化 Pandas 计算的一种方法。 Note this method removes the need to perform a separate groupby .请注意，此方法不需要执行单独的groupby 。

df = pd.DataFrame([['A', 'F'], ['A', 'F'], ['A', 'M'], ['B', 'M'], ['B', 'M'], ['B', 'F'],
                   ['C', 'M'], ['C', 'M'], ['D', 'F']], columns=['Occupation', 'Gender'])

# pivot input dataframe
res = df.pivot_table(index='Occupation', columns='Gender', aggfunc='size', fill_value=0)

# calculate ratios
sums = res[['F', 'M']].sum(axis=1)
res['FemaleRatio'] = res['F'] / sums
res['MaleRatio'] = res['M'] / sums

print(res)

Gender      F  M  FemaleRatio  MaleRatio
Occupation                              
A           2  1     0.666667   0.333333
B           1  2     0.333333   0.666667
C           0  2     0.000000   1.000000
D           1  0     1.000000   0.000000

Answer 2

Maybe quite late to the party but here's what I believe is the exact answer:也许参加聚会已经很晚了，但我认为这是确切的答案：

# create pivot
male_ratio = users.pivot_table(index='occupation', columns='gender', aggfunc='size', fill_value=0)

# calculate male ratio
sums = male_ratio[['F', 'M']].sum(axis=1)
male_ratio['MaleRatio'] = round(100 * male_ratio['M'] / sums , 1)

# result
male_ratio['MaleRatio']

occupation
administrator     54.4
artist            53.6
doctor           100.0
educator          72.6
engineer          97.0
entertainment     88.9
executive         90.6
healthcare        31.2
homemaker         14.3
lawyer            83.3
librarian         43.1
marketing         61.5
none              55.6
other             65.7
programmer        90.9
retired           92.9
salesman          75.0
scientist         90.3
student           69.4
technician        96.3
writer            57.8
Name: MaleRatio, dtype: float64

Answer 3

x=users.groupby(['occupation','gender'])['gender'].count()
    y=users.groupby(['occupation'])['gender'].count()
    r=((x/y)*100).round(2)
    print(r)

#ratio rule "x" is a count of gender(male/female), "y" is the total count of gender

occupation     gender
administrator  F          45.57
               M          54.43
artist         F          46.43
               M          53.57
doctor         M         100.00
educator       F          27.37
               M          72.63
engineer       F           2.99
               M          97.01
entertainment  F          11.11
               M          88.89
executive      F           9.38
               M          90.62

如何在pandas系列中找到groupby函数的比率

问题描述

3 个解决方案

解决方案1
3 2018-06-24 14:27:14

解决方案2
1 2018-12-24 14:50:57

解决方案3
1 已采纳

如何在pandas系列中找到groupby函数的比率

问题描述

3 个解决方案

解决方案1 3 2018-06-24 14:27:14

解决方案2 1 2018-12-24 14:50:57

解决方案3 1 已采纳

解决方案1
3 2018-06-24 14:27:14

解决方案2
1 2018-12-24 14:50:57

解决方案3
1 已采纳