繁体   English   中英

熊猫根据指标将价值除以总和

[英]pandas divide value by aggregated sum based on indicator

在此之前,被标记为重复,我已经看过了以下内容: 问题1 问题2 source3

对于每个农民,我正在尝试计算两件事:1)水果x的成熟水果的百分比:%(成熟水果x)/(总成熟水果)2)水果的成熟水果的百分比x:%(成熟的水果x)/(总水果x)

基于成熟水果指标(1表示成熟,0表示未成熟)。

输入:

df = pd.DataFrame({'Farmer': ['Sallys','Sallys','Sallys','Sallys','Sallys','Sallys','Sallys','Sallys','Sallys','Sallys','Sallys','Tims','Tims','Tims','Tims'],
                 'Fruit':['Apple','Apple','Apple','Grape','Grape','Grape','Grape','Cherry','Cherry','Cherry','Cherry','Cherry','Cherry','Cherry','Cherry'],
                 'Type': ['Red','Yellow','Green','Red seedless','Red with seeds','Green','Purple','Montmorency','Morello','Bing','Rainer','Montmorency','Morello','Bing','Rainer'],
                 'Number':[2,6,2,1,1,6,2,3,1,3,3,3,1,3,3],
                 'Ripe':[1,1,0,1,0,1,1,0,0,0,1,0,0,0,1]})
df

    Farmer  Fruit   Number  Ripe    Type
0   Sallys  Apple   2        1      Red
1   Sallys  Apple   6        1      Yellow
2   Sallys  Apple   2        0      Green
3   Sallys  Grape   1        1      Red seedless
4   Sallys  Grape   1        0      Red with seeds
5   Sallys  Grape   6        1      Green
6   Sallys  Grape   2        1      Purple
7   Sallys  Cherry  3        0      Montmorency
8   Sallys  Cherry  1        0      Morello
9   Sallys  Cherry  3        0      Bing
10  Sallys  Cherry  3        1      Rainer
11  Tims    Cherry  3        0      Montmorency
12  Tims    Cherry  1        0      Morello
13  Tims    Cherry  3        0      Bing
14  Tims    Cherry  3        1      Rainer

所需输出:

    Farmer  Fruit   %(ripe fruit x)/(total ripe fruit)  %(ripe fruit x)/(total fruit x)
0   Sallys  Apple   40                                  80
1   Sallys  Grape   45                                  90
2   Sallys  Cherry  15                                  30
3   Tims    Cherry  100                                 30

首先将sum汇总并通过unstack重塑,然后用sum除以div

df1 = df.groupby(['Farmer','Fruit','Ripe'], sort=False)['Number'].sum().unstack()

a = df1[1].div(df1[1].sum(level=0)).mul(100)
b = df1[1].div(df1.sum(axis=1)).mul(100)

keys = ('%(ripe fruit x)/(total ripe fruit)','%(ripe fruit x)/(total fruit x)')
df2 = pd.concat([a,b], axis=1, keys=keys).reset_index()
print (df2)
   Farmer   Fruit  %(ripe fruit x)/(total ripe fruit)  \
0  Sallys   Apple                                40.0   
1  Sallys   Grape                                45.0   
2  Sallys  Cherry                                15.0   
3    Tims  Cherry                               100.0   

   %(ripe fruit x)/(total fruit x)  
0                             80.0  
1                             90.0  
2                             30.0  
3                             30.0  

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM