简体   繁体   English

如何在Pandas中进行Groupby Ratio计算

[英]How to do Groupby Ratio calculation in Pandas

I have a pandas dataframe as below. 我有一个pandas数据帧如下。 I need to calculate the success ratio of the Flag column when its value is equal to Y for a orgin and destination combo. 我需要计算Flag列的成功率,当它的值等于org和目标组合的Y

Input 输入

ORG DSTN    FLAG
LON SIN      Y
ADL SIN      N
SIN LON      N
LON SIN      Y
LON SIN      N
ADL SIN      Y
ADL SIN      N
SIN LON      Y
SIN LON      Y
SIN LON      Y
SIN LON      N
LON SIN      N

Expected Output 预期产出

ORG DSTN    FLAG    Ratio
LON SIN      Y       0.5
ADL SIN      N       0.3
SIN LON      N       0.6
LON SIN      Y       0.5
LON SIN      N       0.5
ADL SIN      Y       0.3
ADL SIN      N       0.3
SIN LON      Y       0.6
SIN LON      Y       0.6
SIN LON      Y       0.6
SIN LON      N       0.6
LON SIN      N       0.5

How can this be done in pandas. 怎么能在熊猫里做到这一点。

Using value_counts with normalize=True : 使用value_countsnormalize=True

s = (df.groupby(['ORG', 'DSTN']).FLAG
        .value_counts(normalize=True).rename('Ratio').reset_index()
)

Then changing rows where FLAG equals N to their corresponding Y value, and merging: 然后将FLAG等于N行更改为其对应的Y值,并合并:

s.loc[s.FLAG.eq('N'), 'Ratio'] = 1.0 - s.Ratio
df.merge(s, how='left')

   ORG DSTN FLAG     Ratio
0   LON  SIN    Y  0.500000
1   ADL  SIN    N  0.333333
2   SIN  LON    N  0.600000
3   LON  SIN    Y  0.500000
4   LON  SIN    N  0.500000
5   ADL  SIN    Y  0.333333
6   ADL  SIN    N  0.333333
7   SIN  LON    Y  0.600000
8   SIN  LON    Y  0.600000
9   SIN  LON    Y  0.600000
10  SIN  LON    N  0.600000
11  LON  SIN    N  0.500000

well you can also group, then replace everything with the group proportion of FLAG=='Y' 那么你也可以分组,然后用FLAG=='Y'的组比例替换所有东西FLAG=='Y'

 df.assign(Ratio=df.groupby(['ORG','DSTN']).FLAG.apply(lambda x:x.replace('Y|N',(x=='Y').mean(),regex=True)))
Out[174]: 
    ORG DSTN FLAG     Ratio
0   LON  SIN    Y  0.500000
1   ADL  SIN    N  0.333333
2   SIN  LON    N  0.600000
3   LON  SIN    Y  0.500000
4   LON  SIN    N  0.500000
5   ADL  SIN    Y  0.333333
6   ADL  SIN    N  0.333333
7   SIN  LON    Y  0.600000
8   SIN  LON    Y  0.600000
9   SIN  LON    Y  0.600000
10  SIN  LON    N  0.600000
11  LON  SIN    N  0.500000

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM