[英]How to do Groupby Ratio calculation in Pandas
I have a pandas dataframe as below. 我有一个pandas数据帧如下。 I need to calculate the success ratio of the Flag
column when its value is equal to Y
for a orgin and destination combo. 我需要计算Flag
列的成功率,当它的值等于org和目标组合的Y
。
Input 输入
ORG DSTN FLAG
LON SIN Y
ADL SIN N
SIN LON N
LON SIN Y
LON SIN N
ADL SIN Y
ADL SIN N
SIN LON Y
SIN LON Y
SIN LON Y
SIN LON N
LON SIN N
Expected Output 预期产出
ORG DSTN FLAG Ratio
LON SIN Y 0.5
ADL SIN N 0.3
SIN LON N 0.6
LON SIN Y 0.5
LON SIN N 0.5
ADL SIN Y 0.3
ADL SIN N 0.3
SIN LON Y 0.6
SIN LON Y 0.6
SIN LON Y 0.6
SIN LON N 0.6
LON SIN N 0.5
How can this be done in pandas. 怎么能在熊猫里做到这一点。
Using value_counts
with normalize=True
: 使用value_counts
和normalize=True
:
s = (df.groupby(['ORG', 'DSTN']).FLAG
.value_counts(normalize=True).rename('Ratio').reset_index()
)
Then changing rows where FLAG
equals N
to their corresponding Y
value, and merging: 然后将FLAG
等于N
行更改为其对应的Y
值,并合并:
s.loc[s.FLAG.eq('N'), 'Ratio'] = 1.0 - s.Ratio
df.merge(s, how='left')
ORG DSTN FLAG Ratio
0 LON SIN Y 0.500000
1 ADL SIN N 0.333333
2 SIN LON N 0.600000
3 LON SIN Y 0.500000
4 LON SIN N 0.500000
5 ADL SIN Y 0.333333
6 ADL SIN N 0.333333
7 SIN LON Y 0.600000
8 SIN LON Y 0.600000
9 SIN LON Y 0.600000
10 SIN LON N 0.600000
11 LON SIN N 0.500000
well you can also group, then replace everything with the group proportion of FLAG=='Y'
那么你也可以分组,然后用FLAG=='Y'
的组比例替换所有东西FLAG=='Y'
df.assign(Ratio=df.groupby(['ORG','DSTN']).FLAG.apply(lambda x:x.replace('Y|N',(x=='Y').mean(),regex=True)))
Out[174]:
ORG DSTN FLAG Ratio
0 LON SIN Y 0.500000
1 ADL SIN N 0.333333
2 SIN LON N 0.600000
3 LON SIN Y 0.500000
4 LON SIN N 0.500000
5 ADL SIN Y 0.333333
6 ADL SIN N 0.333333
7 SIN LON Y 0.600000
8 SIN LON Y 0.600000
9 SIN LON Y 0.600000
10 SIN LON N 0.600000
11 LON SIN N 0.500000
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.