[英]Apply function to grouped data counts in pandas
>>> new_confirmIOC.groupby(['ErrorCode','ResponseType']).OrderID.count()
ErrorCode ResponseType
0 CANCEL_ORDER_CONFIRM 80
TRADE_CONFIRM 31
1 CANCEL_ORDER_CONFIRM 80
TRADE_CONFIRM 31
How do I add percentage of total eg- 80/111, 31/111 for ErrorCode 0 and so on 如何为错误代码0加上总计的百分比,例如80 / 111、31 / 111,依此类推
I tried 我试过了
new_confirmIOC.groupby(['ErrorCode','ResponseType']).OrderID.count().apply(lambda x: x / x.sum())
But it gives me 但这给了我
ErrorCode ResponseType
0 CANCEL_ORDER_CONFIRM 1
TRADE_CONFIRM 1
1 CANCEL_ORDER_CONFIRM 1
TRADE_CONFIRM 1
Name: OrderID, dtype: int64
I think you need groupby
by first level and divide by sum
: 我认为您需要按第一层groupby
并按sum
除法:
df = new_confirmIOC.groupby(['ErrorCode','ResponseType']).OrderID.count()
df = df.groupby(level='ErrorCode').apply(lambda x: x / x.sum())
print (df)
ErrorCode ResponseType
0 CANCEL_ORDER_CONFIRM 0.720721
TRADE_CONFIRM 0.279279
1 CANCEL_ORDER_CONFIRM 0.720721
TRADE_CONFIRM 0.279279
Name: val, dtype: float64
Another solution with transform
: transform
另一种解决方案:
df = df.div(df.groupby(level='ErrorCode').transform('sum'))
print (df)
ErrorCode ResponseType
0 CANCEL_ORDER_CONFIRM 0.720721
TRADE_CONFIRM 0.279279
1 CANCEL_ORDER_CONFIRM 0.720721
TRADE_CONFIRM 0.279279
Name: val, dtype: float64
Thank you FLab for comment: 谢谢FLab发表评论:
The result of .count
is a Series, so the apply function would operate element by element. .count
的结果是一个Series,因此apply函数将逐个元素地操作。 (not on the entire column as it would for a pandas DataFrame). (不像熊猫DataFrame那样在整列上)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.