[英]How to use multiple lambda function for a pandas groupby
I'm calculating various evaluation metric for a dataframe with several group.我正在为具有多个组的数据框计算各种评估指标。 Here's my code
这是我的代码
import pandas as pd
from sklearn.metrics import mean_squared_error, median_absolute_error
temp = pd.DataFrame({"group":['A','A','B','B'],"actual":[10,2,3,4],"pred":[0,1,2,3]})
temp.groupby("group").apply(lambda x : mean_squared_error(x['actual'],x['pred'])).to_frame('MSE').reset_index()
group MSE
0 A 50.5
1 B 1.0
temp.groupby("group").apply(lambda x : median_absolute_error(x['actual'],x['pred'])).to_frame('MAE').reset_index()
group MAE
0 A 5.5
1 B 1.0
If I have 5 metrics, than I need to write groupby + apply 5 times, or execute is as a loop.如果我有 5 个指标,那么我需要编写 groupby + apply 5 次,或者作为循环执行。 But is there any native way from pandas to call for multiple apply in a single groupby object ?
但是有没有从 Pandas 中调用多个应用程序在单个 groupby 对象中的本地方法?
Maybe something like this :也许是这样的:
temp.groupby("group").agg({"MAE": lambda x : median_absolute_error(x['actual'],x['pred']),"MSE": lambda x : mean_squared_error(x['actual'],x['pred'])})
group MSE MAE
0 A 50.5 5.5
1 B 1.0 1.0
the code is wrong but I think you get what i'm trying to do.代码是错误的,但我认为你明白我想要做什么。
Try returning a Series
from groupby apply
:尝试从
groupby apply
返回一个Series
:
new_df = temp.groupby("group", as_index=False).apply(
lambda x: pd.Series({'MSE': mean_squared_error(x['actual'], x['pred']),
'MAE': median_absolute_error(x['actual'], x['pred'])})
)
new_df
: new_df
:
group MSE MAE
0 A 50.5 5.5
1 B 1.0 1.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.