简体   繁体   English

用python集成两个模型

[英]Ensemble two models with python

I have regression task and I am predicting here with linear regression and random-forest models.我有回归任务,我在这里用线性回归和随机森林模型进行预测。 Need some hints or code example how to ensemble them (averaging already done).需要一些提示或代码示例如何组合它们(平均已经完成)。 Here are my model realizations with python:这是我使用 python 实现的模型:

np.random.seed(42)
mask = np.random.rand(happiness2.shape[0]) <= 0.7

print('Train set shape {0}, test set shape {1}'.format(happiness2[mask].shape, happiness2[~mask].shape))

from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(happiness22[mask].drop(['Country', 'Happiness_Score_2017',
                               'Happiness_Score_2018','Happiness_Score_2019'], axis=1).fillna(0), 
       happiness22[mask]['Happiness_Score_2019'] )

pred = lr.predict(happiness22[~mask].drop(['Country', 'Happiness_Score_2017',
                               'Happiness_Score_2018','Happiness_Score_2019'], axis=1).fillna(0)) 
print('RMSE = {0:.04f}'.format(np.sqrt(np.mean((pred - happiness22[~mask]['Happiness_Score_2019'])**2)))) 

from sklearn.ensemble import RandomForestRegressor

rf = RandomForestRegressor(n_estimators=100)
rf.fit(happiness22[mask].drop(['Country', 'Happiness_Score_2017',
                               'Happiness_Score_2018','Happiness_Score_2019'], axis=1).fillna(0), 
       happiness22[mask]['Happiness_Score_2019'] )
pred3 = rf.predict(happiness22[~mask].drop(['Country', 'Happiness_Score_2017',
                               'Happiness_Score_2018','Happiness_Score_2019'], axis=1).fillna(0))
print('RMSE = {0:.04f}'.format(np.sqrt(np.mean((pred3 - happiness22[~mask]['Happiness_Score_2019'])**2))))

avepred=(pred+pred3)/2
print('RMSE = {0:.04f}'.format(np.sqrt(np.mean((avepred - happiness22[~mask]['Happiness_Score_2019'])**2))))

First, you can evaluate each model (linear regression and random forest) on a validation set and get out the error (MSE for instance).首先,您可以在验证集上评估每个模型(线性回归和随机森林)并找出错误(例如 MSE)。 Then, weight each model according to this error and use this weight later when predicting.然后,根据这个误差对每个模型进行加权,然后在预测时使用这个权重。

You can use also cobra ensemble method (developped by Guedj et al.) https://modal.lille.inria.fr/pycobra/您也可以使用眼镜蛇合奏方法(由 Guedj 等人开发) https://modal.lille.inria.fr/pycobra/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM