使用sklearn套袋分类器预测连续值

Question

Can I use sklearn's BaggingClassifier to produce continuous predictions? 我可以使用sklearn的BaggingClassifier产生连续的预测吗？ Is there a similar package? 有类似的包装吗？ My understanding is that the bagging classifier predicts several classifications with different models, then reports the majority answer. 我的理解是，装袋分类器使用不同的模型预测几个分类，然后报告多数答案。 It seems like this algorithm could be used to generate probability functions for each classification then reporting the mean value. 看来该算法可用于为每个分类生成概率函数，然后报告平均值。

trees = BaggingClassifier(ExtraTreesClassifier())
trees.fit(X_train,Y_train)
Y_pred = trees.predict(X_test)

Answer 1

If you're interested in predicting probabilities for the classes in your classifier, you can use the predict_proba method, which gives you a probability for each class. 如果您对预测分类器中类的概率感兴趣，则可以使用predict_proba方法，该方法为您提供每个类的概率。 It's a one-line change to your code: 这是对代码的单行更改：

trees = BaggingClassifier(ExtraTreesClassifier())
trees.fit(X_train,Y_train)
Y_pred = trees.predict_proba(X_test)

The shape of Y_pred will be [n_samples, n_classes] . Y_pred的形状将为[n_samples, n_classes] 。

If your Y_train values are continuous and you want to predict those continuous values (ie, you're working on a regression problem), then you can use the BaggingRegressor instead. 如果您的Y_train值是连续的并且想要预测这些连续值（即，您正在处理回归问题），则可以改用BaggingRegressor 。

Answer 2

I typically use BaggingRegressor() for continuous values, and then compare performance with RMSE. 我通常将BaggingRegressor（）用于连续值，然后将性能与RMSE进行比较。 example below: 下面的例子：

from sklearn.ensemble import BaggingReressor
trees = BaggingRegressor()
trees.fit(X_train,Y_train)
scores_RMSE = math.sqrt(metrics.mean_squared_error(Y_test, trees.predict(X_test))

使用sklearn套袋分类器预测连续值

问题描述

2 个解决方案

解决方案1
3 已采纳 2015-12-10 16:44:14

解决方案2
1 2019-02-01 18:47:12

使用sklearn套袋分类器预测连续值

问题描述

2 个解决方案

解决方案1 3 已采纳 2015-12-10 16:44:14

解决方案2 1 2019-02-01 18:47:12

解决方案1
3 已采纳 2015-12-10 16:44:14

解决方案2
1 2019-02-01 18:47:12