简体   繁体   English

集成学习 Python-随机森林、SVM、KNN

[英]Ensemble learning Python-Random Forest, SVM, KNN

I am trying to ensemble the classifiers Random forest, SVM and KNN.我正在尝试集成分类器随机森林、SVM 和 KNN。 Here to ensemble, I'm using the VotingClassifier with GridSearchCV.在这里合奏,我将 VotingClassifier 与 GridSearchCV 一起使用。 The code is working fine if I try with the Logistic regression, Random Forest and Gaussian如果我尝试使用逻辑回归、随机森林和高斯,代码工作正常

clf11 = LogisticRegression(random_state=1)
clf12 = RandomForestClassifier(random_state=1)
clf13 = GaussianNB()

But I don't know what I was wrong in this below code cause I'm a beginner.但我不知道我在下面的代码中做错了什么,因为我是初学者。 Here is my try to work with Random forest, KNN and SVM这是我尝试使用随机森林、KNN 和 SVM

from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import VotingClassifier

clf11 = RandomForestClassifier(n_estimators=100,criterion="entropy")
clf12 = KNeighborsClassifier(n_neighbors=best_k)
clf13 = SVC(kernel='rbf', probability=True)
eclf1 = VotingClassifier(estimators=[('lr', clf11), ('rf', clf12), ('gnb', clf13)],voting='hard')

params = {'lr__C': [1.0, 100.0], 'rf__n_estimators': [20, 200]}

grid1 = GridSearchCV(estimator=eclf1, param_grid=params, cv=30)
grid1.fit(X_train,y_train)
grid1_predicted = grid1.predict(X_test)
print('Accuracy score : {}%'.format(accuracy_score(y_test,grid1_predicted)*100))
scores_dict['Logistic-Random-Gaussian'] = accuracy_score(y_test,grid1_predicted)*100

Whenever I run this I get每当我运行这个我得到

Invalid parameter estimator VotingClassifier.

These are the errors I'm getting.这些是我得到的错误。 在此处输入图像描述 在此处输入图像描述

Is it possible to ensemble Random Forest, svm and KNN?是否可以集成随机森林、支持向量机和 KNN?

Or else, is there any other way to do it?或者,还有其他方法可以做到吗?

The code posted is the following:发布的代码如下:

clf11 = RandomForestClassifier(n_estimators=100,criterion="entropy")
clf12 = KNeighborsClassifier(n_neighbors=best_k)
clf13 = SVC(kernel='rbf', probability=True)
eclf1 = VotingClassifier(estimators=[('lr', clf11), ('rf', clf12), ('gnb', clf13)],voting='hard')

params = {'lr__C': [1.0, 100.0], 'rf__n_estimators': [20, 200]}

Here, you are using hiperparameters C for RandomForestClassifier, which will not work.在这里,您正在为 RandomForestClassifier 使用超参数 C,这将不起作用。

You must use hiperparameters that are valid for the classifiers that are being used.您必须使用对正在使用的分类器有效的超参数。 Maybe the names "lr", "rf" and "gnb" for the estimators show be replaced by other more adequate and then selecting hiperparameters valid for the different kind of classifiers估计器的名称“lr”、“rf”和“gnb”可能会被其他更合适的名称替换,然后选择对不同类型的分类器有效的超参数

The following would work:以下将起作用:

clf11 = RandomForestClassifier(n_estimators=100,criterion="entropy")
clf12 = KNeighborsClassifier(n_neighbors=best_k)
clf13 = SVC(kernel='rbf', probability=True)
eclf1 = VotingClassifier(estimators=[('rf', clf11), ('knn', clf12), ('svc', clf13)],voting='hard')

params = {'svc__C': [1.0, 100.0], 'rf__n_estimators': [20, 200]}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM