简体   繁体   中英

Ensemble learning Python-Random Forest, SVM, KNN

I am trying to ensemble the classifiers Random forest, SVM and KNN. Here to ensemble, I'm using the VotingClassifier with GridSearchCV. The code is working fine if I try with the Logistic regression, Random Forest and Gaussian

clf11 = LogisticRegression(random_state=1)
clf12 = RandomForestClassifier(random_state=1)
clf13 = GaussianNB()

But I don't know what I was wrong in this below code cause I'm a beginner. Here is my try to work with Random forest, KNN and SVM

from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import VotingClassifier

clf11 = RandomForestClassifier(n_estimators=100,criterion="entropy")
clf12 = KNeighborsClassifier(n_neighbors=best_k)
clf13 = SVC(kernel='rbf', probability=True)
eclf1 = VotingClassifier(estimators=[('lr', clf11), ('rf', clf12), ('gnb', clf13)],voting='hard')

params = {'lr__C': [1.0, 100.0], 'rf__n_estimators': [20, 200]}

grid1 = GridSearchCV(estimator=eclf1, param_grid=params, cv=30)
grid1.fit(X_train,y_train)
grid1_predicted = grid1.predict(X_test)
print('Accuracy score : {}%'.format(accuracy_score(y_test,grid1_predicted)*100))
scores_dict['Logistic-Random-Gaussian'] = accuracy_score(y_test,grid1_predicted)*100

Whenever I run this I get

Invalid parameter estimator VotingClassifier.

These are the errors I'm getting. 在此处输入图像描述 在此处输入图像描述

Is it possible to ensemble Random Forest, svm and KNN?

Or else, is there any other way to do it?

The code posted is the following:

clf11 = RandomForestClassifier(n_estimators=100,criterion="entropy")
clf12 = KNeighborsClassifier(n_neighbors=best_k)
clf13 = SVC(kernel='rbf', probability=True)
eclf1 = VotingClassifier(estimators=[('lr', clf11), ('rf', clf12), ('gnb', clf13)],voting='hard')

params = {'lr__C': [1.0, 100.0], 'rf__n_estimators': [20, 200]}

Here, you are using hiperparameters C for RandomForestClassifier, which will not work.

You must use hiperparameters that are valid for the classifiers that are being used. Maybe the names "lr", "rf" and "gnb" for the estimators show be replaced by other more adequate and then selecting hiperparameters valid for the different kind of classifiers

The following would work:

clf11 = RandomForestClassifier(n_estimators=100,criterion="entropy")
clf12 = KNeighborsClassifier(n_neighbors=best_k)
clf13 = SVC(kernel='rbf', probability=True)
eclf1 = VotingClassifier(estimators=[('rf', clf11), ('knn', clf12), ('svc', clf13)],voting='hard')

params = {'svc__C': [1.0, 100.0], 'rf__n_estimators': [20, 200]}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM