简体   繁体   中英

Value error when training model with randomforest classifier

from sklearn import ensemble
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import OneHotEncoder
import time
from sklearn import metrics
from sklearn import preprocessing
from sklearn.linear_model import LinearRegression
enc = preprocessing.OneHotEncoder()
onehotencoder = OneHotEncoder(categories='auto')
enc.fit(X)
onehotlabels = enc.transform(X).toarray()
onehotlabels.shape
clf=RandomForestClassifier(n_estimators=10)
clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
predict = clf.predict(X_test)
print("Evaluation on Test Set",predict)

I am doing this to train my model with randomforest classifier. I am getting the following error:

ValueError: could not convert string to float: 'gorilla'

I can't tell for sure by looking at your code, because data structures of X, X_train or X_test is not clear. However, I suspect that the onehotlabels variable is not used. If one hot encoding worked properly, 'gorilla' string would not have been included.

So, I suggest that you check whether the following code had been executed.

X_train, X_test = train_test_split(onehotlabels)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM