[英]Getting an error with random forest model using sklearn
我运行了以下代码来拟合随机森林 model。我使用了 Kaggle 数据集:
资料链接: https://www.kaggle.com/arnavr10880/winedataset-eda-ml/data?select=WineQT.csv
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.model_selection import KFold,cross_val_score,GridSearchCV
from sklearn import linear_model
from sklearn.ensemble import RandomForestRegressor
import numpy as np
data= pd.read_csv("C:/Users/Downloads/Model Test Data.csv")
y=data.loc[: ,["y"]]
x=data.iloc[:,1:]
x_train, x_test,y_train, y_test = train_test_split(x,y)
rf=RandomForestRegressor()
params = {
'n_estimators' : [300,500],
'max_depth' : np.array([8,9,12]),
'random_state' : [0],
}
scoring = ["neg_mean_absolute_error","neg_mean_squared_error"]
for score in scoring:
print("score %s" % scoring)
clf= GridSearchCV(rf,param_grid=params,scoring="%s" %score,verbose=False)
clf.fit(x_train,y_train)
print("Best parameters:")
print(clf.best_params_)
means=clf.cv_results_["mean_test_score"]
stds=clf.cv_results_["std_test_score"]
for mean,sd,params in zip(means,stds, clf.cv_results_["params"]):
print("%0.3f (+/-%0.3f) for %r" %(mean,2*sd,params) )
但是,我收到以下错误:
Parameter grid for parameter (max_depth) needs to be a list or numpy array,
but got (<class 'int'>). Single values need to be wrapped in a list with one element.
谁能帮我解决这个问题?
当您运行您的示例时,您会看到for
循环中的第一个score
打印得很好。 之后,检查params
变量显示{'max_depth': 12, 'n_estimators': 500, 'random_state': 0}
所以你不小心用特定的参数组合覆盖了params
空间。
再次查看您的代码,它在循环末尾的打印中:
for mean,sd,***params*** in zip(means,stds, clf.cv_results_["params"]):
print("%0.3f (+/-%0.3f) for %r" %(mean,2*sd,params) )
所以在这里使用不同的变量。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.