[英]ValueError: Cannot have number of splits n_splits=3 greater than the number of samples: 1
I am trying this training modeling using train_test_split and a decision tree regressor: 我正在尝试使用train_test_split和决策树回归器进行此训练建模:
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import cross_val_score
# TODO: Make a copy of the DataFrame, using the 'drop' function to drop the given feature
new_data = samples.drop('Fresh', 1)
# TODO: Split the data into training and testing sets using the given feature as the target
X_train, X_test, y_train, y_test = train_test_split(new_data, samples['Fresh'], test_size=0.25, random_state=0)
# TODO: Create a decision tree regressor and fit it to the training set
regressor = DecisionTreeRegressor(random_state=0)
regressor = regressor.fit(X_train, y_train)
# TODO: Report the score of the prediction using the testing set
score = cross_val_score(regressor, X_test, y_test, cv=3)
print score
When running this, I am getting the error: 运行此命令时,出现错误:
ValueError: Cannot have number of splits n_splits=3 greater than the number of samples: 1.
If I change the value of cv to 1, I get: 如果将cv的值更改为1,则会得到:
ValueError: k-fold cross-validation requires at least one train/test split by setting n_splits=2 or more, got n_splits=1.
Some sample rows of the data look like: 数据的一些示例行如下所示:
Fresh Milk Grocery Frozen Detergents_Paper Delicatessen
0 14755 899 1382 1765 56 749
1 1838 6380 2824 1218 1216 295
2 22096 3575 7041 11422 343 2564
If the number of splits is greater than number of samples, you will get the first error. 如果分割数大于样本数,则将出现第一个错误。 Check the snippet from the source code given below:
从下面给出的源代码中检查代码段:
if self.n_splits > n_samples:
raise ValueError(
("Cannot have number of splits n_splits={0} greater"
" than the number of samples: {1}.").format(self.n_splits,
n_samples))
If the number of folds is less than or equal 1
, you will get the second error. 如果折叠数小于或等于
1
,您将得到第二个错误。 In your case, the cv = 1
. 在您的情况下,
cv = 1
。 Check the source code : 检查源代码 :
if n_folds <= 1:
raise ValueError(
"k-fold cross validation requires at least one"
" train / test split by setting n_folds=2 or more,"
" got n_folds={0}.".format(n_folds))
An educated guess, the number of samples in X_test
is less than 3
. 有根据的猜测,
X_test
的样本数少于3
。 Check that carefully. 仔细检查。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.