简体   繁体   English

ValueError:无法将拆分数n_splits = 3大于样本数:1

[英]ValueError: Cannot have number of splits n_splits=3 greater than the number of samples: 1

I am trying this training modeling using train_test_split and a decision tree regressor: 我正在尝试使用train_test_split和决策树回归器进行此训练建模:

import sklearn
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import cross_val_score

# TODO: Make a copy of the DataFrame, using the 'drop' function to drop the given feature
new_data = samples.drop('Fresh', 1)

# TODO: Split the data into training and testing sets using the given feature as the target
X_train, X_test, y_train, y_test = train_test_split(new_data, samples['Fresh'], test_size=0.25, random_state=0)

# TODO: Create a decision tree regressor and fit it to the training set
regressor = DecisionTreeRegressor(random_state=0)
regressor = regressor.fit(X_train, y_train)

# TODO: Report the score of the prediction using the testing set
score = cross_val_score(regressor, X_test, y_test, cv=3)

print score

When running this, I am getting the error: 运行此命令时,出现错误:

ValueError: Cannot have number of splits n_splits=3 greater than the number of samples: 1.

If I change the value of cv to 1, I get: 如果将cv的值更改为1,则会得到:

ValueError: k-fold cross-validation requires at least one train/test split by setting n_splits=2 or more, got n_splits=1.

Some sample rows of the data look like: 数据的一些示例行如下所示:

    Fresh   Milk    Grocery Frozen  Detergents_Paper    Delicatessen
0   14755   899 1382    1765    56  749
1   1838    6380    2824    1218    1216    295
2   22096   3575    7041    11422   343 2564

If the number of splits is greater than number of samples, you will get the first error. 如果分割数大于样本数,则将出现第一个错误。 Check the snippet from the source code given below: 从下面给出的源代码中检查代码段:

if self.n_splits > n_samples:
    raise ValueError(
        ("Cannot have number of splits n_splits={0} greater"
         " than the number of samples: {1}.").format(self.n_splits,
                                                     n_samples))

If the number of folds is less than or equal 1 , you will get the second error. 如果折叠数小于或等于1 ,您将得到第二个错误。 In your case, the cv = 1 . 在您的情况下, cv = 1 Check the source code : 检查源代码

if n_folds <= 1:
            raise ValueError(
                "k-fold cross validation requires at least one"
                " train / test split by setting n_folds=2 or more,"
                " got n_folds={0}.".format(n_folds))

An educated guess, the number of samples in X_test is less than 3 . 有根据的猜测, X_test的样本数少于3 Check that carefully. 仔细检查。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 ValueError:n_splits = 10不能大于每个类中的成员数 - ValueError: n_splits=10 cannot be greater than the number of members in each class Python ValueError: n_splits=3 不能大于每个 class 中的成员数 - Python ValueError: n_splits=3 cannot be greater than the number of members in each class ValueError:n_splits = 3不能大于每个类中的成员数 - ValueError: n_splits=3 cannot be greater than the number of members in each class ValueError:不能将折叠数n_folds = 3大于样本数:2 - ValueError: Cannot have number of folds n_folds=3 greater than the number of samples: 2 如何修复 ValueError:n_splits=10 错误 sklearn NLP - How to fix the ValueError: n_splits=10 Error sklearn NLP SVM 分类器 n_samples, n_splits 问题 sklearn Python - SVM classifier n_samples, n_splits problem sklearn Python 当n_samples%n_splits不为零时,KFold如何工作 - How does KFold work when n_samples % n_splits is non-zero ValueError:通过设置 n_splits=2 或更多,k 折交叉验证需要至少一个训练/测试分割,得到 n_splits=1 - ValueError: k-fold cross-validation requires at least one train/test split by setting n_splits=2 or more, got n_splits=1 TypeError:__init __()为参数&#39;n_splits&#39;获取了多个值 - TypeError: __init__() got multiple values for argument 'n_splits' __init __()得到了意外的关键字参数&#39;n_splits&#39;错误 - __init__() got an unexpected keyword argument 'n_splits' ERROR
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM