ValueError：無法將拆分數n_splits = 3大於樣本數：1

Question

我正在嘗試使用train_test_split和決策樹回歸器進行此訓練建模：

import sklearn
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import cross_val_score

# TODO: Make a copy of the DataFrame, using the 'drop' function to drop the given feature
new_data = samples.drop('Fresh', 1)

# TODO: Split the data into training and testing sets using the given feature as the target
X_train, X_test, y_train, y_test = train_test_split(new_data, samples['Fresh'], test_size=0.25, random_state=0)

# TODO: Create a decision tree regressor and fit it to the training set
regressor = DecisionTreeRegressor(random_state=0)
regressor = regressor.fit(X_train, y_train)

# TODO: Report the score of the prediction using the testing set
score = cross_val_score(regressor, X_test, y_test, cv=3)

print score

運行此命令時，出現錯誤：

ValueError: Cannot have number of splits n_splits=3 greater than the number of samples: 1.

如果將cv的值更改為1，則會得到：

ValueError: k-fold cross-validation requires at least one train/test split by setting n_splits=2 or more, got n_splits=1.

數據的一些示例行如下所示：

    Fresh   Milk    Grocery Frozen  Detergents_Paper    Delicatessen
0   14755   899 1382    1765    56  749
1   1838    6380    2824    1218    1216    295
2   22096   3575    7041    11422   343 2564

Answer 1

如果分割數大於樣本數，則將出現第一個錯誤。 從下面給出的源代碼中檢查代碼段：

if self.n_splits > n_samples:
    raise ValueError(
        ("Cannot have number of splits n_splits={0} greater"
         " than the number of samples: {1}.").format(self.n_splits,
                                                     n_samples))

如果折疊數小於或等於1 ，您將得到第二個錯誤。 在您的情況下， cv = 1 。 檢查源代碼：

if n_folds <= 1:
            raise ValueError(
                "k-fold cross validation requires at least one"
                " train / test split by setting n_folds=2 or more,"
                " got n_folds={0}.".format(n_folds))

有根據的猜測， X_test的樣本數少於3 。 仔細檢查。

ValueError：無法將拆分數n_splits = 3大於樣本數：1

問題描述

1 個解決方案

解決方案1
3 已采納 2016-10-03 04:39:13

ValueError：無法將拆分數n_splits = 3大於樣本數：1

問題描述

1 個解決方案

解決方案1 3 已采納 2016-10-03 04:39:13

解決方案1
3 已采納 2016-10-03 04:39:13