K-fold 交叉验证以减少过度拟合：实现问题

Question

这是我第一次尝试使用交叉验证，但遇到了错误。

首先，我的数据集如下所示：

因此，为了避免/减少模型的过度拟合，我尝试使用 k 折交叉验证。

from sklearn.model_selection import KFold 
X,y = creation_X_y() #Function which is cleaning my data
kf = KFold(n_splits=5) 

for train_index, test_index in kf.split(X):
    print("Train:", train_index, "Validation:",test_index)
    X_train = X[train_index]
    X_test = X[test_index]
    y_train, y_test = y[train_index], y[test_index]

但是，我面临以下错误，我没有找到如何解决它。 我知道它在列中查找这些值，但它可能应该在索引中查找 no ？ 例如，我可以使用 X.loc[train_index] 吗？

提前感谢您的时间和帮助！

Answer 1

您的假设是正确的： .iloc[index]将起作用。 这是代码：

from sklearn.model_selection import KFold 
X,y = creation_X_y() #Function which is cleaning my data
kf = KFold(n_splits=5) 

for train_index, test_index in kf.split(X):
    print("Train:", train_index, "Validation:",test_index)
    X_train = X.iloc[train_index]
    X_test = X.iloc[test_index]
    y_train, y_test = y.iloc[train_index], y.iloc[test_index]

另一种方法是让creation_X_y()返回一个numpy.array 。

K-fold 交叉验证以减少过度拟合：实现问题

问题描述

1 个解决方案

解决方案1
1 2022-06-09 12:32:29

K-fold 交叉验证以减少过度拟合：实现问题

问题描述

1 个解决方案

解决方案1 1 2022-06-09 12:32:29

解决方案1
1 2022-06-09 12:32:29