k-fold 交叉验证程序将数据拆分为训练和测试

Question

i'm trying to do a k-fold cross validation procedure to split the data into training and test subsets but i'm not sure how to do this:我正在尝试进行 k 折交叉验证程序以将数据拆分为训练和测试子集，但我不确定如何执行此操作：

 products = pd.read_csv("product_imgs.csv")

  kf = model_selection.KFold(n_splits=2, shuffle=True)
   for train_index, test_index in kf.split(products):
       print('train: %s, test: %s' % (products[train_index], products[test_index]))

example of data, data is of images: (label 1 = cars, label 0 = vans)数据示例，数据是图像：（标签 1 = 汽车，label 0 = 货车）

Error:错误：

KeyError: "None of [Int64Index([    0,     1,     2,     3,     5,     6,     7,     9,    10,\n               11,\n            ...\n            13981, 13982, 13986, 13987, 13989, 13990, 13993, 13995, 13996,\n            13997],\n           dtype='int64', length=7000)] are in the [columns]"

Answer 1

The returned train_index and test_index from kf.split() are indexes. kf.split()返回的train_index和test_index是索引。 Therefore, in your print function, you should use .loc to access with indexes as showed in the code below.因此，在您的打印 function 中，您应该使用.loc来访问索引，如下面的代码所示。

print('train: %s, test: %s' % (products.loc[train_index, :], products.loc[test_index, :]))

k-fold 交叉验证程序将数据拆分为训练和测试

问题描述

1 个解决方案

解决方案1
0 2019-11-05 22:03:51

k-fold 交叉验证程序将数据拆分为训练和测试

问题描述

1 个解决方案

解决方案1 0 2019-11-05 22:03:51

解决方案1
0 2019-11-05 22:03:51