如何意外地训练测试拆分和交叉验证？

Question

I wrote the following code below which works:我在下面编写了以下代码，该代码有效：

from surprise.model_selection import cross_validate

cross_validate(algo,dataset,measures=['RMSE', 'MAE'],cv=5, verbose=False, n_jobs=-1)

However when I do this: (notice the trainset is passed here in cross_validate instead of whole dataset)但是，当我这样做时：（注意训练集是在 cross_validate 中传递的，而不是整个数据集）

from surprise.model_selection import train_test_split
trainset, testset = train_test_split(dataset, test_size=test_size)
cross_validate(algo, trainset, measures=['RMSE', 'MAE'],cv=5, verbose=False, n_jobs=-1)

It gives the following error:它给出了以下错误：

AttributeError: 'Trainset' object has no attribute 'raw_ratings'

I looked it up and Surprise documentation says that Trainset objects are not the same as dataset objects, which makes sense.我查了一下， Surprise 文档说 Trainset 对象与数据集对象不同，这是有道理的。

However, the documentation does not say how to convert the trainset to dataset.但是，文档没有说明如何将训练集转换为数据集。

My question is: 1. Is it possible to convert Surprise Trainset to surprise Dataset?我的问题是： 1. 是否可以将 Surprise Trainset 转换为惊喜数据集？ 2. If not, what is the correct way to train-test split the whole dataset and cross-validate? 2. 如果不是，训练测试拆分整个数据集并交叉验证的正确方法是什么？

Answer 1

From my understanding, cross-validate will perform the trainset(s)/testset(s) splits for you.据我了解，交叉验证将为您执行训练集/测试集拆分。 So your first line of code is correct and will split into 5 folds(cv=5).所以你的第一行代码是正确的，将分成5折（cv=5）。 Each fold will be the test for the other 4 (train).每个折叠都将是对其他 4 个（火车）的测试。

If you wanted a simple train/test set, see this example from the docs .如果您想要一个简单的训练/测试集，请参阅文档中的这个示例。

如何意外地训练测试拆分和交叉验证？

问题描述

1 个解决方案

解决方案1
-1 2020-05-09 08:00:49

如何意外地训练测试拆分和交叉验证？

问题描述

1 个解决方案

解决方案1 -1 2020-05-09 08:00:49

解决方案1
-1 2020-05-09 08:00:49