简体   繁体   中英

How to train-test split and cross-validate in surprise?

I wrote the following code below which works:

from surprise.model_selection import cross_validate

cross_validate(algo,dataset,measures=['RMSE', 'MAE'],cv=5, verbose=False, n_jobs=-1)

However when I do this: (notice the trainset is passed here in cross_validate instead of whole dataset)

from surprise.model_selection import train_test_split
trainset, testset = train_test_split(dataset, test_size=test_size)
cross_validate(algo, trainset, measures=['RMSE', 'MAE'],cv=5, verbose=False, n_jobs=-1)

It gives the following error:

AttributeError: 'Trainset' object has no attribute 'raw_ratings'

I looked it up and Surprise documentation says that Trainset objects are not the same as dataset objects, which makes sense.

However, the documentation does not say how to convert the trainset to dataset.

My question is: 1. Is it possible to convert Surprise Trainset to surprise Dataset? 2. If not, what is the correct way to train-test split the whole dataset and cross-validate?

  1. From my understanding, cross-validate will perform the trainset(s)/testset(s) splits for you. So your first line of code is correct and will split into 5 folds(cv=5). Each fold will be the test for the other 4 (train).

If you wanted a simple train/test set, see this example from the docs .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM