TensorFlow Dataset train/test split

Question

I am trying to load the coil100 dataset from TensorFlow Datasets. This dataset, according to the documentation, comes only with the train split. I want to split the dataset in train/test for playing locally, however, even after carefully reading the TensorFlow Dataset documentation, I have many issues. This is my attempt:

import tensorflow_datasets as tfds

ds_train, ds_info = tfds.load(
'coil100',
split=['train'],
shuffle_files=True,
as_supervised=True,
with_info=True,

)

train = ds_train[0][0: 7000]
test = ds_train[0][7000:]

However, it leads to this error:

TypeError: '_OptionsDataset' object is not subscriptable

I am getting many issues understanding the way some datasets are prepared, since the returned data are not iterable, and this whole this is not really clearly explained in the docs. Is there any additional resource where I could finally understand how to deal with any dataset from this library?

Answer 1

See the documentation of Tensorflow Datasets: Splits and Slicing . What you need is this:

tfds.load('coil100', split=['train[:7000]', 'train[7000:]'])

TensorFlow Dataset train/test split

Question

1 answers

solution1
2 ACCPTED 2020-12-21 13:58:39

TensorFlow Dataset train/test split

Question

1 answers

solution1 2 ACCPTED 2020-12-21 13:58:39

solution1
2 ACCPTED 2020-12-21 13:58:39