简体   繁体   English

TensorFlow 数据集训练/测试拆分

[英]TensorFlow Dataset train/test split

I am trying to load the coil100 dataset from TensorFlow Datasets.我正在尝试从TensorFlow数据集加载线圈 100 数据集。 This dataset, according to the documentation, comes only with the train split.根据文档,该数据集仅与train拆分一起提供。 I want to split the dataset in train/test for playing locally, however, even after carefully reading the TensorFlow Dataset documentation, I have many issues.我想在训练/测试中拆分数据集以在本地播放,但是,即使在仔细阅读了 TensorFlow 数据集文档之后,我还是有很多问题。 This is my attempt:这是我的尝试:

import tensorflow_datasets as tfds

ds_train, ds_info = tfds.load(
'coil100',
split=['train'],
shuffle_files=True,
as_supervised=True,
with_info=True,

)

train = ds_train[0][0: 7000]
test = ds_train[0][7000:]

However, it leads to this error:但是,它会导致此错误:

TypeError: '_OptionsDataset' object is not subscriptable

I am getting many issues understanding the way some datasets are prepared, since the returned data are not iterable, and this whole this is not really clearly explained in the docs.我在理解一些数据集的准备方式时遇到了很多问题,因为返回的数据是不可迭代的,而这一切在文档中并没有真正清楚地解释。 Is there any additional resource where I could finally understand how to deal with any dataset from this library?是否有任何其他资源可以让我最终了解如何处理该库中的任何数据集?

See the documentation of Tensorflow Datasets: Splits and Slicing .请参阅Tensorflow Datasets: Splits and Slicing的文档。 What you need is this:你需要的是这样的:

tfds.load('coil100', split=['train[:7000]', 'train[7000:]'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM