简体   繁体   English

如何在 Python 脚本中将 tensorflow 数据集拆分为训练、测试和验证?

[英]How to split a tensorflow dataset into train, test and validation in a Python script?

On a jupyter notebook with Tensorflow-2.0.0, a train-validation-test split of 80-10-10 was performed in this way:在带有 Tensorflow-2.0.0 的 jupyter notebook 上,以这种方式执行了 80-10-10 的训练验证测试拆分:

import tensorflow_datasets as tfds
from os import getcwd
splits = tfds.Split.ALL.subsplit(weighted=(80, 10, 10))

filePath = f"{getcwd()}/../tmp2/"
splits, info = tfds.load('fashion_mnist', with_info=True, as_supervised=True, split=splits, data_dir=filePath)

However, when trying to run the same code locally I get the error但是,当尝试在本地运行相同的代码时,出现错误

AttributeError: type object 'Split' has no attribute 'ALL'

I have seen I can create two sets in this way:我已经看到我可以通过这种方式创建两个集合:

splits, info = tfds.load('fashion_mnist', with_info=True, as_supervised=True, split=['train[:80]','test[80:90]'], data_dir=filePath)

but I do not know how I can add a third set.但我不知道如何添加第三组。

tfds.Split.ALL.subsplit or tfds.Split.TRAIN.subsplit apparently are deprecated and no longer supported. tfds.Split.ALL.subsplittfds.Split.TRAIN.subsplit显然已弃用且不再受支持。

Some of the datasets are already split between train and test.一些数据集已经在训练和测试之间拆分。 In this case I found the following solution (using for example the fashion MNIST dataset):在这种情况下,我找到了以下解决方案(例如使用时尚 MNIST 数据集):

splits, info = tfds.load('fashion_mnist', with_info=True, as_supervised=True,
split=['train+test[:80]','train+test[80:90]', 'train+test[90:]'],
data_dir=filePath)
(train_examples, validation_examples, test_examples) = splits

In the case of rock_paper_scissor dataset on tfds it works for me:对于 tfds 上的 rock_paper_scissor 数据集,它对我有用:

splits = ['train+test[:80]', 'train+test[80:90]', 'train+test[90:]']

splits, info = tfds.load( 'rock_paper_scissors', split=splits, as_supervised=True, with_info=True)

(train_examples, validation_examples, test_examples) = splits

num_examples = info.splits['train'].num_examples
num_classes = info.features['label'].num_classes

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 TensorFlow 数据集训练/测试拆分 - TensorFlow Dataset train/test split 使用 image_dataset_from_directory 时是否可以将 tensorflow 数据集拆分为训练、验证和测试数据集? - Is it possible to split a tensorflow dataset into train, validation AND test datasets when using image_dataset_from_directory? 如何将此数据集拆分为训练集、验证集和测试集? - How can I split this dataset into train, validation, and test set? 将 Tensorflow 数据集拆分为训练集、验证集、测试集,此代码是否会导致数据泄漏? - Split a Tensorflow Dataset into Train, Validation, Test sets, does this code cause data leakage? 如何将数据表 dataframe 拆分为 python 中的训练和测试数据集 - How to split datatable dataframe into train and test dataset in python 如何将此数据集拆分为训练集和验证集? - how to split this dataset into train and validation set? 如何准备图像数据集以训练和测试张量流 - How to prepare a dataset of images to train and test tensorflow 如何使用 tensorflow 将数据拆分为测试和训练 - how to split data into test and train using tensorflow 如何使用 Python Numpy 中的 train_test_split 将数据拆分为训练、测试和验证数据集? 分裂不应该是随机的 - How to split data by using train_test_split in Python Numpy into train, test and validation data set? The split should not random 如何在 tf 2.1.0 中创建 tf.data.Dataset 的训练、测试和验证拆分 - how to create train, test & validation split of tf.data.Dataset in tf 2.1.0
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM