[英]How to load images and labels seperately in a dataset loaded by tensorflow_datasets
import tensorflow_datasets as tfds
train_ds = tfds.load('cifar100', split='train[:90%]').shuffle(1024).batch(32)
val_ds = tfds.load('cifar100', split='train[-10%:]').shuffle(1024).batch(32)
I want to convert train_ds
and val_ds
into something like this: x_train, y_train
and x_val, y_val
(x for images and y for labels).我想将train_ds
和val_ds
转换成这样的东西: x_train, y_train
和x_val, y_val
(x 代表图像,y 代表标签)。 The Keras API uses train and test data split (this seems to be the case in sklearn too), but I do not want to use any test data at all here. Keras API 使用训练和测试数据拆分(sklearn 中似乎也是这种情况),但我不想在这里使用任何测试数据。
I have tried this, but it didn't work (and I do understand why this doesn't work, but I don't know how else can I convert my training data to images and labels):我已经尝试过了,但它没有用(我确实理解为什么这不起作用,但我不知道如何将我的训练数据转换为图像和标签):
x_train = train_ds['image']
# TypeError: 'BatchDataset' object is not subscriptable
Not the best way, I created lists firstly to inspect them.不是最好的方法,我首先创建了列表来检查它们。 I think you want something like:我想你想要这样的东西:
train_ds = tfds.load('mnist', split='train[:90%]')
train_examples_labels = tfds.as_numpy(train_ds)
x_train = []
y_train = []
for features_labels in train_examples_labels:
x_train.append(features_labels['image'])
y_train.append(features_labels['label'])
features_labels
is a dictionary here: features_labels
是这里的字典:
features_labels.keys()
dict_keys(['image', 'label'])
After you can convert them into numpy
arrays.之后可以将它们转换为numpy
arrays。
x_train = np.array(x_train, dtype = 'float32')
y_train = np.array(y_train, dtype = 'float32')
I found a better solution:我找到了一个更好的解决方案:
train_ds, val_ds = tfds.load(name="cifar100", split=('train[:90%]','train[-10%:]'), batch_size=-1, as_supervised=True)
x_train, y_train = tfds.as_numpy(train_data)
x_val, y_val = tfds.as_numpy(val_data)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.