简体   繁体   English

我怎样才能正确地创建我的数据集?

[英]How can I properly get my Dataset to create?

I have the following code:我有以下代码:

imagepaths = tf.convert_to_tensor(imagepaths, dtype=tf.string)
labels = tf.convert_to_tensor(labels, dtype=tf.int32)

# Build a TF Queue, shuffle data
image, label = tf.data.Dataset.from_tensor_slices((imagepaths, labels))

and am getting the following error:并收到以下错误:

image, label = tf.data.Dataset.from_tensor_slices((imagepaths, labels))
ValueError: too many values to unpack (expected 2)

Shouldn't Dataset.from_tensor_slices see this as the length of the tensor, not the number of inputs? Dataset.from_tensor_slices 不应该将其视为张量的长度,而不是输入的数量吗? How can I fix this issue or combine the data tensors into the same variable more effectively?如何更有效地解决此问题或将数据张量组合到同一变量中? Just for reference: There are 1800 imagepaths and 1800 labels corresponding to each other.仅供参考:有1800个图像路径和1800个标签相互对应。 And to be clear, the imagepaths are paths to the files where the jpgs images are located.需要明确的是,图像路径是 jpg 图像所在文件的路径。 My goal after this is to shuffle the data set and build the neural network model.我的目标是打乱数据集并构建神经网络模型。

That code is right here: # Read images from disk image = tf.read_file(image) image = tf.image.decode_jpeg(image, channels=CHANNELS)该代码就在这里: # 从磁盘读取图像 image = tf.read_file(image) image = tf.image.decode_jpeg(image, channels=CHANNELS)

# Resize images to a common size
image = tf.image.resize_images(image, [IMG_HEIGHT, IMG_WIDTH])

# Normalize
image = image * 1.0/127.5 - 1.0

# Create batches
X, Y = tf.train.batch([image, label], batch_size=batch_size,
                      capacity=batch_size * 8,
                      num_threads=4)

try to do this:尝试这样做:

def transform(entry):
  img = entry[0]
  lbl = entry[1]

  return img, lbl

raw_data = list(zip(imagepaths, labels))
dataset = tf.data.Dataset.from_tensor_slices(raw_data)
dataset = dataset.map(transform)

and if you want to have a look at your dataset you can do it like this:如果你想看看你的数据集,你可以这样做:

for e in dataset.take(1):
    print(e)

you can add multiple map functions and you can after that use shuffle and batch on your dataset to prepare it for training ;)您可以添加多个地图函数,然后您可以在数据集上使用 shuffle 和批处理来准备训练;)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使我的登录功能正常工作 - How can I get my login function to work properly 如何创建数据集的连续分布? - How can I create a continuous distribution of a dataset? 如何正确拆分不平衡数据集以训练和测试集? - How can I properly split imbalanced dataset to train and test set? 如何读取我的 exel 表中的特定数据并从读取的每个数据集创建一个图? (Python) - How can I read specific data in my exel sheet and create a plot from each dataset that is read? (Python) 如何让 tf.data.Dataset.from_tensor_slices 接受我的 dtype? - How can I get tf.data.Dataset.from_tensor_slices to accept my dtype? 如何让我的数据 (Pandas Dataframe) 在我的 Flet (Python) 应用程序 output 中正确对齐? - How can I get my data (Pandas Dataframe) to align properly in my Flet (Python) app output? 如何从数据集中在熊猫中创建标题和列? - How can I create headings and columns in pandas from a dataset? 如何从拆分数据集中创建线性回归模型? - How can I create a Linear Regression Model from a split dataset? 如何正确获取数据? - how can I get data properly? 我如何优化我在庞大数据集上的嵌入转换? - How can i optimize my Embedding transformation on a huge dataset?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM