无法将 Python 列表转换为 Tensorflow 数据集（InvalidArgumentError：所有输入的形状必须匹配...）

Question

I'm trying to make a neural.network (using YT guide, but I had to change data input code) and I need the batched dataset for the train function to work properly (idk why, not event sure on it).我正在尝试制作一个 neural.network（使用 YT 指南，但我必须更改数据输入代码）并且我需要火车 function 的批处理数据集才能正常工作（idk 为什么，不确定事件）。 But when I try to convert a train data list to Dataset using tensorflow.data.Dataset.from_tensor_slices(train_data)) I receive a error message:但是当我尝试使用 tensorflow.data.Dataset.from_tensor_slices(train_data)) 将火车数据列表转换为数据集时，我收到一条错误消息：

InvalidArgumentError
{{function_node __wrapped__Pack_N_3_device_/job:localhost/replica:0/task:0/device:GPU:0}} Shapes of all inputs must match: values[0].shape = [105,105,3] != values[2].shape = [1] [Op:Pack] name: 0

The train_data list consists of 560 lists, each with 3 elements inside: train_data 列表由 560 个列表组成，每个列表内部有 3 个元素：

<tf.Tensor: shape=(105, 105, 3), dtype=float32, numpy = array([[["105x105 3-dimensional image with my face"]]]. dtype=float32)>
<tf.Tensor: shape=(105, 105, 3), dtype=float32, numpy = array([[["different image with the same properties"]]] dtype=float32)>
<tf.Tensor: shape=(1,), dtype=float32, numpy=array(["1. or 0. (float), a label, showing if these pictures are actually the pictures of the same person"], dtype=float32)>

I am pretty sure that all of the shapes in the train_data list are exactly as described.我很确定 train_data 列表中的所有形状都与描述的完全一致。

Some data about shapes using.shape method使用 .shape 方法的一些关于形状的数据

train_data.shape #"AttributeError: 'list' object has no attribute 'shape'" - main list
train_data[0].shape #"AttributeError: 'list' object has no attribute 'shape'" - sublist, with 3 elements
train_data[0][0].shape #"TensorShape([105, 105, 3])" - first image
train_data[0][0][0].shape #"TensorShape([105, 3])" - first row of image pixels, ig
train_data[0][0][0][0].shape #"TensorShape([3])" - pixel in the left upper corner

That's what I tried to do: The label of the image pairs (1. or 0.) was previosly just an integer. Then, I received an error saying that everything here should be the same type of float32.这就是我试图做的：图像对（1. 或 0.）的 label 以前只是一个 integer。然后，我收到一条错误消息，指出这里的所有内容都应该是相同类型的 float32。 Then, I tried to convert it to tensor, but it changed nothing except the last part of the current error message, it used to say "values[2].shape = []" before.然后，我尝试将它转换为张量，但除了当前错误消息的最后一部分，它没有任何改变，它以前是说“values[2].shape = []”。

I really have no idea what could lead to the error.我真的不知道什么会导致错误。 I don't have any Tensorflow usage experience.我没有任何 Tensorflow 使用经验。

sorry if my engrish is bad对不起，如果我的英语不好

Edit: here is the code that takes the images out of certain directory.编辑：这是将图像从特定目录中取出的代码。 May cause eye bleeding可能会导致眼睛出血

for i in os.listdir("t"):
    for ii in os.listdir(os.path.join("t", i)):
        td.append([
                   [
                    tensorflow.expand_dims(
                     tensorflow.io.decode_jpeg(
                      tensorflow.io.read_file(os.path.join("t", i, ii) + "\\" + os.listdir(os.path.join("t", i, ii))[0])) / 255, 0), 
                    tensorflow.expand_dims(
                     tensorflow.io.decode_jpeg(
                      tensorflow.io.read_file(os.path.join("t", i, ii) + "\\2.jpeg")) / 255, 0)],
                    tensorflow.convert_to_tensor(
                     float(
                      os.listdir(os.path.join("t", i, ii))[0][0]
                     )
                    )
                  ])

I added some spaces in order to make it a bit more readable.我添加了一些空格以使其更具可读性。 td = train_data. td = 火车数据。 Yea, I could've messed something up there.是的，我可能在那里搞砸了。

Edit 2: Answering Mohammad's question, there is the output data shape of the code they gave me:编辑 2：回答 Mohammad 的问题，他们给我的代码有 output 数据形状：

td.shape #AttributeError: 'list' object has no attribute 'shape' - main list
td[0].shape #AttributeError: 'list' object has no attribute 'shape' - sublist, with a list and a label
td[0][0].shape #AttributeError: 'list' object has no attribute 'shape' - subsublist, with 2 images
td[0][1].shape #TensorShape([]) - label
td[0][0][0].shape #TensorShape([1, 105, 105, 3]) - first image
td[0][0][1].shape #TensorShape([1, 105, 105, 3]) - second image

It can be shown as:可以表示为：

train_data = [  [[x1, x2], y],  [[x1, x2], y], ... ]

Answer 1

Replicating the problem:复制问题：

x1 = tf.random.normal((105,105,3))
x2 = tf.random.normal((105,105,3))
y = tf.random.normal((1,))

array_list = [[x1, x2, y]] * 560
tf.data.Dataset.from_tensor_slices(array_list)
#InvalidArgumentError ... values[0].shape = [105,105,3] != values[2].shape = [1]

Fix:使固定：

#flatten to a single list
flatten_list = sum(array_list, [])

#Separate features and labels 
X = tf.squeeze(tf.stack(flatten_list[::3]))
y = tf.squeeze(tf.stack(flatten_list[2::3]))

#construct dataset iterator
ds = tf.data.Dataset.from_tensor_slices((X, y))
for data in ds.take(1):
    print(data)

Answer 2

Your data is in this shape right now...您的数据现在处于这种状态...

x1 = tf.random.normal((105, 105, 3))
x2 = tf.random.normal((105, 105, 3))
y = tf.random.normal((1,))

train_list = [[[x1,x2] , y] , [[x1,x2] , y] , [[x1,x2] , y] , [[x1,x2] , y]]

x1 = [train_list[x][:1][0][0] for x in range(len(train_list))]
x2 = [train_list[x][:1][0][1] for x in range(len(train_list))]
y = [train_list[x][1:] for x in range(len(train_list))]

tf.data.Dataset.from_tensor_slices(((x1 , x2) , y))

<TensorSliceDataset element_spec=((TensorSpec(shape=(105, 105, 3), dtype=tf.float32, name=None), TensorSpec(shape=(105, 105, 3), dtype=tf.float32, name=None)), TensorSpec(shape=(1, 1), dtype=tf.float32, name=None))>

Or Change the Code when you are Loading Images and Labels from Disks This will save time Or Change the Code when you are Loading Images and Labels from Disks这将节省时间

x1 = []
x2 = []
y = []
for i in os.listdir("t"):
    for ii in os.listdir(os.path.join("t", i)):
        x1.append(
                    tensorflow.expand_dims(
                     tensorflow.io.decode_jpeg(
                      tensorflow.io.read_file(os.path.join("t", i, ii) + "\\" + os.listdir(os.path.join("t", i, ii))[0])) / 255, 0))
        x2.append(tensorflow.expand_dims(
                     tensorflow.io.decode_jpeg(
                      tensorflow.io.read_file(os.path.join("t", i, ii) + "\\2.jpeg")) / 255, 0)
                 )
        y.append(tensorflow.convert_to_tensor(
                     float(
                      os.listdir(os.path.join("t", i, ii))[0][0]
                     )
                    ))

tf.data.Dataset.from_tensor_slices(((x1 , x2) , y))

无法将 Python 列表转换为 Tensorflow 数据集（InvalidArgumentError：所有输入的形状必须匹配...）

问题描述

2 个解决方案

解决方案1
1 2022-11-26 11:29:17

解决方案2
0 已采纳 2022-11-26 10:13:17

无法将 Python 列表转换为 Tensorflow 数据集（InvalidArgumentError：所有输入的形状必须匹配...）

问题描述

2 个解决方案

解决方案1 1 2022-11-26 11:29:17

解决方案2 0 已采纳 2022-11-26 10:13:17

解决方案1
1 2022-11-26 11:29:17

解决方案2
0 已采纳 2022-11-26 10:13:17