简体   繁体   中英

ValueError: Shape must be rank 0 but is rank 1 for 'ReadFile' (op: 'ReadFile') with input shapes: [1]

Here is my overall problem: I am trying to build a tf.data.Dataset containing images. I am reading the imageIds (filenames) from a csv-file to be able to combine them with the corresponding label also found in the same csv-file. CSV-file Screenshot

df_imageIDs = pd.read_csv(file_labels,
                          usecols=["ImageId"])
df_imageIDs = df_imageIDs.apply(lambda ID: data_dir_train + ID, axis=1)
ds_filenames_images = tf.data.Dataset.from_tensor_slices(df_imageIDs.to_numpy())

After reading the Image IDs with pandas and adding the directory path, I map an image decoding function to my dataset using the method tf.read_file . This throws the error.

def decode_image(file_path):
    img = tf.io.read_file(file_path)
    img = tf.image.decode_jpeg(img, channels=parameters["CHANNELS"])
    img = tf.image.convert_image_dtype(img, tf.float32)
    img = tf.image.resize(img, (parameters["IMAGE_HEIGHT"], parameters["IMAGE_WIDTH"]))
    return img

ds_images = ds_filenames_images.map(decode_image,
                                    num_parallel_calls=AUTOTUNE)

Now, what I've tried before, is using tf.data.Dataset.list_files which worked fine but the ImageIds were in the wrong order.

ds_filenames_imagesx = tf.data.Dataset.list_files(data_dir_train + "*.jpg",
                                                  shuffle=False)

The difference seems to be how pandas consumes the filenames although they have the same data type "string". Printing out elements of the later method resulted in this:

tf.Tensor(/home/wid35008/airbus-ship-detection/train_v2/000155de5.jpg, shape=(), dtype=string)

Printing out the tensors of the method I want to use leads to that:

tf.Tensor(['/home/wid35008/airbus-ship-detection/train_v2/0002756f7.jpg'], shape=(1,), dtype=string)

Since I don't know how to interpret the differences nor knowing how to find a workaround. Does anybody have a solution to this problem? Thanks in advance!

So to give a solution to myself and others who might stumble upon it. I reshaped the dataset after creation so the Tensors inside would have the shape of a scalar again.

ds_filenames_images = tf.data.Dataset.from_tensor_slices(df_imageIDs.to_numpy())
ds_filenames_images = ds_filenames_images.map(lambda t: tf.reshape(t, []))

The docs give an more in-depth explanation for that. Tensor Shapes

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM