Here is my overall problem: I am trying to build a tf.data.Dataset containing images. I am reading the imageIds (filenames) from a csv-file to be able to combine them with the corresponding label also found in the same csv-file. CSV-file Screenshot
df_imageIDs = pd.read_csv(file_labels,
usecols=["ImageId"])
df_imageIDs = df_imageIDs.apply(lambda ID: data_dir_train + ID, axis=1)
ds_filenames_images = tf.data.Dataset.from_tensor_slices(df_imageIDs.to_numpy())
After reading the Image IDs with pandas and adding the directory path, I map an image decoding function to my dataset using the method tf.read_file
. This throws the error.
def decode_image(file_path):
img = tf.io.read_file(file_path)
img = tf.image.decode_jpeg(img, channels=parameters["CHANNELS"])
img = tf.image.convert_image_dtype(img, tf.float32)
img = tf.image.resize(img, (parameters["IMAGE_HEIGHT"], parameters["IMAGE_WIDTH"]))
return img
ds_images = ds_filenames_images.map(decode_image,
num_parallel_calls=AUTOTUNE)
Now, what I've tried before, is using tf.data.Dataset.list_files
which worked fine but the ImageIds were in the wrong order.
ds_filenames_imagesx = tf.data.Dataset.list_files(data_dir_train + "*.jpg",
shuffle=False)
The difference seems to be how pandas consumes the filenames although they have the same data type "string". Printing out elements of the later method resulted in this:
tf.Tensor(/home/wid35008/airbus-ship-detection/train_v2/000155de5.jpg, shape=(), dtype=string)
Printing out the tensors of the method I want to use leads to that:
tf.Tensor(['/home/wid35008/airbus-ship-detection/train_v2/0002756f7.jpg'], shape=(1,), dtype=string)
Since I don't know how to interpret the differences nor knowing how to find a workaround. Does anybody have a solution to this problem? Thanks in advance!
So to give a solution to myself and others who might stumble upon it. I reshaped the dataset after creation so the Tensors inside would have the shape of a scalar again.
ds_filenames_images = tf.data.Dataset.from_tensor_slices(df_imageIDs.to_numpy())
ds_filenames_images = ds_filenames_images.map(lambda t: tf.reshape(t, []))
The docs give an more in-depth explanation for that. Tensor Shapes
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.