如何在图像分割中使用带有 TIFF 文件的 tf.Dataset？

Question

I have two sets of files: masks and images.我有两组文件：遮罩和图像。 There is no tiff decoder in 'tensorflow', but there is 'tfio.experimental'. “tensorflow”中没有 tiff 解码器，但有“tfio.experimental”。 Tiff files have more than 4 channels. Tiff 文件有 4 个以上的通道。

this code doesnt work:此代码不起作用：

    import numpy as np
    import tiffile as tiff
    import tensorflow as tf

    for i in range(100):
      a = np.random.random((30, 30, 8))
      b = np.random.randint(10, size = (30, 30, 8))
      tiff.imsave('new1//images'+str(i)+'.tif', a)
      tiff.imsave('new2//images'+str(i)+'.tif', b)

    import glob
    paths1 = glob.glob('new1//*.*')
    paths2 = glob.glob('new2//*.*')

    def load(image_file, mask_file):
      image = tf.io.read_file(image_file)
      image = tfio.experimental.image.decode_tiff(image)

      mask = tf.io.read_file(mask_file)
      mask = tfio.experimental.image.decode_tiff(mask)

      input_image = tf.cast(image, tf.float32)
      mask_image = tf.cast(mask, tf.uint8)
      return input_image, mask_image

    AUTO = tf.data.experimental.AUTOTUNE
    BATCH_SIZE = 32

    dataloader = tf.data.Dataset.from_tensor_slices((paths1, paths2))

    dataloader = (
    dataloader
    .shuffle(1024)
    .map(load, num_parallel_calls=AUTO)
    .batch(BATCH_SIZE)
    .prefetch(AUTO)
    )

it is impossible to keep entire dataset in the memory, saving to numpy arrays also gives no easy solution.不可能将整个数据集保留在 memory 中，保存到 numpy arrays 也没有提供简单的解决方案。 Although code provided above gives no error directly.虽然上面提供的代码直接没有报错。 But shape of images is (None, None, None)但是图像的形状是（无，无，无）

'model.fit' gives error 'model.fit' 给出错误

Is there alternative way to save arrays?有其他方法可以保存 arrays 吗？ I only see bruteforce solution with manual feeding random batches during custom training.我只看到在自定义训练期间手动输入随机批次的蛮力解决方案。

Answer 1

I found solution for my question: DataGenerator allows to work with any files我找到了我的问题的解决方案：DataGenerator allows to work with any files

class Gen(tf.keras.utils.Sequence):

    def __init__(self, x_set, y_set, batch_size):
        self.x, self.y = x_set, y_set
        self.batch_size = batch_size

    def __len__(self):
        return math.ceil(len(self.x) / self.batch_size)

    def __getitem__(self, idx):
        batch_x = self.x[idx * self.batch_size:(idx + 1) *
        self.batch_size]
        batch_y = self.y[idx * self.batch_size:(idx + 1) *
        self.batch_size]

        return np.array([
            tiff.imread(file_name_x)
               for file_name_x in batch_x]), np.array([
            tiff.imread(file_name_y)
               for file_name_y in batch_y])

It works anyway without any problem无论如何它都可以正常工作

如何在图像分割中使用带有 TIFF 文件的 tf.Dataset？

问题描述

1 个解决方案

解决方案1
3 已采纳 2021-01-29 06:48:02

如何在图像分割中使用带有 TIFF 文件的 tf.Dataset？

问题描述

1 个解决方案

解决方案1 3 已采纳 2021-01-29 06:48:02

解决方案1
3 已采纳 2021-01-29 06:48:02