简体   繁体   English

如何在 TF/Keras 中实现滑动窗口模型进行图像分割?

[英]How to implement a sliding window model in TF/Keras for image segmentation?

I'm working on semantic image segmentation, with U-net based models.我正在使用基于 U-net 的模型进行语义图像分割。 The input images are of different dimensions (between 300-600 pixels in each axis).输入图像的尺寸不同(每个轴在 300-600 像素之间)。 My approach so far was to rescale the images to standard dims, and work from there.到目前为止,我的方法是将图像重新缩放到标准暗度,然后从那里开始工作。

Now I want to try a sliding window approach, extracting eg 64x64 patches from the original images (no rescaling), and train a model on that.现在我想尝试一种滑动窗口方法,从原始图像中提取例如 64x64 的补丁(没有重新缩放),并在其上训练模型。 I'm not sure about how to implement this efficiently.我不确定如何有效地实现这一点。

For the training phase, I already have an online augmentation object (keras Sequence) for random transforms.对于训练阶段,我已经有一个用于随机变换的在线增强对象(keras 序列)。 Should I add a patch extraction process in there?我应该在那里添加补丁提取过程吗? If I do that, I'll be slicing numpy arrays and yielding them, and it doesn't sound very efficient.如果我这样做,我将切片 numpy 数组并生成它们,这听起来效率不高。 Is there a better way to do this?有一个更好的方法吗?

And for the predicting phase, again - should I extract patches from the images in numpy, and feed them to the model?对于预测阶段,我是否应该从 numpy 中的图像中提取补丁,并将它们提供给模型? If I choose overlapping windows (eg patch dims 64x64 and strides 32x32), should I manually (in numpy) weight/average/concat the raw patch predictions from the model, to output a full-scale segmentation?如果我选择重叠窗口(例如补丁变暗 64x64 和步幅 32x32),我是否应该手动(在 numpy 中)权重/平均/连接模型的原始补丁预测,以输出全尺寸分割? Or is there a better way to handle this?或者有没有更好的方法来处理这个问题?

I'm using TF 2.1 btw.顺便说一句,我正在使用 TF 2.1。 Any help is appreciated.任何帮助表示赞赏。

Although it might sounds like it is not efficient to break an image into smaller patches before training your model, it has one huge benefit.尽管在训练模型之前将图像分解成较小的块听起来效率不高,但它有一个巨大的好处。 Before training begins, the optimizer shuffles all data samples which in turns lead to a less-biased model.在训练开始之前,优化器会打乱所有数据样本,从而产生偏差较小的模型。 However, if you feed your model one image by one, and then the optimizer breaks it into smaller patches, it is still being trained over patches of a single image.但是,如果您将模型一张一张地提供给模型,然后优化器将其分解为更小的块,它仍然会在单个图像的块上进行训练。

In order to efficiently break an image into small patches, you can use:为了有效地将图像分成小块,您可以使用:

skimage.util.view_as_windows(arr_in, window_shape, step=1)

You can define window shape and step of the rolling window.您可以定义滚动窗口的窗口形状和步长。 For example:例如:

>>> import numpy as np
>>> from skimage.util.shape import view_as_windows
>>> A = np.arange(4*4).reshape(4,4)
>>> A
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])
>>> window_shape = (2, 2)
>>> B = view_as_windows(A, window_shape)
>>> B[0, 0]
array([[0, 1],
       [4, 5]])
>>> B[0, 1]
array([[1, 2],
       [5, 6]])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM