简体   繁体   English

图像预处理和数据扩充应如何进行语义分割?

[英]How should image preprocessing and data augmentation be for semantic segmentation?

I have an imbalanced and small dataset which contains 4116 224x224x3 (RGB) aerial images. 我有一个不平衡的小型数据集,其中包含4116 224x224x3(RGB)航拍图像。 It's very likely that I will encounter the overfitting problem since the dataset is not big enough. 由于数据集不够大,我很可能会遇到过拟合问题。 Image preprocessing and data augmentation help to tackle this problem as explained below. 图像预处理和数据增强有助于解决此问题,如下所述。

"Overfitting is caused by having too few samples to learn from, rendering you unable to train a model that can generalize to new data. Given infinite data, your model would be exposed to every possible aspect of the data distribution at hand: you would never overfit. Data augmentation takes the approach of generating more training data from existing training samples, by augmenting the samples via a number of random transformations that yield believable-looking images." “过度拟合是由于要学习的样本太少所致,使您无法训练可以推广到新数据的模型。在无限数据的情况下,模型将暴露于手头数据分布的各个方面:您永远不会数据增强采用通过大量随机变换对样本进行扩增,从而生成看起来可信的图像,从而从现有的训练样本中生成更多训练数据的方法。”

Deep Learning with Python by François Chollet, page 138-139, 5.2.5 Using data augmentation . FrançoisChollet撰写的Python深度学习,第138-139页,5.2.5使用数据扩充

I've read Medium - Image Data Preprocessing for Neural Networks and examined Stanford's CS230 - Data Preprocessing and CS231 - Data Preprocessing courses. 我已经阅读了“ 介质-神经网络的图像数据预处理”,并阅读了斯坦福大学的CS230-数据预处理CS231-数据预处理课程。 It is highlighted once more in SO question and I understand that there is no "one fits all" solution. SO问题中再次强调了这一点,我了解到没有“一刀切”的解决方案。 Here is what forced me to ask this question: 这是迫使我问这个问题的原因:

"No translation augmentation was used since we want to achieve high spatial resolution." “因为我们要实现高空间分辨率,所以没有使用翻译增强。”

Reference: Researchgate - Semantic Segmentation of Small Objects and Modeling of Uncertainty in Urban Remote Sensing Images Using Deep Convolutional Neural Networks 参考:Researchgate-使用深度卷积神经网络对城市遥感图像中的小对象进行语义分割和不确定性建模



I know that I will use Keras - ImageDataGenerator Class , but don't know which techniques and what parameters to use for the semantic segmentation on small objects task. 我知道我将使用Keras-ImageDataGenerator类 ,但不知道在小对象任务上进行语义分割时要使用哪些技术和参数。 Could someone enlighten me? 有人可以启发我吗? Thanks in advance. 提前致谢。 :) :)

from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(
    rotation_range=20,      # is a value in degrees (0–180)
    width_shift_range=0.2,  # is a range within which to randomly translate pictures horizontally.
    height_shift_range=0.2, # is a range within which to randomly translate pictures vertically.
    shear_range=0.2,        # is for randomly applying shearing transformations.
    zoom_range=0.2,         # is for randomly zooming inside pictures.
    horizontal_flip=True,   # is for randomly flipping half the images horizontally
    fill_mode='nearest',    # is the strategy used for filling in newly created pixels, which can appear after a rotation or a width/height shift
    featurewise_center=True,
    featurewise_std_normalization=True)

datagen.fit(X_train)

The augmentation and preprocessing phases are always depending on the problem that you have. 扩充和预处理阶段始终取决于您遇到的问题。 You have to think of all the possible augmentation which can enlarge your dataset. 您必须考虑所有可能扩大数据集的扩充。 But the most important thing is, that you should not perform extreme augmentations, which makes new training samples in the way which can not happen in real examples. 但是最重​​要的是,您不应该执行极端增强,这会以实际示例中无法发生的方式生成新的训练样本。 If you do not expect that the real examples will be horizontally flipped do not perform horizontal flip, since this will give your model false information. 如果您不希望实际的示例会被水平翻转,请不要执行水平翻转,因为这会给您的模型错误的信息。 Think of all the possible changes that can happen in your input images and try to artificially produce new images from your existing one. 考虑一下输入图像中可能发生的所有可能更改,并尝试从现有图像中人为地生成新图像。 You can use a lot of built-in functions from Keras. 您可以使用Keras的许多内置函数。 But you should be aware of each that it will not make new examples which are not likely to be present on the input of your model. 但是,您应该意识到每种方法都不会产生新的示例,这些新示例不太可能出现在模型的输入中。

As you said, there is no "one fits all" solution, because everything is dependent on the data. 正如您所说,没有万能的解决方案,因为一切都取决于数据。 Analyse the data and build everything with respect to it. 分析数据并建立与之相关的一切。

About the small objects - one direction which you should check are the loss functions which emphasise the impact of target volumes in comparison to the background. 关于小物体-您应该检查的一个方向是损失函数,该函数强调目标体积与背景相比的影响。 Look at the Dice Loss or Generalised Dice Loss. 查看骰子损失或广义骰子损失。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 对于语义分割,如何在 Pytorch 中执行数据增强? - For semantic segmentation how to perform data augmentation in Pytorch? 图像分割后的数据增强 - Data augmentation after image segmentation Caffe:如何加载图像数据进行语义分割 - Caffe: How to load image data for semantic segmentation 如何在 keras.preprocessing.image.ImageDataGenerator 前后显示数据增强的结果 - how to show results of data augmentation before and after keras.preprocessing.image.ImageDataGenerator 如何在 tensorflow 中应用自定义数据增强作为预处理层? - How do I apply custom data augmentation as preprocessing layer in tensorflow? 即时进行数据扩充以进行语义分割,我的python层定义正确吗? - Data augmentation on-the-fly for semantic segmentation, Is my python layer definition correct? 用于3D语义分割任务的Keras预处理 - Keras preprocessing for 3D semantic segmentation task 如何在 Pytorch 中使用 torchvision.transforms 进行分割任务的数据增强? - How to use torchvision.transforms for data augmentation of segmentation task in Pytorch? tensorflow:如何旋转图像以进行数据增强? - tensorflow: how to rotate an image for data augmentation? Keras 图像数据增强 - Keras Image data augmentation
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM