简体繁体 English

深度学习中的图像预处理

[英]Image preprocessing in deep learning

原文 2017-01-02 14:44:36 7 5 image-processing/ deep-learning/ object-detection

I am experimenting with deep learning on images.我正在尝试对图像进行深度学习。 I have about ~4000 images from different cameras with different light conditions, image resolutions and view angle.我有大约 4000 张来自不同相机的图像，它们具有不同的光照条件、图像分辨率和视角。

My question is: What kind of image preprocessing would be helpful for improving object detection?我的问题是：什么样的图像预处理将有助于改进 object 检测？ (For example: contrast/color normalization, denoising, etc.) （例如：对比度/颜色归一化、去噪等）

5 个解决方案

For pre-processing of images before feeding them into the Neural Networks.用于在将图像输入神经网络之前对图像进行预处理。 It is better to make the data Zero Centred .最好使数据以零为中心。 Then try out normalization technique.然后尝试归一化技术。 It certainly will increase the accuracy as the data is scaled in a range than arbitrarily large values or too small values.当数据在一个范围内缩放而不是任意大的值或太小的值时，它肯定会提高准确性。

An example image will be: -示例图像将是：-

Here is a explanation of it from Stanford CS231n 2016 Lectures.这是斯坦福 CS231n 2016 讲座中的解释。

* *

Normalization refers to normalizing the data dimensions so that they are of approximately the same scale.归一化是指对数据维度进行归一化，使其具有大致相同的规模。 For Image data There are two common ways of achieving this normalization.对于图像数据，有两种常见的方法可以实现这种归一化。 One is to divide each dimension by its standard deviation, once it has been zero-centered:一种是将每个维度除以其标准偏差，一旦它以零为中心：
(X /= np.std(X, axis = 0)) . (X /= np.std(X, axis = 0)) 。 Another form of this preprocessing normalizes each dimension so that the min and max along the dimension is -1 and 1 respectively.这种预处理的另一种形式是对每个维度进行归一化，使得沿维度的最小值和最大值分别为 -1 和 1。 It only makes sense to apply this preprocessing if you have a reason to believe that different input features have different scales (or units), but they should be of approximately equal importance to the learning algorithm.如果您有理由相信不同的输入特征具有不同的尺度（或单位），那么应用这种预处理才有意义，但它们对学习算法的重要性应该大致相同。 In case of images, the relative scales of pixels are already approximately equal (and in range from 0 to 255), so it is not strictly necessary to perform this additional preprocessing step.在图像的情况下，像素的相对比例已经大致相等（范围从 0 到 255），因此执行这个额外的预处理步骤并不是绝对必要的。

* *

Link for the above extract:- http://cs231n.github.io/neural-networks-2/以上摘录的链接：- http://cs231n.github.io/neural-networks-2/

This is certainly late reply for this post, but hopefully help who stumble upon this post.这对于这篇文章来说肯定是迟到的回复，但希望能帮助那些偶然发现这篇文章的人。

Here's an article I found online Image Data Pre-Processing for Neural Networks , I though this certainly was a good in article into how the network should be trained.这是我在网上找到的一篇用于神经网络的图像数据预处理的文章，我虽然这肯定是一篇关于如何训练网络的好文章。

Main gist of the article says文章的主要要点说

1) As data(Images) few into the NN should be scaled according the image size that the NN is designed to take, usually a square ie 100x100,250x250 1) 由于数据（图像）很少进入 NN 应根据 NN 设计的图像大小进行缩放，通常是一个正方形，即 100x100,250x250

2) Consider the MEAN (Left Image) and STANDARD DEVIATION (Right Image) value of all the input images in your collection of a particular set of images 2）考虑平均值（左图）和标准偏差（右图）所有输入图像的值的特定图像组的集合

3) Normalizing image inputs done by subtracting the mean from each pixel and then dividing the result by the standard deviation, which makes convergence faster while training the network. 3）通过从每个像素中减去平均值然后将结果除以标准差来对图像输入进行归一化，这使得在训练网络时收敛更快。 This would resemble a Gaussian curve centred at zero这将类似于以零为中心的高斯曲线

4) Dimensionality reduction RGB to Grayscale image, neural network performance is allowed to be invariant to that dimension, or to make the training problem more tractable 4）降维RGB到灰度图像，允许神经网络性能对该维度保持不变，或者使训练问题更易于处理

In addition to what is mentioned above, a great way to improve the quality of Low-Resolution images(LR) would be to do super-resolution using deep learning.除了上面提到的之外，提高低分辨率图像 (LR) 质量的一个好方法是使用深度学习进行超分辨率。 What this would mean is to make a deep learning model that would convert low-resolution image to high resolution.这意味着创建一个深度学习模型，将低分辨率图像转换为高分辨率。 We can convert a high-resolution image to a low-resolution image by applying degradation functions(filters such as blurring).我们可以通过应用退化函数（过滤器，如模糊）将高分辨率图像转换为低分辨率图像。 This would essentially mean LR = degradation(HR) where the degradation function would convert the high-resolution image to low resolution.这实质上意味着 LR = 退化（HR），其中退化函数会将高分辨率图像转换为低分辨率。 If we can find the inverse of this function, then we convert a low-resolution image to a high resolution.如果我们能找到这个函数的逆函数，那么我们就可以将低分辨率图像转换为高分辨率图像。 This can be treated as a supervised learning problem and solved using deep learning to find the inverse function.这可以被视为监督学习问题，并使用深度学习找到反函数来解决。 Came across this interesting article on introduction to super-resolution using deep learning.阅读了这篇关于使用深度学习介绍超分辨率的有趣文章。 I hope this helps.我希望这有帮助。

Have a read through this , hopefully that will be helpful.要通过读这个，希望这会有所帮助。 The idea is to split the input image into parts.这个想法是将输入图像分成几部分。 This is called R-CNN ( here are some examples).这称为 R-CNN（这里有一些示例）。 There are two stages to this process, object detection and segmentation.这个过程有两个阶段，目标检测和分割。 Object detection is the process where certain objects in the foreground are detected by observing changes in gradient.对象检测是通过观察梯度变化来检测前景中的某些对象的过程。 Segmentation is the process where the objects are put together in an image with high contrast.分割是将对象放在具有高对比度的图像中的过程。 High level image detectors use bayesian optimization which can detect what could happen next using the local optimization point.高级图像检测器使用贝叶斯优化，它可以使用局部优化点检测接下来会发生什么。

Basically, in answer to your question, all of the pre-processing options you have given seem to be good.基本上，为了回答您的问题，您提供的所有预处理选项似乎都很好。 As the contrast and colour normalization makes the computer recognise different objects and denoising will make the gradients more easy to distinguish.由于对比度和颜色归一化使计算机识别不同的对象，去噪会使渐变更容易区分。

I hope all of this information is useful to you!我希望所有这些信息对你有用！

In order for you to improve the image you first of all need to identify the issue in that image, eg, low contrast, non-uniform illumination, etc. Once you are able to identify the issue in your dataset you will be able to find the right solution and apply it to it.为了让您改进图像，您首先需要识别该图像中的问题，例如低对比度、非均匀照明等。一旦您能够识别数据集中的问题，您将能够找到正确的解决方案并将其应用于它。 That will able to improve your object detection accuracy and the results.这将能够提高您的 object 检测精度和结果。