简体   繁体   English

没有 CNN 的重复模式的图像语义分割

[英]Image semantic segmentation of repeating patterns without CNNs

Suppose I have one or multiple tiles consisting of a single pattern (eg materials like: wood, concrete, gravel...) that I would like to train my classifier on, and then I'll use the trained classifier to determine to which class each pixel in another image belong.假设我有一个或多个由单一图案组成的瓷砖(例如:木材、混凝土、砾石......另一个图像中的每个像素都属于。

Below are example of two tiles I would like to train the classifier on:下面是我想训练分类器的两个图块的示例:

具体的 木头

And let's say I want to segment the image below to identify the pixels belonging to the door and those belonging to the wall.假设我想分割下面的图像以识别属于门的像素和属于墙壁的像素。 It's just an example, I know this image isn't made of exactly the same patterns as the tiles above:这只是一个例子,我知道这张图片不是由与上面的瓷砖完全相同的图案组成的:

门

For this specific problem, is it necessary to use convolutional neural networks?对于这个具体问题,是否有必要使用卷积神经网络? Or is there a way to achieve my goal with a shallow neural network or any other classifier, combined with texture features for example?或者有没有办法通过浅层神经网络或任何其他分类器来实现我的目标,例如结合纹理特征?

I've already implemented a classifier with Scikit-learn which works on tile pixels individually (see code below where training_data is a vector of singletons), but I want instead to train the classifier on texture patterns.我已经使用 Scikit-learn 实现了一个分类器,它单独处理平铺像素(参见下面的代码,其中training_data是单例向量),但我想在纹理模式上训练分类器。

# train classifier
classifier = SGDClassifier()
classifier.fit(training_data, training_target)

# classify given image
test_data = image_gray.flatten().reshape((-1, 1))
predictions = classifier.predict(test_data)
image_classified = predictions.reshape(image_gray.shape)

I was reading this review of recent deep learning methods used for image segmentation and the results seem accurate, but since I've never used any CNN before I feel intimidated by it.我正在阅读这篇关于最近用于图像分割的深度学习方法的评论,结果看起来很准确,但由于我从未使用过任何 CNN,因此我感到被它吓倒了。

Convolutional Neural Networks (CNNs) are high performance tools for image recognition (including semantic segmentation) and have been shown to be very sensitive to texture .卷积神经网络 (CNN) 是用于图像识别(包括语义分割)的高性能工具,并且已被证明对纹理非常敏感 The field of computer vision has been around way before the current wave of interest in deep learning, however, and there are various other tools that are still relevant - often with smaller requirements for computational resources and/or training data.然而,计算机视觉领域在当前对深度学习的兴趣浪潮之前就已经存在,并且还有各种其他工具仍然相关 - 通常对计算资源和/或训练数据的要求较小。

For this specific problem, is it necessary to use convolutional neural networks?对于这个具体问题,是否有必要使用卷积神经网络?

It very much depends on what your metrics for success are.这在很大程度上取决于您的成功指标是什么。 There are other tools that do not involve the use of CNNs - whether they will give you a satisfactory level of detection accuracy can only be determined by practical testing.还有其他工具不涉及 CNN 的使用——它们是否会给你一个令人满意的检测精度水平只能通过实际测试来确定。

Or is there a way to achieve my goal with a shallow neural network or any other classifier, combined with texture features for example?或者有没有办法通过浅层神经网络或任何其他分类器来实现我的目标,例如结合纹理特征?

A shallow neural network will have some detection capability, although (unlike CNNs), they do not exhibit translational invariance and so are sensitive to small displacements of the target.浅层神经网络将具有一定的检测能力,尽管(与 CNN 不同)它们不表现出平移不变性,因此对目标的小位移很敏感。 Such a network is likely to have more success if used to classify small patches of the image;如果用于对图像的小块进行分类,这样的网络可能会取得更大的成功; classifying an image patch within a sliding window is not that unlike how a CNN works, of course.当然,在滑动 window 中对图像块进行分类与 CNN 的工作方式并没有什么不同。 It is also possible to approximate a CNN using an equivalent multi-layer perceptron (MLP) - that would be another approach, if your definition of 'shallow' permits.也可以使用等效的多层感知器 (MLP) 来逼近 CNN - 如果您对“浅”的定义允许,这将是另一种方法。

Two approaches that do not require neural networks:两种不需要神经网络的方法:

Histogram of Oriented Gradients The HOG descriptor extracts image features using a histogram of gradients in the horizontal and vertical axis.定向梯度直方图HOG 描述符使用水平和垂直轴上的梯度直方图提取图像特征。 This produces a feature vector, that can be classified - such as using a Support Vector Machine (SVM) or shallow neural network (MLP), for example.这会产生一个可以分类的特征向量,例如使用支持向量机 (SVM) 或浅层神经网络 (MLP)。 This would be a viable approach to classifying image patches without using CNNs.这将是一种在不使用 CNN 的情况下对图像块进行分类的可行方法。 The scikit-image package has a HOG function , and there is a full worked example of classification of HOG features here . scikit-image package 有一个HOG function这里有一个完整的 HOG 特征分类示例。 From the documentation:从文档中:

from skimage.feature import hog
from skimage import data, exposure

image = data.astronaut()

fd, hog_image = hog(image, orientations=8, pixels_per_cell=(16, 16),
                    cells_per_block=(1, 1), visualize=True, multichannel=True)

Felsenszwalb's efficient graph based image segmentation There are a bunch of segmentation algorithms in the scikit-image.segmentation toolbox. Felsenszwalb 的基于图的高效图像分割scikit-image.segmentation工具箱中有一堆分割算法。 Felsenszwalb's is one of them, which (broadly speaking) performs a clustering of image regions based on edges. Felsenszwalb 就是其中之一,它(广义地说)基于边缘对图像区域进行聚类。 More info here . 更多信息在这里 From the module documentation:从模块文档:

from skimage.segmentation import felzenszwalb
from skimage.data import coffee
img = coffee()
segments = felzenszwalb(img, scale=3.0, sigma=0.95, min_size=5)

Hope that helps.希望有帮助。

You can use U-Net or SegNet for image segmentation.您可以使用U-NetSegNet进行图像分割。 In fact you add residual layers to your CNN to get this result:实际上,您将残差层添加到 CNN 以获得以下结果:

图像分割

About U-Net :关于优网

Arxiv: U-Net: Convolutional Networks for Biomedical Image Segmentation Arxiv: U-Net:用于生物医学图像分割的卷积网络

U-Net 架构

Seg-Net :分段网

Arxiv: SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation Arxiv: SegNet:用于图像分割的深度卷积编码器-解码器架构

隔离网架构

Here are Simple Examples of Codes: keras==1.1.0以下是简单的代码示例: keras==1.1.0

U-Net:网络:

shape=60
batch_size = 30
nb_classes = 10
img_rows, img_cols = shape, shape
nb_filters = 32
pool_size = (2, 2)
kernel_size = (3, 3)
input_shape=(shape,shape,1)

reg=0.001
learning_rate = 0.013
decay_rate = 5e-5
momentum = 0.9

sgd = SGD(lr=learning_rate,momentum=momentum, decay=decay_rate, nesterov=True)
shape2

recog0 = Sequential()
recog0.add(Convolution2D(20, 3,3,
                        border_mode='valid',
                        input_shape=input_shape))
recog0.add(BatchNormalization(mode=2))

recog=recog0
recog.add(Activation('relu'))
recog.add(MaxPooling2D(pool_size=(2,2)))
recog.add(UpSampling2D(size=(2, 2)))
recog.add(Convolution2D(20, 3, 3,init='glorot_uniform'))
recog.add(BatchNormalization(mode=2))
recog.add(Activation('relu'))

for i in range(0,2):
    print(i,recog0.layers[i].name)

recog_res=recog0
part=1
recog0.layers[part].name
get_0_layer_output = K.function([recog0.layers[0].input, K.learning_phase()],[recog0.layers[part].output])

get_0_layer_output([x_train, 0])[0][0]

pred=[np.argmax(get_0_layer_output([x_train, 0])[0][i]) for i in range(0,len(x_train))]

loss=x_train-pred
loss=loss.astype('float32')

recog_res.add(Lambda(lambda x: x,input_shape=(56,56,20),output_shape=(56,56,20)))
recog2=Sequential()
recog2.add(Merge([recog,recog_res],mode='ave'))
recog2.add(Activation('relu'))
recog2.add(Convolution2D(20, 3, 3,init='glorot_uniform'))
recog2.add(BatchNormalization(mode=2))
recog2.add(Activation('relu'))
recog2.add(Convolution2D(1, 1, 1,init='glorot_uniform'))
recog2.add(Reshape((shape2,shape2,1)))
recog2.add(Activation('relu'))

recog2.compile(loss='mean_squared_error', optimizer=sgd,metrics = ['mae'])
recog2.summary()

x_train3=x_train2.reshape((1,shape2,shape2,1))

recog2.fit(x_train,x_train3,
                nb_epoch=25,
                batch_size=30,verbose=1)

SegNet:分段网:

shape=60
batch_size = 30
nb_classes = 10
img_rows, img_cols = shape, shape
nb_filters = 32
pool_size = (2, 2)
kernel_size = (3, 3)
input_shape=(shape,shape,1)

reg=0.001
learning_rate = 0.012
decay_rate = 5e-5
momentum = 0.9

sgd = SGD(lr=learning_rate,momentum=momentum, decay=decay_rate, nesterov=True)

recog0 = Sequential()
recog0.add(Convolution2D(20, 4,4,
                        border_mode='valid',
                        input_shape=input_shape))
recog0.add(BatchNormalization(mode=2))
recog0.add(MaxPooling2D(pool_size=(2,2)))

recog=recog0
recog.add(Activation('relu'))
recog.add(MaxPooling2D(pool_size=(2,2)))
recog.add(UpSampling2D(size=(2, 2)))
recog.add(Convolution2D(20, 1, 1,init='glorot_uniform'))
recog.add(BatchNormalization(mode=2))
recog.add(Activation('relu'))

for i in range(0,8):
    print(i,recog0.layers[i].name)

recog_res=recog0
part=8
recog0.layers[part].name
get_0_layer_output = K.function([recog0.layers[0].input, K.learning_phase()],[recog0.layers[part].output])
get_0_layer_output([x_train, 0])[0][0]
pred=[np.argmax(get_0_layer_output([x_train, 0])[0][i]) for i in range(0,len(x_train))]

loss=x_train-pred
loss=loss.astype('float32')

recog_res.add(Lambda(lambda x: x-np.mean(loss),input_shape=(28,28,20),output_shape=(28,28,20)))

recog2=Sequential()
recog2.add(Merge([recog,recog_res],mode='sum'))
recog2.add(UpSampling2D(size=(2, 2)))
recog2.add(Convolution2D(1, 3, 3,init='glorot_uniform'))
recog2.add(BatchNormalization(mode=2))
recog2.add(Reshape((shape2*shape2,)))
recog2.add(Reshape((shape2,shape2,1)))
recog2.add(Activation('relu'))
recog2.compile(loss='mean_squared_error', optimizer=sgd,metrics = ['mae'])
recog2.summary()

x_train3=x_train2.reshape((1,shape2,shape2,1))

recog2.fit(x_train,x_train3,
                nb_epoch=400,
                batch_size=30,verbose=1)

Then add a threshold for the colors of segmentation.然后为分割的colors添加一个阈值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM