简体   繁体   English

Keras如何在具有CNN和密集层的网络中设置尺寸?

[英]How does Keras set the dimensions in this network which has CNN and dense layers?

I need some help to understand what's going on here. 我需要一些帮助以了解此处的情况。

My goal is to have a network that receives sizeXsize images and returns sizeXsize binary matrices. 我的目标是拥有一个接收sizeXsize图像并返回sizeXsize二进制矩阵的网络。 The output of the network should be a binary sizeXsize matrix that indicates if a pixel has a feature or not. 网络的输出应为二进制sizeXsize矩阵,该矩阵指示像素是否具有特征。

For example, think of a corner detection network where the output layer tells if a pixel is exactly a tip of the corner. 例如,考虑一个拐角检测网络,其中输出层会判断像素是否正好位于拐角的尖端。 Namely, we want to detect only the pixel of this corner: 即,我们只想检测此角的像素:

在此处输入图片说明

The first layers in the networks are defined as follows: 网络中的第一层定义如下:

from keras import models, layers
import numpy as np

size=5

input_image = layers.Input(shape=(size, size, 1))

b = layers.Conv2D(5, (3,3), activation='relu', padding='same')(input_image)
b = layers.MaxPooling2D((2,2), strides=1,  padding='same')(b)
b = layers.Conv2D(5, (3,3), activation='relu', padding='same')(b)
b_out = layers.MaxPooling2D((2,2),strides=1 ,padding='same')(b)

Until now I maintained the dimensions of the original input layer ( sizeXsize ). 直到现在,我仍保留原始输入层的尺寸( sizeXsize )。

Now I would like to have a dense layer as an output layer with sizeXsize pixels. 现在,我想将密集层作为具有sizeXsize像素的输出层。

If I use output = layers.Dense(size, activation='sigmoid')(b_out) the layer built is sizeXsizeXsize , and if I do output = layers.Dense(1, activation='sigmoid')(b_out) the size is sizeXsize , how comes?! 如果我使用output = layers.Dense(size, activation='sigmoid')(b_out)则构建的图层为sizeXsizeXsize ,如果我使用output = layers.Dense(1, activation='sigmoid')(b_out)则尺寸为sizeXsize ,怎么来的?

This is the building and the compilation part of the code: 这是代码的构建和编译部分:

model = models.Model(input_image, output)
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()

What do I miss here? 我在这里想念什么? Isn't output = layers.Dense(1, activation='sigmoid')(b_out) just a single neuron? output = layers.Dense(1, activation='sigmoid')(b_out)不是output = layers.Dense(1, activation='sigmoid')(b_out)只是单个神经元吗?

The thing is that if I train: 问题是,如果我训练:

n_images=100
data = np.random.randint(0,2,(n_images,size,size,1))
labels = np.random.randint(0,2,(n_images,size,size,1))
labels = data

model.fit(data, labels, verbose=1, batch_size=4, epochs=20)

and if I test it: 如果我测试它:

data1 = np.random.randint(0,2,(n_images,size,size,1))
score, acc = model.evaluate(data1,data1, verbose=1)


print('Test score:', score)
print('Test accuracy:', acc)

a=np.random.randint(0,2,(1,size,size,1))
prediction = model.predict(a)

print(a==np.round(prediction))

I get a good accuracy, and it seems the sizes are correct for the output layer: 我得到了很好的准确性,并且看来输出层的大小是正确的:

100/100 [==============================] - 0s 349us/step
Test score: 0.187119951248
Test accuracy: 0.926799981594
[[[[ True]
   [ True]
   [ True]
   [ True]
   [ True]]

  [[ True]
   [ True]
   [ True]
   [ True]
   [ True]]

  [[ True]
   [ True]
   [ True]
   [ True]
   [ True]]

  [[ True]
   [ True]
   [ True]
   [ True]
   [ True]]

  [[ True]
   [ True]
   [ True]
   [ True]
   [ True]]]]

If I read Dense documentation: 如果我阅读密集文档:

units: Positive integer, dimensionality of the output space. 单位:正整数,输出空间的维数。

So how comes if I put layers.Dense(1, activation='sigmoid')(b_out) I get an output layer of sizeXsize ? 所以,如果我把怎么弄layers.Dense(1, activation='sigmoid')(b_out)我得到的输出层sizeXsize

The trick is not to use the conventional Dense layer, but use a convolutional layer with kernel size (1,1) , ie you need something like below: 诀窍不是使用常规的Dense层,而是使用内核大小为(1,1)的卷积层,即您需要以下内容:

b = layers.Conv2D(5, (3,3), activation='relu', padding='same')(input_image)
b = layers.MaxPooling2D((2,2), strides=1,  padding='same')(b)
b = layers.Conv2D(5, (3,3), activation='relu', padding='same')(b)
b = layers.MaxPooling2D((2,2),strides=1 ,padding='same')(b)
# not use Dense, but Conv2D
binary_out = layers.Conv2D(1, (1,1), activation='sigmoid', padding='same')(b)

Your confusion stems from the fact that Dense layer is currently implemented such that it is applied on the last axis of input data . 造成混淆的原因是,当前已实现了Dense层,以便将其应用于输入数据的最后一个轴 That's why when you feed the output of MaxPooling layer (ie b_out ), which has a shape of (size, size, 5) , to a Dense layer with one unit you get an output of shape (size, size, 1) . 这就是为什么将具有形状(size, size, 5)的MaxPooling层的输出(即b_out )馈送到具有一个单位的Dense层时会得到形状(size, size, 1)的输出的原因。 In this case, the single neuron in the Dense layer is connected to each of 5 elements in the output array, though with the same weights (that's why if you take a look at the summary() output, you would see that the Dense layer has 6 parameters, 5 weights plus one bias parameter). 在这种情况下,虽然具有相同的权重,但Dense层中的单个神经元连接到输出数组中的5个元素中的每一个(这就是为什么如果您查看summary()输出,您会看到Dense层有6个参数,5个权重和一个偏置参数)。

You can either use a Dense layer (with one unit) or a Conv2D layer (with one filter) as the last layer. 您可以将Dense层(带有一个单元)或Conv2D层(带有一个滤镜)用作最后一层。 If you ask which one works better the answer is It depends on the specific problem you are working on and the data you have. 如果您问哪个更好,答案是:它取决于您正在处理的特定问题和所拥有的数据。 However, you can take some ideas from image segmentation networks where first the image is processed with a combination of Conv2D and MaxPooling2D layers (and its dimension is reduced as we go forward in the model) and then some upsample layers and Conv2D layers are used to get back the image with the same size as input image. 但是,您可以从图像分割网络中获取一些想法,在这种网络中,首先使用Conv2D和MaxPooling2D层的组合来处理图像(随着模型的进行,其尺寸会减小),然后使用一些上采样层和Conv2D层取回与输入图像相同大小的图像。 Here is a sketch (though, you don't need to use TimeDistributed and LSTM layers for your case). 是一个草图(不过,您的案例不需要使用TimeDistributedLSTM层)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM