如何使用Keras构建多类卷积神经网络

Question

I am trying to implement a U-Net with Keras with Tensorflow backend for an image segmentation task. 我正在尝试使用带有Keras和Tensorflow后端的U-Net来实现图像分割任务。 I have images of size (128,96) as input to the network together with mask images of size (12288,6) since they are flattened. 我将大小为（128,96）的图像与大小为（12288,6）的蒙版图像一起输入到网络中，因为它们被展平了。 I have 6 different classes (0-5) which gives the second part of the mask images' shape. 我有6个不同的类（0-5），它们给出了蒙版图像形状的第二部分。 They have been encoded to one-hot labels using the to_categorical() function. 使用to_categorical（）函数将它们编码为一键式标签。 At the moment I use just one input image and also use the same one as validation and test data. 目前，我只使用一张输入图像，也使用同一张图像作为验证和测试数据。

I would like the U-Net to perform image segmentation, where class 0 corresponds to the background. 我希望U-Net进行图像分割，其中0类对应于背景。 When I now train my U-Net only for a few epochs (1-10), the resulting predicted mask image seems to just give random classes to each pixel. 现在，当我仅在几个时期（1-10）内训练U-Net时，所得的预测蒙版图像似乎只是为每个像素提供了随机类别。 When I train the network longer (50+ epochs), all pixels are classified as background. 当我训练网络的时间更长（超过50个纪元）时，所有像素都被归类为背景。 Since I train and test using the same image, I find this very weird as I was expedting the network to overtrain. 由于我使用相同的图像进行训练和测试，因此在加速网络进行过度训练时，我发现这很奇怪。 How can I fix this problem? 我该如何解决这个问题？ Could there be something wrong with the way I give mask images and the real images to the network? 我将遮罩图像和真实图像提供给网络的方式是否有问题？

I have tried giving weights to the network manually to put less emphasis on background than the other classes and have tried different combinations of losses, different ways of shaping the mask image and many other things but nothing gave good results. 我已经尝试过手动给网络赋权，以减少对背景的关注，并且尝试了不同的损失组合，不同的蒙版图像塑造方法以及许多其他事情，但是没有任何结果能带来良好的结果。

Below is the code of my network. 下面是我的网络代码。 It is based on the U-Net taken from this repository . 它基于从此存储库中获取的U-Net。 I managed to train it for the two class case with good results but I don't know how to extend it to more classes now. 我设法在两节课的情况下对它进行了培训，并取得了良好的效果，但是我现在不知道如何将其扩展到更多的课。

def get_unet(self):

    inputs = Input((128, 96,1))
    #Input shape=(?,128,96,1)

    conv1 = Conv2D(64, (3,3), activation = 'relu', padding = 'same',
      kernel_initializer = 'he_normal', input_shape=(None,128,96,6))(inputs)
    #Conv1 shape=(?,128,96,64)
    conv1 = Conv2D(64, (3,3), activation = 'relu', padding = 'same',
          kernel_initializer = 'he_normal')(conv1)
    #Conv1 shape=(?,128,96,64)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
    #pool1 shape=(?,64,48,64)


    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same',
         kernel_initializer = 'he_normal')(pool1)
    #Conv2 shape=(?,64,48,128)
    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same',
         kernel_initializer = 'he_normal')(conv2)
    #Conv2 shape=(?,64,48,128)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
    #Pool2 shape=(?,32,24,128)

    conv5 = Conv2D(256, (3,3), activation = 'relu', padding = 'same',
         kernel_initializer = 'he_normal')(pool2)
    conv5 = Conv2D(256, (3,3), activation = 'relu', padding = 'same',
         kernel_initializer = 'he_normal')(conv5)

    up8 = Conv2D(128, 2, activation = 'relu', padding = 'same',
        kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv5))
    merge8 = concatenate([conv2,up8], axis = 3)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same',
         kernel_initializer = 'he_normal')(merge8)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same',
         kernel_initializer = 'he_normal')(conv8)


    up9 = Conv2D(64, (2,2), activation = 'relu', padding = 'same',
        kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv8))
    merge9 = concatenate([conv1,up9], axis = 3)
    conv9 = Conv2D(64, (3,3), activation = 'relu', padding = 'same',
        kernel_initializer = 'he_normal')(merge9)
    conv9 = Conv2D(64, (3,3), activation = 'relu', padding = 'same',
        kernel_initializer = 'he_normal')(conv9)
    conv9 = Conv2D(6, (3,3), activation = 'relu', padding = 'same',
        kernel_initializer = 'he_normal')(conv9)

    conv10 = Conv2D(6, (1,1), activation = 'sigmoid')(conv9)
    conv10 = Reshape((128*96,6))(conv10)

    model = Model(input = inputs, output = conv10)
    model.compile(optimizer = Adam(lr = 1e-5), loss = 'binary_crossentropy',
          metrics = ['accuracy'])

    return model

Can anyone point out what is wrong with my model? 谁能指出我的模型出了什么问题？

Answer 1

Thank you @Daniel, your suggestions helped me in the end to get the Unet to work. 谢谢@Daniel，您的建议最终帮助我使Unet正常工作。 I managed to get results that did not just classify the whole image as background when running 500+ epochs. 运行500多个纪元时，我设法获得的结果不只是将整个图像分类为背景。 Also, instead of using kernel_initializer='he_normal' , kernel_initializer='zeros' or kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.07) worked for me. 另外，代替使用kernel_initializer='he_normal' ， kernel_initializer='zeros'或kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.07)为我工作。 I used 'sigmoid' activation function and loss='binary_crossentropy' . 我使用了'sigmoid'激活函数和loss='binary_crossentropy' 。 I kept the 'relu' activation for all the hidden convolutional layers. 我为所有隐藏的卷积层保留了“ relu”激活。 I noticed that my network will sometimes be stuck in a local minimum where the loss does not improve anymore, so I need to restart. 我注意到我的网络有时会停留在一个本地最小值范围内，在此范围内损耗不再改善，因此我需要重新启动。

Answer 2

I don't see your prediction layer which as far as I know must be a dense layer and not a convolutional layer. 据我所知，我看不到您的预测层必须是密集层而不是卷积层。 Maybe that's your problem. 也许那是你的问题。

Answer 3

In my experience, also with a U-net for segmentation. 以我的经验，还可以使用U-net进行细分。 It tends to do this: 它倾向于这样做：

Go to totally black or totally white 变成全黑或全白
After a lot of time in which the loss seems to be frozen, it finds it's way. 经过很多时间，损失似乎被冻结了，它发现了。

I also use the "train just one image" method to find that convergence, then adding the other images is ok. 我还使用“仅训练一张图像”的方法来找到收敛，然后添加其他图像即可。

But I had to try a lot of times, and the only time it worked pretty fast was when I used: 但是我不得不尝试很多次，并且它运行得非常快的唯一一次是我使用时：

final activation = 'sigmoid' 最终激活='Sigmoid'
loss = 'binary_crossentropy' 损失='binary_crossentropy'

But I wasn't using "relu" anywhere...perhaps that influences a little the convergence speed...? 但是我没有在任何地方使用“ relu”……也许会影响收敛速度……？ Thinking about "relu", which has only 0 or positive results, there is a big region in this function that does not have a gradient. 考虑只有0或正数结果的“ relu”，此函数中有一个很大的区域没有梯度。 Maybe having lots of "relu" activations creates a lot of "flat" areas without gradients? 也许有很多“ relu”激活会创建很多没有梯度的“平坦”区域？ (Must think better about it to confirm) （必须仔细考虑才能确认）

Try a few times (and have patience to wait for many many epochs) with different weight initializations. 尝试使用不同的权重初始化几次（并耐心等待许多纪元）。

There is a chance that your learning rate is too big too. 您的学习率也有可能太大。

About to_categorical() : have you tried to plot/print your masks? 关于to_categorical() ：您是否尝试绘制/打印蒙版？ Do they really seem like what you expect them to? 他们真的像您期望的那样吗？

如何使用Keras构建多类卷积神经网络

问题描述

3 个解决方案

解决方案1
2 2017-08-31 09:30:52

解决方案2
1 2017-08-29 12:57:20

解决方案3
1 已采纳 2017-08-29 17:07:35

如何使用Keras构建多类卷积神经网络

问题描述

3 个解决方案

解决方案1 2 2017-08-31 09:30:52

解决方案2 1 2017-08-29 12:57:20

解决方案3 1 已采纳 2017-08-29 17:07:35

解决方案1
2 2017-08-31 09:30:52

解决方案2
1 2017-08-29 12:57:20

解决方案3
1 已采纳 2017-08-29 17:07:35