简体   繁体   English

Keras:使用 Dice 系数损失函数,val loss 没有改善

[英]Keras: Using Dice coefficient Loss Function, val loss is not improving

Problem问题

I am doing two classes image segmentation, and I want to use loss function of dice coefficient.我正在做两类图像分割,我想使用骰子系数的损失函数。 However validation loss is not improved.但是验证损失没有得到改善。 How to Solve these problem?如何解决这些问题?

what I did我做了什么

Using the mothod of one-hot encoding, Processed label image and it has not include backgroung label.使用 one-hot 编码的方法,处理标签图像并且它没有包含背景标签。

Code代码

Shape of X is (num of data, 256, 256, 1) # graysacle X 的形状是 (num of data, 256, 256, 1) # graysacle

Shape of y is (num of data, 256, 256, 2) # two class and exclude background label shape of y is (num of data, 256, 256, 2) # 两类并排除背景标签

one_hot_y = np.zeros((len(y), image_height, image_width, 2))
for i in range(len(y)):
  one_hot = to_categorical(y[i])
  one_hot_y[i] = one_hot[:,:,1:] 
one_hot_y.shape  #->  (566, 256, 256, 2)

#### <-- Unet Model --> ####

from tensorflow import keras
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Concatenate, Conv2DTranspose
from keras import Model

def unet(image_height, image_width, num_classes):
    # inputs = Input(input_size)
    inputs = Input(shape=(image_height, image_width, 1),name='U-net')
    
    conv1 = Conv2D(32, (3, 3), activation='relu', padding='same')(inputs)
    conv1 = Conv2D(32, (3, 3), activation='relu', padding='same')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

    conv2 = Conv2D(64, (3, 3), activation='relu', padding='same')(pool1)
    conv2 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)

    conv3 = Conv2D(128, (3, 3), activation='relu', padding='same')(pool2)
    conv3 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv3)
    pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)

    conv4 = Conv2D(256, (3, 3), activation='relu', padding='same')(pool3)
    conv4 = Conv2D(256, (3, 3), activation='relu', padding='same')(conv4)
    pool4 = MaxPooling2D(pool_size=(2, 2))(conv4)

    conv5 = Conv2D(512, (3, 3), activation='relu', padding='same')(pool4)
    conv5 = Conv2D(512, (3, 3), activation='relu', padding='same')(conv5)

    up6 = Concatenate()([Conv2DTranspose(256, (2, 2), strides=(2, 2), padding='same')(conv5), conv4])
    conv6 = Conv2D(256, (3, 3), activation='relu', padding='same')(up6)
    conv6 = Conv2D(256, (3, 3), activation='relu', padding='same')(conv6)

    up7 = Concatenate()([Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(conv6), conv3])
    conv7 = Conv2D(128, (3, 3), activation='relu', padding='same')(up7)
    conv7 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv7)

    up8 = Concatenate()([Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same')(conv7), conv2])
    conv8 = Conv2D(64, (3, 3), activation='relu', padding='same')(up8)
    conv8 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv8)

    up9 = Concatenate()([Conv2DTranspose(32, (2, 2), strides=(2, 2), padding='same')(conv8), conv1])
    conv9 = Conv2D(32, (3, 3), activation='relu', padding='same')(up9)
    conv9 = Conv2D(32, (3, 3), activation='relu', padding='same')(conv9)

    outputs = Conv2D(num_classes, (1, 1), activation='softmax')(conv9)
    
    return Model(inputs=[inputs], outputs=[outputs])```


#### <-- Dice Score --> ####

from tensorflow.keras import backend as K
def dice_coef(y_true, y_pred):
  y_true_f = K.flatten(y_true)
  y_pred_f = K.flatten(y_pred)
  intersection = K.sum(y_true_f * y_pred_f)
  return (2. * intersection + 0.0001) / (K.sum(y_true_f) + K.sum(y_pred_f) + 0.0001)

def dice_coef_loss(y_true, y_pred):
  return 1 - dice_coef(y_true, y_pred)```


#### <-- Fit the Model --> ####

from tensorflow.keras import optimizers
adam = optimizers.Adam(learning_rate=0.0001)
unet_model.compile(optimizer=adam, loss=[dice_coef_loss],metrics=[dice_coef])
hist = unet_model.fit(X_train,y_train, epochs=epochs, batch_size=batch_size,validation_data=(X_val,y_val), callbacks=[checkpoint,earlystopping])

I tried to replicate your experience.我试图复制你的经验。 I used the Oxford-IIIT Pets database whose label has three classes: 1: Foreground, 2: Background, 3: Not classified.我使用 Oxford-IIIT Pets 数据库,其标签分为三类:1:前景,2:背景,3:未分类。 If class 1 ("Foreground") is removed as you did, then the val_loss does not change during the iterations.如果像您一样删除了类 1(“前景”),则 val_loss 在迭代期间不会更改。 On the other hand, if the "Not classified" class is removed, the optimization seems to work.另一方面,如果“未分类”类被删除,优化似乎工作。 The model fails to discriminate between "Background" and "Not classified", which is conceivable.该模型无法区分“背景”和“未分类”,这是可以想象的。
Besides, there is a small error in the calculation of the dice coefficient: In the denominator, you need to take the sum of the squares .此外,骰子系数的计算有一个小误差:在分母中,你需要取平方和 It doesn't change anything for y_true but for y_pred it does.它不会改变 y_true 的任何东西,但它会改变 y_pred 。

I can't say why your code doesn't work, but I can tell you the way I do it.我不能说为什么你的代码不起作用,但我可以告诉你我是怎么做的。 Differences are that I exclude the background and encode the target inside the dice coef calculation function.不同之处在于我排除了背景并在骰子系数计算函数中对目标进行了编码。

Then I define my Dice coefficient as follows:然后我定义我的骰子系数如下:

def dice_coef(y_true, y_pred, smooth=1):
    # flatten
    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    # one-hot encoding y with 3 labels : 0=background, 1=label1, 2=label2
    y_true_f = K.one_hot(K.cast(y_true_f, np.uint8), 3)
    y_pred_f = K.one_hot(K.cast(y_pred_f, np.uint8), 3)
    # calculate intersection and union exluding background using y[:,1:]
    intersection = K.sum(y_true_f[:,1:]* y_pred_f[:,1:], axis=[-1])
    union = K.sum(y_true_f[:,1:], axis=[-1]) + K.sum(y_pred_f[:,1:], axis=[-1])
    # apply dice formula
    dice = K.mean((2. * intersection + smooth)/(union + smooth), axis=0)
    return dice

def dice_loss(y_true, y_pred):
    return 1-dice_coef

I was also confused about this problem until I understood the following code!!!!我也对这个问题感到困惑,直到我看懂了下面的代码!!!!

import numpy as np
from PIL import Image
from keras import backend as K


def dice_loss(y_true, y_pred):
    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    intersection = K.sum(y_true_f* y_pred_f)
    val = (2. * intersection + K.epsilon()) / (K.sum(y_true_f * y_true_f) + K.sum(y_pred_f * y_pred_f) + K.epsilon())
    return 1. - val


arr1 = np.array([[[9.6,0.6,0.3],
                  [0.3,0.5,0.5]],
                 [[0.5,0.5,0.5],
                  [0.5,0.5,0.5]],
                 [[0.5,0.5,0.5],
                 [0.5,0.5,0.5]],
                 [[0.5,0.5,0.5],
                 [0.5,0.5,0.5]]])

arr2= np.array([[[9.6,0.6,0.3],
                  [0.3,0.5,0.5]],
                 [[0.5,0.5,0.5],
                  [0.5,0.5,0.5]],
                 [[0.5,0.5,0.5],
                 [0.5,0.5,0.5]],
                 [[0.5,0.5,0.5],
                 [0.5,0.5,0.5]]])

loss = dice_loss(arr1,arr2)
print(loss)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM