简体   繁体   English

如何为 Keras 模型编写基于二元交叉熵损失的条件回归损失函数

[英]How to write a conditonal regressional loss function based on binary cross entropy loss for Keras model

I am building a key-point detection system of the human face.我正在构建人脸的关键点检测系统。 The goal is to have an image of the face be input into the model, and the model then detects anatomical landmarks in the image (eyes, nose) and outputs the pixel coordinates of the landmarks that are visible.目标是将人脸图像输入模型,然后模型检测图像中的解剖标志(眼睛、鼻子)并输出可见标志的像素坐标。 There are three targets per landmark: x, y, visible.每个地标有三个目标:x、y、可见。 X and Y are the pixel coordinates, and visible is whether the landmark is in the image or not. X 和 Y 是像素坐标,visible 是地标是否在图像中。 The plan is to first have a binary cross entropy loss between predicted visibility and true visibility.该计划是首先在预测能见度和真实能见度之间有一个二元交叉熵损失。 Then, the second loss is a regression loss (I'm using MAPE) between the x,y coordinates and the targets.然后,第二个损失是 x,y 坐标和目标之间的回归损失(我使用的是 MAPE)。 However, the regression loss would only be calculated for landmarks that are visible.然而,回归损失只会计算可见的地标。 The loss would look something like:损失看起来像:

#Pseudo-code
def loss(y_true,y_pred):
    if y_true[2] == 1
       #Probability that landmark is in image
       #Compute binary cross entropy loss
       #Compute MAPE regression loss
       Total_loss = Binary_loss + MAPE_loss
       return Total_loss

    else:
       Total_loss = Binary loss
       return Total_loss

Once the loss function is written, how would I go about implementing it in code?一旦编写了损失函数,我将如何在代码中实现它? I know how to create models for each problem (checking the coordinates, and separately checking the visibility), but I'm not sure exactly how to go about combining the two heads with the conditional loss function.我知道如何为每个问题创建模型(检查坐标,并分别检查可见性),但我不确定如何将两个头与条件损失函数结合起来。 How would I combine the layers (Conv, Flatten, Dense for each head) to get the desired output?我将如何组合层(每个头的 Conv、Flatten、Dense)以获得所需的输出? Thank you!谢谢!

EDIT: I'm not able to upload the data, but here is an image of it.编辑:我无法上传数据,但这是它的图像。 The first 9 columns are the coordinates, and visibility of the landmarks.前 9 列是地标的坐标和可见性。 The last column is the corresponding image which has been flattened.最后一列是相应的已展平的图像。 在此处输入图片说明 When I load in the data for training, these are the steps I do:当我加载数据进行训练时,这些是我执行的步骤:

###Read in data file
file = "Directory/file.csv"
train_data = pd.read_csv(file)
###Convert each coordinate column to type float64
train_data['xreye'] = train_data['xreye'].astype(np.float64)
...
###Convert image column to string type
train_data['Image'] = train_data['Image'].astype(str)

#Image is feature, other values are labels to predict later
#Image column values are strings, also some missing values, have to split
##string by space and append it and handle missing values
imag = []
for i in range(len(train_data)):
    img = train_data['Image'][i].split(' ')
    img = ['0' if x == '' else x for x in img]      
    imag.append(img) 
#Reshape and convert to float value
image_list = np.array(imag,dtype = 'uint8')
X_train = image_list.reshape(-1,256,256,1)

####Get pixel coordinates and visibility targets
training = train_data[['xreye','yreye','reyev','xleye','yleye','leyev','xtsept','ytsept','tseptv']]
y_train = []
for i in range(len(train_data)):
    y = training.iloc[i,:]
    y_train.append(y)

y_train = np.array(y_train, dtype='float')

EDIT: Model code, loss function, and fit method.编辑:模型代码、损失函数和拟合方法。

###Loss function
visuals_mask = [False, False, True] * 3
def loss_func(y_true, y_pred):
    visuals_true = tf.boolean_mask(y_true, visuals_mask, axis=1)
    visuals_pred = tf.boolean_mask(y_pred, visuals_mask, axis=1)
    visuals_loss = tf.keras.losses.BinaryCrossentropy(visuals_true, visuals_pred)
    visuals_loss = tf.reduce_mean(visuals_loss)

    coords_true = tf.boolean_mask(y_true, ~np.array(visuals_mask), axis=1)
    coords_pred = tf.boolean_mask(y_pred, ~np.array(visuals_mask), axis=1)
    coords_loss = tf.keras.losses.MeanAbsolutePercentageError(coords_true, coords_pred)
    coords_loss = tf.reduce_mean(coords_loss)

    return coords_loss + visuals_loss
####Model code
model = Sequential()

model.add(Conv2D(32, (3,3), activation='relu', padding='same', use_bias=False, input_shape=(256,256,1)))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(2,2)))

model.add(Conv2D(64, (3,3), activation='relu', padding='same', use_bias=False))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(2,2)))

model.add(Conv2D(128, (3,3), activation='relu', padding='same', use_bias=False))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(2,2)))

model.add(Flatten())

model.add(Dense(128, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(4, activation='relu'))
model.add(Dense(9, activation='linear'))
model.summary()
model.compile(optimizer='adam', loss=loss_func)

###Model fit
checkpointer = ModelCheckpoint('C:/Users/Cloud/.spyder-py3/x_y_shift/weights/vis_coords_TEST.hdf5', monitor='val_loss', verbose=1, mode = 'min', save_best_only=True)
out = model.fit(X_train,y_train,epochs=5,batch_size=4,validation_split=0.1, verbose=1, callbacks=[checkpointer])

I can't be sure beacuse I don't have data to reproduce the problem but these are the steps in my head:我不能确定,因为我没有数据来重现问题,但这些是我脑海中的步骤:

  1. Use boolean masking to get the 2, 5 and 8. indexes from the output:使用布尔掩码从输出中获取 2、5 和 8. 索引:
visuals_mask_ = [False, False, True] * 3

# in the loss function
visuals_true = tf.boolean_mask(y_true, visuals_mask_, axis=-1) # do the same with preds
  1. Compute the loss for the visuals计算视觉效果的损失
visuals_loss = binary_crossentropy(visuals_true, visuals_pred) # use sparse if that's the case
  1. Get the coordinates' outputs just like we did for visuals but with reversed visuals_mask .获取坐标的输出,就像我们对视觉效果所做的一样,但使用反向的visuals_mask I believe tf.boolean_mask(y_true, tf.math.logical_not(visuals_mask_, axis=-1)) should work.我相信tf.boolean_mask(y_true, tf.math.logical_not(visuals_mask_, axis=-1))应该可以工作。
  2. Compute MAPE for the rest ( coords_true and coords_pred )计算其余部分的 MAPE( coords_truecoords_pred
  3. Get the means for both losses by tf.reduce_mean通过tf.reduce_mean获取两种损失的tf.reduce_mean
  4. Get sum of losses and return it获取损失总和并返回

I hope these will provide some insight.我希望这些将提供一些见解。

Edit: I tried the following and seems like it's working:编辑:我尝试了以下操作,似乎可以正常工作:

y_true = tf.convert_to_tensor(np.random.rand(32, 9))
y_pred = tf.convert_to_tensor(np.random.rand(32, 9))

visuals_mask = [False, False, True] * 3

def loss_func(y_true, y_pred):
    visuals_true = tf.boolean_mask(y_true, visuals_mask, axis=1)
    visuals_pred = tf.boolean_mask(y_pred, visuals_mask, axis=1)
    visuals_loss = binary_crossentropy(visuals_true, visuals_pred)
    visuals_loss = tf.reduce_mean(visuals_loss)

    coords_true = tf.boolean_mask(y_true, ~np.array(visuals_mask), axis=1)
    coords_pred = tf.boolean_mask(y_pred, ~np.array(visuals_mask), axis=1)
    coords_loss = mean_absolute_percentage_error(coords_true, coords_pred)
    coords_loss = tf.reduce_mean(coords_loss)

    return coords_loss + visuals_loss

loss_func(y_true, y_pred)

What I assumed here is:我在这里假设的是:

  • Your output has actually has length of 9 ( (batch_size, 9) ).您的输出实际上长度为 9 ( (batch_size, 9) )。
  • Custom loss calculations may differ in this demonstration and actual training because of eager execution .由于急切执行,本演示和实际训练中的自定义损失计算可能会有所不同。

Edit 2: I tried it with this kind of model and it's seems to work:编辑 2:我用这种模型试过了,它似乎有效:

model = Sequential()

model.add(Conv2D(4, 10, data_format='channels_last', input_shape=(256, 256, 1)))
model.add(Flatten())
model.add(Dense(9, activation='sigmoid'))

model.compile('adam', loss=loss_func)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM