简体   繁体   English

了解 Keras MNIST 连体网络并将其调整为三元组

[英]Understanding the Keras MNIST Siamese Network and adapting it for triples

I am currently adapting this siamese network in Python with Keras.我目前正在使用 Keras 在 Python 中调整这个连体网络。 However, I currently do not understand how the loss works (not the function itself, but which parameters get passed where)但是,我目前不明白损失是如何工作的(不是 function 本身,而是哪些参数在哪里传递)

Okay, now step by step how I think this works:好的,现在一步一步我认为这是如何工作的:

distance = Lambda(euclidean_distance,
              output_shape=eucl_dist_output_shape)([processed_a, processed_b])

This is the line where the outputs of both individual networks get combined and the custom layer applies the following functions:这是两个单独网络的输出合并的行,自定义层应用以下功能:

def euclidean_distance(vects):
    x, y = vects
    sum_square = K.sum(K.square(x - y), axis=1, keepdims=True)
    return K.sqrt(K.maximum(sum_square, K.epsilon()))


def eucl_dist_output_shape(shapes):
    shape1, shape2 = shapes
    return (shape1[0], 1)

So when the input to this layer is (128, 128) the output would be (128, 1).因此,当该层的输入为 (128, 128) 时,output 将为 (128, 1)。 In the last step the loss is calculated with:在最后一步中,损失计算如下:

def contrastive_loss(y_true, y_pred):
    '''Contrastive loss from Hadsell-et-al.'06
    http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
    '''
    margin = 1
    square_pred = K.square(y_pred)
    margin_square = K.square(K.maximum(margin - y_pred, 0))
    return K.mean(y_true * square_pred + (1 - y_true) * margin_square)

Here, the predicted 128D vector is compared to the 128D ground truth vector.在这里,预测的 128D 向量与 128D 地面实况向量进行比较。

Now I changed the Lambda layer to:现在我将 Lambda 层更改为:

distance = Lambda(euclidean_distance,
                  output_shape=eucl_dist_output_shape)([processed_a, processed_b, processed_c])

so I have three networks now with the following adapted functions (which should just combine the three outputs to one output, with a shape of (128, 3)):所以我现在有三个具有以下适应功能的网络(应该将三个输出组合到一个 output,形状为 (128, 3)):

def euclidean_distance(vects):
    return vects


def eucl_dist_output_shape(shapes):
    shape1, shape2, shape3 = shapes
    return (shape1, shape2, shape3)

and then the new loss function:然后是新的损失function:

def loss_desc_triplet(vects, margin=5):
    """Triplet loss.
    """
    d1, d2, d3 = vects
    d_pos = K.sqrt(K.sum(K.square(d1 - d2), axis=1))
    pair_dist_1_to_3 = K.sqrt(K.sum(K.square(d1 - d3), axis=1))
    d_neg = pair_dist_1_to_3

    return Activation.relu(d_pos - d_neg + margin)

But now I get this error:但现在我得到这个错误:

File "DeepLearningWithAugmentationWithTriplets.py", line 233, in output_shape=eucl_dist_output_shape)([processed_a, processed_b, processed_c])文件“DeepLearningWithAugmentationWithTriplets.py”,第 233 行,在 output_shape=eucl_dist_output_shape)([processed_a, processes_b, processed_c])

File "lib/python3.7/site-packages/keras/engine/base_layer.py", line 497, in call arguments=user_kwargs)文件“lib/python3.7/site-packages/keras/engine/base_layer.py”,第 497 行,调用参数=user_kwargs)

File "lib/python3.7/site-packages/keras/engine/base_layer.py", line 565, in _add_inbound_node output_tensors[i]._keras_shape = output_shapes[i]文件“lib/python3.7/site-packages/keras/engine/base_layer.py”,第 565 行,在 _add_inbound_node output_tensors[i]._keras_shape = output_shapes[i]

IndexError: list index out of range IndexError:列表索引超出范围

But I am not sure what causes this.但我不确定是什么原因造成的。

I fixed the problem by concatenating the outputs:我通过连接输出解决了这个问题:

merged_vector = concatenate([processed_a, processed_b, processed_c], axis=-1, name='merged_layer')

and then disassembling the vector in my loss function:然后在我的损失 function 中分解向量:

d1 = y_pred[:,0:128]
d2 = y_pred[:,128:256]
d3 = y_pred[:,256:384]

Though, I am not sure if that is the best solution.不过,我不确定这是否是最好的解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM