Keras - 使用多个输出实现自定义损失 function

Question

I am trying to replicate (a way smaller version) of the AlphaGo Zero system.我正在尝试复制（一个更小的版本）AlphaGo Zero 系统。 However, in the network model, I am having a problem.但是，在网络 model 中，我遇到了问题。 The loss function I am supposed to implement is the following:我应该实现的损失 function 如下：

Where:在哪里：

z is the label (a real value between -1 and 1) of one of the two heads of network and v is this value predicted by the network. z是网络的两个头之一的 label（介于 -1 和 1 之间的实际值）， v是网络预测的这个值。
pi is the label of a distribution probability over all actions and p is the distribution probability over all actions predicted by the network. pi是所有动作的分布概率的 label， p是网络预测的所有动作的分布概率。
c is the L2 regularization parameter. c是 L2 正则化参数。

I pass to the network a list of channels (representing the game state) and an array (same size of the pi and p ) representing which actions are indeed valid (by putting 1 if valid, 0 otherwise).我向网络传递一个通道列表（表示游戏状态）和一个数组（大小相同的pi和p ），表示哪些动作确实有效（如果有效则输入1 ，否则输入0 ）。

As you can see, the loss function uses both the target and the network predictions for the calculation.如您所见，损失 function 使用目标和网络预测进行计算。 But after extensive search, when implementing my custom loss function, I can only pass as parameter y_true and y_pred even though I have two "y_true's" and two "y_pred's".但是经过广泛的搜索，在实现我的自定义损失 function 时，即使我有两个“y_true”和两个“y_pred”，我也只能作为参数y_true和y_pred传递。 I have tried using indexing to get those values but I'm pretty sure it is not working.我曾尝试使用索引来获取这些值，但我很确定它不起作用。

The modeling of the network and the custom loss function is in the code below:网络的建模和自定义损失 function 在下面的代码中：

def custom_loss(y_true, y_pred):

    # I am pretty sure this does not work

    output_prob_dist = y_pred[0]
    output_value = y_pred[1] 
    label_prob_dist = y_true[0]
    label_value = y_pred[1]

    mse_loss = K.mean(K.square(label_value - output_value), axis=-1)
    cross_entropy_loss = K.dot(K.transpose(label_prob_dist), output_prob_dist)

    return mse_loss - cross_entropy_loss

def define_model():
    """Neural Network model implementation using Keras + Tensorflow."""
    state_channels = Input(shape = (5,5,6), name='States_Channels_Input')
    valid_actions_dist = Input(shape = (32,), name='Valid_Actions_Input')

    conv = Conv2D(filters=10, kernel_size=2, kernel_regularizer=regularizers.l2(0.0001), activation='relu', name='Conv_Layer')(state_channels)
    pool = MaxPooling2D(pool_size=(2, 2), name='Pooling_Layer')(conv)
    flat = Flatten(name='Flatten_Layer')(pool)

    # Merge of the flattened channels (after pooling) and the valid action
    # distribution. Used only as input in the probability distribution head.
    merge = concatenate([flat, valid_actions_dist])

    #Probability distribution over actions
    hidden_fc_prob_dist_1 = Dense(100, kernel_regularizer=regularizers.l2(0.0001), activation='relu', name='FC_Prob_1')(merge)
    hidden_fc_prob_dist_2 = Dense(100, kernel_regularizer=regularizers.l2(0.0001), activation='relu', name='FC_Prob_2')(hidden_fc_prob_dist_1)
    output_prob_dist = Dense(32, kernel_regularizer=regularizers.l2(0.0001), activation='softmax', name='Output_Dist')(hidden_fc_prob_dist_2)

    #Value of a state
    hidden_fc_value_1 = Dense(100, kernel_regularizer=regularizers.l2(0.0001), activation='relu', name='FC_Value_1')(flat)
    hidden_fc_value_2 = Dense(100, kernel_regularizer=regularizers.l2(0.0001), activation='relu', name='FC_Value_2')(hidden_fc_value_1)
    output_value = Dense(1, kernel_regularizer=regularizers.l2(0.0001), activation='tanh', name='Output_Value')(hidden_fc_value_2)

    model = Model(inputs=[state_channels, valid_actions_dist], outputs=[output_prob_dist, output_value])

    model.compile(loss=custom_loss, optimizer='adam', metrics=['accuracy'])

    return model



# In the main method
model = define_model()
# ...
# MCTS routine to collect the data for the network input
# ...

x_train = [channels_input, valid_actions_dist_input]
y_train = [dist_probs_label, who_won_label]

model.fit(x_train, y_train, epochs=10)

In short, my question is: how do I correctly implement this custom loss function that uses both the network outputs and label values of the network?简而言之，我的问题是：如何正确实现此自定义损失 function 使用网络输出和网络的 label 值？

Answer 1

I check their git and there is a lot going on;我检查了他们的 git 并且发生了很多事情； As showing in the equetion the final loss is the combination of three different losses, and the three networks are minimizing this final loss.如方程式所示，最终损失是三个不同损失的组合，三个网络正在最小化这个最终损失。 Their code of losses is below:他们的损失代码如下：

    # train ops
    policy_cost = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits_v2(
        logits=logits, labels=tf.stop_gradient(labels['pi_tensor'])))

    value_cost = params['value_cost_weight'] * tf.reduce_mean(
    tf.square(value_output - labels['value_tensor']))

   reg_vars = [v for v in tf.trainable_variables()
            if 'bias' not in v.name and 'beta' not in v.name]
   l2_cost = params['l2_strength'] * \
   tf.add_n([tf.nn.l2_loss(v) for v in reg_vars])

   combined_cost = policy_cost + value_cost + l2_cost

You can refer this and make your changes accordingly.您可以参考此内容并相应地进行更改。

Keras - 使用多个输出实现自定义损失 function

问题描述

1 个解决方案

解决方案1
0 2019-11-23 21:28:41

Keras - 使用多个输出实现自定义损失 function

问题描述

1 个解决方案

解决方案1 0 2019-11-23 21:28:41

解决方案1
0 2019-11-23 21:28:41