![](/img/trans.png)
[英]Keras: clean implementation for multiple outputs and custom loss functions?
[英]Keras - Implementation of custom loss function with multiple outputs
我正在嘗試復制(一個更小的版本)AlphaGo Zero 系統。 但是,在網絡 model 中,我遇到了問題。 我應該實現的損失 function 如下:
在哪里:
我向網絡傳遞一個通道列表(表示游戲狀態)和一個數組(大小相同的pi和p ),表示哪些動作確實有效(如果有效則輸入1
,否則輸入0
)。
如您所見,損失 function 使用目標和網絡預測進行計算。 但是經過廣泛的搜索,在實現我的自定義損失 function 時,即使我有兩個“y_true”和兩個“y_pred”,我也只能作為參數y_true
和y_pred
傳遞。 我曾嘗試使用索引來獲取這些值,但我很確定它不起作用。
網絡的建模和自定義損失 function 在下面的代碼中:
def custom_loss(y_true, y_pred):
# I am pretty sure this does not work
output_prob_dist = y_pred[0]
output_value = y_pred[1]
label_prob_dist = y_true[0]
label_value = y_pred[1]
mse_loss = K.mean(K.square(label_value - output_value), axis=-1)
cross_entropy_loss = K.dot(K.transpose(label_prob_dist), output_prob_dist)
return mse_loss - cross_entropy_loss
def define_model():
"""Neural Network model implementation using Keras + Tensorflow."""
state_channels = Input(shape = (5,5,6), name='States_Channels_Input')
valid_actions_dist = Input(shape = (32,), name='Valid_Actions_Input')
conv = Conv2D(filters=10, kernel_size=2, kernel_regularizer=regularizers.l2(0.0001), activation='relu', name='Conv_Layer')(state_channels)
pool = MaxPooling2D(pool_size=(2, 2), name='Pooling_Layer')(conv)
flat = Flatten(name='Flatten_Layer')(pool)
# Merge of the flattened channels (after pooling) and the valid action
# distribution. Used only as input in the probability distribution head.
merge = concatenate([flat, valid_actions_dist])
#Probability distribution over actions
hidden_fc_prob_dist_1 = Dense(100, kernel_regularizer=regularizers.l2(0.0001), activation='relu', name='FC_Prob_1')(merge)
hidden_fc_prob_dist_2 = Dense(100, kernel_regularizer=regularizers.l2(0.0001), activation='relu', name='FC_Prob_2')(hidden_fc_prob_dist_1)
output_prob_dist = Dense(32, kernel_regularizer=regularizers.l2(0.0001), activation='softmax', name='Output_Dist')(hidden_fc_prob_dist_2)
#Value of a state
hidden_fc_value_1 = Dense(100, kernel_regularizer=regularizers.l2(0.0001), activation='relu', name='FC_Value_1')(flat)
hidden_fc_value_2 = Dense(100, kernel_regularizer=regularizers.l2(0.0001), activation='relu', name='FC_Value_2')(hidden_fc_value_1)
output_value = Dense(1, kernel_regularizer=regularizers.l2(0.0001), activation='tanh', name='Output_Value')(hidden_fc_value_2)
model = Model(inputs=[state_channels, valid_actions_dist], outputs=[output_prob_dist, output_value])
model.compile(loss=custom_loss, optimizer='adam', metrics=['accuracy'])
return model
# In the main method
model = define_model()
# ...
# MCTS routine to collect the data for the network input
# ...
x_train = [channels_input, valid_actions_dist_input]
y_train = [dist_probs_label, who_won_label]
model.fit(x_train, y_train, epochs=10)
簡而言之,我的問題是:如何正確實現此自定義損失 function 使用網絡輸出和網絡的 label 值?
我檢查了他們的 git 並且發生了很多事情; 如方程式所示,最終損失是三個不同損失的組合,三個網絡正在最小化這個最終損失。 他們的損失代碼如下:
# train ops
policy_cost = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits_v2(
logits=logits, labels=tf.stop_gradient(labels['pi_tensor'])))
value_cost = params['value_cost_weight'] * tf.reduce_mean(
tf.square(value_output - labels['value_tensor']))
reg_vars = [v for v in tf.trainable_variables()
if 'bias' not in v.name and 'beta' not in v.name]
l2_cost = params['l2_strength'] * \
tf.add_n([tf.nn.l2_loss(v) for v in reg_vars])
combined_cost = policy_cost + value_cost + l2_cost
您可以參考此內容並相應地進行更改。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.