keras argmax没有渐变。如何定义argmax的梯度？

Question

I am using Keras.Backend.armax() in a gamma layer. 我在伽马层中使用Keras.Backend.armax() 。 The model compiles fine but throws an error during fit(). 该模型编译良好，但在fit（）期间引发错误。

ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

My model: 我的模特：

latent_dim = 512
encoder_inputs = Input(shape=(train_data.shape[1],))
encoder_dense = Dense(vocabulary, activation='softmax')
encoder_outputs = Embedding(vocabulary, latent_dim)(encoder_inputs)
encoder_outputs = LSTM(latent_dim, return_sequences=True)(encoder_outputs)
encoder_outputs = Dropout(0.5)(encoder_outputs)
encoder_outputs = encoder_dense(encoder_outputs)
encoder_outputs = Lambda(K.argmax, arguments={'axis':-1})(encoder_outputs)
encoder_outputs = Lambda(K.cast, arguments={'dtype':'float32'})(encoder_outputs)

encoder_dense1 = Dense(train_label.shape[1], activation='softmax')
decoder_embedding = Embedding(vocabulary, latent_dim)
decoder_lstm1 = LSTM(latent_dim, return_sequences=True)
decoder_lstm2 = LSTM(latent_dim, return_sequences=True)
decoder_dense2 = Dense(vocabulary, activation='softmax')

decoder_outputs = encoder_dense1(encoder_outputs)
decoder_outputs = decoder_embedding(decoder_outputs)
decoder_outputs = decoder_lstm1(decoder_outputs)
decoder_outputs = decoder_lstm2(decoder_outputs)
decoder_outputs = Dropout(0.5)(decoder_outputs)
decoder_outputs = decoder_dense2(decoder_outputs)
model = Model(encoder_inputs, decoder_outputs)
model.summary()

Model summary for easy visualizing: 模型摘要，易于可视化：

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_7 (InputLayer)         (None, 32)                0         
_________________________________________________________________
embedding_13 (Embedding)     (None, 32, 512)           2018816   
_________________________________________________________________
lstm_19 (LSTM)               (None, 32, 512)           2099200   
_________________________________________________________________
dropout_10 (Dropout)         (None, 32, 512)           0         
_________________________________________________________________
dense_19 (Dense)             (None, 32, 3943)          2022759   
_________________________________________________________________
lambda_5 (Lambda)            (None, 32)                0         
_________________________________________________________________
lambda_6 (Lambda)            (None, 32)                0         
_________________________________________________________________
dense_20 (Dense)             (None, 501)               16533     
_________________________________________________________________
embedding_14 (Embedding)     (None, 501, 512)          2018816   
_________________________________________________________________
lstm_20 (LSTM)               (None, 501, 512)          2099200   
_________________________________________________________________
lstm_21 (LSTM)               (None, 501, 512)          2099200   
_________________________________________________________________
dropout_11 (Dropout)         (None, 501, 512)          0         
_________________________________________________________________
dense_21 (Dense)             (None, 501, 3943)         2022759   
=================================================================
Total params: 14,397,283
Trainable params: 14,397,283
Non-trainable params: 0
_________________________________________________________________

I googled for the solution but almost all were about a faulty model. 我用谷歌搜索了解决方案，但几乎所有都是关于错误的模型。 Some recommended to not use functions causing that are causing issues. 一些建议不要使用引起问题的功能。 However, as you can see, I cannot create this model without K.argmax (If you know any other way then do tell me). 但是，如您所见，如果没有K.argmax，我将无法创建此模型（如果您知道其他方法，请告诉我）。 How do I solve this issue and hence train my model? 如何解决此问题，然后训练模型？

Answer 1

For obvious reasons there is no gradient for the Argmax function; 出于明显的原因，Argmax函数没有梯度。 How would that even be defined? 怎么定义呢？ In order for your model to work, you need to make the layer non-trainable. 为了使模型起作用，您需要使该层不可训练。 As per this question (or the documentation ), you need to pass trainable = False to your layer. 根据这个问题（或文档），您需要将trainable = False传递给您的图层。 As for the layer weight (if applicable), you probably want to set it to an identity matrix. 至于图层权重（如果适用），您可能希望将其设置为单位矩阵。

keras argmax没有渐变。如何定义argmax的梯度？

问题描述

1 个解决方案

解决方案1
0 2018-09-16 20:58:58

keras argmax没有渐变。 如何定义argmax的梯度？

问题描述

1 个解决方案

解决方案1 0 2018-09-16 20:58:58

keras argmax没有渐变。如何定义argmax的梯度？

解决方案1
0 2018-09-16 20:58:58