简体   繁体   中英

keras - embedding layer mask_zero causing exception at subsequent layers

I am working on a model based on this paper and I am getting an exception due to GlobalMaxPooling1D layer not supporting masking.

I have an Embedding layer with mask_zero argument set to True . However, since a subsequent GlobalMaxPooling1D layer does not support masking, I am getting an exception. The exception is expected, as it is actually stated in the documentation of the Embedding layer that any subsequent layers after an Embedding layer with mask_zero = True should support masking .

However, as I am processing sentences with variable number of words in them, I do need the masking in the Embedding layer. (ie due to the varying length of input) My question is, how should I alter my model that masking remains a part of the model, and does not cause a problem at GlobalMaxPooling1D layer?

Below is the code for the model.

model = Sequential()
embedding_layer = Embedding(dictionary_size, num_word_dimensions,
                            weights=[embedding_weights], mask_zero=True,
                            embeddings_regularizer=regularizers.l2(0.0001))
model.add(TimeDistributed(embedding_layer,
                          input_shape=(max_conversation_length, timesteps)))

model.add(TimeDistributed(Bidirectional(LSTM(m // 2, return_sequences=True,
                                             kernel_regularizer=regularizers.l2(0.0001)))))
model.add(TimeDistributed(Dropout(0.2)))
model.add(TimeDistributed(GlobalMaxPooling1D()))
model.add(Bidirectional(LSTM(h // 2, return_sequences = True,
                             kernel_regularizer=regularizers.l2(0.0001)),
                        merge_mode='concat'))
model.add(Dropout(0.2))
crf = CRF(num_tags, sparse_target=False, kernel_regularizer=regularizers.l2(0.0001))
model.add(crf)
model.compile(optimizer, loss = crf.loss_function, metrics=[crf.accuracy])

However, as I am processing sentences with variable number of words in them, I do need the masking in the Embedding layer.

Are you padding the sentences to make them have equal lengths? If so, then instead of masking, you can let the model find out on its own that the 0 is padding and therefore should be ignored. Therefore, you would not need an explicit masking. This approach is also used for dealing with missing values in the data as suggested in this answer .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM