简体   繁体   English

我可以在Keras中将嵌入层与形状层(?,5)连接起来吗?

[英]Can I concatenate an Embedding layer with a layer of shape (?, 5) in keras?

I want to create an LSTM memory. 我想创建一个LSTM内存。 The LSTM should predict a one-hot encoding value of length 4 given a sentence. LSTM应该预测给定一个句子的长度为4的单编码值。 This was easy in the first step. 第一步很容易。

The next thing, I wanted to do, is adding additional information to my dataset. 我想做的下一件事是向数据集中添加其他信息。 The information is a one-hot encoded vector of length 5. 该信息是长度为5的一键编码矢量。

My idea was to concatenate the Embedding layer with another Input shape before passing the data to an LSTM. 我的想法是在将数据传递到LSTM之前,将“嵌入”层与另一个“输入”形状连接起来。 This looks like this for me: 这对我来说是这样的:

main_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32', name='main_input')
embedding = Embedding(MAX_NB_WORDS, EMBEDDING_SIZE,
                    input_length=MAX_SEQUENCE_LENGTH)(main_input)

# second input model
auxiliary_input = Input(shape=(5,), name='aux_input')
x = concatenate([embedding, auxiliary_input])

lstm = LSTM(HIDDEN_LAYER_SIZE)(x)

main_output = Dense(4, activation='sigmoid', name='main_output')(lstm)

model = Model(inputs=[main_input, auxiliary_input], outputs=main_output)

But if I try to do a set up like this, I get the following error: ValueError: A Concatenate layer requires inputs with matching shapes except for the concat axis. 但是,如果我尝试进行这样的设置,则会收到以下错误:ValueError:连续层需要输入,除了concat轴外,它们的形状都匹配。 Got inputs shapes: [(None, 50, 128), (None, 5)] 得到了输入形状:[(无,50,128),(无,5)]

It is working for me that I create an LSTM of the embedding layer and concatenate this one the the auxiliary input but then I cannot run a LSTM anymore after (getting the erro: ValueError: Input 0 is incompatible with layer lstm_2: expected ndim=3, found ndim=2) 为我工作是创建一个嵌入层的LSTM并将其连接到辅助输入,但是之后我无法再运行LSTM(获取错误:ValueError:输入0与lstm_2层不兼容:预期的ndim = 3 ,发现ndim = 2)

So my question is: What is the right way to build a LSTM with an embedding layer input with additional data in keras? 所以我的问题是:用keras中带有附加数据的嵌入层输入来构建LSTM的正确方法是什么?

It seems that here you are trying to pass an additional information about the full sequence (and not each token) , that's why you have a mismatch problem. 似乎您在这里尝试传递有关完整序列的附加信息(而不是每个令牌),这就是为什么您遇到不匹配的问题。

There are several ways to tackle this problem , all with pros and cons 有几种解决这个问题的方法,各有利弊

(1) You can concatenate the aux_data with the last output of your lstm , so concatenating concat_with_aux = concatenate([auxiliary_input,lstm]) and pass this concatenate vector to your model. (1)您可以将aux_dataaux_data的最后一个输出并置,因此将concat_with_aux = concatenate([auxiliary_input,lstm])并置,并将此并置向量传递给您的模型。 Here it means that if you have two identical sequences with different category, the output of the LSTM will be the same, then after concatenation it will be the job of the dense classifier to use this concatenated result to produce the right output. 这意味着如果您有两个相同的序列且具有不同的类别,那么LSTM的输出将是相同的,那么在级联之后,密集分类器的工作就是使用此级联的结果来产生正确的输出。

(2) If you want to pass the information directly at the input of the LSTM. (2)如果要直接在LSTM的输入处传递信息。 You can for example create new trainable Embedding layer for your categories: 例如,您可以为类别创建新的可训练Embedding层:

auxiliary_input = Input(shape=(1,), name='aux_input') # Now you pass the idx (0,1,2,3,4) not the one_hot encoded form
embed_categories = Embedding(5, EMBEDDING_SIZE,
                    input_length=1)(auxiliary_input)

x = concatenate([embed_categories, embedding])

By doing that your LSTM will be conditionned on your auxiliary information and two identical sentences with different categories will have a different last lstm output. 这样一来,您的LSTM将以您的辅助信息为条件,并且具有不同类别的两个相同句子将具有不同的最后lstm输出。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM