简体   繁体   English

可能在Keras的CNN之前添加双向LSTM吗?

[英]Possible to add a bidirectional LSTM before a CNN in Keras?

I am currently working on a system that classifies whether two sentences share the same content or not. 我目前正在使用一种系统,该系统可以对两个句子是否共享相同的内容进行分类。 For this purpose I use pretrained word vectors, so there is an array with the word vectors of sentence one (s1) and an array with the word vectors of sentence 2 (s2). 为此,我使用了预训练的单词向量,因此有一个数组包含句子1的单词向量(s1)和一个数组包含句子2的单词向量(s2)。 In order to classify whether they are similar or not I create a matrix by comparing all vectors in s1 pairwise with the vectors in s2. 为了对它们是否相似进行分类,我通过将s1中的所有向量与s2中的向量成对比较来创建矩阵。 This matrix is then fed into a CNN classifier and trained on the data. 然后将此矩阵输入到CNN分类器中,并对数据进行训练。 This all pretty straight forward. 这一切都非常简单。

Now I would like to enhance this system by making using bidirectional LSTMs on s1 and s2. 现在,我想通过在s1和s2上使用双向LSTM来增强此系统。 The bidirectional LSTM should be used in order to get the hidden state of each vector in s1 and s2 and these hidden states should then be compared in the same way by pairwise cosine similarity as the vectors of s1 and s2 are compared before. 应该使用双向LSTM来获取s1和s2中每个向量的隐藏状态,然后应像以前比较s1和s2的向量一样,通过成对余弦相似性对这些隐藏状态进行比较。 This is in order to capture the information of the sentence context of each word in s1 and s2. 这是为了捕获s1和s2中每个单词的句子上下文信息。

Now the question is how to do this in Keras. 现在的问题是如何在Keras中执行此操作。 Currently I am using numpy/sklearn to create the matrices which are then fed as training data into Keras. 目前,我正在使用numpy / sklearn创建矩阵,然后将这些矩阵作为训练数据馈入Keras。 I found one implementation of what I want to do in plain tensorflow ( https://github.com/LiuHuiwen/Pairwise-Word-Interaction-Modeling-by-Tensorflow-1.0/blob/master/model.py ). 我在普通的tensorflow中找到了我想做的一种实现( https://github.com/LiuHuiwen/Pairwise-Word-Interaction-Modeling-by-Tensorflow-1.0/blob/master/model.py )。

I assume that I will have to change the input data to consist of just the two arrays of vectors of s1 and s2. 我假设我将不得不更改输入数据,使其仅包含s1和s2向量的两个数组。 Then I have to run the biLSTM first, get the hidden states, convert everything into matrices and feed this into the CNN. 然后,我必须先运行biLSTM,获取隐藏状态,将所有内容转换为矩阵,然后将其输入到CNN中。 The example in plain tensorflow seems to be quite clear to me, but I cannot come up with an idea of how to do this in Keras. 普通tensorflow中的示例对我来说似乎很清楚,但是我无法想到在Keras中如何执行此操作的想法。 Is it possible at all in Keras or does one have to resort to tensorflow directly in order to do the necessary calculations on the output of the biLSTM? 在Keras中是否有可能,或者为了对biLSTM的输出进行必要的计算而必须直接使用张量流吗?

Keras RNN layer including LSTM can return not only the last output in the output sequence but also the full sequence from all hidden layers using return_sequences=True option. 使用return_sequences=True选项,包括LSTM的Keras RNN层不仅可以返回输出序列中的最后一个输出,还可以返回所有隐藏层中的完整序列。

https://keras.io/layers/recurrent/ https://keras.io/layers/recurrent/

When you want to connect Bi-Directional LSTM layer before CNN layer, the following code is an example: 当您要在CNN层之前连接双向LSTM层时,以下代码为示例:

from keras.layers import Input, LSTM, Bidirectional, Conv1D

input = Input(shape=(50, 200))
seq = Bidirectional(LSTM(16, return_sequences=True))(input)
cnn = Conv1D(32, 3, padding="same", activation="relu")(seq)

Please note: If you want to use Conv2D layer after Bi-Directional LSTM layer, reshaping to ndim=4 is required for input of Conv2D, like the following code: 请注意:如果要在双向LSTM层之后使用Conv2D层,则输入Conv2D时需要将ndim=4重塑为ndim=4 ,例如以下代码:

from keras.layers import Input, LSTM, Bidirectional, Conv2D, Reshape

input = Input(shape=(50, 200))
seq = Bidirectional(LSTM(16, return_sequences=True))(input)
seq = Reshape((50, 32, 1))(seq)
cnn = Conv2D(32, (3, 3), padding="same", activation="relu")(seq)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM