简体   繁体   English

用于在 Keras 中输出的序列的 LSTM

[英]LSTM for sequence to output in Keras

I'm really confused of how to solve/structure my task using an LSTM from keras:我真的很困惑如何使用 keras 的 LSTM 解决/构建我的任务:

So I have a sequence of vectors .所以我有一个vectors sequence Each sequence belongs to a certain output (a document in my case).每个序列都属于某个输出(在我的例子中是一个文档)。

The vectors themself are 500 features long (they represent a sentence).向量本身有 500 个特征(它们代表一个句子)。

The sequence (how many sentences within a document) varies.. so I assume the sequence needs to be padded, so each sequence is equally long, eg say lets make each 200 vectors long.序列(文档中的句子数)各不相同..所以我假设需要填充序列,所以每个序列都一样长,例如让每 200 个向量长。

Now, since each sequence belongs to a certain output/ document (1 - 10.000), how do I frame my task?现在,由于每个sequence属于某个输出/文档 (1 - 10.000),我该如何构建我的任务?

Keras takes an input as (#samples, time_steps, input_dim) ; Keras将输入作为(#samples, time_steps, input_dim) so I guess it must be (#documents, #sentences, #features_of_each_sentece) - in my case: (10.000, 200, 500)所以我想它一定是(#documents, #sentences, #features_of_each_sentece) - 就我而言: (10.000, 200, 500)

Correct?正确的?

So how do I train my model, to predict which sentence most likely belongs to which document?那么我如何训练我的模型,以预测哪个句子最有可能属于哪个文档?

Is my output-vector one-hot-encoded [1, 0, ...] for the first document, [0, 1, ..] sencond etc.?我的输出向量是第一个文档的单热编码 [1, 0, ...] 吗,[0, 1, ..] 秒等? Or is my output-vector just [1, 1, 1, ...., 2, 2, 2,...] , so I have just one output vector that contains which sequence belongs to which document??或者我的输出向量只是[1, 1, 1, ...., 2, 2, 2,...] ,所以我只有一个包含哪个序列属于哪个文档的输出向量?

I'm really confused.. In the end I want to take the last layer to have a vector representation for each document.我真的很困惑..最后我想拿最后一层来为每个文档都有一个向量表示。

so it looks something like:所以它看起来像: 在此处输入图片说明

Would it be like (?):会不会像(?):

model = Sequential()
model.add(LSTM(50, input_shape=(10000,200,500)))
model.add(Dense(3))
model.add(Activation('softmax'))
model.compile(loss='categorial_crossentropy',opitimizer=some_optimizer)

If the first layer of your model is a recurring layer you don't have to specify the length.如果模型的第一层是重复层,则不必指定长度。 You can then use masking to work with the varying sequence size.然后,您可以使用掩码来处理不同的序列大小。 For more information on Keras Recurrent layers(includes LSTM) check the documentation For a simple example on LSTM's you can look here有关 Keras 循环层(包括 LSTM)的更多信息,请查看文档有关 LSTM 的简单示例,您可以查看此处

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM