简体   繁体   English

如何在不嵌入的情况下使用带有 keras 的不同长度的序列数据的 LSTM?

[英]How to use LSTM with sequence data of varying length with keras without embedding?

I have an input data where each example is some varying number of vectors of length k.我有一个输入数据,其中每个示例都是长度为 k 的不同数量的向量。 In total I have n examples.我总共有 n 个例子。 So the dimensions of the input is n * ?所以输入的维度是 n * ? * k. * k。 The question mark symbolizes varying length.问号象征着不同的长度。

I want to input it to an LSTM layer in Keras, if possible, without using embedding (it isn't your ordinary words dataset).如果可能,我想将它输入到 Keras 中的 LSTM 层,而不使用嵌入(它不是您的普通单词数据集)。

Could someone write a short example of how to do this?有人可以写一个简短的例子来说明如何做到这一点吗?

The data is currently a double nested python array, eg数据目前是一个双嵌套的python数组,例如

example1 = [[1,0,1], [1,1,1]]
example2 = [[1,1,1]]
my_data = []
my_data.append(example1)
my_data.append(example2)

I think you could use pad_sequences .我想你可以使用pad_sequences This should get all of your inputs to the same length.这应该使您的所有输入长度相同。

You can use padding ( pad_sequences ) and a Masking layer.您可以使用填充( pad_sequences )和Masking层。

You can also train batches of different lenghts in a manual training loop:您还可以在手动训练循环中训练不同长度的批次:

for e in range(epochs):
    for batch_x, batch_y in list_of_batches: #providade you separated the batches by length
        model.train_on_batch(batch_x, batch_y)

The keypoint in all of this is that your input_shape=(None, k)所有这一切的关键是你的input_shape=(None, k)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM