简体   繁体   English

如何向在 python 中构建为顺序 keras 模型的 LSTM 自动编码器添加注意层?

[英]How to add an attention layer to LSTM autoencoder built as sequential keras model in python?

So I want to build an autoencoder model for sequence data.所以我想为序列数据构建一个自动编码器模型。 I have started to build a sequential keras model in python and now I want to add an attention layer in the middle, but have no idea how to approach this.我已经开始在 python 中构建一个顺序 keras 模型,现在我想在中间添加一个注意力层,但不知道如何处理这个。 My model so far:到目前为止我的模型:

from keras.layers import LSTM, TimeDistributed, RepeatVector, Layer
from keras.models import Sequential
import keras.backend as K

model = Sequential()
model.add(LSTM(20, activation="relu", input_shape=(time_steps,n_features), return_sequences=False))
model.add(RepeatVector(time_steps, name="bottleneck_output"))
model.add(LSTM(30, activation="relu", return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))

model.compile(optimizer="adam", loss="mae")

So far I have tried to add an attention function copied from here到目前为止,我已经尝试添加从这里复制的注意力功能

class attention(Layer):
    def __init__(self,**kwargs):
        super(attention,self).__init__(**kwargs)

    def build(self,input_shape):
        self.W=self.add_weight(name="att_weight",shape=(input_shape[-1],1),initializer="normal")
        self.b=self.add_weight(name="att_bias",shape=(input_shape[1],1),initializer="zeros")        
        super(attention, self).build(input_shape)

    def call(self,x):
        et=K.squeeze(K.tanh(K.dot(x,self.W)+self.b),axis=-1)
        at=K.softmax(et)
        at=K.expand_dims(at,axis=-1)
        output=x*at
        return K.sum(output,axis=1)

    def compute_output_shape(self,input_shape):
        return (input_shape[0],input_shape[-1])

    def get_config(self):
        return super(attention,self).get_config()

and added it after first LSTM, before repeat vector, ie:并在第一个 LSTM 之后,在重复向量之前添加它,即:

model = Sequential()
model.add(LSTM(20, activation="relu", input_shape=(time_steps,n_features), return_sequences=False))
model.add(attention()) # this is added
model.add(RepeatVector(time_steps, name="bottleneck_output"))
model.add(LSTM(30, activation="relu", return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))

model.compile(optimizer="adam", loss="mae")

but the code gives error, because the dimensions somehow do not fit and the problem is in putting output of attention() to repeat vector:但是代码给出了错误,因为尺寸不知何故不适合,问题在于将 attention() 的输出置于重复向量中:

ValueError: Input 0 is incompatible with layer bottleneck_output: expected ndim=2, found ndim=1

.... but according to model.summary() the output dimension of attention layer is (None, 20) , which is the same also for the first lstm_1 layer . .... 但根据model.summary()注意力层的输出维度是(None, 20) ,这对于第一个 lstm_1 层也是一样的。 The code works without attention layer.该代码在没有注意力层的情况下工作。

I would appreciate also some explanation why the solution is the solution to the problem, I am fairly new to python and have problems understanding what the class attention() is doing.我也很感激一些解释为什么解决方案是问题的解决方案,我对 python 相当陌生,并且在理解类attention()正在做什么时遇到问题。 I just copied it and tried to use it which is ofcrs not working....我只是复制了它并尝试使用它,但是ofcrs不起作用....

Ok, I solved it.好的,我解决了。 There has to be return_sequence = True in first LSTM layer.在第一个 LSTM 层中必须有return_sequence = True Then it works as it is.然后它按原样工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM