简体   繁体   English

如何使用Keras实施注意力

[英]How to implement Attention using Keras

I have a keras model that has sequence of inputs and sequence of outputs where each input has an associated output(Label). 我有一个具有输入序列和输出序列的keras模型,其中每个输入都有一个关联的输出(标签)。

model = Sequential()
model.add(Masking(mask_value=5, input_shape= (Seq_in.shape[1],1)))
model.add(Bidirectional(LSTM(256,  return_sequences=True)))
model.add(Dropout(0.2))
model.add(Bidirectional(LSTM(256, return_sequences=True))) 
model.add(Dropout(0.2))
model.add(Dense(n_Labels, activation='softmax'))  # n_Labels is the number of labels which is 15
sgd = optimizers.SGD(lr=.1,momentum=0.9,decay=1e-3,nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
model.fit(X_train,Y_train,epochs=2, validation_data=(X_val, Y_val),verbose=1)

Now I want to implement attention mechanism following this work Zhou et al: "Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification". 现在,我要在Zhou等人的工作之后实现注意力机制:“基于注意力的双向长期短期记忆网络,用于关系分类”。

For each output, we compute the tanh for each output states in the sequence(Eq# 9 in the paper), then we compute the softmax for each output states in regard to the current output (Eq#10), then we multiply each output states with the corrsponding softmax (attention) (Eq# 11), then we take the sum of the weighted states, then we take the tanh of the final output which represent. 对于每个输出,我们计算序列中每个输出状态的tanh(本文中的方程9),然后针对当前输出(方程10)计算每个输出状态的softmax,然后将每个输出相乘状态与相应的softmax(注意)(等式11)相对应,然后取加权状态的总和,然后取表示的最终输出的tanh。 Finally we concatenate the attention vector with the output states. 最后,我们将注意力向量与输出状态串联起来。

How can I do that? 我怎样才能做到这一点? is it possiple to that with the keras API or I have to come up with my own custom layer? 使用keras API可以做到这一点,还是我必须提出自己的自定义层? any help ? 有什么帮助吗?

Thank you in advance.... 先感谢您....

There is no Keras API yet. 目前还没有Keras API。 But a lot of hard working programmers made some good implementations using Keras. 但是,许多勤奋的程序员使用Keras进行了一些不错的实现。 You can try looking at the code in keras-monotonic-attention . 您可以尝试在keras-monotonic-attention中查看代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM