用於多類文本分類的具有自注意力的 LSTM

Question

我在以下鏈接中關注 Keras 中的自我關注How to add attention layer to a Bi-LSTM

我想將 BI LSTM 應用於 3 個類的多類文本分類。

我嘗試在我的代碼中應用注意力，但出現以下錯誤，我該如何解決這個問題？ 有人可以幫我嗎？

Incompatible shapes: [100,3] vs. [64,3]
     [[Node: training_1/Adam/gradients/loss_11/dense_14_loss/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@training_1/Adam/gradients/loss_11/dense_14_loss/mul_grad/Reshape_1"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](training_1/Adam/gradients/loss_11/dense_14_loss/mul_grad/Shape, training_1/Adam/gradients/loss_11/dense_14_loss/mul_grad/Shape_1)]]




class attention(Layer):
    
    def __init__(self, return_sequences=False):
        self.return_sequences = return_sequences
        super(attention,self).__init__()
        
    def build(self, input_shape):
        
        self.W=self.add_weight(name="att_weight", shape=(input_shape[-1],1),
                               initializer="normal")
        self.b=self.add_weight(name="att_bias", shape=(input_shape[1],1),
                               initializer="zeros")
        
        super(attention,self).build(input_shape)
        
    def call(self, x):
        
        e = K.tanh(K.dot(x,self.W)+self.b)
        a = K.softmax(e, axis=1)
        output = x*a
        
        if self.return_sequences:
            return output
        
        return K.sum(output, axis=1)




model = Sequential()
model.add(Embedding(17666, 100, input_length=409))
model.add(Bidirectional(LSTM(32, return_sequences=False)))
model.add(attention(return_sequences=True)) # receive 3D and output 2D
model.add(Dropout(0.3))
model.add(Dense(3, activation='softmax'))


model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model.summary()

from keras.callbacks import EarlyStopping
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=3)
history777=model.fit(x_train, y_train,
 batch_size=100,
 epochs=30,
 validation_data=(x_val, y_val),
 callbacks=[es])

the model:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_14 (Embedding)     (None, 409, 100)          1766600   
_________________________________________________________________
bidirectional_14 (Bidirectio (None, 64)                34048     
_________________________________________________________________
attention_14 (attention)     (None, 64)                128       
_________________________________________________________________
dropout_6 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_14 (Dense)             (None, 3)                 195       
=================================================================
Total params: 1,800,971
Trainable params: 1,800,971
Non-trainable params: 0
____

Answer 1

注意你如何在 LSTM 和注意力層中設置 return_sequence 參數

您的輸出是 2D，所以最后一個返回序列必須設置為 False 而其他的必須設置為 True

你的模型必須是

model = Sequential()
model.add(Embedding(max_words, emb_dim, input_length=max_len))
model.add(Bidirectional(LSTM(32, return_sequences=True))) # return_sequences=True
model.add(attention(return_sequences=False)) # return_sequences=False
model.add(Dropout(0.3))
model.add(Dense(3, activation='softmax'))

這里是完整的例子： https : //colab.research.google.com/drive/13l5eAHS5uTUsdqyQNm1Dr4JEXg7Fl2Bo?usp=sharing

用於多類文本分類的具有自注意力的 LSTM

問題描述

1 個解決方案

解決方案1
1 已采納 2020-11-24 11:34:23

用於多類文本分類的具有自注意力的 LSTM

問題描述

1 個解決方案

解決方案1 1 已采納 2020-11-24 11:34:23

解決方案1
1 已采納 2020-11-24 11:34:23