簡體 English 中英

Additive attention 和 RNN cell 的計算復雜度不匹配

[英]Mismatch between computational complexity of Additive attention and RNN cell

原文 2022-12-02 14:21:33 9 1 machine-learning/ deep-learning/ nlp/ recurrent-neural-network/ attention-model

根據Attention is all you need論文：Additive attention（Bahdanau 在 RNN 中使用的經典注意力）使用具有單個隱藏層的前饋網絡計算兼容性 function。 雖然兩者在理論復雜性上相似，...

事實上，我們可以在這里看到加法注意力和 dot-prod（transformer 注意力）的計算復雜度都是n²*d 。

然而，如果我們仔細觀察附加注意力，它實際上是一個具有n*d²計算復雜度的 RNN 單元（根據同一張表）。

因此，加法注意力的計算復雜度不應該是n*d²而不是n²*d嗎？

1 個解決方案

你聲稱加性注意力實際上是一個 RNN 單元，這讓你誤入歧途。 加法注意是在編碼器和解碼器 RNN“之間”使用完全連接的淺層（1 個隱藏層）前饋神經網絡實現的，如下所示，並在 Bahdanau 等人的原始論文中進行了描述。 （第 3 頁） [1] ：

... alignment model對 position j和 output 在 position i的輸入匹配程度進行評分。 該分數基於 RNN 隱藏 state s_i − 1 （就在發出y_i之前，等式（4））和輸入句子的第j個注釋h_j 。

我們將 alignment model a參數化為前饋神經網絡，它與所提出系統的所有其他組件聯合訓練......

圖 1：來自[2]的注意力機制圖。

因此，alignment 分是通過將隱藏的解碼器 state 的輸出添加到編碼器輸出來計算的。 所以附加注意力不是 RNN 單元。

參考

[1] Bahdanau, D.、Cho, K. 和 Bengio, Y.，2014。通過聯合學習對齊和翻譯進行神經機器翻譯。 arXiv 預印本 arXiv:1409.0473。

[2] Arbel, N., 2019。RNN 中的注意力。 中等博客文章。

Transformer 中自注意力的計算復雜度 Model

[英]Computational Complexity of Self-Attention in the Transformer Model

tf.nn.rnn_cell.MultiRNNCell是否創建可變形狀不匹配？

[英]tf.nn.rnn_cell.MultiRNNCell creates variable shape mismatch?

預測SVM分類器的計算復雜度

[英]Prediction computational complexity of an SVM classifier

如何自定義RNN單元

[英]How to customize a RNN cell

通過Matlab中的AdaboostM1進行高功能選擇，以降低計算復雜性

[英]High feature selection with AdaboostM1 in Matlab to reduce computational complexity

應該對可變長度序列上的RNN注意權重進行重新標准化，以“掩蓋”零填充的影響嗎？

[英]Should RNN attention weights over variable length sequences be re-normalized to “mask” the effects of zero-padding?

如何在Tensorflow中使RNN單元的權重無法處理？

[英]How to make the weights of an RNN cell untrainable in Tensorflow?

Recurrentshop和Keras：多維RNN導致尺寸不匹配誤差

[英]Recurrentshop and Keras: multi-dimensional RNN results in a dimensions mismatch error

什么是傳統的加法模型以及這些模型與機器學習模型之間的區別？

[英]what is traditional additive models and the differences between these models and machine learning models?

在TensorFlow中如何在運行時從RNN單元列表中選擇RNN單元

[英]How to choose a RNN cell from a list of RNN cells during runtime in TensorFlow

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 Transformer 中自注意力的計算復雜度 Model tf.nn.rnn_cell.MultiRNNCell是否創建可變形狀不匹配？預測SVM分類器的計算復雜度如何自定義RNN單元通過Matlab中的AdaboostM1進行高功能選擇，以降低計算復雜性應該對可變長度序列上的RNN注意權重進行重新標准化，以“掩蓋”零填充的影響嗎？如何在Tensorflow中使RNN單元的權重無法處理？ Recurrentshop和Keras：多維RNN導致尺寸不匹配誤差什么是傳統的加法模型以及這些模型與機器學習模型之間的區別？在TensorFlow中如何在運行時從RNN單元列表中選擇RNN單元

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM