简体繁体 English

张量流中LSTM递归权重的缺失

[英]Dropout for LSTM recurrent weights in tensorflow

原文 2017-09-15 15:07:46 6 1 python/ tensorflow/ lstm

Tensorflow's DropoutWrapper allows to apply dropout to either the cell's inputs, outputs or states. Tensorflow的DropoutWrapper允许将DropoutWrapper应用于单元的输入，输出或状态。 However, I haven't seen an option to do the same thing for the recurrent weights of the cell (4 out of the 8 different matrices used in the original LSTM formulation). 但是，我还没有看到对细胞的循环重量进行相同操作的选择（在原始LSTM配方中使用的8种不同矩阵中有4种）。 I just wanted to check that this is the case before implementing a Wrapper of my own. 我只是想在实现自己的包装程序之前检查这种情况。

EDIT: 编辑：

Apparently this functionality has been added in newer versions (my original comment referred to v1.4): https://github.com/tensorflow/tensorflow/issues/13103 显然，此功能已在较新的版本中添加（我的原始评论引用了v1.4）： https : //github.com/tensorflow/tensorflow/issues/13103

1 个解决方案

It's because original LSTM model only applies dropout on the input and output layers (only to the non-recurrent layers.) This paper is considered as a "textbook" that describes the LSTM with dropout: https://arxiv.org/pdf/1409.2329.pdf 这是因为原始LSTM模型仅在输入和输出层上应用了辍学（仅适用于非递归层。）本文被认为是“教科书”，它描述了具有辍学的LSTM： https : //arxiv.org/pdf/ 1409.2329.pdf

Recently some people tried applying dropout in recurrent layers as well. 最近，有些人也尝试在循环图层中应用辍学。 If you want to look at the implementation and the math behind it, search for "A Theoretically Grounded Application of Dropout in Recurrent Neural Networks" by Yarin Gal. 如果要查看实现及其背后的数学运算，请搜索Yarin Gal的“递归神经网络中辍学的理论基础应用”。 I'm not sure Tensorflow or Keras already implemented this approach though. 我不确定Tensorflow或Keras是否已经实现了这种方法。