简体繁体 English

LSTM 模型是否使用特征趋势？

[英]Does an LSTM model use trend in features?

原文 2022-06-29 13:30:18 4 1 machine-learning/ lstm/ lstm-stateful

Does an LSTM take into account a trend in a feature? LSTM 是否考虑了特征的趋势？ Or does it only see trends from the previous output (Y predicted)?还是它只看到先前输出的趋势（Y 预测）？

To illustrate, imagine we have a trend in feature A. In our problem, we know that in the real world Y tends to decrease as A increases (inversely proportional).为了说明，假设我们在特征 A 上有一个趋势。在我们的问题中，我们知道在现实世界中，Y 趋向于随着 A 的增加而减小（成反比）。 However, Y is NOT related to the actual value of A.但是，Y 与 A 的实际值无关。

Example, if A increases from 10 to 20, Y decrease by 1. If A increase from 40 to 50, Y decrease by 1 as well.例如，如果 A 从 10 增加到 20，Y 减少 1。如果 A 从 40 增加到 50，Y 也减少 1。 Similarly, if A decreases from 30 to 20, Y should decrease by 1 and if A decreases from 60 to 50, Y should decrease by 1 as well.同样，如果 A 从 30 减少到 20，Y 应该减少 1，如果 A 从 60 减少到 50，Y 也应该减少 1。

In the above example, the LSTM will do well if it can understand the decreasing trend of feature A. However, if it using the actual value of feature A, it will not be useful at all.在上面的例子中，如果 LSTM 能够理解特征 A 的下降趋势，它会做得很好。但是，如果它使用特征 A 的实际值，它就完全没有用了。 The value of "10" or "50" for A above is in itself meaningless because there is not direct correlation between the value of A and Y, only the trend in A influences Y. A real world example of this is the correlation between SPY (stock market) and VIX (volatility index).上面 A 的“10”或“50”值本身是没有意义的，因为 A 和 Y 的值之间没有直接的相关性，只有 A 的趋势会影响 Y。现实世界的一个例子是 SPY 之间的相关性（股票市场）和 VIX（波动率指数）。 The VIX has an inverse correlation to the movement of SPY, but the actual value of SPY doesn't matter. VIX 与 SPY 的走势呈负相关，但 SPY 的实际值并不重要。

I have spent a lot of time learning about LSTMs in general, but it's not clear whether the "memory" remembers the trend in a feature's value.我花了很多时间学习一般的 LSTM，但不清楚“记忆”是否记住了特征值的趋势。 From what I can see, it does not, and only remembers the previous "weights" of the features and the trend in the output (Y).据我所知，它没有，并且只记住以前的特征“权重”和输出（Y）中的趋势。

1 个解决方案

There are no "previous weights";没有“以前的权重”； the weights are fixed at evaluation time.权重在评估时是固定的。 The network remembers whatever function of the previous inputs and the previous state it learns to remember, based on the recurrent weights it learned during training.网络根据它在训练期间学习的循环权重记住先前输入的任何功能和它学会记住的先前状态。 The difference between an input at the current time-step and the previous time-step, or an approximation of a longer-range derivative, is certainly something that it could learn if that was useful.当前时间步长的输入与前一个时间步长的输入之间的差异，或者一个较长范围导数的近似值，如果有用的话，它当然可以学习。