LSTM keras 多个特性：我做错了什么？

Question

The code below predicts Close value (stock prices) with 3 inputs: Close , Open and Volume .下面的代码使用 3 个输入预测Close价（股票价格）： Close价、 Open和Volume 。 Dataset:数据集：

             Close    Open   Volume
Date                               
2019-09-20  5489.0  5389.0  1578781
2019-09-23  5420.0  5460.0   622325
2019-09-24  5337.5  5424.0   688395
2019-09-25  5343.5  5326.5   628849
2019-09-26  5387.5  5345.0   619344
...            ...     ...      ...
2020-03-30  4459.0  4355.0  1725236
2020-03-31  4715.0  4550.0  2433310
2020-04-01  4674.5  4596.0  1919728
2020-04-02  5050.0  4865.0  3860103
2020-04-03  5204.5  5050.0  3133078

[134 rows x 3 columns]

Info:信息：

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 134 entries, 2019-09-20 to 2020-04-03
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Close   134 non-null    float64
 1   Open    134 non-null    float64
 2   Volume  134 non-null    int64  
dtypes: float64(2), int64(1)

The question is how to correct script to get right prediction with 3 features last 10 days, because I get this:问题是如何更正脚本以在过去 10 天内获得 3 个功能的正确预测，因为我明白了：

Epoch 1/1
64/64 [==============================] - 6s 88ms/step - loss: 37135470.9219
[[32.588608]
 [32.587284]
 [32.586754]
 [32.587196]
 [32.58649 ]
 [32.58663 ]
 [32.586098]
 [32.58682 ]
 [32.586452]
 [32.588108]]
rmse: 4625.457010985681

The problem remains even if I remove scaling ( fit_transform ) at all.即使我完全删除缩放（ fit_transform ），问题仍然存在。 In other topics I was told there is no need to scale y_train .在其他主题中，有人告诉我没有必要缩放y_train 。 Full script code:完整的脚本代码：

from math import sqrt
from numpy import concatenate
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import mean_squared_error
from keras.models import Sequential
from keras.layers import Dense, Dropout, Embedding
from keras.layers import LSTM
import numpy as np
from datetime import datetime, timedelta
import yfinance as yf

start = (datetime.now() - timedelta(days=200)).strftime("%Y-%m-%d")
end = (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d")
df = yf.download(tickers="LKOH.ME", start=start, end=end, interval="1d")
dataset = df.loc[start:end].filter(['Close', 'Open', 'Volume']).values
scaler = MinMaxScaler(feature_range=(0,1))

training_data_len = len(dataset) - 10 # last 10 days to test
train_data = dataset[0:int(training_data_len), :]
x_train = []
y_train = []

for i in range(60, len(train_data)):
    x_train.append(train_data[i-60:i, :]) # get all 3 features
    y_train.append(train_data[i, 0]) # 0 means we predict Close

x_train, y_train = np.array(x_train), np.array(y_train)
x_train = x_train.reshape((x_train.shape[0], x_train.shape[1]*x_train.shape[2])) # convert to 2d for fit_transform()
x_train = scaler.fit_transform(x_train)
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))

model = Sequential()
# Do I need to change it to input_shape=(x_train.shape[1], 3), because of 3 features?
model.add(LSTM(50, return_sequences=True, input_shape=(x_train.shape[1], 1)))
model.add(LSTM(50))
model.add(Dense(25))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(x_train, y_train, batch_size=1, epochs=1)

test_data = dataset[training_data_len - 60:, :]
x_test = []
y_test = dataset[training_data_len:, 0]
for i in range(60, len(test_data)):
    x_test.append(test_data[i-60:i, :])

x_test = np.array(x_test)
x_test = x_test.reshape((x_test.shape[0], x_test.shape[1]*x_test.shape[2]))
x_test = scaler.fit_transform(x_test)
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))

predictions = model.predict(x_test)
print(predictions)
print('rmse:', np.sqrt(np.mean(((predictions - y_test) ** 2))))

Answer 1

As @rvinas has already mentioned, we need to scale the values and then use inverse_transform to get the desired predicted outcome.正如@rvinas 已经提到的，我们需要缩放这些值，然后使用inverse_transform来获得所需的预测结果。 You can find the reference here .您可以在此处找到参考。

After making some small changes in the code and I was able to come up with satisfactory results.在对代码进行了一些小的更改后，我能够得出令人满意的结果。 We can play around with the data scaling methodologies and model architectures to improve the results.我们可以使用data scaling 方法和 model 架构来改进结果。

After some enhancements经过一些改进

from math import sqrt
from numpy import concatenate
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import mean_squared_error
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Embedding
from tensorflow.keras.layers import LSTM
from tensorflow.keras.optimizers import SGD
import numpy as np
from datetime import datetime, timedelta
import yfinance as yf

start = (datetime.now() - timedelta(days=200)).strftime("%Y-%m-%d")
end = (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d")
df = yf.download(tickers="LKOH.ME", start=start, end=end, interval="1d")
dataset = df.loc[start:end].filter(['Close', 'Open', 'Volume']).values
scaler = MinMaxScaler(feature_range=(0,1))


dataset = scaler.fit_transform(dataset)
training_data_len = len(dataset) - 10 # last 10 days to test
train_data = dataset[0:int(training_data_len), :]
x_train = []
y_train = []

for i in range(60, len(train_data)):
    x_train.append(train_data[i-60:i, :]) # get all 3 features
    y_train.append(train_data[i, 0]) # 0 means we predict Close


x_train, y_train = np.array(x_train), np.array(y_train)
x_train = x_train.reshape((x_train.shape[0], x_train.shape[1]*x_train.shape[2])) # convert to 2d for fit_transform()
x_train_scale = scaler.fit(x_train)
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))

model = Sequential()
# Do I need to change it to input_shape=(x_train.shape[1], 3), because of 3 features?
# yes, i did that.

model.add(LSTM(units=50,return_sequences=True, kernel_initializer='random_uniform', input_shape=(x_train.shape[1], 1)))
model.add(Dropout(0.2))
model.add(LSTM(units=50,return_sequences=True, kernel_initializer='random_uniform'))
model.add(Dropout(0.2))
model.add(LSTM(units=50,return_sequences=True, kernel_initializer='random_uniform'))
model.add(Dropout(0.2))
model.add(LSTM(units=50, kernel_initializer='random_uniform'))
model.add(Dropout(0.2))
model.add(Dense(units=25, activation='relu'))
model.add(Dense(units=1))

# compile model
model.compile(optimizer='adam', loss='mean_squared_error')
model.summary()
model.fit(x_train, y_train, batch_size=5, epochs=2)

test_data = dataset[training_data_len - 60:, :]
x_test = []
y_test = dataset[training_data_len:, 0]
for i in range(60, len(test_data)):
    x_test.append(test_data[i-60:i, :])

x_test = np.array(x_test)
x_test = x_test.reshape((x_test.shape[0], x_test.shape[1]*x_test.shape[2]))
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))

predictions = model.predict(x_test)
# predictions = y_train_scale.inverse_transform(predictions)
print(predictions)
print('rmse:', np.sqrt(np.mean(((predictions - y_test) ** 2))))

Predictions 1:预测1：

start = (datetime.now() - timedelta(days=200)).strftime("%Y-%m-%d")

opt = SGD(lr=0.01, momentum=0.9, clipnorm=1.0, clipvalue=0.5)
model.compile(loss='mean_squared_error', optimizer=opt)

[[0.6151125 ]
 [0.6151124 ]
 [0.6151121 ]
 [0.6151119 ]
 [0.61511195]
 [0.61511236]
 [0.61511326]
 [0.615114  ]
 [0.61511385]
 [0.6151132 ]]
rmse: 0.24450220836260966

Prediction 2:预测二：

start = (datetime.now() - timedelta(days=1000)).strftime("%Y-%m-%d")

model.compile(optimizer='adam', loss='mean_squared_error')

[[0.647125  ]
 [0.6458076 ]
 [0.6405072 ]
 [0.63450944]
 [0.6315386 ]
 [0.6384401 ]
 [0.65666   ]
 [0.68073314]
 [0.703547  ]
 [0.72095114]]
rmse: 0.1236932687978488

Stock market prices are highly unpredictable and volatile.股票市场价格是高度不可预测和波动的。 This means that there are no consistent patterns in the data that allow you to model stock prices over time near-perfectly.这意味着数据中没有一致的模式可以让您随着时间的推移接近完美的 model 股票价格。 So this needs a lot of R&D to come up with a good strategy.所以这需要大量的研发来提出一个好的策略。

Things that you can do:你可以做的事情：

Add more training data to your model, so that it is able to generalize it better.将更多的训练数据添加到您的 model 中，使其能够更好地泛化它。
Make the model deeper.使 model 更深。 Play around with the model Hyperparameters to squeeze out the performance of the model.使用Hyperparameters超参数来发挥 model 的性能。 You can find a good reference here about hyperparameter tuning .您可以在此处找到有关超参数调整的良好参考。

You can find more information about various other data preprocessing techniques and model architecture from reference links below:您可以从以下参考链接中找到有关各种其他数据预处理技术和 model 架构的更多信息：

Stock Market Predictions with LSTM in Python Python 中的 LSTM 股市预测

Machine Learning to Predict Stock Prices 机器学习预测股票价格

Answer 2

Even though the posted answer is technically correct and provides useful references, I found it a bit annoying that the results of fitting do not make much sense (you can notice that predictions are constant, even though the y_test isn't).尽管发布的答案在技术上是正确的并且提供了有用的参考，但我发现拟合的结果没有多大意义有点烦人（您可以注意到预测是恒定的，即使 y_test 不是）。 Yes, scaling fixes the loss - with the values in the order of 1000 the L2 measure makes any gradient-based algorithm very unstable and Rishab's answer addresses that.是的，缩放修复了损失——L2 度量的值在 1000 左右，使得任何基于梯度的算法都非常不稳定，Rishab 的答案解决了这个问题。 Here is my code snippet.这是我的代码片段。 With the following changes in addition to the scaling:除了缩放之外，还有以下更改：

Use more data.使用更多数据。 I randomly chose 10000 days, but if there is more, you probably will get better results.我随机选择了 10000 天，但如果有更多，你可能会得到更好的结果。 200 points are not sufficient to get any convergence better than the straight line. 200 分不足以获得比直线更好的收敛。
With more points use a larger batch as otherwise, it'll take a while to fit更多的点使用更大的批次，否则需要一段时间才能适应
With the larger batch, use more epochs (although in this case, more than 3 do not produce any better convergence)对于较大的批次，使用更多的时期（尽管在这种情况下，超过 3 不会产生任何更好的收敛）

Lastly, do not just look at the RMSE, plot your data.最后，不要只看 RMSE，plot 您的数据。 Small RMSE doesn't necessarily mean there is any meaningful fit.小的 RMSE 并不一定意味着有任何有意义的拟合。

With the snippet below I've got a somewhat good fit on the train data.通过下面的代码片段，我对火车数据有了一定的了解。 And foreseeing the questions: yes, I totally know that I'm overfitting the data, but that is what the convergence should be doing at the very least here as fitting a straight line is much less meaningful for this of problem.并预见问题：是的，我完全知道我过度拟合了数据，但这就是收敛至少应该在这里做的事情，因为拟合直线对于这个问题的意义要小得多。 This, at least, pretends to predict something.至少，这假装预测了一些事情。

from math import sqrt
from numpy import concatenate
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import mean_squared_error
from keras.models import Sequential
from keras.layers import Dense, Dropout, Embedding
from keras.layers import LSTM
import numpy as np
from datetime import datetime, timedelta
import yfinance as yf
from matplotlib import pyplot as plt

start = (datetime.now() - timedelta(days=10000)).strftime("%Y-%m-%d")
end = (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d")
df = yf.download(tickers="LKOH.ME", start=start, end=end, interval="1d")
scaler = MinMaxScaler(feature_range=(0,1))
dataset = scaler.fit_transform(df.loc[start:end].filter(['Close', 'Open', 'Volume']).values)

training_data_len = len(dataset) - 10 # last 10 days to test
train_data = dataset[0:int(training_data_len), :]
x_train = []
y_train = []

for i in range(60, len(train_data)):
    x_train.append(train_data[i-60:i, :]) # get all 3 features
    y_train.append(train_data[i, 0]) # 0 means we predict Close

x_train, y_train = np.array(x_train), np.array(y_train)
x_train = x_train.reshape((x_train.shape[0], x_train.shape[1]*x_train.shape[2])) # convert to 2d for fit_transform()
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))

model = Sequential()
# Do I need to change it to input_shape=(x_train.shape[1], 3), because of 3 features?
model.add(LSTM(50, return_sequences=True, input_shape=(x_train.shape[1], 1)))
model.add(LSTM(50))
model.add(Dense(25))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(x_train, y_train, batch_size=100, epochs=3)

test_data = dataset[training_data_len - 60:, :]
x_test = []
y_test = dataset[training_data_len:, 0]
for i in range(60, len(test_data)):
    x_test.append(test_data[i-60:i, :])

x_test = np.array(x_test)
x_test = x_test.reshape((x_test.shape[0], x_test.shape[1]*x_test.shape[2]))
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))

predictions = model.predict(x_test)
print(predictions)
print('rmse:', np.sqrt(np.mean(((predictions - y_test) ** 2))))

Here is the output:这是 output：

>>> print(predictions)
[[0.64643383]
 [0.63276255]
 [0.6288108 ]
 [0.6320714 ]
 [0.6572328 ]
 [0.6998471 ]
 [0.7333    ]
 [0.7492812 ]
 [0.7503019 ]
 [0.75124526]]
>>> print('rmse:', np.sqrt(np.mean(((predictions - y_test) ** 2))))
rmse: 0.0712241892828221

To plot use, for training data fit以 plot 使用，用于训练数据拟合

plt.plot(model.predict(x_train))
plt.plot(y_train)
plt.show()

and for test predictions并用于测试预测

plt.plot(model.predict(x_test))
plt.plot(y_test)
plt.show()

LSTM keras 多个特性：我做错了什么？

问题描述

2 个解决方案

解决方案1
1 2020-04-11 17:41:40

解决方案2
1 2020-04-12 05:50:47

LSTM keras 多个特性：我做错了什么？

问题描述

2 个解决方案

解决方案1 1 2020-04-11 17:41:40

解决方案2 1 2020-04-12 05:50:47

解决方案1
1 2020-04-11 17:41:40

解决方案2
1 2020-04-12 05:50:47