简体   繁体   English

如何使用 LSTM Keras 预测未来股票

[英]How to predict future Stock using LSTM Keras

First of all, I must say, I'm a beginner to this AI things.首先,我必须说,我是这个 AI 事物的初学者。 I followed most of the tutorials about stock market predictions and all of them are pretty much same.我遵循了大多数有关股市预测的教程,所有教程都非常相似。 These tutorials using a data set and split in to two sets.这些教程使用一个数据集并分成两组。 First one is Training set and the 2nd one is Test set.第一个是训练集,第二个是测试集。 They are using Closing price of the stocks to train and make a model.他们正在使用股票的收盘价来训练和制作模型。 From that model, they insert test data set which contain the closing price and showing two graphs.从该模型中,他们插入包含收盘价并显示两个图表的测试数据集。 Then they say the actual and the predicted graphs are pretty much same.然后他们说实际和预测的图表几乎相同。 The github repo of the tutorial.本教程的 github 存储库。 - https://github.com/surajr/Stock-Predictor-using-LSTM/blob/master/Stock-Predictor-using-LSTM.ipynb This is my question, 1. Why all those tutorials are putting closing price in the testing set also? - https://github.com/surajr/Stock-Predictor-using-LSTM/blob/master/Stock-Predictor-using-LSTM.ipynb这是我的问题, 1. 为什么所有这些教程都将收盘价放在测试中还设置? They are only suppose to insert dates right?他们只是假设插入日期对吗? Because we are predicting the closing price.因为我们在预测收盘价。 This is confusing.这令人困惑。 Please explain me.请解释一下。 2. No one is telling me how to predict next 7 days values. 2. 没有人告诉我如何预测未来 7 天的价值。 So if we have a model, how to get next 7 days closing value?那么如果我们有一个模型,如何获得接下来的 7 天收盘价?

Please help me to clarify this.请帮我澄清这一点。 Thanks a lot.非常感谢。

Take a look at this link.看看这个链接。 I think it will get you going in the right direction.我认为这会让你朝着正确的方向前进。

https://www.datacamp.com/community/tutorials/lstm-python-stock-market https://www.datacamp.com/community/tutorials/lstm-python-stock-market

Why all those tutorials are putting closing price in the testing set also?为什么所有这些教程也将收盘价放在测试集中?

The ultimate goal is to predict the movement (growth), Which is closing minus- opening price.最终目标是预测运动(增长),即收盘价减去开盘价。 The ultimate model is the model that calculates the growth in test data set very close to what the actual growth is.最终模型是计算测试数据集增长非常接近实际增长的模型。 The growth is the main problem that the model is trying solve and is the point of reference when you calculate the accuracy of the trained model.增长是模型试图解决的主要问题,也是计算训练模型准确率的参考点。

They are only suppose to insert dates right?他们只是假设插入日期对吗? Because we are predicting the closing price因为我们在预测收盘价

The model is predicting the growth based on given factors.该模型根据给定的因素预测增长。 For a company, you have many factors that are quantified, per day.对于一家公司,您每天有许多量化的因素。 I suspect the tutorial you did uses a testing set extracted for one particular day and different stocks.我怀疑您所做的教程使用了为特定日期和不同股票提取的测试集。 Like extracting all parameters for all companies but only in 10th of January and then check how accurate the trained model is.就像提取所有公司的所有参数,但仅限于 1 月 10 日,然后检查训练模型的准确度。 The training set on the other hand contains the stock for more than one day most of the time.另一方面,训练集大部分时间都包含超过一天的库存。

No one is telling me how to predict next 7 days values.没有人告诉我如何预测未来 7 天的价值。 So if we have a model, how to get next 7 days closing value?那么如果我们有一个模型,如何获得接下来的 7 天收盘价?

To predict the stock price relatively accurate, you need a well-trained model.要相对准确地预测股价,您需要一个训练有素的模型。 To do this you need to train your model based on many many factors.为此,您需要根据许多因素来训练模型。 Same model cannot predict stock in different countries.同一个模型无法预测不同国家的库存。 One model might be suitable to predict technology stocks (AAPL) but not other fields.一种模型可能适用于预测科技股 (AAPL),但不适用于其他领域。

Overall, this is a complicated subject. 总的来说,这是一个复杂的课题。 Financial advisers pay a massive amount of money just to use reliable models. 财务顾问只是为了使用可靠的模型而支付大量资金。 Most of them use multiple models based on their client's portfolio. 他们中的大多数人根据客户的投资组合使用多种模型。 These tutorials introduce the subject to you and teach you the main concept. 这些教程向您介绍该主题并教您主要概念。 IMHO, I would say the next step would be learning and then competing in Kaggle. 恕我直言,我想说下一步是学习,然后参加 Kaggle。

In the training set, closing value is included as an input because it is relevant to the "next day's" price, or "price in X days" (for models that predict price movement over more than 1 day).在训练集中,收盘价作为输入包含在内,因为它与“第二天的”价格或“X 天内的价格”(对于预测价格变动超过 1 天的模型)相关。

Note, in the training data, typically the future price (today + 1 day) is the target value (train_Y).请注意,在训练数据中,通常未来价格(今天 + 1 天)是目标值 (train_Y)。

In the testing data, the closing data is included because the testing data is predicting "future price."在测试数据中,包含收盘数据,因为测试数据预测“未来价格”。

In determining the accuracy of the model, the price prediction of (today + X days) is compared against the future value (test_Y) to determine the effectiveness of the prediction.在确定模型的准确性时,将(今天 + X 天)的价格预测与未来值 (test_Y) 进行比较,以确定预测的有效性。 Just like a human stock trader, if you are guessing/predicting if the FUTURE price will be Y (ie up/down), then you would have access to the current day's end of day closing price...which is why it is a relevant input.就像人类股票交易者一样,如果您正在猜测/预测未来价格是否为 Y(即上涨/下跌),那么您将可以访问当天的收盘价......这就是为什么它是一个相关输入。 Obviously, in a real-world model, the accuracy of the prediction would only be known AFTER X days pass.显然,在现实世界的模型中,预测的准确性只有在 X 天过去后才能知道。 When training and then testing a model, typically the data is historical, so out of sample values (like the price of today + X days) is used for accuracy determination, though the FUTURE value should definitely not be an input.在训练然后测试模型时,通常数据是历史数据,因此样本外值(如今天的价格 + X 天)用于确定准确性,但 FUTURE 值绝对不应作为输入。

  1. Why all those tutorials are putting closing price in the testing set also?为什么所有这些教程也将收盘价放在测试集中? -> It is easy to understand that closing price is a kind of input variable which is required to calculate stock price. -> 很容易理解收盘价是一种计算股票价格所需的输入变量。

  2. As I see the code, it seems predict stock price with 22days history正如我看到的代码,它似乎预测了 22 天历史的股票价格

X_train (1173, 22, 3) y_train (1173,) X_test (130, 22, 3) y_test (130,) X_train (1173, 22, 3) y_train (1173,) X_test (130, 22, 3) y_test (130,)

I think you should re-train with (~~~, 7, 3) to predict price of 7 days after today.我认为您应该重新训练 (~~~, 7, 3) 以预测今天后 7 天的价格。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM