[英]Input of layer sequential is incompatible with the layer: shapes error in LSTM
I'm new to neural.networks and I want to use them to compare with other machine learning methods.我是 neural.networks 的新手,我想用它们来与其他机器学习方法进行比较。 I have a multivariate time series data with a range of approximately two years.
我有一个范围约为两年的多元时间序列数据。 I want to predict 'y' for the next few days based on the other variables using LSTM.
我想使用 LSTM 根据其他变量预测接下来几天的“y”。 The final day of my data is 2020-07-31.
我的数据的最后一天是 2020-07-31。
df.tail()
y holidays day_of_month day_of_week month quarter
Date
2020-07-27 32500 0 27 0 7 3
2020-07-28 33280 0 28 1 7 3
2020-07-29 31110 0 29 2 7 3
2020-07-30 37720 0 30 3 7 3
2020-07-31 32240 0 31 4 7 3
To train the LSTM model I also split the data into train and test data.为了训练 LSTM model,我还将数据拆分为训练数据和测试数据。
from sklearn.model_selection import train_test_split
split_date = '2020-07-27' #to predict the next 4 days
df_train = df.loc[df.index <= split_date].copy()
df_test = df.loc[df.index > split_date].copy()
X1=df_train[['day_of_month','day_of_week','month','quarter','holidays']]
y1=df_train['y']
X2=df_test[['day_of_month','day_of_week','month','quarter','holidays']]
y2=df_test['y']
X_train, y_train =X1, y1
X_test, y_test = X2,y2
Because I'm working with LSTM, some scaling is needed:因为我正在使用 LSTM,所以需要进行一些缩放:
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
Now, onto the difficult part: the model.现在,进入困难的部分:model。
num_units=50
activation_function = 'sigmoid'
optimizer = 'adam'
loss_function = 'mean_squared_error'
batch_size = 10
num_epochs = 100
# Initialize the RNN
regressor = Sequential()
# Adding the input layer and the LSTM layer
regressor.add(LSTM(units = num_units, return_sequences=True ,activation = activation_function,
input_shape=(X_train.shape[1], 1)))
# Adding the output layer
regressor.add(Dense(units = 1))
# Compiling the RNN
regressor.compile(optimizer = optimizer, loss = loss_function)
# Using the training set to train the model
regressor.fit(X_train_scaled, y_train, batch_size = batch_size, epochs = num_epochs)
However, I receive the following error:但是,我收到以下错误:
ValueError: Input 0 of layer sequential_11 is incompatible with the layer: expected ndim=3, found
ndim=2. Full shape received: [None, 5]
I don't understand how we choose the parameters or the shape of the input.我不明白我们如何选择参数或输入的形状。 I've seen some videos and read some Github pages and everyone seems to run LSTM in a different way, which makes it even more difficult to implement.
我看过一些视频并阅读了一些 Github 页,每个人似乎都以不同的方式运行 LSTM,这使得它更难实现。 The previous error is probably coming from the shape but other than that is everything else right?
之前的错误可能来自形状,但除此之外其他一切都正确吗? And how can I fix this to work?
我该如何解决这个问题? Thanks
谢谢
EDIT: This similar question does not solve my problem.. I've tried the solution from there编辑: 这个类似的问题不能解决我的问题。我已经从那里尝试了解决方案
x_train = X_train_scaled.reshape(-1, 1, 5)
x_test = X_test_scaled.reshape(-1, 1, 5)
(My X_test and y_test only have one column). (我的 X_test 和 y_test 只有一列)。 And the solution also doesn't seem to work.
而且该解决方案似乎也不起作用。 I get this error now:
我现在收到此错误:
ValueError: Input 0 is incompatible with layer sequential_22: expected shape=
(None, None, 1), found shape=[None, 1, 5]
INPUT:输入:
The problem is that you model expect a 3D input of shape (batch, sequence, features)
but your X_train
is actually a slice of data frame, so a 2D array:问题是你 model 期望形状输入 3D
(batch, sequence, features)
但你的X_train
实际上是数据帧的一部分,所以是一个二维数组:
X1=df_train[['day_of_month','day_of_week','month','quarter','holidays']]
X_train, y_train =X1, y1
I assume your columns are supposed to be you features, so what you would usually do is "stack slices" of your df so that you X_train
look something like that:我假设你的专栏应该是你的特征,所以你通常会做的是你的 df 的“堆栈切片”,这样你的
X_train
看起来像这样:
Here is a dummy 2D data set of shape (15,5)
:这是一个形状为
(15,5)
的虚拟二维数据集:
data = np.zeros((15,5))
array([[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]])
You can reshape it to add a batch dimension, for example (15,1,5)
:您可以对其进行整形以添加批量维度,例如
(15,1,5)
:
data = data[:,np.newaxis,:]
array([[[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.]]])
Same data, but presented in a different way.相同的数据,但以不同的方式呈现。 Now in this example,
batch = 15
and sequence = 1
, I don't know what is the sequence length in your case but it can be anything.现在在这个例子中,
batch = 15
和sequence = 1
,我不知道你的序列长度是多少,但它可以是任何东西。
MODEL: MODEL:
Now in your model, keras
input_shape
expect (batch, sequence, features)
, when you pass this:现在在你的 model,
keras
input_shape
expect (batch, sequence, features)
中,当你传递这个:
input_shape=(X_train.shape[1], 1)
This is what you model sees: (None, Sequence = X_train.shape[1], num_features = 1)
None
is for the batch dimension.这就是您 model 看到的:
(None, Sequence = X_train.shape[1], num_features = 1)
None
是针对批量维度的。 I don't think that's what your are trying to do so once you've reshaped you should also correct input_shape
to match the new array.我不认为这就是你想要这样做的,一旦你重塑了你还应该更正
input_shape
以匹配新数组。
It is a multivariate regression problem you are solving using LSTM.这是您使用 LSTM 解决的多元回归问题。 Before jumping into the code lets actually see what it means
在进入代码之前让我们实际看看它的意思
5
feature holidays, day_of_month, day_of_week,month,quarter
per day for k
days5
特色holidays, day_of_month, day_of_week,month,quarter
per day for k
daysy
of the n
th dayn
天的y
p
records where p
is the sequence length.p
条记录没有可用的预测,其中p
是序列长度。timeseries_dataset_from_array
method.timeseries_dataset_from_array
方法创建 window 数据集。 So pictorial what we want to achieve is show below:因此,我们想要实现的图形如下所示:
For each LSTM cell unrolling, we pass in the 5 features of the day, and we unroll in m
time where m
is the sequence length.对于每个 LSTM 单元格展开,我们传入当天的 5 个特征,并在
m
时间内展开,其中m
是序列长度。 We are predicting the y
of the last day.我们正在预测最后一天的
y
。
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
# Model
regressor = models.Sequential()
regressor.add(layers.LSTM(5, return_sequences=True))
regressor.add(layers.Dense(1))
regressor.compile(optimizer='sgd', loss='mse')
# Dummy data
n = 10000
df = pd.DataFrame(
{
'y': np.arange(n),
'holidays': np.random.randn(n),
'day_of_month': np.random.randn(n),
'day_of_week': np.random.randn(n),
'month': np.random.randn(n),
'quarter': np.random.randn(n),
}
)
# Train test split
train_df, test_df = train_test_split(df)
print (train_df.shape, test_df.shape)\
# Create y to be predicted
# given last n days predict todays y
# train data
sequence_length = 3
y_pred = train_df['y'][sequence_length-1:].values
train_df = train_df[:-2]
train_df['y_pred'] = y_pred
# Validataion data
y_pred = test_df['y'][sequence_length-1:].values
test_df = test_df[:-2]
test_df['y_pred'] = y_pred
# Create window datagenerators
# Train data generator
train_X = train_df[['holidays','day_of_month','day_of_week','month','month']]
train_y = train_df['y_pred']
train_dataset = tf.keras.preprocessing.timeseries_dataset_from_array(
train_X, train_y, sequence_length=sequence_length, shuffle=True, batch_size=4)
# Validation data generator
test_X = test_df[['holidays','day_of_month','day_of_week','month','month']]
test_y = test_df['y_pred']
test_dataset = tf.keras.preprocessing.timeseries_dataset_from_array(
test_X, test_y, sequence_length=sequence_length, shuffle=True, batch_size=4)
# Finally fit the model
regressor.fit(train_dataset, validation_data=test_dataset, epochs=3)
Output: Output:
(7500, 6) (2500, 6)
Epoch 1/3
1874/1874 [==============================] - 8s 3ms/step - loss: 9974697.3664 - val_loss: 8242597.5000
Epoch 2/3
1874/1874 [==============================] - 6s 3ms/step - loss: 8367530.7117 - val_loss: 8256667.0000
Epoch 3/3
1874/1874 [==============================] - 6s 3ms/step - loss: 8379048.3237 - val_loss: 8233981.5000
<tensorflow.python.keras.callbacks.History at 0x7f3e94bdd198>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.