层顺序输入与层不兼容：LSTM 中的形状错误

Question

我是 neural.networks 的新手，我想用它们来与其他机器学习方法进行比较。 我有一个范围约为两年的多元时间序列数据。 我想使用 LSTM 根据其他变量预测接下来几天的“y”。 我的数据的最后一天是 2020-07-31。

df.tail()

              y   holidays  day_of_month    day_of_week month   quarter
   Date                     
 2020-07-27 32500      0      27                 0        7        3
 2020-07-28 33280      0      28                 1        7        3
 2020-07-29 31110      0      29                 2        7        3
 2020-07-30 37720      0      30                 3        7        3
 2020-07-31 32240      0      31                 4        7        3

为了训练 LSTM model，我还将数据拆分为训练数据和测试数据。

from sklearn.model_selection import train_test_split
split_date = '2020-07-27' #to predict the next 4 days
df_train = df.loc[df.index <= split_date].copy()
df_test = df.loc[df.index > split_date].copy()
X1=df_train[['day_of_month','day_of_week','month','quarter','holidays']]
y1=df_train['y']
X2=df_test[['day_of_month','day_of_week','month','quarter','holidays']]
y2=df_test['y']

X_train, y_train =X1, y1
X_test, y_test = X2,y2

因为我正在使用 LSTM，所以需要进行一些缩放：

scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

现在，进入困难的部分：model。

num_units=50
activation_function = 'sigmoid'
optimizer = 'adam'
loss_function = 'mean_squared_error'
batch_size = 10
num_epochs = 100

 # Initialize the RNN
regressor = Sequential()

 # Adding the input layer and the LSTM layer
regressor.add(LSTM(units = num_units, return_sequences=True ,activation = activation_function, 
input_shape=(X_train.shape[1], 1)))

 # Adding the output layer
regressor.add(Dense(units = 1))

 # Compiling the RNN
regressor.compile(optimizer = optimizer, loss = loss_function)

# Using the training set to train the model
regressor.fit(X_train_scaled, y_train, batch_size = batch_size, epochs = num_epochs)

但是，我收到以下错误：

ValueError: Input 0 of layer sequential_11 is incompatible with the layer: expected ndim=3, found 
ndim=2. Full shape received: [None, 5]

我不明白我们如何选择参数或输入的形状。 我看过一些视频并阅读了一些 Github 页，每个人似乎都以不同的方式运行 LSTM，这使得它更难实现。 之前的错误可能来自形状，但除此之外其他一切都正确吗？ 我该如何解决这个问题？ 谢谢

编辑：这个类似的问题不能解决我的问题。我已经从那里尝试了解决方案

x_train = X_train_scaled.reshape(-1, 1, 5)
x_test  = X_test_scaled.reshape(-1, 1, 5)

（我的 X_test 和 y_test 只有一列）。 而且该解决方案似乎也不起作用。 我现在收到此错误：

ValueError: Input 0 is incompatible with layer sequential_22: expected shape= 
(None, None, 1), found shape=[None, 1, 5]

Answer 1

输入：

问题是你 model 期望形状输入 3D (batch, sequence, features)但你的X_train实际上是数据帧的一部分，所以是一个二维数组：

X1=df_train[['day_of_month','day_of_week','month','quarter','holidays']]
X_train, y_train =X1, y1

我假设你的专栏应该是你的特征，所以你通常会做的是你的 df 的“堆栈切片”，这样你的X_train看起来像这样：

这是一个形状为(15,5)的虚拟二维数据集：

data = np.zeros((15,5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

您可以对其进行整形以添加批量维度，例如(15,1,5) ：

data = data[:,np.newaxis,:] 

array([[[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]]])

相同的数据，但以不同的方式呈现。 现在在这个例子中， batch = 15和sequence = 1 ，我不知道你的序列长度是多少，但它可以是任何东西。

MODEL：

现在在你的 model, keras input_shape expect (batch, sequence, features)中，当你传递这个：

input_shape=(X_train.shape[1], 1)

这就是您 model 看到的： (None, Sequence = X_train.shape[1], num_features = 1) None是针对批量维度的。 我不认为这就是你想要这样做的，一旦你重塑了你还应该更正input_shape以匹配新数组。

Answer 2

这是您使用 LSTM 解决的多元回归问题。 在进入代码之前让我们实际看看它的意思

问题陈述：

你有5特色holidays, day_of_month, day_of_week,month,quarter per day for k days
对于任何 n 天，给定最后 'm' 天的特征，你想预测第n天的y

创建 window 数据集：

我们首先需要确定我们想要提供给 model 的天数。这称为序列长度（在本例中我们将其固定为 3）。
我们必须拆分序列长度的天数来创建训练和测试数据集。 这是通过使用滑动 window 来完成的，其中 window 大小是序列长度。
如您所见，最后p条记录没有可用的预测，其中p是序列长度。
我们将使用timeseries_dataset_from_array方法创建 window 数据集。
有关更多高级内容，请关注官方 tf 文档。

长短期记忆网络 Model

因此，我们想要实现的图形如下所示：

对于每个 LSTM 单元格展开，我们传入当天的 5 个特征，并在m时间内展开，其中m是序列长度。 我们正在预测最后一天的y 。

代码：

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

# Model
regressor =  models.Sequential()
regressor.add(layers.LSTM(5, return_sequences=True))
regressor.add(layers.Dense(1))
regressor.compile(optimizer='sgd', loss='mse')

# Dummy data
n = 10000
df = pd.DataFrame(
    {
      'y': np.arange(n),
      'holidays': np.random.randn(n),
      'day_of_month': np.random.randn(n),
      'day_of_week': np.random.randn(n),
      'month': np.random.randn(n),
      'quarter': np.random.randn(n),     
    }
)

# Train test split
train_df, test_df = train_test_split(df)
print (train_df.shape, test_df.shape)\

# Create y to be predicted 
# given last n days predict todays y

# train data
sequence_length = 3
y_pred = train_df['y'][sequence_length-1:].values
train_df = train_df[:-2]
train_df['y_pred'] = y_pred

# Validataion data
y_pred = test_df['y'][sequence_length-1:].values
test_df = test_df[:-2]
test_df['y_pred'] = y_pred

# Create window datagenerators

# Train data generator
train_X = train_df[['holidays','day_of_month','day_of_week','month','month']]
train_y = train_df['y_pred']
train_dataset = tf.keras.preprocessing.timeseries_dataset_from_array(
    train_X, train_y, sequence_length=sequence_length, shuffle=True, batch_size=4)

# Validation data generator
test_X = test_df[['holidays','day_of_month','day_of_week','month','month']]
test_y = test_df['y_pred']
test_dataset = tf.keras.preprocessing.timeseries_dataset_from_array(
    test_X, test_y, sequence_length=sequence_length, shuffle=True, batch_size=4)

# Finally fit the model
regressor.fit(train_dataset, validation_data=test_dataset, epochs=3)

Output：

(7500, 6) (2500, 6)
Epoch 1/3
1874/1874 [==============================] - 8s 3ms/step - loss: 9974697.3664 - val_loss: 8242597.5000
Epoch 2/3
1874/1874 [==============================] - 6s 3ms/step - loss: 8367530.7117 - val_loss: 8256667.0000
Epoch 3/3
1874/1874 [==============================] - 6s 3ms/step - loss: 8379048.3237 - val_loss: 8233981.5000
<tensorflow.python.keras.callbacks.History at 0x7f3e94bdd198>

层顺序输入与层不兼容：LSTM 中的形状错误

问题描述

2 个解决方案

解决方案1
2 2020-12-22 14:18:51

解决方案2
1 2020-12-24 14:27:54

问题陈述：

创建 window 数据集：

长短期记忆网络 Model

代码：

层顺序输入与层不兼容：LSTM 中的形状错误

问题描述

2 个解决方案

解决方案1 2 2020-12-22 14:18:51

解决方案2 1 2020-12-24 14:27:54

问题陈述：

创建 window 数据集：

长短期记忆网络 Model

代码：

解决方案1
2 2020-12-22 14:18:51

解决方案2
1 2020-12-24 14:27:54