LSTM Keras - 多对多分类值错误：不兼容的形状

Question

I have just started with implementing a LSTM in Python with Tensorflow / Keras to test out an idea I had, however I am struggling to properly create a model. This post is mainly about a Value error that I often get (see the code at the bottom), but any and all help with creating a proper LSTM model for the problem below is greatly appreciated.我刚刚开始在 Python 和 Tensorflow / Keras 中实现 LSTM 来测试我的想法，但是我正在努力正确创建 model。这篇文章主要是关于我经常遇到的值错误（参见底部），但非常感谢为以下问题创建适当的 LSTM model 的任何帮助。

For each day, I want to predict which of a group of events will occur.对于每一天，我都想预测一组事件中的哪一个会发生。 The idea is that some events are recurring / always occur after a certain amount of time has passed, whereas other events occur only rarely or without any structure.这个想法是一些事件在经过一定时间后重复出现/总是发生，而其他事件很少发生或没有任何结构。 A LSTM should be able to pick up on these recurring events, in order to predict their occurences for days in the future. LSTM 应该能够识别这些重复发生的事件，以便预测它们在未来几天内的发生。

In order to display the events, I use a list with values 0 and 1 (non-occurence and occurence).为了显示事件，我使用了一个值为 0 和 1（未发生和发生）的列表。 So for example if I have the events ["Going to school", "Going to the gym", "Buying a computer"] I have lists like [1, 0, 1], [1, 1, 0], [1, 0, 1], [1, 1, 0] etc. The idea is then that the LSTM will recognize that I go to school every day, the gym every other day and that buying a computer is very rare.例如，如果我有事件 ["Going to school", "Going to the gym", "Buying a computer"] 我有 [1, 0, 1], [1, 1, 0], [1 , 0, 1], [1, 1, 0] 等。这个想法是 LSTM 会识别出我 go 每天上学，每隔一天去一次健身房，而且很少买电脑。 So following the sequence of vectors, for the next day it should predict [1,0,0].因此，按照向量序列，第二天它应该预测 [1,0,0]。

So far I have done the following:到目前为止，我已经完成了以下工作：

Create x_train: a numpy.array with shape (305, 60, 193).创建 x_train：形状为 (305, 60, 193) 的 numpy.array。 Each entry of x_train contains 60 consecutive days, where day is represented by a vector of the same 193 events that can take place like described above. x_train 的每个条目包含 60 个连续的天，其中天由可以如上所述发生的相同 193 个事件的向量表示。
Create y_train: a numpy.array with shape (305, 1, 193).创建 y_train：形状为 (305, 1, 193) 的 numpy.array。 Similar to x_train, but y_train only contains 1 day per entry.类似于 x_train，但 y_train 每个条目仅包含 1 天。

x_train[0] consists of day 1,2,...,60 and y_train[0] contains day 61. x_train[1] then contains day 2,...,61 and y_train[1] contains day 62, etc. The idea is that the LSTM should learn to use data from the past 60 days, and that it can then iteratively start predicting/generating new vectors of event occurences for future days. x_train[0] 包含第 1,2,...,60 天，y_train[0] 包含第 61 天。x_train[1] 然后包含第 2,...,61 天，y_train[1] 包含第 62 天，依此类推。这个想法是 LSTM 应该学习使用过去 60 天的数据，然后它可以迭代地开始预测/生成未来几天事件发生的新向量。

I am really struggling with how to create a simple implementation of a LSTM that can handle this.我真的在为如何创建一个可以处理这个问题的 LSTM 的简单实现而苦苦挣扎。 So far I think I have figured out the following:到目前为止，我想我已经弄清楚以下几点：

I need to start with the below block of code, where N_INPUTS = 60 and N_FEATURES = 193 .我需要从下面的代码块开始，其中N_INPUTS = 60和N_FEATURES = 193 。 I am not sure what N_BLOCKS should be, or if the value it should take is strictly bound by some conditions.我不确定N_BLOCKS应该是什么，或者它应该采用的值是否受到某些条件的严格约束。 EDIT: According to https://zhuanlan.zhihu.com/p/58854907 it can be whatever I want编辑：根据https://zhuanlan.zhihu.com/p/58854907可以是任何我想要的

model = Sequential()
model.add(LSTM(N_BLOCKS, input_shape=(N_INPUTS, N_FEATURES)))

I should probably add a dense layer.我可能应该添加一个致密层。 If I want the output of my LSTM to be a vector with the 193 events, this should look as follows:如果我希望我的 LSTM 的 output 成为包含 193 个事件的向量，它应该如下所示：

model.add(layers.Dense(193,activation = 'linear') #or some other activation function

I can also add a dropout layer to prevent overfitting, for example with model.add.layers.dropout(0.2) where the 0.2 is some rate at which things are set to 0.我还可以添加一个丢失层以防止过度拟合，例如model.add.layers.dropout(0.2)其中 0.2 是将事物设置为 0 的某个速率。
I need to add a model.compile(loss =..., optimizer =...) .我需要添加一个model.compile(loss =..., optimizer =...) 。 I am not sure if the loss function (eg MSE or categorical_crosstentropy) and optimizer matter if I just want a working implementation.如果我只想要一个有效的实现，我不确定损失 function（例如 MSE 或 categorical_crosstentropy）和优化器是否重要。
I need to train my model, which I can achieve by using model.fit(x_train,y_train)我需要训练我的 model，我可以使用model.fit(x_train,y_train)来实现
If all of the above works well, I can start to predict values for the next day using model.predict(the 60 days before the day I want to predict)如果以上一切正常，我可以开始使用model.predict(the 60 days before the day I want to predict)值

One of my attempts can be seen here:我的尝试之一可以在这里看到：

print(x_train.shape)
print(y_train.shape)

model = keras.Sequential()
model.add(layers.LSTM(256, input_shape=(x_train.shape[1], x_train.shape[2])))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(y_train.shape[2], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.summary()
model.fit(x_train,y_train) #<- This line causes the ValueError

Output:
(305, 60, 193)
(305, 1, 193)
Model: "sequential_29"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm_27 (LSTM)              (None, 256)               460800    
                                                                 
 dense_9 (Dense)             (None, 1)                 257       
                                                                 
=================================================================
Total params: 461,057
Trainable params: 461,057
Non-trainable params: 0
_________________________________________________________________
ValueError: Shapes (None, 1, 193) and (None, 193) are incompatible

Alternatively, I have tried replacing the line model.add(layers.Dense(y_train.shape[2], activation='softmax')) with model.add(layers.Dense(y_train.shape[1], activation='softmax')) .或者，我尝试将行model.add(layers.Dense(y_train.shape[2], activation='softmax'))替换为model.add(layers.Dense(y_train.shape[1], activation='softmax')) 。 This produces ValueError: Shapes (None, 1, 193) and (None, 1) are incompatible .这会产生ValueError: Shapes (None, 1, 193) and (None, 1) are incompatible 。

Are my ideas somewhat okay?我的想法是不是有点问题？ How can I resolve this Value Error?如何解决此值错误？ Any help would be greatly appreciated.任何帮助将不胜感激。

EDIT: As suggested in the comments, changing the size of y_train did the trick.编辑：正如评论中所建议的那样，改变y_train的大小就可以了。

print(x_train.shape)
print(y_train.shape)

model = keras.Sequential()
model.add(layers.LSTM(193, input_shape=(x_train.shape[1], x_train.shape[2]))) #De 193 mag ieder mogelijk getal zijn. zie: https://zhuanlan.zhihu.com/p/58854907
model.add(layers.Dropout(0.2))
model.add(layers.Dense(y_train.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.summary()
model.fit(x_train,y_train)


(305, 60, 193)
(305, 193)
Model: "sequential_40"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm_38 (LSTM)              (None, 193)               298764    
                                                                 
 dropout_17 (Dropout)        (None, 193)               0         
                                                                 
 dense_16 (Dense)            (None, 193)               37442     
                                                                 
=================================================================
Total params: 336,206
Trainable params: 336,206
Non-trainable params: 0
_________________________________________________________________
10/10 [==============================] - 3s 89ms/step - loss: 595.5011

Now I am stuck on the fact that model.predict(x) requires x to be of the same size as x_train, and will output an array with the same size as y_train.现在我坚持这样一个事实，即model.predict(x)要求 x 的大小与 x_train 相同，而 output 将是一个与 y_train 大小相同的数组。 I was hoping only one set of 60 days would be required to output the 61th day.我希望第 61 天 output 只需要一组 60 天。 Does anyone know how to achieve this?有谁知道如何实现这一目标？

Answer 1

The solution may be to have y_train of shape (305, 193) instead of (305, 1, 193) as you predict one day, this does not change the data, just its shape.解决方案可能是让 y_train 的形状为 (305, 193) 而不是你预测的 (305, 1, 193) 有一天，这不会改变数据，只会改变它的形状。 You should then be able to train and predict.然后你应该能够训练和预测。 With model.add(layers.Dense(y_train.shape[1], activation='softmax')) of course. model.add(layers.Dense(y_train.shape[1], activation='softmax')) 。

LSTM Keras - 多对多分类值错误：不兼容的形状

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-05-04 14:25:14

LSTM Keras - 多对多分类值错误：不兼容的形状

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-05-04 14:25:14

解决方案1
1 已采纳 2022-05-04 14:25:14