简体   繁体   English

tensorflow/keras lstm 输入形状

[英]tensorflow/keras lstm input shape

I am trying to learn the keras functional API through the tutorials from Keras, and when I try to modify the example, I seem to get a shape mismatch.我正在尝试通过 Keras 的教程学习 keras 函数式 API,当我尝试修改示例时,我似乎遇到了形状不匹配的问题。 The only difference between the tutorial code and the one below is that I remove the embedding layer since mine is a regression problem.教程代码和下面的代码之间的唯一区别是我删除了嵌入层,因为我的是一个回归问题。

Firstly, I am aware that LSTM expects 3 dimensions.首先,我知道 LSTM 需要 3 个维度。 In my example, I have:在我的例子中,我有:

TRAIN_BATCH_SIZE=32
MODEL_INPUT_BATCH_SIZE=128

headline_data = np.random.uniform(low=1, high=9000, size=(MODEL_INPUT_BATCH_SIZE, 100)).astype(np.float32)
additional_data = np.random.uniform(low=1, high=9000, size=(MODEL_INPUT_BATCH_SIZE, 5)).astype(np.float32)
labels = np.random.randint(0, 1 + 1, size=(MODEL_INPUT_BATCH_SIZE, 1))

main_input = Input(shape=(100,), dtype='float32', name='main_input')

lstm_out = LSTM(32)(main_input)

auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)

auxiliary_input = Input(shape=(5,), name='aux_input')
x = keras.layers.concatenate([lstm_out, auxiliary_input])

# We stack a deep densely-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)

# And finally we add the main logistic regression layer
main_output = Dense(1, activation='sigmoid', name='main_output')(x)

# This defines a model with two inputs and two outputs:
model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])

model.compile(optimizer='rmsprop',
                loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
                loss_weights={'main_output': 1., 'aux_output': 0.2})

# And trained it via:
model.fit({'main_input': headline_data, 'aux_input': additional_data},
                {'main_output': labels, 'aux_output': labels},
                epochs=2, batch_size=TRAIN_BATCH_SIZE)

When I run the above, I get:当我运行上述时,我得到:

ValueError: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=2

So, I tried changing my input shape like so:所以,我尝试像这样改变我的输入形状:

main_input = Input(shape=(100,1), dtype='float64', name='main_input')

and when I run this, I get:当我运行它时,我得到:

ValueError: Error when checking input: expected main_input to have 3 dimensions, but got array with shape (128, 100)

I am perplexed and lost as to where the error is coming from.我对错误的来源感到困惑和迷茫。 Would really appreciate some guidance on this.真的很感激这方面的一些指导。

EDIT编辑

I have also tried setting:我也试过设置:

headline_data = np.expand_dims(headline_data, axis=2)

and then used,然后使用,

main_input = Input(shape=headline_data.shape, dtype='float64', name='main_input')

then, I get:然后,我得到:

ValueError: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=4

seems really strange!看起来真的很奇怪!

ValueError: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=2

Your problem is with the shape of your data.您的问题在于数据的形状。

headline_data = np.random.uniform(low=1, high=9000, size=(MODEL_INPUT_BATCH_SIZE, 100))
headline_data.shape

returns返回

(128,100)

However this should have three dimensions.然而,这应该具有三个维度。

Without doublechecking you probably need to do something like:如果没有仔细检查,您可能需要执行以下操作:

headline_data.reshape(128,1,100)

Have a look at this post, it should clear everything up.看看这个帖子,它应该清除一切。

Link 关联

* UPDATE * * 更新 *

Do the following:请执行下列操作:

headling_data = healdine_data.reshape(128,1,100)
main_input = Input(shape=(1,100), dtype='float32', name='main_input')

I tested it and it works, so let me know if it doesnt for you =)我测试了它并且它有效,所以如果它不适合你,请告诉我=)

---- Complete Code: ---- ---- 完整代码: ----

import numpy as np

from tensorflow import keras
from tensorflow.keras import Model
from tensorflow.keras.layers import Input, LSTM, Dense


TRAIN_BATCH_SIZE=32
MODEL_INPUT_BATCH_SIZE=128

headline_data = np.random.uniform(low=1, high=9000, size=(MODEL_INPUT_BATCH_SIZE, 100)).astype(np.float32)
headline_data.shape
lstm_data = headline_data.reshape(MODEL_INPUT_BATCH_SIZE,1,100)
additional_data = np.random.uniform(low=1, high=9000, size=(MODEL_INPUT_BATCH_SIZE, 5)).astype(np.float32)
labels = np.random.randint(0, 1 + 1, size=(MODEL_INPUT_BATCH_SIZE, 1))

main_input = Input(shape=(1,100), dtype='float32', name='main_input')

lstm_out = LSTM(32)(main_input)

auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)

auxiliary_input = Input(shape=(5,), name='aux_input')
x = keras.layers.concatenate([lstm_out, auxiliary_input])

# We stack a deep densely-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)

# And finally we add the main logistic regression layer
main_output = Dense(1, activation='sigmoid', name='main_output')(x)

# This defines a model with two inputs and two outputs:
model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])

model.compile(optimizer='rmsprop',
                loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
                loss_weights={'main_output': 1., 'aux_output': 0.2})

# And trained it via:
model.fit({'main_input': lstm_data, 'aux_input': additional_data},
                {'main_output': labels, 'aux_output': labels},
                epochs=1000, batch_size=TRAIN_BATCH_SIZE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM