Keras（Tensorflow）如何计算。最后一层张量的形状？

Question

我目前正在处理生成Conv NN的图像和生成Recurrent NN的音频。 我为这两个生成器都构建了，但是由于某种原因， build_audio_generator模型在其最后一层具有形状为（？，1）而不是（？，28，28 ）的Tensor（ Tensor（“ model_4 / sequential_4 / activation_4 / Tanh：0” ） ，1） 。我的问题是，我该如何更改build_audio_generator的代码，使其具有与build_generator相同的形状（？， 28，28，1） ？

码：

def build_generator(latent_dim, channels, num_classes):

    model = Sequential()

    model.add(Dense(128 * 7 * 7, activation="relu", input_dim=latent_dim))
    model.add(Reshape((7, 7, 128)))
    model.add(BatchNormalization(momentum=0.8))
    model.add(UpSampling2D())
    model.add(Conv2D(128, kernel_size=3, padding="same"))
    model.add(Activation("relu"))
    model.add(BatchNormalization(momentum=0.8))
    model.add(UpSampling2D())
    model.add(Conv2D(64, kernel_size=3, padding="same"))
    model.add(Activation("relu"))
    model.add(BatchNormalization(momentum=0.8))
    model.add(Conv2D(channels, kernel_size=3, padding='same'))
    model.add(Activation("tanh"))

    model.summary()

    noise = Input(shape=(latent_dim,))
    label = Input(shape=(1,), dtype='int32')


    label_embedding = Flatten()(Embedding(num_classes, 100)(label))

    model_input = multiply([noise, label_embedding])

    img = model(model_input)

    return Model([noise, label], img)

def build_audio_generator(latent_dim, num_classes):

    model = Sequential()
    model.add(LSTM(512, input_dim=latent_dim, return_sequences=True))
    model.add(Dropout(0.3))
    model.add(LSTM(512, return_sequences=True))
    model.add(Dropout(0.3))
    model.add(LSTM(512))
    model.add(Dense(256))
    model.add(Dropout(0.3))
    model.add(Dense(num_classes))
    model.add(Activation('tanh'))

    model.summary()

    noise = Input(shape=(None, latent_dim,))
    label = Input(shape=(1,), dtype='int32')
    label_embedding = Flatten()(Embedding(num_classes, 100)(label))
    model_input = multiply([noise, label_embedding])

    sound = model(model_input)

    return Model([noise, label], sound)

# Build the generator
generator = build_generator(100, 3, 1)
audio_generator = build_audio_generator(100, 1)

# The generator takes noise and the target label as input
# and generates the corresponding digit of that label
noise = Input(shape=(None, 100,))
label = Input(shape=(1,))

img = generator([noise, label])

audio = audio_generator([noise, label])

print('Audio: '+ str(audio))
print('Audio shape: ' + str(audio.shape))

print('IMG: '+str(img))
print('IMG shape: ' + str(img.shape))

控制台输出：

Audio: Tensor("model_4/sequential_4/activation_4/Tanh:0", shape=(?, 1), dtype=float32)
Audio shape: (?, 1)
IMG: Tensor("model_3/sequential_3/activation_3/Tanh:0", shape=(?, 28, 28, 1), dtype=float32)
IMG shape: (?, 28, 28, 1)

Answer 1

我想您会想要3D音频，不是吗？

只需在所有LSTM中保持return_sequences=True 。

Keras（Tensorflow）如何计算。最后一层张量的形状？

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-05-03 18:45:31

Keras（Tensorflow）如何计算。 最后一层张量的形状？

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-05-03 18:45:31

Keras（Tensorflow）如何计算。最后一层张量的形状？

解决方案1
0 已采纳 2018-05-03 18:45:31