简体   繁体   English

Keras CNN Autoencoder输入形状错误

[英]Keras CNN Autoencoder input shape is wrong

I have build a CNN autoencoder using keras and it worked fine for the MNIST test data set. 我使用keras构建了一个CNN自动编码器,它可以很好地用于MNIST测试数据集。 I am now trying it with a different data set collected from another source. 我现在正在尝试使用从其他来源收集的不同数据集。 There are pure images and I have to read them in using cv2 which works fine. 有纯图像,我必须使用cv2阅读它们工作正常。 I then convert these images into a numpy array which again I think works fine. 然后我将这些图像转换为一个numpy数组,我认为再次正常工作。 But when I try to do the .fit method it gives me this error. 但是当我尝试使用.fit方法时,它会给我这个错误。

Error when checking target: expected conv2d_39 to have shape (100, 100, 1) but got array with shape (100, 100, 3)

I tried converting the images to grey scale but they then get the shape (100,100) and not (100,100,1) which is what the model wants. 我尝试将图像转换为灰度,但它们然后得到形状(100,100)而不是(100,100,1)这是模型想要的。 What am I doing wrong here? 我在这做错了什么?

Here is the code that I am using: 这是我正在使用的代码:

def read_in_images(path):
    images = []
    for files in os.listdir(path):
        img = cv2.imread(os.path.join(path, files))
        if img is not None:
            images.append(img)
    return images

train_images = read_in_images(train_path)
test_images = read_in_images(test_path)
x_train = np.array(train_images)
x_test = np.array(test_images) # (36, 100, 100, 3)

input_img = Input(shape=(100,100,3))
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)


x = Conv2D(16, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(168, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)


autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')


autoencoder.fit(x_train, x_train,
            epochs=25,
            batch_size=128,
            shuffle=True,
            validation_data=(x_test, x_test),
            callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])

The model works fine with the MNIST data set but not with my own data set. 该模型适用于MNIST数据集,但不适用于我自己的数据集。 Any help will be appreciated. 任何帮助将不胜感激。

Your input and output shapes are different. 您的输入和输出形状是不同的。 That triggers the error (I think). 这会触发错误(我认为)。

decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

should be 应该

decoded = Conv2D(num_channels, (3, 3), activation='sigmoid', padding='same')(x)

I ran some tests, and with data loaded in grayscale like that : 我运行了一些测试,并且数据加载为灰度级:

img = cv2.imread(os.path.join(path, files), 0)

then expand the dim of the final loaded array like : 然后扩展最终加载数组的暗淡,如:

x_train = np.expand_dims(x_train, -1)

and finaly normalize you data with a simple : 并通过简单的方式将数据规范化:

x_train = x_train / 255.

(the input of your model must be : input_img = Input(shape=(100, 100, 1) ) (模型的输入必须是: input_img = Input(shape=(100, 100, 1)

The loss becomes normal again and the model run well ! 损失再次变得正常,模型运行良好!

UPDATE after comment 评论后更新

In order to keep all the rgb channel throught the network, you need an output corresponding to your input shape. 为了使所有rgb通道保持通过网络,您需要一个与输入形状相对应的输出。
Here if you want image with shape (100, 100, 3), you need an output of (100, 100, 3) from your decoder. 如果您想要图像形状(100,100,3),您需要从解码器输出(100,100,3)。

The decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x) will shrink the output to have a shape (100, 100, 1) decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)将缩小输出以具有形状( decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

So you simply need to change the number of filters, here we want 3 colors channels so the conv must be like that : 所以你只需要改变过滤器的数量,这里我们需要3种颜色的通道,所以转换必须是这样的:

decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM