CNN对猫/狗图像进行二进制分类的准确性不比随机性好

Question

I've adapted a simple CNN from a tutorial on Analytics Vidhya . 我从Analytics Vidhya的教程改编了一个简单的CNN。

Problem is that my accuracy on a holdout set is no better than random. 问题是我对保留集的准确性不比随机性好。 I am training on ~8600 images each of cats and dogs, which should be enough data for decent model, but accuracy on the test set is at 49%. 我正在为每只猫和狗制作约8600张图像的训练，对于像样的模型来说这应该是足够的数据，但是测试集的准确性为49％。 Is there a glaring omission in my code somewhere? 我的代码中是否有明显的遗漏？

import os
import numpy as np
import keras
from keras.models import Sequential
from sklearn.model_selection import train_test_split
from datetime import datetime
from PIL import Image
from keras.utils.np_utils import to_categorical
from sklearn.utils import shuffle


def main():

    cat=os.listdir("train/cats")
    dog=os.listdir("train/dogs")
    filepath="train/cats/"
    filepath2="train/dogs/"

    print("[INFO] Loading images of cats and dogs each...", datetime.now().time())
    #print("[INFO] Loading {} images of cats and dogs each...".format(num_images), datetime.now().time())
    images=[]
    label = []
    for i in cat:
        image = Image.open(filepath+i)
        image_resized = image.resize((300,300))
        images.append(image_resized)
        label.append(0) #for cat images

    for i in dog:
        image = Image.open(filepath2+i)
        image_resized = image.resize((300,300))
        images.append(image_resized)
        label.append(1) #for dog images

    images_full = np.array([np.array(x) for x in images])

    label = np.array(label)
    label = to_categorical(label)

    images_full, label = shuffle(images_full, label)

    print("[INFO] Splitting into train and test", datetime.now().time())
    (trainX, testX, trainY, testY) = train_test_split(images_full, label, test_size=0.25)


    filters = 10
    filtersize = (5, 5)

    epochs = 5
    batchsize = 32

    input_shape=(300,300,3)
    #input_shape = (30, 30, 3)

    print("[INFO] Designing model architecture...", datetime.now().time())
    model = Sequential()
    model.add(keras.layers.InputLayer(input_shape=input_shape))
    model.add(keras.layers.convolutional.Conv2D(filters, filtersize, strides=(1, 1), padding='same',
                                                data_format="channels_last", activation='relu'))
    model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))
    model.add(keras.layers.Flatten())

    model.add(keras.layers.Dense(units=2, input_dim=50,activation='softmax'))
    #model.add(keras.layers.Dense(units=2, input_dim=5, activation='softmax'))

    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

    print("[INFO] Fitting model...", datetime.now().time())
    model.fit(trainX, trainY, epochs=epochs, batch_size=batchsize, validation_split=0.3)

    model.summary()

    print("[INFO] Evaluating on test set...", datetime.now().time())
    eval_res = model.evaluate(testX, testY)
    print(eval_res)

if __name__== "__main__":
    main()

Answer 1

For me the problem comes from the size of your network, you have only one Conv2D with a filter size of 10. This is way too small to learn the deep reprensation of your image. 对我来说，问题出在网络的大小上，您只有一个Conv2D，其过滤器大小为10。这太小了，无法了解图像的深层表现。

Try to increment this a lot by using blocks of common architectures like VGGnet ! 尝试通过使用VGGnet等通用体系结构的块来增加很多！
Example of a block : 块示例：

x = Conv2D(32, (3, 3) , padding='SAME')(model_input)
x = LeakyReLU(alpha=0.3)(x)
x = BatchNormalization()(x)
x = Conv2D(32, (3, 3) , padding='SAME')(x)
x = LeakyReLU(alpha=0.3)(x)
x = BatchNormalization()(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Dropout(0.25)(x)

You need to try multiple blocks like that, and increasing the filter size in order to capture deeper features. 您需要尝试类似的多个块，并增加过滤器的大小，以捕获更深的功能。

Other thing, you don't need to specify the input_dim of your dense layer, keras automaticly take care of that ! 另外，您无需指定密集层的input_dim，keras会自动处理它！

Last but not least, you need to fully connected network in oder to correctly classify your images, not only a single layer. 最后但并非最不重要的一点是，您需要完全连接网络以正确分类图像，而不仅仅是单个层。

For example : 例如：

x = Flatten()(x)
x = Dense(256)(x)
x = LeakyReLU(alpha=0.3)(x)
x = Dense(128)(x)
x = LeakyReLU(alpha=0.3)(x)
x = Dense(2)(x)
x = Activation('softmax')(x)

Try those changes and keep me in touch ! 尝试这些更改并保持联系！

Update after op's questions 提问后更新

Images are complex, they contain much information like shapes, edges, colors, etc 图像很复杂，其中包含许多信息，例如形状，边缘，颜色等

In order to capture the maximum amont of information you need to passes through multiple convolutions which will learn the different aspects of the image. 为了捕获最大数量的信息，您需要通过多次卷积来学习图像的不同方面。 Imagine that like for example first convolution will learn to recognise a square, the second conv to recognise circles, the third to recognise edges, etc .. 想象一下，例如第一次卷积将学习识别正方形，第二次卷积将识别圆，第三次卷积将识别边缘等。

And for my second point, the final fully connected acts like a classifier, the conv network will output a vector that "represents" a dog or a cat, now you need to learn that this kind of vector is one class or the other one. 对于第二点，最终完全连接的行为就像一个分类器，conv网络将输出一个“代表”狗或猫的向量，现在您需要了解这种向量是一个类别还是另一个类别。
And directly feeding that vector in the final layer is not enough to learn this representation. 并且直接在最后一层中馈入该向量不足以学习该表示。

Is that more clear ? 这更清楚吗？

Last update for op's second comment op的第二条评论的最新更新

Here the two ways for defining a Keras model, both output the same thing ! 这里定义Keras模型的两种方式都输出相同的东西！

model_input = Input(shape=(200, 1))
x = Dense(32)(model_input)
x = Dense(16)(x)
x = Activation('relu')(x)
model = Model(inputs=model_input, outputs=x)




model = Sequential()
model.add(Dense(32, input_shape=(200, 1)))
model.add(Dense(16, activation = 'relu'))

Example of architecure 建筑实例

model = Sequential()
model.add(keras.layers.InputLayer(input_shape=input_shape))
model.add(keras.layers.convolutional.Conv2D(32, (3,3), strides=(2, 2), padding='same', activation='relu'))
model.add(keras.layers.convolutional.Conv2D(32, (3,3), strides=(2, 2), padding='same', activation='relu'))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))

model.add(keras.layers.convolutional.Conv2D(64, (3,3), strides=(2, 2), padding='same', activation='relu'))
model.add(keras.layers.convolutional.Conv2D(64, (3,3), strides=(2, 2), padding='same', activation='relu'))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))

model.add(keras.layers.Flatten())

model.add(keras.layers.Dense(128, activation='relu'))
model.add(keras.layers.Dense(2, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Don't forget to normalize your data before feeding into your network. 在输入网络之前，别忘了对数据进行规范化。

A simple images_full = images_full / 255.0 on your data can boost your accuracy a lot. 对数据进行简单的images_full = images_full / 255.0可以大大提高准确性。
Try it with grayscale images too, it's more computaly efficient. 也可以尝试使用灰度图像，它的计算效率更高。

CNN对猫/狗图像进行二进制分类的准确性不比随机性好

问题描述

1 个解决方案

解决方案1
2 已采纳 2019-07-10 20:30:11

Update after op's questions 提问后更新

Last update for op's second comment op的第二条评论的最新更新

Example of architecure 建筑实例

CNN对猫/狗图像进行二进制分类的准确性不比随机性好

问题描述

1 个解决方案

解决方案1 2 已采纳 2019-07-10 20:30:11

Update after op's questions 提问后更新

Last update for op's second comment op的第二条评论的最新更新

Example of architecure 建筑实例

解决方案1
2 已采纳 2019-07-10 20:30:11