Keras 模型不是训练层，验证准确率始终为 0.5

Question

My Keras CNN model (based on an implementation of AlexNet ) always has training accuracy close to 0.5 (within +- 0.02) and the validation accuracy is always 0.5 exactly, no matter which epoch.我的 Keras CNN 模型（基于AlexNet 的实现）始终具有接近 0.5（在 +- 0.02 以内）的训练准确度，并且无论哪个时期，验证准确度始终准确为 0.5。 It is a binary classification model where the train/val split is roughly 85/15 and within both those sets the images are split 50/50 for each class.它是一个二元分类模型，其中 train/val 拆分大约为 85/15，并且在这两个集合中，每个类的图像拆分为 50/50。

It doesn't seem to matter which model architecture I use, or whether I initalise with random or ImageNet weights, the validation accuracy is always 0.5.无论我使用哪种模型架构，还是使用随机权重或 ImageNet 权重初始化，验证准确度始终为 0.5，这似乎无关紧要。 In fact, when I didn't have the images split 50/50 into their binary classifications, the validation accuracy would reflect this (so when I had significantly more images belonging to one class, the validation accuracy would always be 0.85).事实上，当我没有将图像按 50/50 分成它们的二进制分类时，验证准确度会反映这一点（所以当我有更多属于一个类别的图像时，验证准确度将始终为 0.85）。

Because of this last point, I have a suspicion the problem doesn't lie with the model or weight optimisation, but rather with my instantiation of the ImageDataGenerator class - although this is just an educated hunch at this stage.由于最后一点，我怀疑问题不在于模型或权重优化，而在于我对 ImageDataGenerator 类的实例化 - 尽管在现阶段这只是一个受过教育的预感。

I've included my code below, can anyone locate any blindingly obvious errors?我在下面包含了我的代码，任何人都可以找到任何令人眼花缭乱的明显错误吗？

sz=224 # image width = height = 224
batch_size=64
train_data_dir = r"./crack_dataset/train"
validation_data_dir = r"./crack_dataset/validate"
nb_train_samples = 3416
nb_val_samples = 612

train_datagen = ImageDataGenerator(rescale=1./255)

validation_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(train_data_dir,
                                                    target_size = (sz, sz),
                                                    batch_size=batch_size,
                                                    class_mode='binary')

validation_generator = validation_datagen.flow_from_directory(validation_data_dir,
                                                              target_size = (sz, sz),
                                                              batch_size=batch_size,
                                                              class_mode='binary')

# Create Model 
model = Sequential()

model.add(Conv2D(filters=96, input_shape=input_shape, kernel_size=(11,11), strides=(4,4), padding='valid', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='valid'))

model.add(Conv2D(filters=256, kernel_size=(11,11), strides=(1,1), padding='valid', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='valid'))

model.add(Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), padding='valid', activation='relu'))

model.add(Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), padding='valid', activation='relu'))

model.add(Conv2D(filters=256, kernel_size=(3,3), strides=(1,1), padding='valid', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='valid'))

model.add(Flatten())
model.add(Dense(4096, input_shape=(256,), activation='relu'))
model.add(Dropout(0.4))

model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.4))

model.add(Dense(1, activation='sigmoid'))

model.compile(optimizer=Adam(0.1), loss='binary_crossentropy', metrics=['accuracy'])

model.fit_generator(train_generator,
                    steps_per_epoch = nb_train_samples // batch_size, 
                    epochs=30,
                    validation_data=validation_generator,
                    validation_steps=nb_val_samples // batch_size)

Answer 1

The problem is the learning rate of the optimizer, the value is way too big.问题是优化器的学习率，这个值太大了。 As suggested in the comment, it should be set to a low value, close to 0.正如评论中所建议的，它应该设置为一个较低的值，接近于 0。

You can see how the learning rate might influence the classification accuracy in the image below:您可以在下图中看到学习率如何影响分类准确度：

Answer 2

It was a comment, but too long so I switched to a proper answer.这是一条评论，但太长了，所以我切换到了正确的答案。

I haven't tried your CNN, but I would revise your layers.我还没有尝试过你的 CNN，但我会修改你的图层。 I'm not an expert in designing CNN architectures but I don't see a lot of meaning in having 2 layers in a row with the same parameters (5th and 6th layers), it's pretty useless for what I've learned so far.我不是设计 CNN 架构的专家，但我没有看到连续 2 层具有相同参数（第 5 层和第 6 层）的意义，这对我到目前为止所学的知识毫无用处。 Try with some simple architecture and "scale up" (in terms of complexity of the architecture) from there.尝试使用一些简单的架构并从那里“扩展”（就架构的复杂性而言）。

Furthermore, going up and down in terms of "number of units" ( 1st, 3rd, 5th, 6th, 7th layers, Conv2D) might not be the best strategy, everytime you go up you generalize in terms of features to get because you start from less features, you are asking to your network to create features from a smaller set.此外，在“单元数”（第 1、第 3、第 5、第 6、第 7 层、Conv2D）方面上下移动可能不是最佳策略，每次上升时，您都会概括要获得的特征，因为您开始从较少的特征中，您要求您的网络从较小的集合中创建特征。 There are some scenarios where it might be useful, but I don't think this is one of them.在某些情况下它可能有用，但我认为这不是其中之一。

And for the rescaling you don't need to call "ImageDataGenerator", you can reach the same result with:对于重新缩放您不需要调用“ImageDataGenerator”，您可以通过以下方式获得相同的结果：

train_datagen = train_datagen/255.0

Simplify where you can.尽可能简化。

Keras 模型不是训练层，验证准确率始终为 0.5

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-02-18 15:14:07

解决方案2
-2 2020-02-17 12:11:23

Keras 模型不是训练层，验证准确率始终为 0.5

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-02-18 15:14:07

解决方案2 -2 2020-02-17 12:11:23

解决方案1
1 已采纳 2020-02-18 15:14:07

解决方案2
-2 2020-02-17 12:11:23