CNN 中图像二进制分类的 50% 准确率

Question

I have a collection of images with open and closed eyes.我有一组睁眼和闭眼的图像。
The data is collected from the current directory using keras in this way:使用 keras 从当前目录收集数据，方式如下：

batch_size = 64
N_images = 84898 #total number of images
datagen = ImageDataGenerator(
    rescale=1./255)
data_iterator = datagen.flow_from_directory(
    './Eyes',
    shuffle = 'False',
    color_mode='grayscale',
    target_size=(h, w),
    batch_size=batch_size,
    class_mode = 'binary')

I've got a.csv file with the state of each eye.我有一个.csv 文件和每只眼睛的 state。

I've built this Sequential model:我已经构建了这个顺序 model：

num_filters = 8
filter_size = 3
pool_size = 2

model = Sequential([
  Conv2D(num_filters, filter_size, input_shape=(90, 90, 1)),
  MaxPooling2D(pool_size=pool_size),
  Flatten(),
  Dense(16, activation='relu'),
  Dense(2, activation='sigmoid'), # Two classes. one for "open" and another one for "closed"
])

Model compilation. Model 编译。

model.compile(
    'adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

Finally I fit all the data with the following:最后，我将所有数据与以下内容相匹配：

model.fit(
  train_images,
  to_categorical(train_labels),
  epochs=3,
  validation_data=(test_images, to_categorical(test_labels)),
)

The result fluctuates around 50% and I do not understand why.结果在 50% 左右波动，我不明白为什么。

Answer 1

Your current model essentially has one convolutional layer.您当前的 model 基本上有一个卷积层。 That is, num_filters convolutional filters (which in this case are 3 x 3 arrays) are defined and fit such that when they are convolved with the image, they produce features that are as discriminative as possible between classes.也就是说，定义和拟合num_filters卷积滤波器（在本例中为 3 x 3 数组），以便当它们与图像进行卷积时，它们会产生尽可能区分类别的特征。 You then perform maxpooling to slightly reduce the dimension of the output CNN features before passing to 2 dense layers.然后执行 maxpooling 以稍微减小 output CNN 特征的维度，然后再传递到 2 个密集层。

I'd start by saying that one convolutional layer is almost certainly insufficient, especially with 3x3 filters.我首先要说一个卷积层几乎肯定是不够的，尤其是对于 3x3 过滤器。 Basically, with a single convolutional layer, the most meaningful information you can get are edges or lines.基本上，使用单个卷积层，您可以获得的最有意义的信息是边缘或线条。 These features are only marginally more useful to a function approximator (ie your fully connected layers) than the raw pixel intensity values because they still have an extremely high degree of variability both within a class and between classes.这些功能对于 function 逼近器（即您的全连接层）仅比原始像素强度值更有用，因为它们在 class 内和类之间仍然具有极高程度的可变性。 Consider that shifting an image of an eye 2 pixels to the left would result in completely different values output from your 1-layer CNN.考虑将眼睛的图像向左移动 2 个像素会导致与 1 层 CNN 完全不同的值 output。 You'd like the outputs of your CNN to be invariant to scale, rotation, illumination, etc.您希望 CNN 的输出在缩放、旋转、照明等方面保持不变。

In practice, this means you're going to need more convolutional layers.实际上，这意味着您将需要更多的卷积层。 The relatively simple VGG net has at least 14 convolutional layers, and modern residual-layer based networks often have over 100 convolutional layers.相对简单的 VGG 网络至少有 14 个卷积层，而现代基于残差层的网络通常有 100 多个卷积层。 Try writing a routine to define sequentially more complex networks until you start seeing performance gains.尝试编写一个例程来依次定义更复杂的网络，直到您开始看到性能提升。

As a secondary point, generally you don't want to use a sigmoid() activation function on your final layer outputs during training.作为第二点，通常您不想在训练期间对最终层输出使用sigmoid()激活 function。 This flattens the gradients and makes it much slower to backpropogate your loss.这会使梯度变平，并使反向传播损失的速度慢得多。 You actually don't care that the output values fall between 0 and 1, you only care about their relative magnitudes.您实际上并不关心 output 值介于 0 和 1 之间，您只关心它们的相对大小。 Common practice is to use cross entropy loss which combines a log softmax function (gradient more stable than normal softmax) and negative log likelihood loss, as you've already done.常见的做法是使用交叉熵损失，它结合了对数 softmax function（梯度比正常 softmax 更稳定）和负对数似然损失，正如您已经完成的那样。 Thus, since the log softmax portion transforms the output values into the desired range, there's no need to use the sigmoid activation function.因此，由于 log softmax 部分将 output 值转换为所需范围，因此无需使用 sigmoid 激活 function。

CNN 中图像二进制分类的 50% 准确率

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-06-18 17:45:44

CNN 中图像二进制分类的 50% 准确率

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-06-18 17:45:44

解决方案1
1 已采纳 2020-06-18 17:45:44