简体   繁体   English

CNN 中图像二进制分类的 50% 准确率

[英]50% accuracy in CNN on image binary classification

I have a collection of images with open and closed eyes.我有一组睁眼和闭眼的图像。
The data is collected from the current directory using keras in this way:使用 keras 从当前目录收集数据,方式如下:

batch_size = 64
N_images = 84898 #total number of images
datagen = ImageDataGenerator(
    rescale=1./255)
data_iterator = datagen.flow_from_directory(
    './Eyes',
    shuffle = 'False',
    color_mode='grayscale',
    target_size=(h, w),
    batch_size=batch_size,
    class_mode = 'binary')

I've got a.csv file with the state of each eye.我有一个.csv 文件和每只眼睛的 state。

I've built this Sequential model:我已经构建了这个顺序 model:

num_filters = 8
filter_size = 3
pool_size = 2

model = Sequential([
  Conv2D(num_filters, filter_size, input_shape=(90, 90, 1)),
  MaxPooling2D(pool_size=pool_size),
  Flatten(),
  Dense(16, activation='relu'),
  Dense(2, activation='sigmoid'), # Two classes. one for "open" and another one for "closed"
])

Model compilation. Model 编译。

model.compile(
    'adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

Finally I fit all the data with the following:最后,我将所有数据与以下内容相匹配:

model.fit(
  train_images,
  to_categorical(train_labels),
  epochs=3,
  validation_data=(test_images, to_categorical(test_labels)),
)

The result fluctuates around 50% and I do not understand why.结果在 50% 左右波动,我不明白为什么。

Your current model essentially has one convolutional layer.您当前的 model 基本上有一个卷积层。 That is, num_filters convolutional filters (which in this case are 3 x 3 arrays) are defined and fit such that when they are convolved with the image, they produce features that are as discriminative as possible between classes.也就是说,定义和拟合num_filters卷积滤波器(在本例中为 3 x 3 数组),以便当它们与图像进行卷积时,它们会产生尽可能区分类别的特征。 You then perform maxpooling to slightly reduce the dimension of the output CNN features before passing to 2 dense layers.然后执行 maxpooling 以稍微减小 output CNN 特征的维度,然后再传递到 2 个密集层。

I'd start by saying that one convolutional layer is almost certainly insufficient, especially with 3x3 filters.我首先要说一个卷积层几乎肯定是不够的,尤其是对于 3x3 过滤器。 Basically, with a single convolutional layer, the most meaningful information you can get are edges or lines.基本上,使用单个卷积层,您可以获得的最有意义的信息是边缘或线条。 These features are only marginally more useful to a function approximator (ie your fully connected layers) than the raw pixel intensity values because they still have an extremely high degree of variability both within a class and between classes.这些功能对于 function 逼近器(即您的全连接层)仅比原始像素强度值更有用,因为它们在 class 内和类之间仍然具有极高程度的可变性。 Consider that shifting an image of an eye 2 pixels to the left would result in completely different values output from your 1-layer CNN.考虑将眼睛的图像向左移动 2 个像素会导致与 1 层 CNN 完全不同的值 output。 You'd like the outputs of your CNN to be invariant to scale, rotation, illumination, etc.您希望 CNN 的输出在缩放、旋转、照明等方面保持不变。

In practice, this means you're going to need more convolutional layers.实际上,这意味着您将需要更多的卷积层。 The relatively simple VGG net has at least 14 convolutional layers, and modern residual-layer based networks often have over 100 convolutional layers.相对简单的 VGG 网络至少有 14 个卷积层,而现代基于残差层的网络通常有 100 多个卷积层。 Try writing a routine to define sequentially more complex networks until you start seeing performance gains.尝试编写一个例程来依次定义更复杂的网络,直到您开始看到性能提升。

As a secondary point, generally you don't want to use a sigmoid() activation function on your final layer outputs during training.作为第二点,通常您不想在训练期间对最终层输出使用sigmoid()激活 function。 This flattens the gradients and makes it much slower to backpropogate your loss.这会使梯度变平,并使反向传播损失的速度慢得多。 You actually don't care that the output values fall between 0 and 1, you only care about their relative magnitudes.您实际上并不关心 output 值介于 0 和 1 之间,您只关心它们的相对大小。 Common practice is to use cross entropy loss which combines a log softmax function (gradient more stable than normal softmax) and negative log likelihood loss, as you've already done.常见的做法是使用交叉熵损失,它结合了对数 softmax function(梯度比正常 softmax 更稳定)和负对数似然损失,正如您已经完成的那样。 Thus, since the log softmax portion transforms the output values into the desired range, there's no need to use the sigmoid activation function.因此,由于 log softmax 部分将 output 值转换为所需范围,因此无需使用 sigmoid 激活 function。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用任何 CNN model(包括 VGG16)的二元分类验证准确率为 50% - 50% validation accuracy in binary classification with any CNN model (including VGG16) 在这个用于图像分类的简单 CNN Tensorflow Keras model 中,我的测试数据准确率无法超过 50% - I cannot get over 50% accuracy on my test data in this simple CNN Tensorflow Keras model for image classification Keras 在二元分类问题中准确率停留在 50% - Keras accuracy stuck at 50% in a binary classification problem 训练精度高,验证精度低 CNN二元分类 keras - High training accuracy, low validation accuracy CNN binary classification keras Keras精度在二进制CNN问题上不会增加​​超过50% - Keras accuracy not increasing over 50% on binary CNN problem CNN对猫/狗图像进行二进制分类的准确性不比随机性好 - CNN accuracy on binary classification of cat/dog images no better than random cnn model 用于二进制分类,86% val_accuracy 总是返回 1 - cnn model for binary classification with 86% val_accuracy always returning 1 CNN与Keras进行简单的二进制分类,但仅获得50% - Simple binary classification by CNN with Keras, But got only 50% acc 使用 BERT 编码器的二元分类模型准确率高达 50% - Binary classification model using BERT encoder stuck at 50% accuracy Tensorflow:损失和准确度在图像分类上保持平稳训练 CNN - Tensorflow: loss and accuracy stay flat training CNN on image classification
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM