使用CNN对图片进行两类分类，但始终可以将所有内容预测为一类

Question

Introduction 介绍

I have a normal CNN network based on tensorflow, and my goal is to train it and then use it to classify the images into 2 classes. 我有一个基于tensorflow的普通CNN网络，我的目标是训练它，然后使用它将图像分类为2类。

About train dataset 关于火车数据集

X: Images(healthy, not healthy), 128*128 X：图像（健康，不健康），128 * 128

label: [1, 0] (not healthy) or [0, 1] (healthy) 标签：[1，0]（不健康）或[0，1]（健康）

I use the TFrecords to make the dataset. 我使用TFrecords制作数据集。

About the CNN model 关于CNN模型

def weight_variable(shape):

    initial = tf.truncated_normal(shape, stddev = 0.1, dtype = tf.float32)
    return tf.Variable(initial)


def bias_variable(shape):

    initial = tf.constant(0.1, shape = shape, dtype = tf.float32)
    return tf.Variable(initial)


def conv2d(x, W):

    #(input, filter, strides, padding)
    #[batch, height, width, in_channels]
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')


def max_pool_2x2(x):

    #(value, ksize, strides, padding)
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

def cnn_model():

    epochs = 1
    batch_size = 200
    learning_rate = 0.001
    hidden = 1024
    cap_c = 498
    cap_h = 478
    num = cap_c + cap_h # the sum number of the training x
    image_size = 128
    label_size = 2
    ex = 2

    #train_loss = np.empty((num//(batch_size * ex)) * epochs)
    #train_acc = np.empty((num//(batch_size * ex)) * epochs)

    x = tf.placeholder(tf.float32, shape = [None, image_size * image_size])
    y = tf.placeholder(tf.float32, shape = [None, label_size])

    X_train_ = tf.reshape(x, [-1, image_size, image_size, 1])

    #First layer
    W_conv1 = weight_variable([5, 5, 1, 32])
    b_conv1 = bias_variable([32])

    h_conv1 = tf.nn.relu(conv2d(X_train_, W_conv1) + b_conv1)
    h_pool1 = max_pool_2x2(h_conv1)

    #Second layer
    W_conv2 = weight_variable([5, 5, 32, 64])
    b_conv2 = bias_variable([64])

    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
    h_pool2 = max_pool_2x2(h_conv2)

    #Third layer
    #W_conv3 = weight_variable([5, 5, 64, 128])
    #b_conv3 = bias_variable([128])

    #h_conv3 = tf.nn.relu(conv2d(h_pool2, W_conv3) + b_conv3)
    #h_pool3 = max_pool_2x2(h_conv3)

    #Full connect layer
    W_fc1 = weight_variable([64 * 64 * 32, hidden])
    b_fc1 = bias_variable([hidden])

    h_pool2_flat = tf.reshape(h_pool2, [-1, 64 * 64 * 32])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

    #Output_Softmax

    W_fc2 = weight_variable([hidden, label_size])
    b_fc2 = bias_variable([label_size])

    y_conv = tf.nn.softmax(tf.matmul(h_fc1, W_fc2) + b_fc2)

    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels = y, logits = y_conv))
    optimize = tf.train.AdamOptimizer(learning_rate).minimize(loss)
    correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y, 1)) 
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

And then is the data read or sess section. 然后是数据读取或处理部分。

About the shape 关于形状

As the shape of placeholder, if the batch size is 200 作为占位符的形状，如果批量为200

X shape: [200, 128 * 128] X形：[200，128 * 128]

label shape: [200, 2] 标签形状：[200，2]

output shape: [200, 2] 输出形状：[200，2]

About the output result 关于输出结果

I think the predicted value should be trained as [1, 0] or [0, 1], but after about 5 steps, the predicted values are all [1, 0] or [0, 1]. 我认为预测值应训练为[1，0]或[0，1]，但是经过大约5个步骤，预测值都是[1，0]或[0，1]。 For example, if the batch size is 5, the result will be 例如，如果批次大小为5，则结果将为

[[1, 0],
[1, 0],
[1, 0],
[1, 0],
[1, 0]]

or completely opposite. 或完全相反。 However, sometimes result will be different, like this 但是，有时结果会有所不同，像这样

[[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0]]

But this only last about 5 steps, then result will be all same. 但这仅持续约5步，那么结果将完全相同。

About the loss and accuracy 关于损失和准确性

Since the predicted result is not right, the loss is not convergent. 由于预测结果不正确，因此损失无法收敛。 In other words, the loss and accuracy totally depend on the X of the training dataset, which is unmeaningful. 换句话说，损失和准确性完全取决于训练数据集的X，这是没有意义的。

My thinking 我的想法

I think the dataset, TFrecords, does not have the problem since I have printed the image matrix and label, they are all right. 我认为数据集TFrecords没有问题，因为我已经打印了图像矩阵和标签，它们没问题。 So I think the problem lies in the model. 因此，我认为问题出在模型上。

I did not get the answer which could solve my question and problem from Google Search and other problems in SO, really thank you if you could help me with this. 我没有从Google搜索中获得可以解决我的问题的答案以及SO中的其他问题，非常感谢您能为我提供帮助。 Please let me know if you need more results or code for the reference. 如果您需要更多结果或代码以供参考，请告诉我。

Answer 1

I think that your data might be unbalanced, ie the numbers of training samples are not roughly for both classes. 我认为您的数据可能不平衡，即两个课程的训练样本数量都不大。 In your example, you might have much more healthy targets than unhealthy ones. 在您的示例中，您的健康目标可能比不健康的目标要多得多。 In this case, the loss function is significantly reduced by classifying all samples into the same class, but after this, the misclassified samples are unlikely to be classified correctly again after some time. 在这种情况下，通过将所有样本分类到同一类别中，损失函数会大大降低，但是此后，经过一段时间分类的样本不太可能再次正确分类。

You can try to resample your data in order to get roughly equal numbers for both classes. 您可以尝试对数据重新采样，以使两个类的数据大致相等。

Another approach is to use a weighted cross entropy (for example, you could compute the cross entropy for each sample, and multiply it with a weight (to be exact, a Tensor of weights for each sample); only after this should you apply tf.reduce_mean . You could, for example, apply a larger weight to the class containing fewer samples, and thus force the optimizer to pay more attention to these samples. 另一种方法是使用加权交叉熵（例如，您可以计算每个样本的交叉熵，然后将其乘以权重（确切地说，是每个样本的权重张量）；仅在此之后才应应用tf.reduce_mean例如，您可以对包含更少样本的类施加更大的权重，从而迫使优化器更加关注这些样本。

This should look like this: 看起来应该像这样：

weights = tf.placeholder(tf.float32, shape=[None])
loss = tf.reduce_mean(tf.multiply(tf.nn.softmax_cross_entropy_with_logits(labels = y, logits = y_conv), weights))

Of course, you need to fill weights with values at some point. 当然，您需要在某些时候用值填充weights 。

使用CNN对图片进行两类分类，但始终可以将所有内容预测为一类

问题描述

Introduction 介绍

About train dataset 关于火车数据集

About the CNN model 关于CNN模型

About the shape 关于形状

About the output result 关于输出结果

About the loss and accuracy 关于损失和准确性

My thinking 我的想法

1 个解决方案

解决方案1
0 已采纳 2017-08-08 08:16:15

使用CNN对图片进行两类分类，但始终可以将所有内容预测为一类

问题描述

Introduction 介绍

About train dataset 关于火车数据集

About the CNN model 关于CNN模型

About the shape 关于形状

About the output result 关于输出结果

About the loss and accuracy 关于损失和准确性

My thinking 我的想法

1 个解决方案

解决方案1 0 已采纳 2017-08-08 08:16:15

解决方案1
0 已采纳 2017-08-08 08:16:15