为什么训练时准确性和损失保持不变？

Question

So I have tried to modify the entry-tutorial from https://www.tensorflow.org/tutorials/keras/basic_classification , to work with my own data. 所以我尝试修改https://www.tensorflow.org/tutorials/keras/basic_classification中的入门教程，以使用我自己的数据。 The goal is to classify images of dogs and cats. 目标是对狗和猫的图像进行分类。 The code is very simple and given below. 代码非常简单，如下所示。 The problem is that the network does not seem to learn at all, training loss and accuracy stay the same after every epoch. 问题是网络似乎根本没有学习，训练损失和准确性在每个时代之后都保持不变。

The images (X_training) and the labels (y_training) seem to have the right format: X_training.shape returns: (18827, 80, 80, 3) 图像（X_training）和标签（y_training）似乎具有正确的格式： X_training.shape返回： (18827, 80, 80, 3)

y_training is a one dimensional list with entries in {0,1} y_training是一个一维列表，其中的条目为{0,1}

I have checked several times, that the "images" in X_training are correctly labeled: Let's say X_training[i,:,:,:] represents a dog, then y_training[i] will return a 1, if X_training[i,:,:,:] represents a cat, then y_training[i] will return a 0. 我已经多次检查过， X_training中的“图像”被正确标记：假设X_training[i,:,:,:]代表一只狗，那么y_training[i]将返回1，如果X_training[i,:,:,:]表示一只猫，然后y_training[i]将返回0。

Shown below is the complete python file without the import statements. 下面显示的是没有import语句的完整python文件。

#loading the data from 4 pickle files:
pickle_in = open("X_training.pickle","rb")
X_training = pickle.load(pickle_in)

pickle_in = open("X_testing.pickle","rb")
X_testing = pickle.load(pickle_in)

pickle_in = open("y_training.pickle","rb")
y_training = pickle.load(pickle_in)

pickle_in = open("y_testing.pickle","rb")
y_testing = pickle.load(pickle_in)


#normalizing the input data:
X_training = X_training/255.0
X_testing = X_testing/255.0


#building the model:
model = keras.Sequential([
    keras.layers.Flatten(input_shape=(80, 80,3)),
    keras.layers.Dense(128, activation=tf.nn.relu),
    keras.layers.Dense(1,activation='sigmoid')
])
model.compile(optimizer='adam',loss='mean_squared_error',metrics=['accuracy'])


#running the model:
model.fit(X_training, y_training, epochs=10)

The code compiles and trains for 10 epochs, but neither loss nor accuracy improve, they stay exactly the same after every epoch. 该代码编制并训练了10个时代，但既没有损失也没有精确度提高，它们在每个时代之后保持完全相同。 The code works fine with the MNIST-fashion dataset used in the tutorial with slight changes accounting for the difference in multiclass vs binary classification and input shape. 该代码适用于本教程中使用的MNIST-fashion数据集，略有变化，考虑了多类与二元分类和输入形状的差异。

Answer 1

if you want to train a classification model you must have binary_crossentropy as you lost function and not mean_squared_error which is used for regression tasks 如果你想训练一个分类模型，你必须有丢失函数时的binary_crossentropy，而不是用于回归任务的mean_squared_error

replace 更换

model.compile(optimizer='adam',loss='mean_squared_error',metrics=['accuracy'])

with 同

model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])

Furthermore i would recommend not using relu activation on your dense layer but linear 此外，我建议不要在密集层上使用relu激活，而是使用linear

replace 更换

keras.layers.Dense(128, activation=tf.nn.relu),

with 同

keras.layers.Dense(128),

and of cource to better use the power of neural networks use some convolutional layers prior your flatten layer 为了更好地利用神经网络的力量，在你的flatten layer之前使用一些convolutional layers flatten layer

Answer 2

I have found a different implementation with a slightly more complex model that works. 我找到了一个不同的实现，其中一个稍微复杂的模型可以工作。 Here is the complete code without the import statements: 这是没有import语句的完整代码：

#global variables:
batch_size = 32
nr_of_epochs = 64
input_shape = (80,80,3)


#loading the data from 4 pickle files:
pickle_in = open("X_training.pickle","rb")
X_training = pickle.load(pickle_in)

pickle_in = open("X_testing.pickle","rb")
X_testing = pickle.load(pickle_in)

pickle_in = open("y_training.pickle","rb")
y_training = pickle.load(pickle_in)

pickle_in = open("y_testing.pickle","rb")
y_testing = pickle.load(pickle_in)



#building the model
def define_model():
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', input_shape=input_shape))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # compile model
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    return model
model = define_model()


#Possibility for image data augmentation
train_datagen = ImageDataGenerator(rescale=1.0/255.0)
val_datagen = ImageDataGenerator(rescale=1./255.) 
train_generator =train_datagen.flow(X_training,y_training,batch_size=batch_size)
val_generator = val_datagen.flow(X_testing,y_testing,batch_size= batch_size)



#running the model
history = model.fit_generator(train_generator,steps_per_epoch=len(X_training) //batch_size,
                              epochs=nr_of_epochs,validation_data=val_generator,
                              validation_steps=len(X_testing) //batch_size)

为什么训练时准确性和损失保持不变？

问题描述

2 个解决方案

解决方案1
1 2019-09-03 13:01:52

解决方案2
1 已采纳 2019-09-03 16:10:31

为什么训练时准确性和损失保持不变？

问题描述

2 个解决方案

解决方案1 1 2019-09-03 13:01:52

解决方案2 1 已采纳 2019-09-03 16:10:31

解决方案1
1 2019-09-03 13:01:52

解决方案2
1 已采纳 2019-09-03 16:10:31