[英]Why is accuracy and loss staying exactly the same while training?
So I have tried to modify the entry-tutorial from https://www.tensorflow.org/tutorials/keras/basic_classification , to work with my own data. 所以我尝试修改https://www.tensorflow.org/tutorials/keras/basic_classification中的入门教程,以使用我自己的数据。 The goal is to classify images of dogs and cats.
目标是对狗和猫的图像进行分类。 The code is very simple and given below.
代码非常简单,如下所示。 The problem is that the network does not seem to learn at all, training loss and accuracy stay the same after every epoch.
问题是网络似乎根本没有学习,训练损失和准确性在每个时代之后都保持不变。
The images (X_training) and the labels (y_training) seem to have the right format: X_training.shape
returns: (18827, 80, 80, 3)
图像(X_training)和标签(y_training)似乎具有正确的格式:
X_training.shape
返回: (18827, 80, 80, 3)
y_training
is a one dimensional list with entries in {0,1} y_training
是一个一维列表,其中的条目为{0,1}
I have checked several times, that the "images" in X_training
are correctly labeled: Let's say X_training[i,:,:,:]
represents a dog, then y_training[i]
will return a 1, if X_training[i,:,:,:]
represents a cat, then y_training[i]
will return a 0. 我已经多次检查过,
X_training
中的“图像”被正确标记:假设X_training[i,:,:,:]
代表一只狗,那么y_training[i]
将返回1,如果X_training[i,:,:,:]
表示一只猫,然后y_training[i]
将返回0。
Shown below is the complete python file without the import statements. 下面显示的是没有import语句的完整python文件。
#loading the data from 4 pickle files:
pickle_in = open("X_training.pickle","rb")
X_training = pickle.load(pickle_in)
pickle_in = open("X_testing.pickle","rb")
X_testing = pickle.load(pickle_in)
pickle_in = open("y_training.pickle","rb")
y_training = pickle.load(pickle_in)
pickle_in = open("y_testing.pickle","rb")
y_testing = pickle.load(pickle_in)
#normalizing the input data:
X_training = X_training/255.0
X_testing = X_testing/255.0
#building the model:
model = keras.Sequential([
keras.layers.Flatten(input_shape=(80, 80,3)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(1,activation='sigmoid')
])
model.compile(optimizer='adam',loss='mean_squared_error',metrics=['accuracy'])
#running the model:
model.fit(X_training, y_training, epochs=10)
The code compiles and trains for 10 epochs, but neither loss nor accuracy improve, they stay exactly the same after every epoch. 该代码编制并训练了10个时代,但既没有损失也没有精确度提高,它们在每个时代之后保持完全相同。 The code works fine with the MNIST-fashion dataset used in the tutorial with slight changes accounting for the difference in multiclass vs binary classification and input shape.
该代码适用于本教程中使用的MNIST-fashion数据集,略有变化,考虑了多类与二元分类和输入形状的差异。
if you want to train a classification model you must have binary_crossentropy as you lost function and not mean_squared_error
which is used for regression tasks 如果你想训练一个分类模型,你必须有丢失函数时的binary_crossentropy,而不是用于回归任务的
mean_squared_error
replace 更换
model.compile(optimizer='adam',loss='mean_squared_error',metrics=['accuracy'])
with 同
model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
Furthermore i would recommend not using relu
activation on your dense layer but linear
此外,我建议不要在密集层上使用
relu
激活,而是使用linear
replace 更换
keras.layers.Dense(128, activation=tf.nn.relu),
with 同
keras.layers.Dense(128),
and of cource to better use the power of neural networks use some convolutional layers
prior your flatten layer
为了更好地利用神经网络的力量,在你的
flatten layer
之前使用一些convolutional layers
flatten layer
I have found a different implementation with a slightly more complex model that works. 我找到了一个不同的实现,其中一个稍微复杂的模型可以工作。 Here is the complete code without the import statements:
这是没有import语句的完整代码:
#global variables:
batch_size = 32
nr_of_epochs = 64
input_shape = (80,80,3)
#loading the data from 4 pickle files:
pickle_in = open("X_training.pickle","rb")
X_training = pickle.load(pickle_in)
pickle_in = open("X_testing.pickle","rb")
X_testing = pickle.load(pickle_in)
pickle_in = open("y_training.pickle","rb")
y_training = pickle.load(pickle_in)
pickle_in = open("y_testing.pickle","rb")
y_testing = pickle.load(pickle_in)
#building the model
def define_model():
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# compile model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
model = define_model()
#Possibility for image data augmentation
train_datagen = ImageDataGenerator(rescale=1.0/255.0)
val_datagen = ImageDataGenerator(rescale=1./255.)
train_generator =train_datagen.flow(X_training,y_training,batch_size=batch_size)
val_generator = val_datagen.flow(X_testing,y_testing,batch_size= batch_size)
#running the model
history = model.fit_generator(train_generator,steps_per_epoch=len(X_training) //batch_size,
epochs=nr_of_epochs,validation_data=val_generator,
validation_steps=len(X_testing) //batch_size)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.