[英]Difference between accuracy_score in scikit-learn and accuracy in Keras
[英]Keras evaluate_generator accuracy and scikit learn accuracy_score inconsistent
我正在使用 Keras ImageDataGenerator 类来加载、训练和预测。 我已经尝试过这里的解决方案,但仍然有问题。 我不确定我是否有与此处提到的相同的问题。 我猜我的y_pred
和y_test
没有正确映射到彼此。
validation_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
subset='validation',
shuffle='False')
validation_generator2 = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
subset='validation',
shuffle='False')
loss, acc = model.evaluate_generator(validation_generator,
steps=math.ceil(validation_generator.samples / batch_size),
verbose=0,
workers=1)
y_pred = model.predict_generator(validation_generator2,
steps=math.ceil(validation_generator2.samples / batch_size),
verbose=0,
workers=1)
y_pred = np.argmax(y_pred, axis=-1)
y_test = validation_generator2.classes[validation_generator2.index_array]
print('loss: ', loss, 'accuracy: ', acc) # loss: 0.47286026436090467 accuracy: 0.864
print('accuracy_score: ', accuracy_score(y_test, y_pred)) # accuracy_score: 0.095
该evaluate_generator
从Keras和accuracy_score
从scikit学习提供了不同的准确性。 当然,当我使用 scikit learn 中的混淆矩阵confusion_matrix(y_test, y_pred)
时,这给了我错误的混淆矩阵。 我犯了什么错误? (通过y_test
我的意思是y_true
)
更新:为了表明y_test
和y_pred
不一致,我打印了每个类的准确性。
cm = confusion_matrix(y_test, y_pred)
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
cm.diagonal()
acc_each_class = cm.diagonal()
print('accuracy of each class: \n')
for i in range(len(labels)):
print(labels[i], ' : ', acc_each_class[i])
print('\n')
'''
accuracy of each class:
cannoli : 0.085
dumplings : 0.065
edamame : 0.1
falafel : 0.125
french_fries : 0.12
grilled_cheese_sandwich : 0.13
hot_dog : 0.075
seaweed_salad : 0.085
tacos : 0.105
takoyaki : 0.135
可以看出,每个类的准确率都太低了。
更新 2:我如何训练模型,可能会有所帮助
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
subset='training')
validation_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
subset='validation',
shuffle='False')
validation_generator2 = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
subset='validation',
shuffle='False')
loss = CategoricalCrossentropy()
model.compile(optimizer=SGD(lr=lr, momentum=momentum),
loss=loss,
metrics=['accuracy'])
history = model.fit_generator(train_generator,
steps_per_epoch = train_generator.samples // batch_size,
validation_data=validation_generator,
validation_steps=validation_generator.samples // batch_size,
epochs=epochs,
verbose=1,
callbacks=[csv_logger, checkpointer],
workers=12)
我的第一印象是你训练了两个不同的模型。 许多模型中都有某种“随机”元素(例如,如何初始化神经网络中的权重),这也会自动导致分类器略有不同。 您声明的准确性由 keras 用于模型“validation_generator”,而 sklearn 准确性在“validation_generator2”上。 你可以试试这个:(请注意,我还没有尝试过这段代码)
validation_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
subset='validation',
shuffle='False')
loss, acc = model.evaluate_generator(validation_generator,
steps=math.ceil(validation_generator.samples / batch_size),
verbose=0,
workers=1)
y_pred = model.predict_generator(validation_generator,
steps=math.ceil(validation_generator.samples / batch_size),
verbose=0,
workers=1)
y_pred = np.argmax(y_pred, axis=-1)
y_test = validation_generator.classes[validation_generator.index_array]
print('loss: ', loss, 'accuracy: ', acc) # loss: 0.47286026436090467 accuracy: 0.864
print('accuracy_score: ', accuracy_score(y_test, y_pred)) # accuracy_score: 0.095
首先,您应该为评估生成器和预测生成器使用相同的生成器,如 San 所述。
其次,我认为 sklearn 和 keras 之间的准确性与sklearn 文档中所说的准确度不完全相同,如果是多类,accuracy_score 确实是 jaccard 分数。
此链接显示差异: https : //stats.stackexchange.com/questions/255465/accuracy-vs-jaccard-for-multiclass-problem
只需在调用 model.predict_generator 之前重置验证生成器:
loss, acc = model.evaluate_generator(validation_generator,
steps=math.ceil(validation_generator.samples / batch_size),
verbose=0,
workers=1)
validation_generator2.reset()
_pred = model.predict_generator(validation_generator2,
steps=math.ceil(validation_generator2.samples / batch_size),
verbose=0,
workers=1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.