Keras 中拟合生成器输出的精度与手动计算的精度不同

Question

It worked fine when I used fit, but when I used fit_generator, I got a problem.当我使用 fit 时它工作正常，但是当我使用 fit_generator 时，我遇到了问题。

I used the call back method to find the confusion matrix at the end of each train epoch.我使用回调方法在每个训练纪元结束时找到混淆矩阵。

However, the accuracy obtained from the confusion matrix and the validation accuracy output from keras differ.但是，从混淆矩阵得到的准确度和从keras输出的验证准确度不同。

My code is below.我的代码如下。

    metrics = Valid_checker(model_name, args.patience, (x_valid, y_valid), x_length_valid)       
    model.compile(optimizer=optimizers.RMSprop(lr=args.lr),
      loss=[first_loss],
      loss_weights=[1.],
      metrics={'capsnet': 'accuracy'})
    callback_list = [lr_decay, metrics]

    model.fit_generator(
                no_decoder_generator(x_train, y_train),
                steps_per_epoch=len(x_train),
                epochs=args.epochs,
                validation_data=no_decoder_generator(x_valid, y_valid),
                validation_steps=len(x_valid),
                callbacks=callback_list,
                #class_weight=class_weights,
                verbose=1)

Valid check is my callback method. Valid check是我的回调方法。 no_decoder_generator is my decoder generator. no_decoder_generator是我的解码器生成器。 and my batch size of train and validation is 1.我的训练和验证批量大小是 1。

This is my Valid_check class.这是我的Valid_check类。 (below) （以下）

class Valid_checker(keras.callbacks.Callback):
        def __init__(self, model_name, patience, val_data, x_length):
            super().__init__()
            self.best_score = 0
            self.patience = patience
            self.current_patience = 0 
            self.model_name = model_name
            self.validation_data = val_data
            self.x_length = x_length


        def on_epoch_end(self, epoch, logs={}):
            X_val, y_val = self.validation_data
            if args.decoder==1:
                y_predict, x_predict = model.predict_generator(no_decoder_generator(X_val, y_val), steps=len(X_val))
                y_predict = np.asarray(y_predict)
                x_predict = np.asarray(x_predict)                       

            else:
                y_predict = np.asarray(model.predict_generator(predict_generator(X_val), steps=len(X_val)))

            y_val, y_predict = get_utterence_label_pred(y_val, y_predict, self.x_length )
            cnf_matrix = get_accuracy_and_cnf_matrix(y_val, y_predict)[1]
            val_acc_custom =  get_accuracy_and_cnf_matrix(y_val, y_predict)[0]
            war = val_acc_custom[0]
            uar = val_acc_custom[1]
            score = round(0.2*war+0.8*uar,2)

            loss_message=''
            # custom ModelCheckpoint & early stopping by using UAR            
            loss_message='loss: %s - acc: %s - val_loss: %s - val_acc: %s'%(round(logs.get('loss'),4), round(logs.get('acc'),4), round(logs.get('val_loss'),4), round(logs.get('val_acc'),4))
            log('[Epoch %03d/%03d]'%(epoch+1, args.epochs))
            log(loss_message)
            log('Confusion matrix:')
            log('%s'%cnf_matrix)
            log('Valid [WAR] [UAR] [Custom] : %s [%s]'%(val_acc_custom,score))

            if score > self.best_score :
                model.save_weights(model_name)
                log('Epoch %05d: val_uar_acc improved from %s to %s saving model to %s'%(epoch+1, self.best_score, score, self.model_name))
                self.best_score = score
                self.current_patience = 0

            else :
                self.current_patience+=1

            # early stopping
            if self.current_patience == (self.patience-1):
                self.model.stop_training = True
                log('Epoch %05d: early stopping' % (epoch + 1)) 
            return

It should be equal to val_acc output by keras and war .它应该等于 keras 和war输出的val_acc 。 However, the value is different.但是，价值是不同的。 Why does this happen?为什么会这样？ I have confirmed that there are no problems with get_utterence_label_pred and get_accuracy_and_cnf_matrix .我已经确认get_utterence_label_pred和get_accuracy_and_cnf_matrix没有问题。 It works well when I use the fit function.当我使用 fit 函数时效果很好。

My generator is below.我的发电机在下面。

def predict_generator(x):
while True:
    for index in range(len(x)):
        feature = x[index]
        feature = np.expand_dims(x[index],-1)
        feature = np.expand_dims(feature,0) # make (1,input_height,input_width,1) 
        yield (feature)

def no_decoder_generator(x, y):
while True:
    indexes = np.arange(len(x))
    np.random.shuffle(indexes)
    for index in indexes:
        feature = x[index]
        feature = np.expand_dims(x[index],-1)
        feature = np.expand_dims(feature,0) # make (1,input_height,input_width,1) 
        label = y[index]
        label = np.expand_dims(label,0)
        yield (feature, label)

Epoch 1/70纪元 1/70
1858/1858 [==============================] - 558s 300ms/step - loss: 1.0708 - acc: 0.5684 - val_loss: 0.9087 - val_acc: 0.6244 [Epoch 001/070] 1858/1858 [==============================] - 558s 300 毫秒/步 - 损失：1.0708 - 加速：0.5684 - val_loss : 0.9087 - val_acc: 0.6244 [纪元 001/070]
loss: 1.0708 - acc: 0.5684 - val_loss: 0.9087 - val_acc: 0.6244损失：1.0708-acc：0.5684-val_loss：0.9087-val_acc：0.6244
Confusion matrix:混淆矩阵：
[[ 0. 28. 68. 4. ] [[ 0. 28. 68. 4. ]
[ 0. 13.33 80. 6.67] [ 0. 13.33 80. 6.67]
[ 0.96 2.88 64.42 31.73] [ 0.96 2.88 64.42 31.73]
[ 0. 0. 3.28 96.72]] [ 0. 0. 3.28 96.72]]
Valid [WAR] [UAR] [Custom]: [62.44 43.62] [47.38]有效 [WAR] [UAR] [自定义]：[62.44 43.62] [47.38]

Epoch 2/70 1858/1858 [==============================] - 262s 141ms/step - loss: 0.9526 - acc: 0.6254 - val_loss: 1.1998 - val_acc: 0.4537 [Epoch 002/070] Epoch 2/70 1858/1858 [==============================] - 262 秒 141 毫秒/步 - 损失：0.9526 - acc : 0.6254 - val_loss: 1.1998 - val_acc: 0.4537 [纪元 002/070]
loss: 0.9526 - acc: 0.6254 - val_loss: 1.1998 - val_acc: 0.4537损失：0.9526 - 累积：0.6254 - val_loss：1.1998 - val_acc： 0.4537
Confusion matrix:混淆矩阵：
[[ 36. 12. 24. 28. ] [[ 36. 12. 24. 28. ]
[ 20. 0. 46.67 33.33] [ 20. 0. 46.67 33.33]
[ 4.81 0.96 24.04 70.19] [ 4.81 0.96 24.04 70.19]
[ 0. 0. 0. 100. ]] [ 0. 0. 0. 100. ]]
Valid [WAR] [UAR] [Custom]: [ 46.34 40.01] [41.28]有效 [WAR] [UAR] [自定义]：[ 46.34 40.01] [41.28]

Answer 1

I konw！ It may be because your model stores the accuracy of the last epoch instead of the historical best accuracy.我知道！这可能是因为你的模型存储的是最后一个 epoch 的准确率，而不是历史最佳准确率。 Therefore, the hand calculation accuracy is not the optimal accuracy.因此，手算精度不是最佳精度。 You can code like this你可以这样编码

1.Save the optimal model to a file 1.将最优模型保存到文件

callbacks= [callback_list.ModelCheckpoint(
        filepath='best_model.{epoch:02d}-{val_acc:.2f}.h5',
        monitor='val_acc', save_best_only=True,verbose=1)]

2.Load the model 2.加载模型

model = load_model('best_model.03-0.69.h5')

Answer 2

I solved this problem using sequence instead of generator.我使用序列而不是生成器解决了这个问题。

I can find out why this phenomenon occurs in the following sources.我可以在以下来源中找出为什么会出现这种现象。

https://github.com/keras-team/keras/issues/11878 https://github.com/keras-team/keras/issues/11878

A simple example using sequence is shown below.下面显示了一个使用序列的简单示例。

https://medium.com/datadriveninvestor/keras-training-on-large-datasets-3e9d9dbc09d4 https://medium.com/datadriveninvestor/keras-training-on-large-datasets-3e9d9dbc09d4

Keras 中拟合生成器输出的精度与手动计算的精度不同

问题描述

2 个解决方案

解决方案1
2 2021-03-29 15:14:20

解决方案2
1 已采纳 2019-04-10 12:41:32

Keras 中拟合生成器输出的精度与手动计算的精度不同

问题描述

2 个解决方案

解决方案1 2 2021-03-29 15:14:20

解决方案2 1 已采纳 2019-04-10 12:41:32

解决方案1
2 2021-03-29 15:14:20

解决方案2
1 已采纳 2019-04-10 12:41:32