如何在 Keras 中保存 val_loss 和 val_acc

Question

I have trouble with recording 'val_loss' and 'val_acc' in Keras.我在 Keras 中记录“val_loss”和“val_acc”时遇到问题。 'loss' and 'acc' are easy because they always recorded in history of model.fit. 'loss' 和 'acc' 很容易，因为它们总是记录在 model.fit 的历史记录中。

'val_loss' is recorded if validation is enabled in fit , and val_acc is recorded if validation and accuracy monitoring are enabled.如果在fit启用验证，则记录 'val_loss'，如果启用验证和准确性监控，则记录val_acc 。 But what does this mean?但是，这是什么意思？

My node is model.fit(train_data, train_labels,epochs = 64,batch_size = 10,shuffle = True,validation_split = 0.2, callbacks=[history]) .我的节点是model.fit(train_data, train_labels,epochs = 64,batch_size = 10,shuffle = True,validation_split = 0.2, callbacks=[history]) 。

As you see, I use 5-fold cross-validation and shuffle the data.如您所见，我使用了 5 折交叉验证并打乱了数据。 In this case, how can I enable validation in fit to record 'val_loss' and 'val_acc'?在这种情况下，我怎么能允许validation在fit录制“val_loss”和“val_acc”？

Thanks谢谢

Answer 1

From Keras documentation, we have for models.fit method:从 Keras 文档中，我们有models.fit方法：

fit(x=None, y=None, batch_size=None, epochs=1, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None)

'val_loss' is recorded if validation is enabled in fit, and val_accis recorded if validation and accuracy monitoring are enabled. - This is from the keras.callbacks.Callback() object, if used for callbacks parameter in the above fit method. - 这来自keras.callbacks.Callback()对象，如果用于上述 fit 方法中的回调参数。 It can be used as below:它可以如下使用：

    from keras.callbacks import Callback
    logs = Callback()
    model.fit(train_data, train_labels,epochs = 64, batch_size = 10,shuffle = True,validation_split = 0.2, callbacks=[logs]) 
    # Instead of using the history callback, which you've used.

'val_loss' is recorded if validation is enabled in fit means: when using the model.fit method you are using either the validatoin_split parameter or you use validation_data parameter to specify the tuple (x_val, y_val) or tuple (x_val, y_val, val_sample_weights) on which to evaluate the loss and any model metrics at the end of each epoch.如果在fit启用验证，则会记录“val_loss”：当使用 model.fit 方法时，您使用的是validatoin_split参数或使用validation_data参数to specify the tuple (x_val, y_val) or tuple (x_val, y_val, val_sample_weights) on which to evaluate the loss and any model metrics at the end of each epoch. . .

A History object.一个历史对象。 Its History.history attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable).它的 History.history 属性是连续 epoch 的训练损失值和度量值的记录，以及验证损失值和验证度量值（如果适用）。 - Keras Documentation ( Return value for model.fit method) - Keras 文档（model.fit 方法的返回值）

In your model below:在下面的模型中：

model.fit(train_data, train_labels,epochs = 64,batch_size = 10,shuffle = True,validation_split = 0.2, callbacks=[history])

you are using the History callback, if you use a variable for saving model.fit like below:您正在使用 History 回调，如果您使用变量来保存 model.fit 如下所示：

history = model.fit(train_data, train_labels,epochs = 64,batch_size = 10,shuffle = True,validation_split = 0.2, callbacks=[history])
history.history

history.history will output a dictionary for you with the : loss , acc , val_loss and val_acc : just like given below: history.history将为您输出一个字典，其中包含： loss 、 acc 、 val_loss和val_acc ：就像下面给出的：

{'val_loss': [14.431451635814849,
  14.431451635814849,
  14.431451635814849,
  14.431451635814849,
  14.431451635814849,
  14.431451635814849,
  14.431451635814849,
  14.431451635814849,
  14.431451635814849,
  14.431451635814849],
 'val_acc': [0.1046428571712403,
  0.1046428571712403,
  0.1046428571712403,
  0.1046428571712403,
  0.1046428571712403,
  0.1046428571712403,
  0.1046428571712403,
  0.1046428571712403,
  0.1046428571712403,
  0.1046428571712403],
 'loss': [14.555215610322499,
  14.555215534028553,
  14.555215548560733,
  14.555215588524229,
  14.555215592157273,
  14.555215581258137,
  14.555215575808571,
  14.55521561940511,
  14.555215563092913,
  14.555215624854679],
 'acc': [0.09696428571428571,
  0.09696428571428571,
  0.09696428571428571,
  0.09696428571428571,
  0.09696428571428571,
  0.09696428571428571,
  0.09696428571428571,
  0.09696428571428571,
  0.09696428571428571,
  0.09696428571428571]}

You can save the data both by using csvlogger like below as given in the comments or by using the longer method of writing a dictionary to a csv file as given here writing a dictionary to a csv您可以通过使用如下注释中给出的 csvlogger 或使用更长的方法将字典写入 csv 文件（如此处给出的将字典写入 csv）来保存数据

csv_logger = CSVLogger('training.log')
model.fit(X_train, Y_train, callbacks=[csv_logger])

Answer 2

UPDATE: The val_accuracy dictionary key seems to no longer work today.更新： val_accuracy字典键今天似乎不再有效。 No idea why, but I removed that code from here despite the OP asking how to log it (also, loss is what actually matters for comparison of cross-validation results).不知道为什么，但我从这里删除了该代码，尽管 OP 询问如何记录它（而且，损失对于交叉验证结果的比较来说实际上很重要）。

Using Python 3.7 and Tensorflow 2.0, the following worked for me, after much searching, guessing, and failing repeatedly.使用 Python 3.7 和 Tensorflow 2.0，经过多次搜索、猜测和反复失败后，以下内容对我有用。 I started with someone else's script to get what I needed written to a .json file;我从其他人的脚本开始，将我需要的内容写入.json文件； it produces one such .json file per training run showing the validation loss per epoch so you can see how the model converged (or did not);它会在每次训练运行时生成一个这样的.json文件，显示每个时期的验证损失，因此您可以看到模型如何收敛（或不收敛）； accuracy is logged but not as a performance metric.准确性被记录但不作为性能指标。

NOTE: You need to fill in yourTrainDir , yourTrainingData , yourValidationData , yourOptimizer , yourLossFunctionFromKerasOrElsewhere , yourNumberOfEpochs , etc. to enable this code to run:注意：您需要填写yourTrainDir 、 yourTrainingData 、 yourValidationData 、 yourOptimizer 、 yourLossFunctionFromKerasOrElsewhere 、 yourNumberOfEpochs等，以启用此代码：

import numpy as np
import os
import tensorflow as tf
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, LambdaCallback
import json
model.compile(
    optimizer=yourOptimizer,
    loss=yourLossFunctionFromKerasOrElsewhere()
    )

# create a custom callback to enable future cross-validation efforts
yourTrainDir = os.getcwd() + '/yourOutputFolderName/'
uniqueID = np.random.randint(999999) # To distinguish validation runs by saved JSON name
epochValidationLog = open(
    yourTrainDir +
    'val_log_per_epoch_' +
    '{}_'.format(uniqueID) +
    '.json',
    mode='wt',
    buffering=1
    )
ValidationLogsCallback = LambdaCallback(
    on_epoch_end = lambda epoch,
        logs: epochValidationLog.write(
            json.dumps(
                {
                    'oneIndexedEpoch': epoch + 1,
                    'Validationloss': logs['val_loss']
                }
                ) + '\n'
            ),
    on_train_end = lambda logs: epochValidationLog.close()
    )

# set up the list of callbacks
callbacksList = [
    ValidationLogsCallback,
    EarlyStopping(patience=40, verbose=1),
    ]
results = model.fit(
    x=yourTrainingData,
    steps_per_epoch=len(yourTrainingData),
    validation_data=yourValidationData,
    validation_steps=len(yourValidationData),
    epochs=yourNumberOfEpochs,
    verbose=1,
    callbacks=callbacksList
    )

This produces a JSON file in TrainDir folder recording validation loss and accuracy for each training epoch as its own dictionary-like item.这会在TrainDir文件夹中生成一个 JSON 文件，将每个训练时期的验证损失和准确性记录为它自己的类似字典的项目。 Note that the epoch number is indexed to start at 1 so it matches the output of tensorflow, not the actual index in Python.请注意，纪元编号的索引从1开始，因此它与 tensorflow 的输出相匹配，而不是 Python 中的实际索引。

I am outputting to .JSON file but it could be anything.我正在输出到 .JSON 文件，但它可以是任何东西。 Here is my code for analyzing the JSON files produced;这是我用于分析生成的 JSON 文件的代码； I could have put it all in one script but did not.我本可以把它全部放在一个脚本中，但没有。

import os
from pathlib import Path
import json

currentDirectory = os.getcwd()
outFileName = 'CVResults.json'
outFile = open(outFileName, mode='wt')
validationLogPaths = Path().glob('val_log_per_epoch_*.json')

# Necessary list to detect short unique IDs for each training session
stringDecimalDigits = [
    '1',
    '2',
    '3',
    '4',
    '5',
    '6',
    '7',
    '8',
    '9',
    '0'
]
setStringDecimalDigits = set(stringDecimalDigits)
trainingSessionsList = []

# Load the JSON files into memory to allow reading.
for validationLogFile in validationLogPaths:
    trainingUniqueIDCandidate = str(validationLogFile)[18:21]

    # Pad unique IDs with fewer than three digits with zeros at front
    thirdPotentialDigitOfUniqueID = trainingUniqueIDCandidate[2]
    if setStringDecimalDigits.isdisjoint(thirdPotentialDigitOfUniqueID):
        secondPotentialDigitOfUniqueID = trainingUniqueIDCandidate[1]
        if setStringDecimalDigits.isdisjoint(secondPotentialDigitOfUniqueID):
            trainingUniqueID = '00' + trainingUniqueIDCandidate[:1]
        else:
            trainingUniqueID = '0' + trainingUniqueIDCandidate[:2]
    else:
        trainingUniqueID = trainingUniqueIDCandidate
    trainingSessionsList.append((trainingUniqueID, validationLogFile))
trainingSessionsList.sort(key=lambda x: x[0])

# Analyze and export cross-validation results
for replicate in range(len(dict(trainingSessionsList).keys())):
    validationLogFile = trainingSessionsList[replicate][1]
    fileOpenForReading = open(
        validationLogFile, mode='r', buffering=1
    )

    with fileOpenForReading as openedFile:
        jsonValidationData = [json.loads(line) for line in openedFile]

    bestEpochResultsDict = {}
    oneIndexedEpochsList = []
    validationLossesList = []
    for line in range(len(jsonValidationData)):
        tempDict = jsonValidationData[line]
        oneIndexedEpochsList.append(tempDict['oneIndexedEpoch'])
        validationLossesList.append(tempDict['Validationloss'])
    trainingStopIndex = min(
        range(len(validationLossesList)),
        key=validationLossesList.__getitem__
    )
    bestEpochResultsDict['Integer_unique_ID'] = trainingSessionsList[replicate][0]
    bestEpochResultsDict['Min_val_loss'] = validationLossesList[trainingStopIndex]
    bestEpochResultsDict['Last_train_epoch'] = oneIndexedEpochsList[trainingStopIndex]
    outFile.write(json.dumps(bestEpochResultsDict, sort_keys=True) + '\n')

outFile.close()

This last block of code creates a JSON summarizing what is in CVResults.json produced above:最后一段代码创建了一个 JSON，总结了上面生成的CVResults.json ：

from pathlib import Path
import json
import os
import statistics

outFile = open("CVAnalysis.json", mode='wt')
CVResultsPath = sorted(Path().glob('*CVResults.json'))
if len(CVResultsPath) > 1:
    print('\nPlease analyze only one CVResults.json file at at time.')
    userAnswer = input('\nI understand only one will be analyzed: y or n')
    if (userAnswer == 'y') or (userAnswer == 'Y'):
        print('\nAnalyzing results in file {}:'.format(str(CVResultsPath[0])))

# Load the first CVResults.json file into memory to allow reading.
CVResultsFile = CVResultsPath[0]
fileOpenForReading = open(
    CVResultsFile, mode='r', buffering=1
)

outFile.write(
    'Analysis of cross-validation results tabulated in file {}'.format(
        os.getcwd()
    ) +
    str(CVResultsFile) +
    ':\n\n'
)

with fileOpenForReading as openedFile:
    jsonCVResultsData = [json.loads(line) for line in openedFile]

minimumValidationLossesList = []
trainedOneIndexedEpochsList = []
for line in range(len(jsonCVResultsData)):
    tempDict = jsonCVResultsData[line]
    minimumValidationLossesList.append(tempDict['Min_val_loss'])
    trainedOneIndexedEpochsList.append(tempDict['Last_train_epoch'])
outFile.write(
    '\nTrained validation losses: ' +
    json.dumps(minimumValidationLossesList) +
    '\n'
)
outFile.write(
    '\nTraining epochs required: ' +
    json.dumps(trainedOneIndexedEpochsList) +
    '\n'
)
outFile.write(
    '\n\nMean trained validation loss: ' +
    str(round(statistics.mean(minimumValidationLossesList), 4)) +
    '\n'
)
outFile.write(
    'Median of mean trained validation losses per session: ' +
    str(round(statistics.median(minimumValidationLossesList), 4)) +
    '\n'
)
outFile.write(
    '\n\nMean training epochs required: ' +
    str(round(statistics.mean(trainedOneIndexedEpochsList), 1)) +
    '\n'
)
outFile.write(
    'Median of mean training epochs required per session: ' +
    str(round(statistics.median(trainedOneIndexedEpochsList), 1)) +
    '\n'
)
outFile.close()

Answer 3

It is possible to save the data of val_loss and val_acc using the ModelCheckpoint class of Keras.可以使用Keras 的 ModelCheckpoint类保存val_loss和val_acc的数据。

from keras.callbacks import ModelCheckpoint

checkpointer = ModelCheckpoint(filepath='yourmodelname.hdf5', 
                               monitor='val_loss', 
                               verbose=1, 
                               save_best_only=False)

history = model.fit(X_train, y_train, epochs=100, validation_split=0.02, callbacks=[checkpointer])

history.history.keys()

# output
# dict_keys(['val_loss', 'val_mae', 'val_acc', 'loss', 'mae', 'acc'])

An important point, if you omit the validation_split property, you will only get the values of loss , mae and acc .重要的一点，如果你省略了validation_split属性，你将只会得到loss 、 mae和acc的值。

Hope this helps!希望这可以帮助！

如何在 Keras 中保存 val_loss 和 val_acc

问题描述

3 个解决方案

解决方案1
4 2018-12-08 06:33:00

解决方案2
1 2020-03-06 17:40:34

解决方案3
1 2020-04-22 22:40:40

如何在 Keras 中保存 val_loss 和 val_acc

问题描述

3 个解决方案

解决方案1 4 2018-12-08 06:33:00

解决方案2 1 2020-03-06 17:40:34

解决方案3 1 2020-04-22 22:40:40

解决方案1
4 2018-12-08 06:33:00

解决方案2
1 2020-03-06 17:40:34

解决方案3
1 2020-04-22 22:40:40