I have trouble with recording 'val_loss' and 'val_acc' in Keras. 'loss' and 'acc' are easy because they always recorded in history of model.fit.
'val_loss' is recorded if validation is enabled in fit
, and val_acc
is recorded if validation and accuracy monitoring are enabled. But what does this mean?
My node is model.fit(train_data, train_labels,epochs = 64,batch_size = 10,shuffle = True,validation_split = 0.2, callbacks=[history])
.
As you see, I use 5-fold cross-validation and shuffle the data. In this case, how can I enable validation
in fit
to record 'val_loss' and 'val_acc'?
Thanks
From Keras documentation, we have for
models.fit
method:
fit(x=None, y=None, batch_size=None, epochs=1, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None)
'val_loss' is recorded if validation is enabled in fit, and val_accis recorded if validation and accuracy monitoring are enabled.
- This is from the keras.callbacks.Callback() object, if used for callbacks parameter in the above fit method. It can be used as below:
from keras.callbacks import Callback
logs = Callback()
model.fit(train_data, train_labels,epochs = 64, batch_size = 10,shuffle = True,validation_split = 0.2, callbacks=[logs])
# Instead of using the history callback, which you've used.
'val_loss' is recorded if validation is enabled in fit
means: when using the model.fit method you are using either the validatoin_split
parameter or you use validation_data
parameter to specify the tuple (x_val, y_val) or tuple (x_val, y_val, val_sample_weights) on which to evaluate the loss and any model metrics at the end of each epoch.
.
A History object. Its History.history attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable). - Keras Documentation ( Return value for model.fit method)
In your model below:
model.fit(train_data, train_labels,epochs = 64,batch_size = 10,shuffle = True,validation_split = 0.2, callbacks=[history])
you are using the History callback, if you use a variable for saving model.fit like below:
history = model.fit(train_data, train_labels,epochs = 64,batch_size = 10,shuffle = True,validation_split = 0.2, callbacks=[history])
history.history
history.history will output a dictionary for you with the : loss
, acc
, val_loss
and val_acc
: just like given below:
{'val_loss': [14.431451635814849,
14.431451635814849,
14.431451635814849,
14.431451635814849,
14.431451635814849,
14.431451635814849,
14.431451635814849,
14.431451635814849,
14.431451635814849,
14.431451635814849],
'val_acc': [0.1046428571712403,
0.1046428571712403,
0.1046428571712403,
0.1046428571712403,
0.1046428571712403,
0.1046428571712403,
0.1046428571712403,
0.1046428571712403,
0.1046428571712403,
0.1046428571712403],
'loss': [14.555215610322499,
14.555215534028553,
14.555215548560733,
14.555215588524229,
14.555215592157273,
14.555215581258137,
14.555215575808571,
14.55521561940511,
14.555215563092913,
14.555215624854679],
'acc': [0.09696428571428571,
0.09696428571428571,
0.09696428571428571,
0.09696428571428571,
0.09696428571428571,
0.09696428571428571,
0.09696428571428571,
0.09696428571428571,
0.09696428571428571,
0.09696428571428571]}
You can save the data both by using csvlogger like below as given in the comments or by using the longer method of writing a dictionary to a csv file as given here writing a dictionary to a csv
csv_logger = CSVLogger('training.log')
model.fit(X_train, Y_train, callbacks=[csv_logger])
UPDATE: The val_accuracy
dictionary key seems to no longer work today. No idea why, but I removed that code from here despite the OP asking how to log it (also, loss is what actually matters for comparison of cross-validation results).
Using Python 3.7 and Tensorflow 2.0, the following worked for me, after much searching, guessing, and failing repeatedly. I started with someone else's script to get what I needed written to a .json
file; it produces one such .json
file per training run showing the validation loss per epoch so you can see how the model converged (or did not); accuracy is logged but not as a performance metric.
NOTE: You need to fill in yourTrainDir
, yourTrainingData
, yourValidationData
, yourOptimizer
, yourLossFunctionFromKerasOrElsewhere
, yourNumberOfEpochs
, etc. to enable this code to run:
import numpy as np
import os
import tensorflow as tf
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, LambdaCallback
import json
model.compile(
optimizer=yourOptimizer,
loss=yourLossFunctionFromKerasOrElsewhere()
)
# create a custom callback to enable future cross-validation efforts
yourTrainDir = os.getcwd() + '/yourOutputFolderName/'
uniqueID = np.random.randint(999999) # To distinguish validation runs by saved JSON name
epochValidationLog = open(
yourTrainDir +
'val_log_per_epoch_' +
'{}_'.format(uniqueID) +
'.json',
mode='wt',
buffering=1
)
ValidationLogsCallback = LambdaCallback(
on_epoch_end = lambda epoch,
logs: epochValidationLog.write(
json.dumps(
{
'oneIndexedEpoch': epoch + 1,
'Validationloss': logs['val_loss']
}
) + '\n'
),
on_train_end = lambda logs: epochValidationLog.close()
)
# set up the list of callbacks
callbacksList = [
ValidationLogsCallback,
EarlyStopping(patience=40, verbose=1),
]
results = model.fit(
x=yourTrainingData,
steps_per_epoch=len(yourTrainingData),
validation_data=yourValidationData,
validation_steps=len(yourValidationData),
epochs=yourNumberOfEpochs,
verbose=1,
callbacks=callbacksList
)
This produces a JSON file in TrainDir
folder recording validation loss and accuracy for each training epoch as its own dictionary-like item. Note that the epoch number is indexed to start at 1
so it matches the output of tensorflow, not the actual index in Python.
I am outputting to .JSON file but it could be anything. Here is my code for analyzing the JSON files produced; I could have put it all in one script but did not.
import os
from pathlib import Path
import json
currentDirectory = os.getcwd()
outFileName = 'CVResults.json'
outFile = open(outFileName, mode='wt')
validationLogPaths = Path().glob('val_log_per_epoch_*.json')
# Necessary list to detect short unique IDs for each training session
stringDecimalDigits = [
'1',
'2',
'3',
'4',
'5',
'6',
'7',
'8',
'9',
'0'
]
setStringDecimalDigits = set(stringDecimalDigits)
trainingSessionsList = []
# Load the JSON files into memory to allow reading.
for validationLogFile in validationLogPaths:
trainingUniqueIDCandidate = str(validationLogFile)[18:21]
# Pad unique IDs with fewer than three digits with zeros at front
thirdPotentialDigitOfUniqueID = trainingUniqueIDCandidate[2]
if setStringDecimalDigits.isdisjoint(thirdPotentialDigitOfUniqueID):
secondPotentialDigitOfUniqueID = trainingUniqueIDCandidate[1]
if setStringDecimalDigits.isdisjoint(secondPotentialDigitOfUniqueID):
trainingUniqueID = '00' + trainingUniqueIDCandidate[:1]
else:
trainingUniqueID = '0' + trainingUniqueIDCandidate[:2]
else:
trainingUniqueID = trainingUniqueIDCandidate
trainingSessionsList.append((trainingUniqueID, validationLogFile))
trainingSessionsList.sort(key=lambda x: x[0])
# Analyze and export cross-validation results
for replicate in range(len(dict(trainingSessionsList).keys())):
validationLogFile = trainingSessionsList[replicate][1]
fileOpenForReading = open(
validationLogFile, mode='r', buffering=1
)
with fileOpenForReading as openedFile:
jsonValidationData = [json.loads(line) for line in openedFile]
bestEpochResultsDict = {}
oneIndexedEpochsList = []
validationLossesList = []
for line in range(len(jsonValidationData)):
tempDict = jsonValidationData[line]
oneIndexedEpochsList.append(tempDict['oneIndexedEpoch'])
validationLossesList.append(tempDict['Validationloss'])
trainingStopIndex = min(
range(len(validationLossesList)),
key=validationLossesList.__getitem__
)
bestEpochResultsDict['Integer_unique_ID'] = trainingSessionsList[replicate][0]
bestEpochResultsDict['Min_val_loss'] = validationLossesList[trainingStopIndex]
bestEpochResultsDict['Last_train_epoch'] = oneIndexedEpochsList[trainingStopIndex]
outFile.write(json.dumps(bestEpochResultsDict, sort_keys=True) + '\n')
outFile.close()
This last block of code creates a JSON summarizing what is in CVResults.json
produced above:
from pathlib import Path
import json
import os
import statistics
outFile = open("CVAnalysis.json", mode='wt')
CVResultsPath = sorted(Path().glob('*CVResults.json'))
if len(CVResultsPath) > 1:
print('\nPlease analyze only one CVResults.json file at at time.')
userAnswer = input('\nI understand only one will be analyzed: y or n')
if (userAnswer == 'y') or (userAnswer == 'Y'):
print('\nAnalyzing results in file {}:'.format(str(CVResultsPath[0])))
# Load the first CVResults.json file into memory to allow reading.
CVResultsFile = CVResultsPath[0]
fileOpenForReading = open(
CVResultsFile, mode='r', buffering=1
)
outFile.write(
'Analysis of cross-validation results tabulated in file {}'.format(
os.getcwd()
) +
str(CVResultsFile) +
':\n\n'
)
with fileOpenForReading as openedFile:
jsonCVResultsData = [json.loads(line) for line in openedFile]
minimumValidationLossesList = []
trainedOneIndexedEpochsList = []
for line in range(len(jsonCVResultsData)):
tempDict = jsonCVResultsData[line]
minimumValidationLossesList.append(tempDict['Min_val_loss'])
trainedOneIndexedEpochsList.append(tempDict['Last_train_epoch'])
outFile.write(
'\nTrained validation losses: ' +
json.dumps(minimumValidationLossesList) +
'\n'
)
outFile.write(
'\nTraining epochs required: ' +
json.dumps(trainedOneIndexedEpochsList) +
'\n'
)
outFile.write(
'\n\nMean trained validation loss: ' +
str(round(statistics.mean(minimumValidationLossesList), 4)) +
'\n'
)
outFile.write(
'Median of mean trained validation losses per session: ' +
str(round(statistics.median(minimumValidationLossesList), 4)) +
'\n'
)
outFile.write(
'\n\nMean training epochs required: ' +
str(round(statistics.mean(trainedOneIndexedEpochsList), 1)) +
'\n'
)
outFile.write(
'Median of mean training epochs required per session: ' +
str(round(statistics.median(trainedOneIndexedEpochsList), 1)) +
'\n'
)
outFile.close()
It is possible to save the data of val_loss
and val_acc
using the ModelCheckpoint class of Keras.
from keras.callbacks import ModelCheckpoint
checkpointer = ModelCheckpoint(filepath='yourmodelname.hdf5',
monitor='val_loss',
verbose=1,
save_best_only=False)
history = model.fit(X_train, y_train, epochs=100, validation_split=0.02, callbacks=[checkpointer])
history.history.keys()
# output
# dict_keys(['val_loss', 'val_mae', 'val_acc', 'loss', 'mae', 'acc'])
An important point, if you omit the validation_split
property, you will only get the values of loss
, mae
and acc
.
Hope this helps!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.