简体   繁体   中英

The clear_session() method of keras.backend does not clean up the fitting data

I am working on a comparison of the fitting accuracy results for the different types of data quality. A "good data" is the data without any NA in the feature values. A "bad data" is the data with NA in the feature values. A "bad data" should be fixed by some value correction. As a value correction, it might be replacing NA with zero or mean value.

In my code, I am trying to perform multiple fitting procedures.

Review the simplified code:

from keras import backend as K
...

xTrainGood = ... # the good version of the xTrain data 

xTrainBad = ... #  the bad version of the xTrain data

...

model = Sequential()

model.add(...)

...

historyGood = model.fit(..., xTrainGood, ...) # fitting the model with 
                                              # the original data without
                                              # NA, zeroes, or the feature mean values

Review the fitting accuracy plot, based on historyGood data:

在此处输入图像描述

After that, the code resets a stored the model and re-train the model with the "bad" data:

K.clear_session()

historyBad = model.fit(..., xTrainBad, ...)

Review the fitting process results, based on historyBad data:

在此处输入图像描述

As one can notice, the initial accuracy > 0.7 , which means the model "remembers" previous fitting.

For the comparison, this is the standalone fitting results of "bad" data:

在此处输入图像描述

How to reset the model to the "initial" state?

K.clear_session() isn't enough to reset states and ensure reproducibility. You'll also need to:

  • Set (& reset) random seeds
  • Reset TensorFlow default graph
  • Delete previous model

Code accomplishing each below.

reset_seeds()
model = make_model() # example function to instantiate model
model.fit(x_good, y_good)

del model
K.clear_session()
tf.compat.v1.reset_default_graph()

reset_seeds()
model = make_model()
model.fit(x_bad, y_bad)

Note that if other variables reference the model, you should del them also - eg model = make_model(); model2 = model model = make_model(); model2 = model --> del model, model2 - else they may persist. Lastly, tf random seeds aren't as easily reset as random 's or numpy 's, and require the graph to be cleared beforehand.


Function/modules used :

 import tensorflow as tf import numpy as np import random import keras.backend as K def reset_seeds(): np.random.seed(1) random.seed(2) if tf.__version__[0] == '2': tf.random.set_seed(3) else: tf.set_random_seed(3) print("RANDOM SEEDS RESET")

You are using K.clear_session() in the wrong way, to get a model with randomly initialized weights, you should delete the old model (using the del keyword), and then proceed to create a new model, and train it.

The you can use K.clear_session() after each fitting procedure.

Instantiating a new model object with the same name is not enough?

model = make_model()

I had a similar issue when did training many models in a loop in a single file. I tried many things on Keras/TF (version 2.5), including the answers in this thread. Nothing helped apart from one thing - running one file from another file using subprocess module , which ensures the kernel restart every single time.

In the simplest way, you can keep a training code in a single file, and access it to run your initial model or rerun the consequent model from a different file. To run one file from another simply do it in the second file:

run_no = [0,1,2,3]
for i in range(len()):
    subprocess.run(["ipython", "your_main_file.ipynb", str(i)])    # for jupyter
    #subprocess.run(["python3", "your_main_file.py", str(i)])      # for python

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM