Why do I get different predictions using Keras sequential neural network in a loop?

Question

I came across a weird difference between keras model.fit() and sklearn model.fit() functions. When model.fit() is called inside a loop I get inconsistent predictions using a Keras sequential model. This is not the case when using an sklearn model. See sample code to reproduce the phenomenon.

from numpy.random import seed
seed(1337)
import tensorflow as tf
tf.random.set_seed(1337)

from sklearn.linear_model import LogisticRegression

from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import InputLayer

from sklearn.datasets import make_blobs
from sklearn.preprocessing import MinMaxScaler
import numpy as np

def get_sequential_dnn(NUM_COLS, NUM_ROWS):
   # code for model

if __name__ == "__main__":

    input_size = 10
    X, y = make_blobs(n_samples=100, centers=2, n_features=input_size,
                      random_state=1
                      )

    scalar = MinMaxScaler()
    scalar.fit(X)
    X = scalar.transform(X)

    model = get_sequential_dnn(X.shape[1], X.shape[0])
    # print(model.summary())
    # model = LogisticRegression()

    for i in range(2):
        model.fit(X, y, epochs=100, verbose=0, shuffle=False)
        # model.fit(X, y)
    
        Xnew, _ = make_blobs(n_samples=3, centers=2, n_features=10, random_state=1)
        Xnew = scalar.transform(Xnew)

        # make a prediction  
        # ynew = model.predict_proba(Xnew)[:, 1]
        ynew = model.predict_proba(Xnew)
        ynew = np.array(ynew)
    
        # show the inputs and predicted outputs
        print('--------------')
        for i in range(len(Xnew)):
            print("X=%s \n Predicted=%s" % (Xnew[i], ynew[i]))

The output of this is

--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
 0.77125788 0.73345369 0.2153754  0.35317172] 
 Predicted=[0.9931685]
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
 0.12891829 0.25729677 0.69975833 0.73165292] 
 Predicted=[0.35249507]
X=[0.65154993 0.26153846 0.2416324  0.11793901 0.7047334  0.17706289
 0.07761879 0.45189967 0.8481064  0.85092378] 
 Predicted=[0.35249507]
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
 0.77125788 0.73345369 0.2153754  0.35317172] 
 Predicted=[1.]
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
 0.12891829 0.25729677 0.69975833 0.73165292] 
 Predicted=[0.17942095]
X=[0.65154993 0.26153846 0.2416324  0.11793901 0.7047334  0.17706289
 0.07761879 0.45189967 0.8481064  0.85092378] 
 Predicted=[0.17942095]

While if I use a Logistic Regression (un-comment the commented lines) the predictions are consistent:

--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
 0.77125788 0.73345369 0.2153754  0.35317172] 
 Predicted=0.929209043999009
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
 0.12891829 0.25729677 0.69975833 0.73165292] 
 Predicted=0.04643513037543502
X=[0.65154993 0.26153846 0.2416324  0.11793901 0.7047334  0.17706289
 0.07761879 0.45189967 0.8481064  0.85092378] 
 Predicted=0.038716408758471876
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
 0.77125788 0.73345369 0.2153754  0.35317172] 
 Predicted=0.929209043999009
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
 0.12891829 0.25729677 0.69975833 0.73165292] 
 Predicted=0.04643513037543502
X=[0.65154993 0.26153846 0.2416324  0.11793901 0.7047334  0.17706289
 0.07761879 0.45189967 0.8481064  0.85092378] 
 Predicted=0.038716408758471876

I get that the obvious solution to this is fit the model before the loop, probably there is a strong randomness how Keras models fit the data to the labels, but there are a couple of cases where you need to have a loop to get prediction scores. For example if you want to perform a 10-fold cross validation to get the AUC, sensitivity, specificity values on a training data. In these situations this randomness is unacceptable.

What is causing this inconsistency and what is the solution to it?

Answer 1

There are couple of issue with the way your are trying to make reproducible results with keras.

You are calling the fit (when i==1 ) over the already fitted model (when i==0 ). So the optimizer sees different sets of inital weights in both the cases and so you will end up in two different models. Solution : Get a fresh model everytime. This is not the case with sklearn, which starts with fresh initialized weights every time a fit is called.
model.fit internally might use a current stage of random number generator. You seeded it outside the loop, so the state will be different when fit is called the second time. Solution : Seed inside the loop.

Sample code with issue

# Issue 2 here
tf.random.set_seed(1337)

def get_model():
  model = Sequential()
  model.add(Dense(4, input_dim=8, activation='relu'))
  model.add(Dense(1, activation='sigmoid'))
  model.compile(loss='binary_crossentropy', optimizer='adam')
  return model

X = np.random.randn(10,8)
y = np.random.randn(10,1)

# Issue 1 here
model = get_model()

results = []
for i in range(10):
  model.fit(X, y, epochs=5, verbose=0, shuffle=False)
  results.append(np.sum(model.predict(X)))

assert np.all(np.isclose(results, results[0]))

As you can see the assert fails

Corrected code

results = []
for i in range(10):
  tf.random.set_seed(1337)
  model = get_model()
  model.fit(X, y, epochs=5, verbose=0, shuffle=False)
  results.append(np.sum(model.predict(X)))

assert np.all(np.isclose(results, results[0]))

Why do I get different predictions using Keras sequential neural network in a loop?

Question

1 answers

solution1
1 ACCPTED 2021-04-29 11:48:03

Sample code with issue

Corrected code

Why do I get different predictions using Keras sequential neural network in a loop?

Question

1 answers

solution1 1 ACCPTED 2021-04-29 11:48:03

Sample code with issue

Corrected code

solution1
1 ACCPTED 2021-04-29 11:48:03