简体   繁体   中英

Why do I get different predictions using Keras sequential neural network in a loop?

I came across a weird difference between keras model.fit() and sklearn model.fit() functions. When model.fit() is called inside a loop I get inconsistent predictions using a Keras sequential model. This is not the case when using an sklearn model. See sample code to reproduce the phenomenon.

from numpy.random import seed
seed(1337)
import tensorflow as tf
tf.random.set_seed(1337)

from sklearn.linear_model import LogisticRegression

from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import InputLayer

from sklearn.datasets import make_blobs
from sklearn.preprocessing import MinMaxScaler
import numpy as np

def get_sequential_dnn(NUM_COLS, NUM_ROWS):
   # code for model

if __name__ == "__main__":

    input_size = 10
    X, y = make_blobs(n_samples=100, centers=2, n_features=input_size,
                      random_state=1
                      )

    scalar = MinMaxScaler()
    scalar.fit(X)
    X = scalar.transform(X)

    model = get_sequential_dnn(X.shape[1], X.shape[0])
    # print(model.summary())
    # model = LogisticRegression()

    for i in range(2):
        model.fit(X, y, epochs=100, verbose=0, shuffle=False)
        # model.fit(X, y)
    
        Xnew, _ = make_blobs(n_samples=3, centers=2, n_features=10, random_state=1)
        Xnew = scalar.transform(Xnew)

        # make a prediction  
        # ynew = model.predict_proba(Xnew)[:, 1]
        ynew = model.predict_proba(Xnew)
        ynew = np.array(ynew)
    
        # show the inputs and predicted outputs
        print('--------------')
        for i in range(len(Xnew)):
            print("X=%s \n Predicted=%s" % (Xnew[i], ynew[i]))

The output of this is

--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
 0.77125788 0.73345369 0.2153754  0.35317172] 
 Predicted=[0.9931685]
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
 0.12891829 0.25729677 0.69975833 0.73165292] 
 Predicted=[0.35249507]
X=[0.65154993 0.26153846 0.2416324  0.11793901 0.7047334  0.17706289
 0.07761879 0.45189967 0.8481064  0.85092378] 
 Predicted=[0.35249507]
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
 0.77125788 0.73345369 0.2153754  0.35317172] 
 Predicted=[1.]
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
 0.12891829 0.25729677 0.69975833 0.73165292] 
 Predicted=[0.17942095]
X=[0.65154993 0.26153846 0.2416324  0.11793901 0.7047334  0.17706289
 0.07761879 0.45189967 0.8481064  0.85092378] 
 Predicted=[0.17942095]

While if I use a Logistic Regression (un-comment the commented lines) the predictions are consistent:

--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
 0.77125788 0.73345369 0.2153754  0.35317172] 
 Predicted=0.929209043999009
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
 0.12891829 0.25729677 0.69975833 0.73165292] 
 Predicted=0.04643513037543502
X=[0.65154993 0.26153846 0.2416324  0.11793901 0.7047334  0.17706289
 0.07761879 0.45189967 0.8481064  0.85092378] 
 Predicted=0.038716408758471876
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
 0.77125788 0.73345369 0.2153754  0.35317172] 
 Predicted=0.929209043999009
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
 0.12891829 0.25729677 0.69975833 0.73165292] 
 Predicted=0.04643513037543502
X=[0.65154993 0.26153846 0.2416324  0.11793901 0.7047334  0.17706289
 0.07761879 0.45189967 0.8481064  0.85092378] 
 Predicted=0.038716408758471876

I get that the obvious solution to this is fit the model before the loop, probably there is a strong randomness how Keras models fit the data to the labels, but there are a couple of cases where you need to have a loop to get prediction scores. For example if you want to perform a 10-fold cross validation to get the AUC, sensitivity, specificity values on a training data. In these situations this randomness is unacceptable.

What is causing this inconsistency and what is the solution to it?

There are couple of issue with the way your are trying to make reproducible results with keras.

  1. You are calling the fit (when i==1 ) over the already fitted model (when i==0 ). So the optimizer sees different sets of inital weights in both the cases and so you will end up in two different models. Solution : Get a fresh model everytime. This is not the case with sklearn, which starts with fresh initialized weights every time a fit is called.
  2. model.fit internally might use a current stage of random number generator. You seeded it outside the loop, so the state will be different when fit is called the second time. Solution : Seed inside the loop.

Sample code with issue

# Issue 2 here
tf.random.set_seed(1337)

def get_model():
  model = Sequential()
  model.add(Dense(4, input_dim=8, activation='relu'))
  model.add(Dense(1, activation='sigmoid'))
  model.compile(loss='binary_crossentropy', optimizer='adam')
  return model

X = np.random.randn(10,8)
y = np.random.randn(10,1)

# Issue 1 here
model = get_model()

results = []
for i in range(10):
  model.fit(X, y, epochs=5, verbose=0, shuffle=False)
  results.append(np.sum(model.predict(X)))

assert np.all(np.isclose(results, results[0]))

As you can see the assert fails

Corrected code

results = []
for i in range(10):
  tf.random.set_seed(1337)
  model = get_model()
  model.fit(X, y, epochs=5, verbose=0, shuffle=False)
  results.append(np.sum(model.predict(X)))

assert np.all(np.isclose(results, results[0]))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM