简体   繁体   English

Keras:可重现的结果CPU上的简单MLP

[英]Keras: Reproducible Results Simple MLP on CPU

I'm building and testing a simple MLP model, but am running into an issue with the Keras reproducibility for my results. 我正在构建和测试一个简单的MLP模型,但是我的结果遇到了Keras可重复性的问题。 I am trying to set up my neural network so that the prediction outputs won't change when I run the network. 我试图建立我的神经网络,以便在运行网络时预测输出不会改变。

I have already followed the Keras guide online as well as this post ( Reproducible results using Keras with TensorFlow backend ). 我已经在网上以及这篇帖子中遵循了Keras指南( 使用带有TensorFlow后端的Keras可重现结果 )。 I am running Keras on my local machine with Tensorflow backend and the following versions: 我在具有Tensorflow后端和以下版本的本地计算机上运行Keras:

tensorflow 2.0.0-alpha0, keras 2.2.4-tf, numpy 1.16.0 tensorflow 2.0.0-alpha0,keras 2.2.4-tf,numpy 1.16.0

import os  
os.environ['PYTHONHASHSEED']=str(0)

import random
random.seed(0)

from numpy.random import seed
seed(1)
import tensorflow as tf
tf.compat.v1.set_random_seed(2)

from keras import backend as K
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
K.set_session(sess)

import numpy as np
from tensorflow.python.keras.layers import Dropout, BatchNormalization
from tensorflow.python.keras.optimizers import Adam
from galileo.time_series.machine_learning.classifiers import Machine_Learning_Classifier_Base
from galileo.util.time_util import TimerContextManager


class Machine_Learning_Classifier_Keras(object):    
    @classmethod
    def _get_classifier(cls, n_input_features=None, **params):
        KerasClassifier = tf.keras.wrappers.scikit_learn.KerasClassifier
        Dense = tf.keras.layers.Dense
        Sequential = tf.keras.models.Sequential

        sk_params = {"epochs": 200, "batch_size": 128, "shuffle": False}

        def create_model(optimizer='adam', init='he_normal'):
            # create model
            model = Sequential()
            model.add(BatchNormalization())
            model.add(Dropout(0.2))
            model.add(Dense(500, input_dim=4, kernel_initializer=init, activation='relu'))
            model.add(BatchNormalization())
            model.add(Dropout(0.2))
            model.add(Dense(250, kernel_initializer=init, activation='relu'))
            model.add(BatchNormalization())
            model.add(Dropout(0.2))
            model.add(Dense(500, kernel_initializer=init, activation='relu'))
            model.add(Dense(1, kernel_initializer=init, activation='sigmoid'))
            # Compile model
            model.compile(loss='binary_crossentropy', optimizer=Adam(lr=3e-3, decay=0.85), metrics=['accuracy'])
            return model

        return KerasClassifier(build_fn=create_model, **sk_params)

if __name__ == "__main__":
    X = np.asarray([[0.0, 0.0], [1.0, 1.0], [2.0, 2.5], [1.5, 1.6]])
    y = np.asarray([0, 0, 1, 1])

    nn = Machine_Learning_Classifier_Keras._get_classifier()
    nn.fit(X, y, sample_weight=np.asarray([0, 0, 1, 1]))

    values = np.asarray([[0.5, 0.5], [0.6, 0.5], [0.8, 1.0], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5]])

    probas = nn.predict_proba(values)
    print(probas)

I would expect my outputs for the predict_proba values to stay the same between runs; 我希望我的predict_proba值输出在两次运行之间保持不变; however, I am getting the following for two successive runs (results will vary) 但是,我连续两个运行都得到以下结果(结果会有所不同)

Run 1:
[[0.9439231  0.05607685]
 [0.91351616 0.08648387]
 [0.06378722 0.9362128 ]
 [0.9439231  0.05607685]
 [0.9439231  0.05607685]
 [0.9439231  0.05607685]
 [0.94392323 0.05607677]
 [0.94392323 0.05607677]]

Run 2:
[[0.94391584 0.05608419]
 [0.91350436 0.08649567]
 [0.06378281 0.9362172 ]
 [0.94391584 0.05608419]
 [0.94391584 0.05608419]
 [0.94391584 0.05608419]
 [0.94391584 0.05608416]
 [0.94391584 0.05608416]]

You also need to fix the seed for the kernel_initializer in every layer. 您还需要在每一层中修复kernel_initializer的种子。 For example, a reproducible code looks like this: 例如,可重现的代码如下所示:

np.random.seed(1)
tf.set_random_seed(1)

model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(8, activation='relu', kernel_initializer=keras.initializers.he_normal(seed=1), input_shape=[1]))
model.add(tf.keras.layers.Dense(1, activation='linear', kernel_initializer=keras.initializers.he_normal(seed=1)))

I opened a feature request ticket in TensorFlow GitHub to ask for a consolidating variable to fix the seed for all kernel initializers. 我在TensorFlow GitHub上打开了功能请求通知单 ,以请求合并变量来修复所有内核初始化程序的种子。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM