在 Python 中使用 Keras 和 TensorFlow 无法重现结果

Question

I have the problem, that I am not able to reproduce my results with Keras and ThensorFlow.我有一个问题，我无法使用 Keras 和 ThensorFlow 重现我的结果。

It seems like recently there has been a workaround published on the Keras documentation site for this issue but somehow it doesn't work for me.似乎最近在Keras 文档站点上发布了针对此问题的解决方法，但不知何故它对我不起作用。

What I am doing wrong?我做错了什么？

I'm using a Jupyter Notebook on a MBP Retina (without Nvidia GPU).我在 MBP Retina（没有 Nvidia GPU）上使用 Jupyter Notebook。

# ** Workaround from Keras Documentation **

import numpy as np
import tensorflow as tf
import random as rn

# The below is necessary in Python 3.2.3 onwards to
# have reproducible behavior for certain hash-based operations.
# See these references for further details:
# https://docs.python.org/3.4/using/cmdline.html#envvar-PYTHONHASHSEED
# https://github.com/fchollet/keras/issues/2280#issuecomment-306959926

import os
os.environ['PYTHONHASHSEED'] = '0'

# The below is necessary for starting Numpy generated random numbers
# in a well-defined initial state.

np.random.seed(42)

# The below is necessary for starting core Python generated random numbers
# in a well-defined state.

rn.seed(12345)

# Force TensorFlow to use single thread.
# Multiple threads are a potential source of
# non-reproducible results.
# For further details, see: https://stackoverflow.com/questions/42022950/which-seeds-have-to-be-set-where-to-realize-100-reproducibility-of-training-res

session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)

from keras import backend as K

# The below tf.set_random_seed() will make random number generation
# in the TensorFlow backend have a well-defined initial state.
# For further details, see: https://www.tensorflow.org/api_docs/python/tf/set_random_seed

tf.set_random_seed(1234)

sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)


# ** Workaround end **

# ** Start of my code **


# LSTM and CNN for sequence classification in the IMDB dataset
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
from sklearn import metrics
# fix random seed for reproducibility
#np.random.seed(7)

# ... importing data and so on ...

# create the model
embedding_vecor_length = 32
neurons = 91
epochs = 1
model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length))
model.add(LSTM(neurons))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_logarithmic_error', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, epochs=epochs, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

Used Python version:使用的 Python 版本：

Python 3.6.3 |Anaconda custom (x86_64)| (default, Oct  6 2017, 12:04:38) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]

The workaround is already included in the code (without effect).解决方法已包含在代码中（无效）。

With everytime I do the training part I get different results.每次我做训练部分时，我都会得到不同的结果。

When resetting the kernel of the Jupyter Notebook, 1st time corresponds with the first time and 2nd time with 2nd time.重置Jupyter Notebook内核时，第一次对应第一次，第二次对应第二次。

So after resetting I will always get for example 0.7782 at the first run, 0.7732 on the second run etc.所以在重置后，我总是会在第一次运行时得到0.7732 ，在第二次运行时得到0.7782 ，等等。

But results without kernel reset are always different each time I run it.但是每次运行没有内核重置的结果总是不同的。

I would be helpful for any suggestion!我会对任何建议有所帮助！

Answer 1

I had exactly the same problem and managed to solve it by closing and restarting the tensorflow session every time I run the model.我遇到了完全相同的问题，并设法通过每次运行模型时关闭并重新启动 tensorflow 会话来解决它。 In your case it should look like this:在您的情况下，它应该如下所示：

#START A NEW TF SESSION
np.random.seed(0)
tf.set_random_seed(0)
sess = tf.Session(graph=tf.get_default_graph())
K.set_session(sess)

embedding_vecor_length = 32
neurons = 91
epochs = 1
model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length))
model.add(LSTM(neurons))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_logarithmic_error', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, epochs=epochs, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

#CLOSE TF SESSION
K.clear_session()

I ran the following code and had reproducible results using GPU and tensorflow backend:我运行了以下代码并使用 GPU 和 tensorflow 后端获得了可重现的结果：

print datetime.now()
for i in range(10):
    np.random.seed(0)
    tf.set_random_seed(0)
    sess = tf.Session(graph=tf.get_default_graph())
    K.set_session(sess)

    n_classes = 3
    n_epochs = 20
    batch_size = 128

    task = Input(shape = x.shape[1:])
    h = Dense(100, activation='relu', name='shared')(task)
    h1= Dense(100, activation='relu', name='single1')(h)
    output1 = Dense(n_classes, activation='softmax')(h1)

    model = Model(task, output1)
    model.compile(loss='categorical_crossentropy', optimizer='Adam')
    model.fit(x_train, y_train_onehot, batch_size = batch_size, epochs=n_epochs, verbose=0)
print(model.evaluate(x=x_test, y=y_test_onehot, batch_size=batch_size, verbose=0))
K.clear_session()

And obtained this output:并获得了这个输出：

2017-10-23 11:27:14.494482
0.489712882132
0.489712893813
0.489712892765
0.489712854426
0.489712882132
0.489712864011
0.486303713004
0.489712903398
0.489712892765
0.489712903398

What I understood is that if you don't close your tf session (you are doing it by running in a new kernel) you keep sampling the same "seeded" distribution.我的理解是，如果您不关闭 tf 会话（您是通过在新内核中运行来完成的），您将继续对相同的“种子”分布进行采样。

Answer 2

Looks like a bug in TensorFlow / Keras not sure.看起来像是 TensorFlow / Keras 中的一个错误，不确定。 When setting the Keras back-end to CNTK the results are reproducible.将 Keras 后端设置为 CNTK 时，结果是可重现的。

I even tried with several versions of TensorFlow from 1.2.1 till 1.13.1.我什至尝试过从 1.2.1 到 1.13.1 的几个版本的 TensorFlow。 All the TensorFlow versions results doesn't agree with multiple runs even when the random seeds are set.即使设置了随机种子，所有 TensorFlow 版本的结果也与多次运行不一致。

Answer 3

My answer is the following, which uses Keras with Tensorflow as backend.我的答案如下，它使用 Keras 和 Tensorflow 作为后端。 Within your nested for loop, where one typically iterates through the various parameters you wish to explore for your model's development, immediately add this function after your last for loop .在嵌套的 for 循环中，通常会遍历您希望为模型开发探索的各种参数，在最后一个for loop之后立即添加此函数。

for...
   for...
      reset_keras()
      .
      .
      .

where the reset function is defined as其中重置函数定义为

def reset_keras():
    sess = tf.keras.backend.get_session()
    tf.keras.backend.clear_session()
    sess.close()
    sess = tf.keras.backend.get_session()
    np.random.seed(1)
    tf.set_random_seed(2)

PS: The function above also actually avoids your nvidia GPU from building up too much memory (which happens after many iteration) so that it eventually becomes very slow...so the function restores GPU performance and maintains results as reproducible. PS：上面的函数实际上也避免了你的 nvidia GPU 建立过多的内存（在多次迭代后发生），因此它最终变得非常慢......所以该函数恢复了 GPU 性能并保持结果可重现。

Answer 4

The thing that worked for me was to run the training every time in a new console .对我有用的是每次都在新控制台中运行培训。 In addition to this I also have this parameters set:除此之外，我还设置了以下参数：

RANDOM_STATE = 42

os.environ['PYTHONHASHSEED'] = str(RANDOM_STATE)
random.seed(RANDOM_STATE)
np.random.seed(RANDOM_STATE)
tf.set_random_seed(RANDOM_STATE)

session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)

intra_op_parallelism could also be a bigger value intra_op_parallelism也可以是更大的值

在 Python 中使用 Keras 和 TensorFlow 无法重现结果

问题描述

4 个解决方案

解决方案1
6 2017-10-23 09:45:42

解决方案2
0 2019-04-03 14:28:15

解决方案3
0 2019-05-11 10:16:12

解决方案4
0 2020-03-26 10:51:45

在 Python 中使用 Keras 和 TensorFlow 无法重现结果

问题描述

4 个解决方案

解决方案1 6 2017-10-23 09:45:42

解决方案2 0 2019-04-03 14:28:15

解决方案3 0 2019-05-11 10:16:12

解决方案4 0 2020-03-26 10:51:45

解决方案1
6 2017-10-23 09:45:42

解决方案2
0 2019-04-03 14:28:15

解决方案3
0 2019-05-11 10:16:12

解决方案4
0 2020-03-26 10:51:45