简体   繁体   English

tf.keras `predict()` 得到不同的结果

[英]tf.keras `predict()` gets different results

I was playing around with tf.keras and ran some predict() method on two Model objects with the same weights initialization.我在玩tf.keras并在两个具有相同权重初始化的Model对象上运行了一些predict()方法。

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import LSTM, Masking, Input, Embedding, Dense
from tensorflow.keras.models import Model


X = np.asarray([
    [0, 1, 2, 3, 3],
    [0, 0, 1, 1, 1],
    [0, 0, 0, 1, 1],

y = [

seq_len = X.shape[1]

inp = Input(shape=[seq_len])
emb = Embedding(4, 10, name='embedding')(inp)

x = emb
x = LSTM(5, return_sequences=False, name='lstm')(x)
out = Dense(1, activation='sigmoid', name='out')(x)

model = Model(inputs=inp, outputs=out)

preds = model.predict(X)

inp = Input(shape=[seq_len])
emb = Embedding(4, 10, name='embedding', weights=model.get_layer('embedding').get_weights()[0])(inp)

x = emb
x = LSTM(5, return_sequences=False, weights=model.get_layer('lstm').get_weights()[0])(x)
out = Dense(1, activation='sigmoid', weights=model.get_layer('out').get_weights()[0])(x)

model_2 = Model(inputs=inp, outputs=out)

preds_2 = model_2.predict(X)

print(preds, preds_2)

I am not sure why but the results of the two predictions are different.我不知道为什么,但两个预测的结果是不同的。 I got these when I ran the print function.我在运行print功能时得到了这些。 You might get something different.你可能会得到不同的东西。

[[0.5027414 ]
 [0.5019673 ]
 [0.50134844]] [[0.5007331]

I am trying to understand how keras works.我试图了解keras是如何工作的。 Any explanation would be appreciated.任何解释将不胜感激。 Thank you.谢谢你。

NOTE : THERE IS NO LEARNING INVOLVED HERE.注意:这里不涉及任何学习。 I don't get the idea where the randomness comes from.我不明白随机性从何而来。

Try to change the optimizer from adam to SGD or something else.尝试将优化器从adam更改为SGD或其他。 I noticed that with the same model I used to get different results and it fixed the problem.我注意到,使用相同的模型,我得到了不同的结果,它解决了这个问题。 Also, take a look at the here to fix the initial weights.另外,请查看此处以修复初始权重。 By the way, I don't know why and how the optimizer can affect the results in the test time with the same model.顺便说一句,我不知道优化器为什么以及如何影响使用相同模型的测试时间的结果。

It is that you are not copying all the weights.那是你没有复制所有的权重。 I have no idea why your call mechanically works but it is really easy to see you are not by examining the get_weights without the [0] indexing.我不知道为什么你的调用机械地起作用,但很容易看出你不是通过检查没有 [0] 索引的 get_weights 。

egthese are not copied:例如,这些不会被复制:


array([[ 0.11243069, -0.1028666 ,  0.01080172, -0.07471965,  0.05566487,
        -0.12818974,  0.34882438, -0.17163819, -0.21306667,  0.5386005 ,
        -0.03643916,  0.03835883, -0.31128728,  0.04882491, -0.05503649,
        -0.22660127, -0.4683674 , -0.00415642, -0.29038426, -0.06893865],
       [-0.5117522 ,  0.01057898, -0.23182054,  0.03220385,  0.21614116,
         0.0732751 , -0.30829042,  0.06233712, -0.54017985, -0.1026137 ,
        -0.18011908,  0.15880923, -0.21900705, -0.11910527, -0.03808065,
         0.07623457, -0.13157862, -0.18740109,  0.06135096, -0.21589288],
       [-0.2295578 , -0.12452635, -0.08739456, -0.1880849 ,  0.2220488 ,
        -0.14575425,  0.32249492,  0.05235165, -0.09479579,  0.2496742 ,
         0.10411342, -0.0263749 ,  0.33186644, -0.1838699 ,  0.28964192,
        -0.2414586 ,  0.41612682,  0.13791762,  0.13942356, -0.36176005],
       [-0.14428475, -0.02090888,  0.27968913,  0.09452424,  0.1291543 ,
        -0.43372717, -0.11366601,  0.37842247,  0.3320751 ,  0.21959782,
        -0.4242381 ,  0.02412989, -0.24809352,  0.2508208 , -0.06223384,
         0.08648364,  0.17311276, -0.05988384,  0.02276517, -0.1473657 ],
       [ 0.28600952, -0.37206012,  0.21376705, -0.16566195,  0.0833357 ,
        -0.00887177,  0.01394618,  0.5345957 , -0.25116244, -0.17159337,
         0.096329  , -0.32286254,  0.02044407, -0.1393016 , -0.0767666 ,
         0.1505355 , -0.28456056,  0.16909163,  0.16806729, -0.14622769]],

but also if you name the lstm layer in model 2 you can see there are not equal parts of the weights.但是,如果您在模型 2 中命名 lstm 层,您会看到权重的部分不相等。

model_2.get_layer("lstm").get_weights()[1] - model.get_layer("lstm").get_weights()[1]

Perhaps, setting numpy seed is not enough to make the operations and weights deterministic.也许,设置 numpy 种子不足以使操作和权重具有确定性。 Tensorflow documentation suggests that to have deterministic weights, you should rather run Tensorflow 文档建议要具有确定的权重,您应该运行


https://www.tensorflow.org/api_docs/python/tf/config/experimental/enable_op_determinism#:~:text=Configures%20TensorFlow%20ops%20to%20run%20deterministically.&text=When%20op%20determinism%20is%20enabled,is%20useful%20for%20debugging%20models . https://www.tensorflow.org/api_docs/python/tf/config/experimental/enable_op_determinism#:~:text=Configures%20TensorFlow%20ops%20to%20run%20deterministically.&text=When%20op%20determinism%20is%20enabled ,%20有用%20对于%20调试%20模型

Could you check if it helps?你能检查它是否有帮助? (your code seems to be written in version 1 of TF, so it does not run on my v2 setup without adaptation) (您的代码似乎是用 TF 版本 1 编写的,因此未经调整就无法在我的 v2 设置上运行)

The thing about machine learning is that it doesn't always learn quite the same way.机器学习的问题在于它并不总是以完全相同的方式学习。 It involves lots of probabilities, so on a larger scale the results will tend to converge towards one value, but individual runs can and will give varying results.它涉及很多概率,因此在更大的范围内,结果将趋向于收敛到一个值,但个别运行可以并且将给出不同的结果。

More info here 更多信息在这里

It is absolutely normal that the many runs with the same input data give different output.使用相同输入数据的多次运行给出不同的输出是绝对正常的。 It is mainly due to the internal stochasticity of such machine learning techniques (example: ANN, Decision Trees building algorithms, etc.).这主要是由于此类机器学习技术的内部随机性(例如:ANN、决策树构建算法等)。

- Nabil Belgasmi, Université de la Manouba - Nabil Belgasmi,马努巴大学

There is not a specific method or technique.没有特定的方法或技术。 The results and evaluation of the performance depends on several factors: the data type, parameters of induction function, training set (supervised), etc. What is important is to compare the results of using metric measurements such as recall, precision, F_measure, ROC curves or other graphical methods.性能的结果和评估取决于几个因素:数据类型、归纳函数的参数、训练集(监督)等。重要的是比较使用召回、精度、F_measure、ROC等度量测量的结果曲线或其他图形方法。

- Jésus Antonio Motta Laval University - 耶稣安东尼奥莫塔拉瓦尔大学

EDIT The predict() function takes an array of one or more data instances.编辑predict() 函数采用一个或多个数据实例的数组。

The example below demonstrates how to make regression predictions on multiple data instances with an unknown expected outcome.下面的示例演示了如何对具有未知预期结果的多个数据实例进行回归预测。

# example of making predictions for a regression problem
from keras.models import Sequential
from keras.layers import Dense
from sklearn.datasets import make_regression
from sklearn.preprocessing import MinMaxScaler
# generate regression dataset
X, y = make_regression(n_samples=100, n_features=2, noise=0.1, random_state=1)
scalarX, scalarY = MinMaxScaler(), MinMaxScaler()
X = scalarX.transform(X)
y = scalarY.transform(y.reshape(100,1))
# define and fit the final model
model = Sequential()
model.add(Dense(4, input_dim=2, activation='relu'))
model.add(Dense(4, activation='relu'))
model.add(Dense(1, activation='linear'))
model.compile(loss='mse', optimizer='adam')
model.fit(X, y, epochs=1000, verbose=0)
# new instances where we do not know the answer
Xnew, a = make_regression(n_samples=3, n_features=2, noise=0.1, random_state=1)
Xnew = scalarX.transform(Xnew)
# make a prediction
ynew = model.predict(Xnew)
# show the inputs and predicted outputs
for i in range(len(Xnew)):
    print("X=%s, Predicted=%s" % (Xnew[i], ynew[i]))

Running the example makes multiple predictions, then prints the inputs and predictions side by side for review.运行该示例进行多个预测,然后并排打印输入和预测以供查看。

X=[0.29466096 0.30317302], Predicted=[0.17097184]
X=[0.39445118 0.79390858], Predicted=[0.7475489]
X=[0.02884127 0.6208843 ], Predicted=[0.43370453]


Disclaimer: The predict() function itself is slightly random (probabilistic)免责声明: predict()函数本身有点随机(概率)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM