简体   繁体   English

加载保存的模型后得到错误的预测

[英]Getting wrong prediction after loading a saved model

I am trying to save an Estimator and then load it to predict as required.我正在尝试保存一个Estimator ,然后根据需要加载它以进行预测。 Part where I train the model:我训练模型的部分:

classifier = tf.estimator.Estimator(model_fn=bag_of_words_model)

# Train
train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"words": x_train},  # x_train is 2D numpy array of shape (26, 5)
    y=y_train,                   # y_train is 1D panda series of length 26
    batch_size=1000,
    num_epochs=None,
    shuffle=True)

classifier.train(input_fn=train_input_fn, steps=300)

I then save the model as follows:然后我按如下方式保存模型:

def serving_input_receiver_fn():
    serialized_tf_example = tf.placeholder(dtype=tf.int64, shape=(None, 5), name='words')
    receiver_tensors = {"predictor_inputs": serialized_tf_example}
    features = {"words": tf.tile(serialized_tf_example, multiples=[1, 1])}
    return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)

full_model_dir = classifier.export_savedmodel(export_dir_base="E:/models/",
                                              serving_input_receiver_fn=serving_input_receiver_fn)

I now load the model and give the test set to it for prediction:我现在加载模型并为其提供测试集以进行预测:

from tensorflow.contrib import predictor

classifier = predictor.from_saved_model("E:\\models\\1547122667")
predictions = classifier({'predictor_inputs': x_test})
print(predictions)

This gives me predictions like:这给了我这样的预测:

{'class': array([ 0,  0,  0,  0,  0,  5,  0,  0,  0,  0,  0,  0,  0,  0,  0, 15,  0,
        0,  5,  0, 20,  0,  5,  0,  0,  0], dtype=int64),
'prob': array([[9.9397606e-01, 6.5355714e-05, 2.2225287e-05, ..., 1.4510043e-07,
            1.6920333e-07, 1.4865007e-07],
           [9.9886864e-01, 1.4976941e-06, 7.0847680e-05, ..., 9.4182191e-08,
            1.1828639e-07, 9.5683227e-08],
           [9.9884748e-01, 2.1105163e-06, 1.1994909e-05, ..., 8.3957858e-08,
            1.0476184e-07, 8.5592234e-08],
           ...,
           [9.6145850e-01, 6.9048328e-05, 1.1446012e-04, ..., 7.3761731e-07,
            8.8173107e-07, 7.3824998e-07],
           [9.7115618e-01, 2.9716679e-05, 5.9592247e-05, ..., 2.8933655e-07,
            3.4183532e-07, 2.9737942e-07],
           [9.7387028e-01, 6.9163914e-05, 1.5800977e-04, ..., 1.6116818e-06,
            1.9025001e-06, 1.5990496e-06]], dtype=float32)}

class and prob are two things that I am predicting. classprob是我预测的两件事。 Now, if I predict the output with the same test set without saving and loading the model:现在,如果我在不保存和加载模型的情况下使用相同的测试集预测输出:

# Predict.
test_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"words": x_test}, y=y_test, num_epochs=1, shuffle=False)
predictions = classifier.predict(input_fn=test_input_fn)
print(predictions)

then I get the output as follows:然后我得到如下输出:

{'class': 0, 'prob': array([9.9023646e-01, 2.6038184e-05, 3.9950578e-06, ..., 1.3950405e-08,
       1.5713249e-08, 1.3064114e-08], dtype=float32)}
{'class': 1, 'prob': array([2.0078469e-05, 9.9907070e-01, 8.9245419e-05, ..., 6.6533559e-08,
       7.1365662e-08, 6.8764685e-08], dtype=float32)}
{'class': 2, 'prob': array([3.0828053e-06, 9.6484597e-05, 9.9906868e-01, ..., 5.9190391e-08,
       6.0995028e-08, 6.2322023e-08], dtype=float32)}
{'class': 3, 'prob': array([7.4923842e-06, 1.1112734e-06, 1.1697492e-06, ..., 4.4295877e-08,
       4.4563325e-08, 4.0475427e-08], dtype=float32)}
{'class': 4, 'prob': array([4.6085161e-03, 2.8403942e-05, 2.0638861e-05, ..., 7.6083229e-09,
       8.5255349e-09, 6.7836012e-09], dtype=float32)}
{'class': 5, 'prob': array([6.2119620e-06, 7.2357750e-07, 2.6231232e-06, ..., 7.4999367e-09,
       9.0847436e-09, 7.5630142e-09], dtype=float32)}
{'class': 6, 'prob': array([4.4882968e-06, 2.2007227e-06, 8.3352124e-06, ..., 2.3130213e-09,
       2.3657243e-09, 2.0045692e-09], dtype=float32)}
{'class': 7, 'prob': array([1.88617545e-04, 9.01482690e-06, 1.47353385e-05, ...,
       3.38567552e-09, 3.97709154e-09, 3.37017392e-09], dtype=float32)}
{'class': 8, 'prob': array([1.9843496e-06, 4.5909755e-06, 4.8804057e-05, ..., 2.2636470e-08,
       2.0094852e-08, 2.0215294e-08], dtype=float32)}
{'class': 9, 'prob': array([2.5907659e-04, 4.4661370e-05, 6.9490757e-06, ..., 1.6249915e-08,
       1.7579131e-08, 1.5439820e-08], dtype=float32)}
{'class': 10, 'prob': array([3.6456138e-05, 7.5861579e-05, 3.0208937e-05, ..., 2.7859956e-08,
       2.5423596e-08, 2.8662368e-08], dtype=float32)}
{'class': 11, 'prob': array([1.1723863e-05, 9.1407037e-06, 4.8835855e-04, ..., 2.3693143e-08,
       2.0524153e-08, 2.3223269e-08], dtype=float32)}
{'class': 12, 'prob': array([1.2886175e-06, 2.6652628e-05, 2.7812246e-06, ..., 4.8295210e-08,
       4.4282604e-08, 4.7342766e-08], dtype=float32)}
{'class': 13, 'prob': array([3.3486103e-05, 1.3361238e-05, 3.6493871e-05, ..., 2.2195401e-09,
       2.4768412e-09, 2.0150714e-09], dtype=float32)}
{'class': 14, 'prob': array([4.6108948e-05, 3.0377207e-05, 2.0945006e-06, ..., 4.2276231e-08,
       5.2376720e-08, 4.4969173e-08], dtype=float32)}
{'class': 15, 'prob': array([1.7165689e-04, 2.9350400e-05, 3.2283624e-05, ..., 7.1849078e-09,
       7.6871531e-09, 6.6224697e-09], dtype=float32)}
{'class': 16, 'prob': array([5.9876328e-07, 3.0931276e-06, 1.5760432e-05, ..., 4.0450086e-08,
       4.2720632e-08, 4.6017195e-08], dtype=float32)}
{'class': 17, 'prob': array([2.6658317e-04, 9.9656281e-05, 4.0355867e-06, ..., 1.2873563e-08,
       1.4808875e-08, 1.2155732e-08], dtype=float32)}
{'class': 18, 'prob': array([1.4914459e-04, 2.1025437e-06, 1.2505146e-05, ..., 9.8899635e-09,
       1.1115599e-08, 8.9312255e-09], dtype=float32)}
{'class': 19, 'prob': array([2.5615416e-04, 2.3750392e-05, 2.2886352e-04, ..., 3.9635733e-08,
       4.5139984e-08, 3.8605780e-08], dtype=float32)}
{'class': 20, 'prob': array([6.3949975e-04, 2.3652929e-05, 7.8577641e-06, ..., 2.0959168e-09,
       2.5495863e-09, 2.0428985e-09], dtype=float32)}
{'class': 21, 'prob': array([8.2179489e-05, 8.4409467e-06, 5.4756888e-06, ..., 2.2360982e-09,
       2.4820561e-09, 2.1206517e-09], dtype=float32)}
{'class': 22, 'prob': array([3.9681905e-05, 2.4394642e-06, 8.9102805e-06, ..., 2.0282410e-08,
       2.1132811e-08, 1.8368105e-08], dtype=float32)}
{'class': 23, 'prob': array([3.0794261e-05, 6.5104805e-06, 3.3528936e-06, ..., 2.0360846e-09,
       1.9360573e-09, 1.7195430e-09], dtype=float32)}
{'class': 24, 'prob': array([3.4596618e-05, 2.2907707e-06, 2.5318438e-06, ..., 1.1038886e-08,
       1.2148775e-08, 9.9556408e-09], dtype=float32)}
{'class': 25, 'prob': array([1.4846727e-03, 1.9189476e-06, 5.3232620e-06, ..., 3.1966723e-09,
       3.5612517e-09, 3.0947123e-09], dtype=float32)}

which is correct.哪个是正确的。 Notice the difference between two outputs is that the class in the second one is increasing 1 by 1 while the class in the first case shows 0s at most places.请注意,两个输出之间的区别在于,第二个输出中的class以 1 递增,而第一个案例中的class在大多数地方显示为 0。

Why is there a difference in the prediction?为什么预测有差异? Am I saving the model in a wrong way?我是否以错误的方式保存模型?

Edit 1:编辑1:

From this question , I came to know that Estimator readily saved the checkpoints if the model_dir argument is given.这个问题,我开始知道如果给定model_dir参数, Estimator很容易保存检查点。 And loads the same graph when the same model_dir is referred to.并在引用相同的model_dir时加载相同的图形。 So I did this while saving the model:所以我在保存模型时这样做了:

classifier = tf.estimator.Estimator(model_fn=bag_of_words_model, model_dir="E:/models/")

I checked and found that checkpoints have been stored at E:/models/ .我检查并发现检查点已存储在E:/models/ Now, the part where I want to restore the model, I wrote:现在,我想恢复模型的部分,我写道:

# Added model_dir args
classifier = tf.estimator.Estimator(model_fn=bag_of_words_model, model_dir="E:/models/")
# Predict.
test_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={WORDS_FEATURE: x_test}, y=y_test, num_epochs=1, shuffle=False)
predictions = classifier.predict(input_fn=test_input_fn)

The logs gave me:日志给了我:

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': 'E:/models/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x0000028240FAB518>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
WARNING:tensorflow:From E:\ml_classif\venv\lib\site-packages\tensorflow\python\estimator\inputs\queues\feeding_queue_runner.py:62: QueueRunner.__init__ (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
WARNING:tensorflow:From E:\ml_classif\venv\lib\site-packages\tensorflow\python\estimator\inputs\queues\feeding_functions.py:500: add_queue_runner (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
2019-01-14 19:17:51.157091: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
INFO:tensorflow:Restoring parameters from E:/models/model.ckpt-100
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
WARNING:tensorflow:From E:\ml_classif\venv\lib\site-packages\tensorflow\python\training\monitored_session.py:804: start_queue_runners (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.

which show that the model has been successfully reconstructed from the given model_dir .这表明模型已经从给定的model_dir成功重建。 I then try to predict the output on the test data but only to get the same output as the previous one:然后我尝试预测测试数据的输出,但只获得与前一个相同的输出:

{'class': 0, 'prob': array([9.8720157e-01, 1.9098983e-04, 8.6194178e-04, ..., 9.8885458e-08,
       1.0560690e-07, 1.1116919e-07], dtype=float32)}
{'class': 0, 'prob': array([9.9646854e-01, 7.3993037e-06, 1.6678940e-03, ..., 3.3662158e-08,
       3.7401023e-08, 3.9902886e-08], dtype=float32)}
{'class': 0, 'prob': array([9.9418157e-01, 2.2869966e-05, 7.2757481e-04, ..., 7.2877960e-08,
       8.5308180e-08, 8.7949694e-08], dtype=float32)}
{'class': 0, 'prob': array([9.8990846e-01, 2.0035572e-05, 5.0557905e-04, ..., 4.2098847e-08,
       4.6305473e-08, 4.8882491e-08], dtype=float32)}
{'class': 0, 'prob': array([9.3541616e-01, 1.6300696e-03, 2.8230180e-03, ..., 3.4934112e-07,
       3.5947951e-07, 3.8610020e-07], dtype=float32)}
{'class': 5, 'prob': array([4.5955207e-04, 3.9533910e-04, 2.9366053e-04, ..., 6.4991447e-08,
       6.5079021e-08, 6.8886770e-08], dtype=float32)}
{'class': 0, 'prob': array([9.2468429e-01, 4.9159536e-04, 9.2872838e-03, ..., 1.0636869e-06,
       1.1284576e-06, 1.1437518e-06], dtype=float32)}
{'class': 0, 'prob': array([9.5501184e-01, 2.6409564e-04, 3.8474586e-03, ..., 1.4077391e-06,
       1.4964197e-06, 1.4892942e-06], dtype=float32)}
{'class': 0, 'prob': array([9.4813752e-01, 2.7400412e-04, 2.2020808e-03, ..., 2.9592795e-06,
       3.0286824e-06, 3.0610188e-06], dtype=float32)}
{'class': 0, 'prob': array([9.6341538e-01, 3.4047980e-04, 2.0810752e-03, ..., 6.5900326e-07,
       6.7539651e-07, 7.0834898e-07], dtype=float32)}
{'class': 0, 'prob': array([9.9541759e-01, 7.4490390e-06, 3.9836011e-04, ..., 5.1197322e-08,
       5.6648332e-08, 5.9212919e-08], dtype=float32)}
{'class': 0, 'prob': array([9.9666804e-01, 1.2600798e-05, 3.1346193e-04, ..., 3.9119975e-08,
       4.3912351e-08, 4.7076494e-08], dtype=float32)}
{'class': 0, 'prob': array([9.9582565e-01, 2.3773579e-05, 5.5219355e-04, ..., 8.2924736e-08,
       9.1671566e-08, 9.3954029e-08], dtype=float32)}
{'class': 0, 'prob': array([9.4328243e-01, 1.5643415e-04, 3.1944674e-03, ..., 3.9115656e-07,
       4.2140312e-07, 4.4074648e-07], dtype=float32)}
{'class': 0, 'prob': array([9.9599898e-01, 1.3793178e-05, 6.0236652e-04, ..., 1.1893864e-07,
       1.3845128e-07, 1.4301372e-07], dtype=float32)}
{'class': 15, 'prob': array([1.8115035e-03, 1.0454544e-03, 2.0831774e-03, ..., 4.5647434e-06,
       5.0818121e-06, 4.9641203e-06], dtype=float32)}
{'class': 0, 'prob': array([9.9594927e-01, 9.6870117e-06, 3.7690319e-04, ..., 1.1332005e-07,
       1.2312253e-07, 1.3208249e-07], dtype=float32)}
{'class': 0, 'prob': array([9.4268161e-01, 7.6396938e-04, 3.4147443e-03, ..., 5.8237259e-07,
       5.8584078e-07, 5.9859877e-07], dtype=float32)}
{'class': 18, 'prob': array([1.2369211e-03, 7.1954611e-03, 3.4218519e-03, ..., 1.6767866e-06,
       1.5141470e-06, 1.5795833e-06], dtype=float32)}
{'class': 0, 'prob': array([9.9327940e-01, 2.4744159e-05, 8.3286857e-04, ..., 8.1387967e-08,
       9.2638246e-08, 9.4754824e-08], dtype=float32)}
{'class': 18, 'prob': array([4.3461438e-02, 7.7443835e-03, 1.0502382e-02, ..., 6.1044288e-06,
       6.4804617e-06, 6.6003668e-06], dtype=float32)}
{'class': 0, 'prob': array([9.1440409e-01, 2.1251327e-04, 1.9904026e-03, ..., 9.9065488e-08,
       1.0103827e-07, 1.0984956e-07], dtype=float32)}
{'class': 5, 'prob': array([4.2783137e-02, 1.3115143e-02, 1.6208552e-02, ..., 3.9897031e-06,
       3.9228212e-06, 4.1420644e-06], dtype=float32)}
{'class': 0, 'prob': array([9.0668356e-01, 6.9979503e-04, 4.9138898e-03, ..., 4.2717656e-07,
       4.3982755e-07, 4.7387920e-07], dtype=float32)}
{'class': 0, 'prob': array([9.3811822e-01, 1.6991694e-04, 2.0085643e-03, ..., 3.8740203e-07,
       4.0521365e-07, 4.3191656e-07], dtype=float32)}
{'class': 0, 'prob': array([9.5434970e-01, 2.1576983e-04, 2.3911290e-03, ..., 7.2219399e-07,
       7.4783770e-07, 7.9287622e-07], dtype=float32)}

Most of the classes are again 0 .大多数类又是0 Why is this happening?为什么会这样? Is there any alternative that could help me?有什么替代方法可以帮助我吗?

Finally, I got the answer. 最后,我得到了答案。 The model was saved and loaded correctly. 模型已正确保存和加载。 The problem was that the x_test which I was passing to the prediction with saving/loading and without saving/loading was different (I know, I am really sorry for this mistake). 问题是我通过保存/加载而没有保存/加载传递给预测的x_test是不同的(我知道,我真的很抱歉这个错误)。 The x_test w/o saving/loading the model had values +1 than the x_test w/ saving/loading. x_test保存/加载模型的x_test值比x_test w / saving / loading值+1。 This was suggested to me by a tensorflow developer on github where I had opened up the issue . 这是由github上的tensorflow开发人员向我建议的,我已经解决了这个问题

I'll start from Edit 1. According to TF documentation : 我将从编辑1开始。根据TF文档:

Checkpointing Frequency By default, the Estimator saves checkpoints in the model_dir according to the following schedule: 检查点频率默认情况下,Estimator根据以下计划在model_dir中保存检查点:

Writes a checkpoint every 10 minutes (600 seconds). 每10分钟(600秒)写一个检查点。 Writes a checkpoint when the train method starts (first iteration) and completes (final iteration). 在列车方法开始(第一次迭代)并完成(最终迭代)时写入检查点。 Retains only the 5 most recent checkpoints in the directory. 仅保留目录中最近的5个检查点。 You may alter the default schedule by taking the following steps: 您可以通过执行以下步骤来更改默认计划:

Create a tf.estimator.RunConfig object that defines the desired schedule. 创建一个定义所需计划的tf.estimator.RunConfig对象。 When instantiating the Estimator, pass that RunConfig object to the Estimator's config argument. 实例化Estimator时,将RunConfig对象传递给Estimator的config参数。

Did you let your train properly finished ? 你让火车完好了吗? According to the log it doesn't seem so (since you restored model-ckpt100 and not model-ckpt300) . 根据日志它似乎不是这样(因为你恢复了model-ckpt100而不是model-ckpt300)。

If the experiment didn't take too long , I'll suggest you to delete the content of your saved model , and either let the training finish at the number of steps you defined when calling classifier.train , or as suggested by the documentation to create a tf.estimator.RunConfig that you pass as argument to the estimator and decide when to save. 如果实验花费的时间不长,我建议您删除已保存模型的内容,并让训练按照您在调用classifier.train时定义的步数完成,或者按照文档的建议完成创建一个tf.estimator.RunConfig,您将其作为参数传递给估算器并决定何时保存。

I hope this will help you ! 我希望这能帮到您 !

Another possible cause of wrong predictions is forgetting to standardize the input in the same way that the training/validation data was.错误预测的另一个可能原因是忘记以与训练/验证数据相同的方式对输入进行标准化。

For example, if you standardized your training set by subtracting the mean and dividing by the variance, and you don't do the same thing with the data that you want predictions for, then the predictions will be nonsense.例如,如果您通过减去均值并除以方差来标准化您的训练集,并且您没有对您想要预测的数据做同样的事情,那么预测将是无意义的。 A good clue that this is happening is that the predictions will be extremely high or extremely low values.发生这种情况的一个很好的线索是预测值将非常高或非常低。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM