简体   繁体   English

joblib.load __main__ AttributeError

[英]joblib.load __main__ AttributeError

I'm starting to dive into deploying a predictive model to a web app using Flask, and unfortunately getting stuck at the starting gate. 我开始倾向于使用Flask将预测模型部署到Web应用程序,并且不幸地陷入了起跑门。

What I did: 我做了什么:

I pickled my model in my model.py program: 我在我的model.py程序中腌制了我的模型:

import numpy as np
from sklearn.externals import joblib

class NeuralNetwork():
    """
    Two (hidden) layer neural network model. 
    First and second layer contain the same number of hidden units
    """
    def __init__(self, input_dim, units, std=0.0001):
        self.params = {}
        self.input_dim = input_dim

        self.params['W1'] = np.random.rand(self.input_dim, units)
        self.params['W1'] *= std
        self.params['b1'] = np.zeros((units))

        self.params['W2'] = np.random.rand(units, units)
        self.params['W2'] *= std * 10  # Compensate for vanishing gradients
        self.params['b2'] = np.zeros((units))

        self.params['W3'] = np.random.rand(units, 1)
        self.params['b3'] = np.zeros((1,))

model = NeuralNetwork(input_dim=12, units=64)

#####THIS RIGHT HERE ##############
joblib.dump(model, 'demo_model.pkl')

then I created an api.py file in the same directory as my demo_model.pkl , per this tutorial ( https://blog.hyperiondev.com/index.php/2018/02/01/deploy-machine-learning-models-flask-api/ ): 然后,我创建在同一目录作为我demo_model.pkl一个api.py文件,每本教程( https://blog.hyperiondev.com/index.php/2018/02/01/deploy-machine-learning-models- flask-api / ):

import flask
from flask import Flask, render_template, request
from sklearn.externals import joblib

app = Flask(__name__)


@app.route("/")
@app.route("/index")
def index():
    return flask.render_template('index.html')


# create endpoint for the predictions (HTTP POST requests)
@app.route('/predict', methods=['POST'])
def make_prediction():
    if request.method == 'POST':
        return render_template('index.html', label='3')


if __name__ == '__main__':
    # LOAD MODEL WHEN APP RUNS ####
    model = joblib.load('demo_model.pkl')
    app.run(host='0.0.0.0', port=8000, debug=True)

I also made a templates/index.html file in the same directory with this info: 我还使用以下信息在同一目录中创建了templates / index.html文件:

<html>
    <head>
        <title>NN Model as Flask API</title>
        <meta charset="utf-8">
        <meta name="viewport" content="width=device-width, initial-scale=1">
    </head>
    <body>
        <h1>Boston Housing Price Predictor</h1>
        <form action="/predict" method="post" enctype="multipart/form-data">
            <input type="file" name="image" value="Upload">
            <input type="submit" value="Predict"> {% if label %} {{ label }} {% endif %}
        </form>
    </body>

</html>

running: 运行:

>> python api.py

gives me an error with the pickler: 给我一个错误的选择器:

Traceback (most recent call last):
  File "api.py", line 22, in <module>
    model = joblib.load('model.pkl')
  File "C:\Users\joshu\Anaconda3\lib\site-packages\sklearn\externals\joblib\numpy_pickle.py", line 578, in load
    obj = _unpickle(fobj, filename, mmap_mode)
  File "C:\Users\joshu\Anaconda3\lib\site-packages\sklearn\externals\joblib\numpy_pickle.py", line 508, in _unpickle
    obj = unpickler.load()
  File "C:\Users\joshu\Anaconda3\lib\pickle.py", line 1043, in load
    dispatch[key[0]](self)
  File "C:\Users\joshu\Anaconda3\lib\pickle.py", line 1342, in load_global
    klass = self.find_class(module, name)
  File "C:\Users\joshu\Anaconda3\lib\pickle.py", line 1396, in find_class
    return getattr(sys.modules[module], name)
AttributeError: module '__main__' has no attribute 'NeuralNetwork'

Why is the main module of the program getting involved with my NeuralNetwork model? 为什么程序的主要模块涉及我的NeuralNetwork模型? I'm very confused at the moment... any advice would be appreciated. 我现在很困惑......任何建议都会受到赞赏。

UPDATE: 更新:

Adding a class definition class NeuralNetwork(object): pass to my api.py program fixed the bug. 添加类定义class NeuralNetwork(object): pass给我的api.py程序修复了bug。

import flask
from flask import Flask, render_template, request
from sklearn.externals import joblib


class NeuralNetwork(object):
    pass


app = Flask(__name__)

If anyone would be willing to offer me an explanation of what was going on that would be hugely appreciated! 如果有人愿意向我提供正在发生的事情的解释,那将非常感激!

The specific exception you're getting refers to attributes in __main__ , but that's mostly a red herring. 你得到的具体例外是指__main__属性,但这主要是红鲱鱼。 I'm pretty sure the issue actually has to do with how you dumped the instance. 我很确定这个问题实际上与你如何转储实例有关。

Pickle does not dump the actual code classes and functions, only their names. Pickle不会转储实际的代码类和函数,只会转储它们的名称。 It includes the name of the module each one was defined in, so it can find them again. 它包括每个模块的名称,因此它可以再次找到它们。 If you dump a class defined in a module you're running as a script, it will dump the name __main__ as the module name, since that's what Python uses as the name for the main module (as seen in the if __name__ == "__main__" boilerplate code). 如果转储在作为脚本运行的模块中定义的类,它将转储名称__main__作为模块名称,因为这是Python用作主模块名称的内容(如if __name__ == "__main__"样板代码if __name__ == "__main__"

When you run model.py as a script and pickle an instance of a class defined in it, that class will be saved as __main__.NeuralNetwork rather than model.NeuralNetwork . 当您将model.py作为脚本运行并model.py定义的类的实例时,该类将保存为__main__.NeuralNetwork model.NeuralNetwork而不是model.NeuralNetwork When you run some other module and try to load the pickle file, Python will look for the class in the __main__ module, since that's where the pickle data tells it to look. 当你运行一些其他模块并尝试加载pickle文件时,Python将在__main__模块中查找该类,因为那是pickle数据告诉它的样子。 This is why you're getting an exception about attributes of __main__ . 这就是为什么你得到关于__main__属性的例外。

To solve this you probably want to change how you're dumping the data. 要解决此问题,您可能希望更改转储数据的方式。 Instead of running model.py as a script, you should probably run some other module and have it do import model , so you get the module under it's normal name. 您应该运行一些其他模块并让它执行import model ,而不是将model.py作为脚本运行,因此您可以使用它的正常名称获取模块。 (I suppose you could have model.py import itself in an if __name__ == "__main__" block, but that's super ugly and awkward). (我想你可以在if __name__ == "__main__"块中使用model.py导入,但这非常丑陋和笨拙)。 You probably also need to avoid recreating and dumping the instance unconditionally when the model is imported, since that needs to happen when you load the pickle file (and I assume the whole point of the pickle is to avoid recreating the instance from scratch). 您可能还需要避免在导入model时无条件地重新创建和转储实例,因为这需要在加载pickle文件时发生(我假设pickle的整个要点是避免从头开始重新创建实例)。

So remove the dumping logic from the bottom of model.py , and add a new file like this: 因此,从model.py的底部删除转储逻辑,并添加如下所示的新文件:

# new script, dump_model.py, does the creation and dumping of the NeuralNetwork

from sklearn.externals import joblib

from model import NeuralNetwork

if __name__ == "__main__":
    model = NeuralNetwork(input_dim=12, units=64)
    joblib.dump(model, 'demo_model.pkl')

When you dump the NeuralNetwork using this script, it will correctly identify model as the module the class was defined in, and so the loading code will be able to import that module and make an instance of the class correctly. 当您使用此脚本转储NeuralNetwork ,它将正确地将model识别为定义类的模块,因此加载代码将能够导入该模块并正确创建类的实例。

Your current "fix" for the issue (defining an empty NeuralNetwork class in the __main__ module when you are loading the object) is probably a bad solution. 您当前对该问题的“修复”(在加载对象时在__main__模块中定义一个空的NeuralNetwork类)可能是一个糟糕的解决方案。 The instance you get from loading the pickle file will be an instance of the new class, not the original one. 从加载pickle文件获得的实例将是新类的实例,而不是原始实例。 It will be loaded with the attributes of the old instance, but it won't have any methods or other class variables set on it (which isn't an issue with the class you've shown, but probably will be for any kind of object that's more complicated). 它将加载旧实例的属性,但它不会设置任何方法或其他类变量(这不是您所显示的类的问题,但可能适用于任何类型的对象更复杂)。

If you are using Keras library to build your neural network then pickle will not work. 如果您使用Keras库来构建您的神经网络,那么pickle将无法工作。 pickle only works fine the model built using scikit libraries. pickle只适用于使用scikit库构建的模型。 Save your neural network model using json . 使用json保存您的神经网络模型。

Keras provides the ability to describe any model using JSON format with a to_json() function. Keras提供了使用带有to_json()函数的JSON格式描述任何模型的功能。 This can be saved to file and later loaded via the model_from_json() function that will create a new model from the JSON specification. 这可以保存到文件中,然后通过model_from_json()函数加载,该函数将根据JSON规范创建新模型。

# serialize model to JSON
    model_json = model.to_json()
    with open(“model.json”, “w”) as json_file:
    json_file.write(model_json)

# serialize weights to HDF5
    model.save_weights(“model.h5”)
    print(“Saved model to disk”)

# later…

# load json and create model
    json_file = open(‘model.json’, ‘r’)
    loaded_model_json = json_file.read()
    json_file.close()
    loaded_model = model_from_json(loaded_model_json)

# load weights into new model
    loaded_model.load_weights(“model.h5”)
    print(“Loaded model from disk”)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM