简体   繁体   English

使用 joblib 加载腌制 scikit-learn 模型时出现 KeyError

[英]KeyError when loading pickled scikit-learn model using joblib

I have an object that contains within it two scikit-learn models, an IsolationForest and a RandomForestClassifier , that I would like to pickle and later unpickle and use to produce predictions.我有一个对象,其中包含两个scikit-learn模型,一个IsolationForest和一个RandomForestClassifier ,我想对它们进行pickle,然后对其进行unpickle 并用于生成预测。 Apart from the two models, the object contains a couple of StandardScaler s and a couple of Python lists.除了这两个模型之外,该对象还包含几个StandardScaler和几个 Python 列表。

Pickling this object using joblib is unproblematic, but when I try to unpickle it later I get the following exception:使用joblib这个对象进行joblib是没有问题的,但是当我稍后尝试joblib它时,我得到以下异常:

Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/home/(...)/python3.5/site-packages/joblib/numpy_pickle.py", line 578, in load
   obj = _unpickle(fobj, filename, mmap_mode)
 File "/home/(...)/python3.5/site-packages/joblib/numpy_pickle.py", line 508, in _unpickle
   obj = unpickler.load()
 File "/usr/lib/python3.5/pickle.py", line 1039, in load
   dispatch[key[0]](self)
KeyError: 0

The same application both pickles and unpickles the object, so the versions of scikit-learn , joblib and other libraries are the same.同一个应用程序同时对对象进行joblib和unpickles,所以scikit-learnjoblib等库的版本是一样的。 I'm not sure where to start debugging, given the vague error.鉴于模糊的错误,我不确定从哪里开始调试。 Any ideas or pointers?任何想法或指示?

The solution to this was pretty banal: Without being aware of it I was using the version of joblib in sklearn.externals.joblib for the pickling, but a newer version of joblib for unpickling the object.对此的解决方案非常平庸:在没有意识到这一点的情况下,我使用joblib中的sklearn.externals.joblib版本进行酸洗,但使用较新版本的joblibjoblib对象。 The problem was resolved when I used the newer version of joblib for both tasks.当我将较新版本的joblib用于这两个任务时,问题得到了解决。

和我一起,碰巧我使用from sklearn.externals import joblib导出模型并尝试使用import joblib加载。

Mine was interesting.我的很有趣。 I was working with git-lfs and thus the files had been changed and joblib couldn't open them.我正在使用git-lfs ,因此文件已更改并且 joblib 无法打开它们。 So I needed to run git lfs pull to get actual files.所以我需要运行git lfs pull来获取实际文件。 So if you are using compatible joblib versions, make sure your files are not changed somehow!因此,如果您使用兼容的 joblib 版本,请确保您的文件没有以某种方式更改!

对我来说,使用相同版本的 joblib 进行转储和加载,但我将文件保存在 python 3.7.4 下并尝试使用引发相同 KeyError 的 python 3.7.6 加载。

In my case, I was trying to load an XGB.就我而言,我试图加载 XGB。 I found outXGB is not compatible with other sklearn models, so I did the following:我发现XGB与其他 sklearn 模型不兼容,因此我执行了以下操作:

from xgboost import *
import joblib

def get_model(model_path):
    if 'xgb' in model_path:
        xgb_model = XGBClassifier()
        xgb_model.load_model(model_path)
        model = xgb_model
    else: 
        model = get_obj(model_path)
    return model 

xbg = get_model('Models/xgb_v1.pkl') # an xgb

tree = model = get_model('Models/dt_v1.pkl') # a decition tree

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM