简体   繁体   中英

python 3.x joblib simple save function

I'm trying to create a simple joblib function, which will evaluate the expression and pickle the result, while checking for the existence of the pickle file. But when I put this function in some other file and import the function after adding the path of the file to sys.path. I get errors.

from pathlib import Path
import joblib as jl    
def saveobj(filename, expression_obj,ignore_file = False):
    fname = Path(filename)
    if fname.exists() and not ignore_file:
        obj = jl.load(filename)
    else:
        obj = eval(expression_obj)
        jl.dump(obj,fname,compress = True)        
    return obj

Sample call:

rf_clf = saveobj(file, "rnd_cv.fit(X_train, np.ravel(y_train))", ignore_file=True)

Error:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-11-02c2cae43c5d> in <module>
      1 file = Path("rf.pickle")
----> 2 rf_clf = saveobj(file, "rnd_cv.fit(X_train, np.ravel(y_train))", ignore_file=True)

~/Dropbox/myfnlib/util_funs.py in saveobj(filename, expression_obj, ignore_file)
     37         obj = jl.load(filename)
     38     else:
---> 39         obj = eval(expression_obj)
     40         jl.dump(obj,fname,compress = True)
     41     return obj

~/Dropbox/myfnlib/util_funs.py in <module>

NameError: name 'rnd_cv' is not defined

I guess, python needs to evaluate the function locally, but since the objects don't exist in that scope, it is throwing this error. Is there a better way of doing this. I need to do this repeatedly, that's why a function. Thanks a lot for your help.

You can check the documentation of eval :

Help on built-in function eval in module builtins:

eval(source, globals=None, locals=None, /)

 Evaluate the given source in the context of globals and locals. The source may be a string representing a Python expression or a code object as returned by compile(). The globals must be a dictionary and locals can be any mapping, defaulting to the current globals and locals. If only globals is given, locals defaults to it.

It has arguments for global and local variables. So, in your case, you can:

from pathlib import Path
import joblib as jl    
def saveobj(filename, expression_obj,global,local,ignore_file = False):
    fname = Path(filename)
    if fname.exists() and not ignore_file:
        obj = jl.load(filename)
    else:
        obj = eval(expression_obj, global, local)
        jl.dump(obj,fname,compress = True)        
    return obj

The code can be changed to:

rf_clf = saveobj(file, "rnd_cv.fit(X_train, np.ravel(y_train))", globals(), locals(), ignore_file=True)

I was about to post answer my own question, when I saw @youkaichao answer. Thanks a lot. One more way to skin the cat: (although limited to keyword arguments)

def saveobj(filename,func, ignore_file = False, **kwargs):
    fname = Path(filename)
    if fname.exists() and not ignore_file:
        obj = jl.load(filename)
    else:
        obj = func(**kwargs)
        jl.dump(obj,fname,compress = True)        
    return obj

Changed Call:

file = Path("rf.pickle")
rf_clf = saveobj(file, rnd_cv.fit, ignore_file=False, X=X_train, y= np.ravel(y_train))

Although, I would still love to know, which one is better.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM