简体   繁体   中英

Persist an object to either local file system or to S3

I need a method which persists an object (model) to either local file system or to an S3 bucket. The destination is determined by the environment variable MODELS_DIR . I have two versions where the first is a little longer and I am rather confident about its correctness. The second version is shorter, but I am worried that not using the with statement is actually wrong.

def persist_model(model, model_name):
    """ VERSION 1
    Persist `model` under the name `model_name` to the environment variable
    `MODELS_DIR` (having a trailing '/').
    """
    MODELS_DIR = os.getenv('MODELS_DIR')

    if MODELS_DIR.startswith('s3://'):
        s3 = s3fs.S3FileSystem()
        with s3.open(MODELS_DIR[5:] + model_name, 'wb') as f:
            joblib.dump(model, f)
    else:
        with open(MODELS_DIR + model_name, 'wb') as f:
            joblib.dump(model, f)

and:

def persist_model(model, model_name):
    """VERSION 2
    Persist `model` under the name `model_name` to the environment variable
    `MODELS_DIR` (having a trailing '/').
    """
    MODELS_DIR = os.getenv('MODELS_DIR')

    if MODELS_DIR.startswith('s3://'):
        s3 = s3fs.S3FileSystem()
        f = s3.open(MODELS_DIR[5:] + model_name, 'wb')
    else:
        f = open(MODELS_DIR + model_name, 'wb')

    joblib.dump(model, f)

My question is whether the second version is safe to use?

Yes an no... normally you would need to close the file after you wrote (dumped) your content in it. When you use the with statement, Python will take care for that. If you skip the with, you need to use f.close() or s.close() .

Also, to be sure that the file gets closed even in case of an error, you would need to use the try-finally construct. Therefore the second version, when used correctly, would become way longer than the first one.

In case you would like to avoid the code duplication, I would propose the use of a function selector:

 def persist_model(model, model_name):

     def get_file_opener(path):
        if path.startswith('s3://'):
            return s3fs.S3FileSystem().open
        else 
            return open

     full_path = os.getenv('MODELS_DIR')
     with get_file_opener(fullpath)(fullpath[5:] + model_name, 'wb') as f:
        joblib.dump(model, f)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM