简体   繁体   中英

How to load object in pickle?

I am trying to load a pickled dictionary but keep getting attribute error such as this

TypeError: a bytes-like object is required, not '_io.BufferedReader'

Below is the code to read and write pickle object. I am dumping pickled object on a linux workstation with python 2.7.12. The data is transferred to Mac with python 3.6.4, where readTrueData() is executed resulting in the above error.

def readTrueData(name):
    fName = str('trueData/'+name+'.pkl')
    f = open(fName,'rb')
    #    print(f)
    #    print(type(f))
    pC = pickle.loads(f)
    return pC

def storeTrueData(atomicConfigs, name):
    import quippy
    storeDic = {}
    #rangeKeys = len(atomicConfigs)
    #print(rangeKeys)
    qTrain = quippy.AtomsList(atomicConfigs)
    print(len(qTrain))
    rangeKeys = len(qTrain)
    print(rangeKeys)
    for i in range(rangeKeys):
        #configConsidered = atomicConfigs[i]
        trueForce = np.array(qTrain[i].force).T
        storeDic[i] = trueForce
    f = open("trueData/"+ name + ".pkl", "wb")
    pickle.dump(storeDic, f)
    f.close()    
    return None

UPDATE

Working on the suggestions mentioned in the comments, I changed my code as below a.) pC = pickle.load(f) b.) pC = pickle.loads(f.read()) In both the case I got the following error

UnicodeDecodeError: 'ascii' codec can't decode byte 0x87 in position 1: ordinal not in range(128)

You need to be using pickle.load(...) to read if using open in that manner.

Source: https://docs.python.org/3/library/pickle.html

pC = pickle.loads(f.read()) is what you're looking for, but you should really be using the with context :

with open(fName, 'rb') as f: 
    pC = pickle.loads(f.read())

This would ensure your file is closed properly, especially because your code doesn't have a f.close() in the function.

Your first problem is caused by a mismatch between the argument type and the chosen load* method; loads expects bytes objects, load expects the file object itself. Passing the file object to loads is what caused your error.

Your other problem is due to the cross-version compatibility issue with numpy and datetime types; Python 2 pickles str s with no specified encoding, but Python 3 must unpickle them with a known encoding (or 'bytes' , to get raw bytes rather than str ). For numpy and datetime types, you're required to pass encoding='latin-1' :

Optional keyword arguments are fix_imports, encoding and errors, which are used to control compatibility support for pickle stream generated by Python 2. If fix_imports is true, pickle will try to map the old Python 2 names to the new names used in Python 3. The encoding and errors tell pickle how to decode 8-bit string instances pickled by Python 2; these default to 'ASCII' and 'strict', respectively. The encoding can be 'bytes' to read these 8-bit string instances as bytes objects. Using encoding='latin1' is required for unpickling NumPy arrays and instances of datetime, date and time pickled by Python 2.

In any event, the fix is to change:

def readTrueData(name):
    fName = str('trueData/'+name+'.pkl')
    f = open(fName,'rb')
    #    print(f)
    #    print(type(f))
    pC = pickle.loads(f)
    return pC

to:

def readTrueData(name):
    fName = str('trueData/'+name+'.pkl')
    with open(fName, 'rb') as f:  # with statement avoids file leak
        # Match load with file object, and provide encoding for Py2 str
        return pickle.load(f, encoding='latin-1')

For correctness and performance reasons, I'd also recommend changing pickle.dump(storeDic, f) to pickle.dump(storeDic, f, protocol=2) on the Python 2 machine, so the stream is generated with a more modern pickle protocol, one which can efficiently pickle numpy arrays among other things. Protocol 0, the Python 2 default, can't use the top bit of each byte (it's ASCII compatible), which means raw binary data bloats dramatically in protocol 0, requiring a ton of bit twiddling, where protocol 2 can dump it raw. Protocol 2 is also the only Py2 protocol that efficiently pickles new style classes, and the only one that can properly pickle certain types of instances (stuff using __slots__ / __new__ and the like) at all.

I'd also recommend the script begin with:

try:
    import cPickle as pickle
except ImportError:
    import pickle

as on Python 2, pickle is implemented in pure Python, and is both slow and unable to use some of the more efficient pickle codes. On Python 3, cPickle is gone, but pickle is automatically accelerated. Between that and using protocol 2, pickling on the Python 2 machine should run much faster, and produce much smaller pickles.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM