简体   繁体   中英

Loading utf-8 file with pickle in Python2

I am writing a script in Python that works on OSX (10.6) and uses Python 2.7. My commands are:

    morphcache = codecs.open('file.txt','r','utf-8')
    morphology = pickle.load(morphcache)
    morphcache.close()

It uses a text file (utf-8) generated by another site which contains newlines and characters like č, š, ž etc.

Since it uses escaped characters it reports this error:

Traceback (most recent call last):   File "createxml.py", line 38, in <module>
morphology = pickle.load(morphcache)   File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1378, in load
return Unpickler(file).load()   File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 858, in load
dispatch[key](self) KeyError: 'sV\xc5\xbedeti\np1\nSVerb,\xc5\xbedeje,\xc5\xbedeti,\xc5\xbedeti,\xc5\xbedi,\xc5\xbedijo\np2\nsV\xc5\xbeupnik\np3\nVSu' make: *** [all] Error 1

I am searching for a solution how this would work - all solutions to the problem were saying to write text to a file in a different way (and not utf-8) first but I cannot do it, I already get the input file in such a form.

Or should this file first be read and written in another way to disk - and then reopened to be pickled?

Thanks.

Pickle files are not text files. They contain Python object definitions (which could include unicode text objects, or str byte strings).

Open your file in binary mode and load that:

with open('file.txt', 'rb') as morphcache:
    morphology = pickle.load(morphcache)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM