简体   繁体   English

在Python2中用pickle加载utf-8文件

[英]Loading utf-8 file with pickle in Python2

I am writing a script in Python that works on OSX (10.6) and uses Python 2.7. 我正在用Python编写脚本,该脚本可在OSX(10.6)上使用Python 2.7。 My commands are: 我的命令是:

    morphcache = codecs.open('file.txt','r','utf-8')
    morphology = pickle.load(morphcache)
    morphcache.close()

It uses a text file (utf-8) generated by another site which contains newlines and characters like č, š, ž etc. 它使用另一个站点生成的文本文件(utf-8),其中包含换行符和č,š,ž等字符。

Since it uses escaped characters it reports this error: 由于它使用转义字符,因此会报告此错误:

Traceback (most recent call last):   File "createxml.py", line 38, in <module>
morphology = pickle.load(morphcache)   File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1378, in load
return Unpickler(file).load()   File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 858, in load
dispatch[key](self) KeyError: 'sV\xc5\xbedeti\np1\nSVerb,\xc5\xbedeje,\xc5\xbedeti,\xc5\xbedeti,\xc5\xbedi,\xc5\xbedijo\np2\nsV\xc5\xbeupnik\np3\nVSu' make: *** [all] Error 1

I am searching for a solution how this would work - all solutions to the problem were saying to write text to a file in a different way (and not utf-8) first but I cannot do it, I already get the input file in such a form. 我正在寻找一种解决方案,该问题的所有解决方案都要求首先以不同的方式(而不是utf-8)将文本写入文件,但我做不到,我已经获得了输入文件表单。

Or should this file first be read and written in another way to disk - and then reopened to be pickled? 还是应该首先以其他方式将文件读取并写入磁盘-然后重新打开以进行腌制?

Thanks. 谢谢。

Pickle files are not text files. 泡菜文件不是文本文件。 They contain Python object definitions (which could include unicode text objects, or str byte strings). 它们包含Python对象定义(可以包括unicode文本对象或str字节字符串)。

Open your file in binary mode and load that: 以二进制模式打开文件并加载:

with open('file.txt', 'rb') as morphcache:
    morphology = pickle.load(morphcache)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM