简体   繁体   English

压缩pkl文件

[英]Compressing a pkl file

My requirement is to convert a pkl file to base64 string format so that i can return a json file containing this string along with some other contents. 我的要求是将pkl文件转换为base64字符串格式,以便我可以返回包含此字符串以及其他一些内容的json文件。

{                                                                       
    'pkl_file': 'pkl_as_base64_string'                                             
    'content1': 'content1_as_base64_string'
    'content2': 'content2_as_base64_string'                                 
                     .                                                         
                     .                                                   
}

Now i have tried out this code with https://stackoverflow.com/a/26349372/9316658 as the reference 现在我已经尝试使用https://stackoverflow.com/a/26349372/9316658作为参考的代码

with open(DIR_PATH + 'd885d7a4bbb742cbb397c2642339e950.pkl', 'rb') as f:
    data = pickle.load(f)
    serialized_str = base64.b64encode(pickle.dumps(data))
    print serialized_str

I am getting this when i execute the above code 我执行上面的代码时得到这个

Traceback (most recent call last):
File "/home/bhargav/PycharmProjects/Test/export_import.py", line 8, in <module>
    data = pickle.load(f)
ImportError: No module named ml.model.project_model

When i open the pkl file using a text editor, these are the first few lines 当我使用文本编辑器打开pkl文件时,这些是前几行

(iml.model.project_model
ProjectModel
p0
(dp1
S'project_predict_pipe'
p2
(iml.pipeline.base
ICVPipeline
p3
(dp4
S'processors'
p5
(lp6
(iml.pi.file.pdf_to_img_pi
PdfFileConvertPI
p7
(dp8
S'process'
p9
Nsba(iml.pi.ocr.file_ocr_pi

I am not sure why python is interpreting the text inside the pkl files as python commands ( I am new to python programming and never dealt with pkl files before ). 我不确定为什么python会将pkl文件中的文本解释为python命令(我是python编程的新手,以前从未处理过pkl文件)。 Also, the pkl file is huge in size (1.2 GB). 另外,pkl文件的大小非常大(1.2 GB)。 How do i achieve pkl to bas64 conversion in the most effective way possible? 如何以最有效的方式实现pkl到bas64的转换? Any help is appreciated. 任何帮助表示赞赏。 TIA TIA

The problem is probably related to the fact that the pkl uses a type/ class that is not known in your environment. 问题可能与pkl使用您的环境中未知的类型/类有关。 If you wrote this file, just import/ declare the missing type (probably ml.model.project_model ). 如果编写了此文件,则只需导入/声明缺少的类型(可能是ml.model.project_model )。

Anyway- what you were trying to do is to translate the object in the pkl to base 64, rather than the file itself as you said (meaning- not using the pkl itself). 无论如何-您试图做的是将pkl中的对象转换为base 64,而不是您所说的文件本身(意思是-不使用pkl本身)。 For example, if the pkl contains a dictionary d , you were trying to have a base64 of d . 例如,如果pkl包含字典d ,则您试图使的base64为d But- the b64encode should receive a string or buffer, so it won't work. 但是b64encode应该接收一个字符串或缓冲区,因此它将不起作用。

So- I think what you really want to do is to dump d to a pkl file (this is the file you already have), and translate the file's content to base64. 所以,我认为您真正想要做的是将d转储到pkl文件(这是您已经拥有的文件)中,并将文件的内容转换为base64。 For this, you don't need to use dump , just do- 为此,您不需要使用dump ,只需-

with open(DIR_PATH + 'd885d7a4bbb742cbb397c2642339e950.pkl', 'rb') as f:
    serialized_str = base64.b64encode(f.read())
    print serialized_str

Then, the other side will need to open the base64 (using b64decode ), write it to a file, and then open this file with pickle.load() to get the original object (in my example- d ). 然后,另一端将需要打开base64(使用b64decode ),将其写入文件,然后使用pickle.load()打开此文件以获取原始对象(在我的示例d )。 This will work assuming he has the ml.model.project_model module declared. 假设他已经声明了ml.model.project_model模块,这将起作用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM