简体   繁体   English

如何莳萝(泡菜)归档?

[英]How to dill (pickle) to file?

The question may seem a little basic, but wasn't able to find anything that I understood in the internet.这个问题可能看起来有点基本,但无法在互联网上找到我理解的任何内容。 How do I store something that I pickled with dill?我如何储存用莳萝腌制的东西?

I have come this far for saving my construct (pandas DataFrame, which also contains custom classes):我已经走到这一步来保存我的构造(pandas DataFrame,它也包含自定义类):

import dill
dill_file = open("data/2017-02-10_21:43_resultstatsDF", "wb")
dill_file.write(dill.dumps(resultstatsDF))
dill_file.close()

and for reading和阅读

dill_file = open("data/2017-02-10_21:43_resultstatsDF", "rb")
resultstatsDF_out = dill.load(dill_file.read())
dill_file.close()

but I when reading I get the error但我在阅读时出现错误

TypeError: file must have 'read' and 'readline' attributes

How do I do this?我该怎么做呢?


EDIT for future readers: After having used this approach (to pickle my DataFrame) for while, now I refrain from doing so.为未来的读者编辑:在使用这种方法(腌制我的 DataFrame)一段时间后,现在我避免这样做。 As it turns out, different program versions (including objects that might be stored in the dill file) might result in not being able to recover the pickled file.事实证明,不同的程序版本(包括可能存储在 dill 文件中的对象)可能会导致无法恢复腌制文件。 Now I make sure that everything that I want to save, can be expressed as a string (as efficiently as possible) -- actually a human readable string.现在我确保我想保存的所有内容都可以表示为字符串(尽可能高效)——实际上是一个人类可读的字符串。 Now, I store my data as CSV.现在,我将数据存储为 CSV。 Objects in CSV-cells might be represented by JSON format. CSV 单元格中的对象可能以 JSON 格式表示。 That way I make sure that my files will be readable in the months and years to come.这样我就可以确保我的文件在未来几个月和几年内都可以读取。 Even if code changes, I am able to rewrite encoders by parsing the strings and I am able to understand the CSV my inspecting it manually.即使代码发生变化,我也可以通过解析字符串来重写编码器,并且我可以通过手动检查来理解 CSV。

Just give it the file without the read :只需给它文件而不read

resultstatsDF_out = dill.load(dill_file)

you can also dill to file like this:你也可以像这样 dill 文件:

with open("data/2017-02-10_21:43_resultstatsDF", "wb") as dill_file:
    dill.dump(resultstatsDF, dill_file)

So:所以:

dill.dump(obj, open_file)

writes to a file directly.直接写入文件。 Whereas:然而:

dill.dumps(obj) 

serializes obj and you can write it to file yourself.序列化obj ,您可以自己将其写入文件。

Likewise:同样地:

dill.load(open_file)

reads from a file, and:从文件中读取,并且:

dill.loads(serialized_obj)

constructs an object form a serialized object, which you could read from a file.从序列化对象构造一个对象,您可以从文件中读取该对象。

It is recommended to open a file using the with statement.建议使用with语句打开文件。

Here:这里:

with open(path) as fobj:
    # do somdthing with fobj

has the same effect as:具有相同的效果:

fobj = open(path)
try:
    # do somdthing with fobj
finally:
    fobj.close()

The file will be closed as soon as you leave the indention of the with statement, even in the case of an exception.只要您离开with语句的缩进,即使出现异常,文件也会立即关闭。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM