[英]Buffered/batch serialization in Python?
I have an algorithm that iteratively creates a very large, highly nested dictionary. 我有一个算法可以迭代创建一个非常大的,高度嵌套的字典。 I would like to buffer parts of this dictionary and then periodically stream the buffer to disk so that I can re-create the whole dictionary at another time.
我想缓冲此字典的某些部分,然后定期将缓冲区流式传输到磁盘,以便在其他时间重新创建整个字典。
It seems like pickle is intended for one-pass serialization. 泡菜似乎打算用于一遍序列化。 Is there a way to serialize a dictionary in batches to a single output stream?
有没有一种方法可以将字典批量序列化为单个输出流?
Ok, it looks like the following will partially solve the problem: 好的,看起来以下内容可以部分解决问题:
with open('file','ab') as f:
while <stopping condition>:
<generate (key,value) pair 'k'>
pickle.dump(k,f)
Now, to reconstruct the whole dictionary, you just do the following: 现在,要重构整个字典,只需执行以下操作:
with open('file','rb') as f:
fullMapping = {}
hasNext = True
while hasNext:
try:
fullMapping.update(pickle.load(f))
except:
f.close()
hasNext = False
This will reconstitute the full dictionary when run. 运行时,这将重新构成完整的词典。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.