[英]Read/Write/Updating object without loading the object to memory
I have been trying out with the Klepto package to Write/Read/Update my object to harddisk, aiming to avoid the "out of memory" issues that I experienced when training my model with my dataset.我一直在尝试使用 Klepto package 将我的 object 写入/读取/更新到硬盘,旨在避免我在使用我的数据集训练我的 model 时遇到的“内存不足”问题。 From my understanding, with the Klepto I could store my data as a key-value based mechanism.
根据我的理解,使用 Klepto 我可以将我的数据存储为基于键值的机制。 But I am not quite sure if I could directly Update the object when I load the data back from the klepto.archieve.
但是我不太确定当我从 klepto.archieve 加载数据时是否可以直接更新 object。 When updating, eg adding a value to the list, while keeping not to directly load the object out to memory to avoid "out of memory" problem.
更新时,例如向列表中添加一个值,同时保持不要将 object 直接加载到 memory 以避免“内存不足”问题。
Here is a sample about the saved data (please correct me if this is also not the correct way for setting it up):这是一个关于保存数据的示例(如果这也不是设置它的正确方法,请纠正我):
from klepto.archives import *
arch = file_archive('test.txt')
arch['a'] = [3,4,5,6,7]
arch.dump()
arch.pop('a')
I'm the klepto
author.我是
klepto
的作者。 If I understand what you want, it looks like you have set it up correctly.如果我明白你想要什么,那么看起来你已经正确设置了它。 The critical keyword is
cached
.关键关键字已
cached
。 If you use cached=True
, then the archive is constructed as an in-memory cache with a manually-synchronized file backend.如果您使用
cached=True
,那么存档将被构造为内存中的缓存,并带有手动同步的文件后端。 If you use cached=False
, then there's no in-memory cache... you just access the file archive directly.如果您使用
cached=False
,则没有内存缓存......您只需直接访问文件存档。
Python 3.7.16 (default, Dec 7 2022, 05:04:27)
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from klepto.archives import *
>>> arch = file_archive('test.txt', cached=True)
>>> arch['a'] = [3,4,5,6,7]
>>> arch.dump() # dump to file archive
>>> arch.pop('a') # delete from memory
[3, 4, 5, 6, 7]
>>> arch
file_archive('test.txt', {}, cached=True)
>>> arch.load('a') # load from file archive
>>> arch
file_archive('test.txt', {'a': [3, 4, 5, 6, 7]}, cached=True)
>>>
>>> arch2 = file_archive('test.txt', cached=True)
>>> arch2
file_archive('test.txt', {}, cached=True)
>>> arch2.load() # load from file archive
>>> arch2
file_archive('test.txt', {'a': [3, 4, 5, 6, 7]}, cached=True)
>>>
>>> arch3 = file_archive('test.txt', cached=False)
>>> arch3 # directly access file-archive
file_archive('test.txt', {'a': [3, 4, 5, 6, 7]}, cached=False)
>>>
You can also manipulate objects that are already in the archive... unfortunately, for cached=False
, the object needs to be loaded into memory to be edited (due to lack of implementation for in-archive editing, you can only replace objects in a cached=False
archive).您还可以操作存档中已有的对象...不幸的是,对于
cached=False
,需要将 object 加载到 memory 中进行编辑(由于缺少存档内编辑的实现,您只能替换中的对象cached=False
存档)。
>>> arch2
file_archive('test.txt', {'a': [3, 4, 5, 6, 7]}, cached=True)
>>> arch2['a'].append(8) # edit the in-memory object
>>> arch2
file_archive('test.txt', {'a': [3, 4, 5, 6, 7, 8]}, cached=True)
>>> arch2.dump('a') # save changes to file-archive
>>> arch3
file_archive('test.txt', {'a': [3, 4, 5, 6, 7, 8]}, cached=False)
>>>
>>> arch3['a'] = arch2['a'][1:] # replace directly in-file
>>> arch3
file_archive('test.txt', {'a': [4, 5, 6, 7, 8]}, cached=False)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.