简体   繁体   English

读取/写入/更新 object,无需将 object 加载到 memory

[英]Read/Write/Updating object without loading the object to memory

I have been trying out with the Klepto package to Write/Read/Update my object to harddisk, aiming to avoid the "out of memory" issues that I experienced when training my model with my dataset.我一直在尝试使用 Klepto package 将我的 object 写入/读取/更新到硬盘,旨在避免我在使用我的数据集训练我的 model 时遇到的“内存不足”问题。 From my understanding, with the Klepto I could store my data as a key-value based mechanism.根据我的理解,使用 Klepto 我可以将我的数据存储为基于键值的机制。 But I am not quite sure if I could directly Update the object when I load the data back from the klepto.archieve.但是我不太确定当我从 klepto.archieve 加载数据时是否可以直接更新 object。 When updating, eg adding a value to the list, while keeping not to directly load the object out to memory to avoid "out of memory" problem.更新时,例如向列表中添加一个值,同时保持不要将 object 直接加载到 memory 以避免“内存不足”问题。

Here is a sample about the saved data (please correct me if this is also not the correct way for setting it up):这是一个关于保存数据的示例(如果这也不是设置它的正确方法,请纠正我):

from klepto.archives import *
arch = file_archive('test.txt')
arch['a'] = [3,4,5,6,7]
arch.dump()
arch.pop('a')

I'm the klepto author.我是klepto的作者。 If I understand what you want, it looks like you have set it up correctly.如果我明白你想要什么,那么看起来你已经正确设置了它。 The critical keyword is cached .关键关键字已cached If you use cached=True , then the archive is constructed as an in-memory cache with a manually-synchronized file backend.如果您使用cached=True ,那么存档将被构造为内存中的缓存,并带有手动同步的文件后端。 If you use cached=False , then there's no in-memory cache... you just access the file archive directly.如果您使用cached=False ,则没有内存缓存......您只需直接访问文件存档。

Python 3.7.16 (default, Dec  7 2022, 05:04:27) 
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from klepto.archives import *
>>> arch = file_archive('test.txt', cached=True)
>>> arch['a'] = [3,4,5,6,7]
>>> arch.dump() # dump to file archive
>>> arch.pop('a') # delete from memory
[3, 4, 5, 6, 7]
>>> arch
file_archive('test.txt', {}, cached=True)
>>> arch.load('a') # load from file archive
>>> arch
file_archive('test.txt', {'a': [3, 4, 5, 6, 7]}, cached=True)
>>> 
>>> arch2 = file_archive('test.txt', cached=True)
>>> arch2
file_archive('test.txt', {}, cached=True)
>>> arch2.load() # load from file archive
>>> arch2
file_archive('test.txt', {'a': [3, 4, 5, 6, 7]}, cached=True)
>>> 
>>> arch3 = file_archive('test.txt', cached=False)
>>> arch3 # directly access file-archive
file_archive('test.txt', {'a': [3, 4, 5, 6, 7]}, cached=False)
>>> 

You can also manipulate objects that are already in the archive... unfortunately, for cached=False , the object needs to be loaded into memory to be edited (due to lack of implementation for in-archive editing, you can only replace objects in a cached=False archive).您还可以操作存档中已有的对象...不幸的是,对于cached=False ,需要将 object 加载到 memory 中进行编辑(由于缺少存档内编辑的实现,您只能替换中的对象cached=False存档)。

>>> arch2
file_archive('test.txt', {'a': [3, 4, 5, 6, 7]}, cached=True)
>>> arch2['a'].append(8) # edit the in-memory object
>>> arch2
file_archive('test.txt', {'a': [3, 4, 5, 6, 7, 8]}, cached=True)
>>> arch2.dump('a') # save changes to file-archive
>>> arch3
file_archive('test.txt', {'a': [3, 4, 5, 6, 7, 8]}, cached=False)
>>> 
>>> arch3['a'] = arch2['a'][1:] # replace directly in-file
>>> arch3
file_archive('test.txt', {'a': [4, 5, 6, 7, 8]}, cached=False)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 获取对象的字符串表示形式,而无需将该对象加载到内存中 - Get the string representation of an object without loading the object into memory 如何在不使用json和python腌制的情况下将对象写入文件并读回? - How do I write an object to a file and read it back without using json and pickling in python? 如何在二进制文件中写入许多对象并读取其特定对象? - How to write many objects in binary file and read it an especific object? 在 memory 中读取第 n 行 importlib.resources.files 而不加载整个文件 - Read nth line of importlib.resources.files without loading whole file in memory 更新 Object 的数据成员 - Updating data members of an Object 如何从文本文件中读取没有双反斜杠的字节对象? - How to read byte object from text file without double backslash? 如何在 python 中打开一个 csv 文件,一次读取一行,而不将整个 csv 文件加载到内存中? - How can I open a csv file in python, and read one line at a time, without loading the whole csv file in memory? Python 3:存储数据而不将其加载到内存中 - Python 3 : Storing Data without loading it into memory 在Python中更新复杂的JSON对象 - Updating complex JSON object in Python 无法更新 rect 对象 pygame - Trouble updating rect object pygame
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM