[英]Getting a MemoryError because list/array is too large
I have to download object_x
. 我必须下载
object_x
。 For simplicity's sake, object_x
comprises a series of integers
adding up to 1000
. 为简单起见,
object_x
包含一系列integers
, object_x
为1000
。 The download is irregular. 下载是不定期的。 I receive groups or
chunks
of integers in seemingly random order, and I need to keep track of them until I have all 1000
to make up the final object_x
. 我收到看似随机的整数组或整数
chunks
,并且需要跟踪它们,直到我拥有全部1000
组成最终的object_x
。
The incoming chunks can also overlap, so for instance: 传入的块也可以重叠,因此例如:
Chunk 1: integers 0-500
Chunk 2: integers 600-1000
Chunk 3: integers 400-700
Create object_x
as a list
containing all of its comprising integers 0-1000
. 创建
object_x
作为包含其所有整数0-1000
的list
。 When a chunk
is downloaded, remove all of the integers that comprise the chunk
from object_x
. 当一个
chunk
被下载,删除所有包含整数的chunk
从object_x
。 Keep doing this until object_x
is empty (known to be complete then). 继续执行此操作,直到
object_x
为空(然后完成)为止。
object_x = range(0,1000)
# download chunk 1
chunk = range(0, 500)
for number in chunk:
if number in object_x:
object_x.remove(number)
# repeat for every downloaded chunk
This method is very memory intensive. 此方法非常占用内存。 The script throws a MemoryError if
object_x
or chunk
is too large. 如果
object_x
或chunk
太大,脚本将引发MemoryError。
I'm searching for a better way to keep track of the chunks to build the object_x
. 我正在寻找一种更好的方法来跟踪构建
object_x
的块。 Any ideas? 有任何想法吗? I'm using Python, but language doesn't matter I guess.
我使用的是Python,但我猜语言并不重要。
This is the kind of scenario where streaming is very important. 在这种情况下,流非常重要。 Doing everything in memory is a bad idea because you might not have enough memory (as in your case).
在内存中执行所有操作不是一个好主意,因为您可能没有足够的内存(例如您的情况)。 You should probably save the chunks to disk, keep track of how many you downloaded, and when you reach 1000, process them on disk (or load them into memory one by one to process them).
您可能应该将这些块保存到磁盘上,跟踪已下载的块数,并在达到1000个块时在磁盘上进行处理(或将它们逐个加载到内存中进行处理)。
" C# Security: Computing File Hashes " is a recent article I wrote - it's a different subject, but it does illustrate the importance of streaming towards the end. “ C#安全:计算文件哈希 ”是我最近写的一篇文章-这是一个不同的主题,但确实说明了流式传输到最后的重要性。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.