多处理中的Python巨大文件读取

Question

I have binary file that contains invariable number of images (size of each image 1024*768). 我有一个二进制文件，其中包含不变数量的图像（每个图像的大小为1024 * 768）。 I put each image to JoinableQueue and analyzed it in multiprocessing, and it works perfect with small files, but I get Memory Error when I try to read huge files. 我将每个图像放入JoinableQueue并在多处理中对其进行了分析，它非常适合小文件，但是当尝试读取大文件时出现内存错误。 Anybody know how can i store big files to bufer/Queue(as string)? 有人知道如何将大文件存储到缓冲区/队列（作为字符串）吗？ (unfortunately i can't use Manager or Pool) （不幸的是我不能使用管理器或池）

Answer 1

Did you have a look at the module io.BytesIO? 您是否看过模块io.BytesIO？ You can find it here: https://docs.python.org/release/3.1.3/library/io.html#binary-io You can set your Buffer size, that solved a memory problem for me once. 您可以在这里找到它： https : //docs.python.org/release/3.1.3/library/io.html#binary-io您可以设置缓冲区大小，这一次为我解决了内存问题。

Answer 2

You can read about buffer here . 您可以在此处阅读有关缓冲区的信息。
If your memory if small, you can try force gc like that: 如果您的内存较小，则可以尝试强制执行gc：

import gc

SIZE = 1024*768  
MEMOSIZE = 1024  # your memory size
with open('xxx', 'rb') as fp:  # open the file
    i = 0  # remember the number to gc in time
    queue = []
    while True:
        if (i*(SIZE-1) < MEMOSIZE):
            x = fp.read(SIZE)  # if your image is single channel
            queue.append(x)
            # do something
        else:
            del queue
            gc.collect()

多处理中的Python巨大文件读取

问题描述

2 个解决方案

解决方案1
0 2019-02-08 15:41:47

解决方案2
0 2019-02-08 15:55:59

多处理中的Python巨大文件读取

问题描述

2 个解决方案

解决方案1 0 2019-02-08 15:41:47

解决方案2 0 2019-02-08 15:55:59

解决方案1
0 2019-02-08 15:41:47

解决方案2
0 2019-02-08 15:55:59