简体   繁体   English

如果两个单独的进程未使用多处理队列,则使用内存

[英]Memory use if multiprocessing queue is not used by two separate processes

I have a thread in my python program that acquires images from a webcam and puts them in a multiprocessing queue. 我的python程序中有一个线程,该线程从网络摄像头获取图像并将其放入多处理队列。 A separate process then takes these images from the queue and does some processing. 然后,一个单独的过程从队列中获取这些图像并进行一些处理。 However, if I try to empty the queue from the image acquisition (producer) thread I do not free any memory, and the program eventually uses all the available memory and crashes the machine (Python 3.6.6 / Ubuntu 18.04 64bit / Linux 4.15.0-43-generic) 但是,如果我尝试从图像获取(生产者)线程中清空队列,则不会释放任何内存,该程序最终会使用所有可用内存并使机器崩溃(Python 3.6.6 / Ubuntu 18.04 64bit / Linux 4.15。 0-43泛型)

I have a simple working example that reproduces the problem. 我有一个简单的工作示例,重现了该问题。

import multiprocessing
import time
import numpy as np

queue_mp = multiprocessing.Queue(maxsize=500)

def producer(q):
    while True:
        # Generate object to put in queue
        dummy_in = np.ones((1000,1000))

        # If the queue is full, get the oldest object (FIFO),
        # to make space for the latest incoming object.
        if q.full():
            __ = q.get()
        q.put(dummy_in)


def consumer(q):
    while True:
        # Get object from queue
        dummy_out = q.get()

        # Do some processing on the object, which we simulate here by time.sleep
        time.sleep(3)

producer_process = multiprocessing.Process(target=producer,
                                           args=(queue_mp,),
                                           daemon=False)

consumer_process = multiprocessing.Process(target=consumer,
                                           args=(queue_mp,),
                                           daemon=False)

# Start producer and consumer processes
producer_process.start()
consumer_process.start()

I can rewrite my code to avoid this problem, but I'd like to understand what is happening. 我可以重写代码来避免此问题,但是我想了解发生了什么。 Is there a general rule that producers and consumers of a multiprocessing queue must be running in separate processes? 是否存在一个通用规则,即多处理队列的生产者和使用者必须在单独的进程中运行?

If anyone understands why this happens, or what exactly is happening behind the scenes of multiprocessing queues that would explain this memory behavior I would appreciate it. 如果有人理解为什么会这样,或者在多处理队列的幕后究竟发生了什么,这可以解释这种内存行为,我将不胜感激。 The docs did not go into a lot of detail. 该文档没有详细介绍。

I figured out what was happening, so I'll post it here for the benefit of anyone that stumbles across question. 我知道发生了什么,所以我将其发布在这里,以帮助那些偶然发现问题的人。

My memory problem resulted from a numpy bug in numpy version 1.16.0. 我的内存问题是由numpy版本1.16.0中的numpy 错误引起的。 Reverting to numpy version 1.13.3 resolved the problem. 恢复为numpy版本1.13.3解决了该问题。

To answer the basic question: No, there is no need to worry which thread/process is doing the consuming ( get ) and which thread/process is doing the producing ( put ) for multiprocessing queues. 要回答一个基本问题:不需要,不必担心哪个线程/进程正在执行(多线程)消费( get ),哪个线程/进程正在进行多进程队列的生产( put )。 There is nothing special about multiprocessing queues with respect to garbage collection. 关于垃圾收集的多处理队列没有什么特别的。 As kindall explains in response to a similar question : 正如kindall在回答类似问题时所解释的:

When there are no longer any references to an object, the memory it occupies is freed immediately and can be reused by other Python objects 当不再有任何对对象的引用时,它所占用的内存将立即释放,并可以由其他Python对象重用

I hope that helps someone. 希望对您有所帮助。 In any case, the numpy bug should be resolved in the 1.16.1 release. 无论如何,numpy错误应在1.16.1版本中解决。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM