简体   繁体   English

"尽管队列已满,但 Python 多处理队列 get() 超时"

[英]Python multiprocessing queue get() timeout despite full queue

I am using Python's multiprocessing module to do scientific parallel processing.我正在使用 Python 的多处理模块来进行科学的并行处理。 In my code I use several working processes which does the heavy lifting and a writer process which persists the results to disk.在我的代码中,我使用了几个执行繁重工作的工作进程和一个将结果保存到磁盘的写入器进程。 The data to be written is send from the worker processes to the writer process via a Queue.要写入的数据通过队列从工作进程发送到写入进程。 The data itself is rather simple and solely consists of a tuple holding a filename and a list with two floats.数据本身相当简单,仅由一个包含文件名的元组和一个带有两个浮点数的列表组成。 After several hours of processing the writer process often would get stuck.经过几个小时的处理后,编写器进程通常会卡住。 More precisely the following block of code更准确地说,下面的代码块

while (True):
    try:
        item = queue.get(timeout=60)
        break
    except Exception as error:
        logging.info("Writer: Timeout occurred {}".format(str(error)))

will never exit the loop and I get continuous 'Timeout' messages.永远不会退出循环,我会收到连续的“超时”消息。

I also implemented a logging process which outputs, among others, the status of the queue and, even though I get the timeout error message above, a call to qsize() constantly returns a full queue (size=48 in my case).我还实现了一个日志记录过程,它输出队列的状态等,即使我收到上面的超时错误消息,对 qsize() 的调用也会不断返回一个完整的队列(在我的例子中是 size=48)。

I have thoroughly checked the documentation on the queue object and can find no possible explanation for why the get() returns timeouts while the queue is full at the same time.我已经彻底检查了有关队列对象的文档,但找不到可能的解释为什么 get() 在队列满时返回超时。

Any ideas?有任何想法吗?

Edit:编辑:

I modified the code to make sure I catch an empty queue exception:我修改了代码以确保捕获空队列异常:

while (True):
    try:
        item = queue.get(timeout=60)
        break
    except Empty as error:
        logging.info("Writer: Timeout occurred {}".format(str(error)))

In multiprocessing queue is used as synchronized message queue. 在多处理队列中用作同步消息队列。 This also seems to be the case in your problem. 在您的问题中似乎也是这种情况。 This however requires more than just call to get() method. 然而,这需要的不仅仅是调用get()方法。 After every task is processed you need to call task_done() so that element get removed from queue. 处理task_done()每个任务后,您需要调用task_done()以便从队列中删除该元素。

From documentation: 来自文档:

Queue.task_done() Queue.task_done()

Indicate that a formerly enqueued task is complete. 表示以前排队的任务已完成。 Used by queue consumer threads. 由队列使用者线程使用。 For each get() used to fetch a task, a subsequent call to task_done() tells the queue that the processing on the task is complete. 对于用于获取任务的每个get(),对task_done()的后续调用会告知队列该任务的处理已完成。

If a join() is currently blocking, it will resume when all items have been processed (meaning that a task_done() call was received for every item that had been put() into the queue). 如果join()当前正在阻塞,则它将在所有项目都已处理后恢复(这意味着已为每个已放入队列的项目收到task_done()调用)。

In documentation you will also find code example of proper threading queue usage. 在文档中,您还可以找到正确的线程队列使用的代码示例。

In case of your code it should be like this 如果您的代码应该是这样的

while (True):
    try:
        item = queue.get(timeout=60)
        if item is None:
            break
        # call working fuction here
        queue.task_done()
    except Exception as error:
        logging.info("Writer: Timeout occurred {}".format(str(error)))

You are catching a too generic Exception and assuming that it is a Timeout error. 您正在捕获一个过于通用的Exception并假设它是一个超时错误。

Try modifying the logic as follows: 尝试修改逻辑如下:

from Queue import Empty

while (True):
    try:
        item = queue.get(timeout=60)
        break
    except Empty as error:
        logging.info("Writer: Timeout occurred {}".format(str(error)))
        print(queue.qsize())

and see if the logline is still printed. 并查看是否仍然打印了日志。

Switching to manager based queue should help solve this issue. 切换到基于管理器的队列应该有助于解决此问题。

manager = Manager()
queue   = manager.Queue()

For more details you can check multiprocessing documentation here: https://docs.python.org/2/library/multiprocessing.html 有关更多详细信息,请在此处查看多处理文档: https//docs.python.org/2/library/multiprocessing.html

I modified the code to make sure I catch an empty queue exception我修改了代码以确保我捕获了一个空队列异常

Problem is not in exception handling.问题不在于异常处理。 Another problem is there.还有一个问题。

The Process does not share memory between others while the same problem will not be occurred with Thread s as those share memory. Process不会在其他人之间共享内存,而Thread不会发生相同的问题,因为它们共享内存。

Let's observe how does work following code with Process and Thread .让我们观察一下使用ProcessThread的代码是如何工作的。

Queue change is visible only for that Process where it was changed队列更改仅对更改它的进程可见

from queue import Queue
from multiprocessing import Process

q = Queue()

def set_data(q):
    q.put("hello")

def get_data(q):
    print(q.get())

p1 = Process(target=get_data, args=(q,))
p2 = Process(target=set_data, args=(q,))

# Wait for queue
p1.start()

# Put data in queue
p2.start()

OUTPUT: Nothing happens输出:什么都没有发生

Queue change visible outside of Thread线程外部可见的队列更改

from queue import Queue
from threading import Thread

q = Queue()

def set_data(q):
    q.put("hello")

def get_data(q):
    print(q.get())

t1 = Thread(target=get_data, args=(q,))
t2 = Thread(target=set_data, args=(q,))

# Wait for queue
t1.start()

# Put data in queue
t2.start()

OUTPUT: hello输出:你好

So you have to use multiprocessing.Manager if you have to use multiprocessing.Process .因此,如果必须使用multiprocessing.Manager ,则必须使用multiprocessing.Process

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM