简体   繁体   English

使用两个消费者运行 Python 多处理队列

[英]Running a Python multiprocessing Queue with two consumers

I just began to learn about Python multiprocessing.我刚刚开始了解 Python 多处理。 For my first exercise I am trying to create a simple Queue with two consumers.在我的第一个练习中,我尝试创建一个包含两个消费者的简单队列。 Each consumer gets an element from the queue, processes it, and prints the result to stdout.每个消费者从队列中获取一个元素,对其进行处理,并将结果打印到标准输出。

Here's what I tried (takes a bunch from an example I tried in the Python standard library):这是我尝试过的(取自我在 Python 标准库中尝试过的示例):

import random
import time
from multiprocessing import Queue, Process


stop_sentinel = "STOP"


def consumer(in_q: Queue, name: str) -> None:
    for func, args in iter(in_q, stop_sentinel):
        print(f"Process {name}, result: {func(*args)}")
        time.sleep(0.5 * random.random())


def fn(x: int) -> str:
    if x % 3:
        return "Fizz"
    if x % 5:
        return "Buzz"
    if x % 15:
        return "FizzBuzz"
    return str(x)


def main():
    proc_q = Queue()

    for i in range(20):
        inputs = (fn, (i + 1,))
        proc_q.put(inputs)

    proc_q.put(stop_sentinel)
    proc_q.put(stop_sentinel)

    p1 = Process(target=consumer, args=())
    p2 = Process(target=consumer, args=())

    p1._args = (proc_q, p1.name)
    p2._args = (proc_q, p2.name)

    p1.start()
    p2.start()


if __name__ == '__main__':
    main()

However when I run this it fails immediately, without processing a single element.但是,当我运行它时,它立即失败,没有处理单个元素。 This is the stack trace:这是堆栈跟踪:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/synchronize.py", line 110, in __setstate__
    self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/synchronize.py", line 110, in __setstate__
    self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory

What am I doing wrong?我究竟做错了什么?

I am having some difficulty accounting for your specific error message (I get a different one with your code), but clearly your use of iter in function consumer is not correct.我在解释您的特定错误消息时遇到了一些困难(我从您的代码中得到了不同的错误消息),但显然您在 function consumer中使用iter是不正确的。 When you use this function with two arguments as you are doing, then the first argument must be a callable object that takes no arguments, ie it should be a function, specifically in this case in_q.get .当您将此 function 与两个 arguments 一起使用时,第一个参数必须是可调用的 object,它不带 arguments,即它应该是 function,特别是在这种情况下in_q.get

import random
import time
from multiprocessing import Queue, Process


stop_sentinel = "STOP"


def consumer(in_q: Queue, name: str) -> None:
    for func, args in iter(in_q.get, stop_sentinel):
        print(f"Process {name}, result: {func(*args)}")
        time.sleep(0.5 * random.random())


def fn(x: int) -> str:
    if x % 3:
        return "Fizz"
    if x % 5:
        return "Buzz"
    if x % 15:
        return "FizzBuzz"
    return str(x)


def main():
    proc_q = Queue()

    for i in range(20):
        inputs = (fn, (i + 1,))
        proc_q.put(inputs)

    proc_q.put(stop_sentinel)
    proc_q.put(stop_sentinel)

    p1 = Process(target=consumer, args=())
    p2 = Process(target=consumer, args=())

    p1._args = (proc_q, p1.name)
    p2._args = (proc_q, p2.name)

    p1.start()
    p2.start()


if __name__ == '__main__':
    main()

Prints:印刷:

Process Process-1, result: Fizz
Process Process-2, result: Fizz
Process Process-1, result: Buzz
Process Process-2, result: Fizz
Process Process-1, result: Fizz
Process Process-1, result: Buzz
Process Process-2, result: Fizz
Process Process-1, result: Fizz
Process Process-2, result: Buzz
Process Process-2, result: Fizz
Process Process-2, result: Fizz
Process Process-1, result: Buzz
Process Process-2, result: Fizz
Process Process-1, result: Fizz
Process Process-1, result: 15
Process Process-2, result: Fizz
Process Process-1, result: Fizz
Process Process-2, result: Buzz
Process Process-2, result: Fizz
Process Process-1, result: Fizz

Some Notes一些注意事项

You have no control over how the two processes will be dispatched by the operating system and therefore which process will "get" which items that have been put on the queue.您无法控制操作系统如何分派这两个进程,因此无法控制哪个进程将“获取”已放入队列的项目。 If you rerun the program you will probably see different processes process different items.如果您重新运行该程序,您可能会看到不同的进程处理不同的项目。

Object attributes that begin with an underscore, such as the _args attribute of the Process object are generally to be though of as "private" and when you use them you run the risk that tomorrow's new version of Python may no longer use this attribute or use it in a different way.下划线开头的_args属性,例如Process object 的 _args 属性,通常被认为是“私有”的,当您使用它们时,您冒着明天的新版本 Python 可能不再使用此属性或使用的风险它以不同的方式。 Consequently, I would personally use the following method for identifying processes:因此,我个人会使用以下方法来识别流程:

import random
import time
from multiprocessing import Queue, Process
import os


stop_sentinel = "STOP"


def consumer(in_q: Queue) -> None:
    pid = os.getpid()
    for func, args in iter(in_q.get, stop_sentinel):
        print(f"Process {pid}, result: {func(*args)}")
        time.sleep(0.5 * random.random())


def fn(x: int) -> str:
    if x % 3:
        return "Fizz"
    if x % 5:
        return "Buzz"
    if x % 15:
        return "FizzBuzz"
    return str(x)


def main():
    proc_q = Queue()

    for i in range(20):
        inputs = (fn, (i + 1,))
        proc_q.put(inputs)

    proc_q.put(stop_sentinel)
    proc_q.put(stop_sentinel)

    p1 = Process(target=consumer, args=(proc_q,))
    p2 = Process(target=consumer, args=(proc_q,))

    p1.start()
    p2.start()

    # Explicitly wait for tasks to complete:
    p1.join()
    p2.join()


if __name__ == '__main__':
    main()

Prints:印刷:

Process 16672, result: Fizz
Process 15304, result: Fizz
Process 15304, result: Buzz
Process 16672, result: Fizz
Process 15304, result: Fizz
Process 16672, result: Buzz
Process 15304, result: Fizz
Process 16672, result: Fizz
Process 16672, result: Buzz
Process 15304, result: Fizz
Process 16672, result: Fizz
Process 15304, result: Buzz
Process 16672, result: Fizz
Process 16672, result: Fizz
Process 15304, result: 15
Process 15304, result: Fizz
Process 16672, result: Fizz
Process 16672, result: Buzz
Process 15304, result: Fizz
Process 16672, result: Fizz

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM