简体   繁体   English

python multiprocessing pool.map 不阻塞?

[英]python multiprocessing pool.map not blocking?

I'm trying to parallelize some web requests in python using multiprocessing , but it appears that occasionally, all of the functions I send to map do not complete.我正在尝试使用multiprocessing在 python 中并行化一些 web 请求,但似乎偶尔,我发送到map所有函数都没有完成。

These results appear whether I'm using python 2 or 3.无论我使用的是 python 2 还是 python 3,这些结果都会出现。

Test script:测试脚本:

#!/usr/bin/env python

import multiprocessing

def my_print(string):
    print(string)

all_strings = ["alpaca", "bear", "cat", "dog", "elephant", "frog"]

pool = multiprocessing.Pool()
pool.map(my_print, all_strings)

I run it like so:我像这样运行它:

for i in `seq 1 50`; do ./test.py | wc -l; done | sort | uniq -c

And my results will look like:我的结果将如下所示:

6 5
44 6

...so most of the time all 6 executions of the function are running, but occasionally, only 5 of them will run until the overall script completes execution. ...所以大部分时间该函数的所有 6 次执行都在运行,但偶尔,只有 5 次会运行,直到整个脚本完成执行。 I expect there to be 50 6 as a result (aka, all functions getting executed on every run).我希望结果是50 6 (也就是说,每次运行时都会执行所有函数)。

The documentation https://docs.python.org/2/library/multiprocessing.html#multiprocessing.pool.multiprocessing.Pool.map says It blocks until the result is ready.文档https://docs.python.org/2/library/multiprocessing.html#multiprocessing.pool.multiprocessing.Pool.map说它It blocks until the result is ready. I assumed that to mean All functions will complete before we move to the next line of code .我认为这意味着All functions will complete before we move to the next line of code

Am I misunderstanding that?我误解了吗? Does using a pool require you to always call pool.close() and pool.join() to ensure the tasks are complete?使用池是否需要您始终调用pool.close()pool.join()以确保任务完成?

Edit: I'm running on AWS, if that makes any obvious difference - a coworker told me I should mention that.编辑:我在 AWS 上运行,如果这有任何明显的不同 - 一位同事告诉我我应该提到这一点。

Thanks very much in advance!首先十分感谢!

All workers run their functions and return any values before map returns.所有工作人员都运行他们的函数并在map返回之前返回任何值。 That is true.那是真实的。 But that doesn't mean you will see all strings immediately.但这并不意味着您会立即看到所有字符串。

You have multiple worker processes trying to write to the same file/terminal.您有多个工作进程试图写入同一个文件/终端。 To make that work you might have to import sys and call sys.stdout.flush() after every print() in the worker process.要完成这项工作,您可能必须在工作进程中的每个print()之后import sys并调用sys.stdout.flush()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM