[英]`concurrent.futures.ProcessPoolExecutor` on Python is ran from beginning of file instead of the defined function
I have a trouble with concurrent.futures
.我对
concurrent.futures
有疑问。 For the short background, I was trying to do a massive image manipulation with python-opencv2
.对于简短的背景,我试图用
python-opencv2
进行大规模的图像处理。 I stumbled upon performance issue, which is a pain considering it can take hours to process only hundreds of image.我偶然发现了性能问题,考虑到处理数百张图像可能需要数小时,这很痛苦。 I found a solution by using
concurrent.futures
to utilize CPU multicores to make the process go faster (because I noticed while it took really long time to process, it only use like 16% of my 6-core processor, which is roughly a single-core).我找到了一个解决方案,通过使用
concurrent.futures
来利用 CPU 多核来加快处理速度(因为我注意到虽然处理时间很长,但它只使用了我的 6 核处理器的 16%,这大约是一个-核)。 So I created the code but then I noticed that the multiprocessing actually start from the beginning of the code instead of isolated around the function I just created.所以我创建了代码,但后来我注意到多处理实际上是从代码的开头开始的,而不是围绕我刚刚创建的函数进行隔离。 Here's the minimal working reproduction of the error:
这是错误的最小工作再现:
import glob
import concurrent.futures
import cv2
import os
def convert_this(filename):
### Read in the image data
img = cv2.imread(filename)
### Resize the image
res = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
res.save("output/"+filename)
try:
#create output dir
os.mkdir("output")
with concurrent.futures.ProcessPoolExecutor() as executor:
files = glob.glob("../project/temp/")
executor.map(convert_this, files)
except Exception as e:
print("Encountered Error!")
print(e)
filelist = glob.glob("output")
for f in filelist:
os.remove(f)
os.rmdir("output")
It gave me an error:它给了我一个错误:
Encountered Error!
Encountered Error!
[WinError 183] Cannot create a file when that file already exists: 'output'
Traceback (most recent call last):
File "M:\pythonproject\testfolder\test.py", line 17, in <module>
os.mkdir("output")
[WinError 183] Cannot create a file when that file already exists: 'output'
Encountered Error!
[WinError 183] Cannot create a file when that file already exists: 'output'
Traceback (most recent call last):
File "M:\pythonproject\testfolder\test.py", line 17, in <module>
os.mkdir("output")
FileExistsError: [WinError 183] Cannot create a file when that file already exists: 'output'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\<username>\Anaconda3\envs\py37\lib\multiprocessing\spawn.py", line 105, in spawn_main
Encountered Error!
[WinError 183] Cannot create a file when that file already exists: 'output'
Traceback (most recent call last):
File "M:\pythonproject\testfolder\test.py", line 17, in <module>
os.mkdir("output")
FileExistsError: [WinError 183] Cannot create a file when that file already exists: 'output'
...
(it was repeating errors of the same "can't create file")
As you see, the os.mkdir
was ran even though it's outside of the convert_this
function I just defined.如您所见,
os.mkdir
已运行,即使它位于我刚刚定义的convert_this
函数之外。 I'm not that new to Python but definitely new in multiprocessing and threading.我对 Python 并不陌生,但在多处理和线程方面绝对是新手。 Is this just how
concurrent.futures
behaves?这就是
concurrent.futures
的行为方式吗? Or am I missing some documentation reading?还是我错过了一些文档阅读?
Thanks.谢谢。
Yes, multiprocessing must load the file in the new processes before it can run the function (just as it does when you run the file yourself), so it runs all code you have written.是的,多处理必须先将文件加载到新进程中,然后才能运行该函数(就像您自己运行文件时一样),因此它会运行您编写的所有代码。 So, either (1) move your multiprocessing code to a separate file with nothing extra in it and call that, or (2) enclose your top level code in a function (eg,
main()
), and at the bottom of your file write因此,要么(1)将您的多处理代码移动到一个单独的文件中,其中没有任何额外内容并调用它,或者(2)将您的顶级代码包含在一个函数(例如
main()
)中,并在您的文件底部写
If __name__ == ”__main__":
main()
This code will only be run when you start the script, but not by the multiprocess-spawned version.此代码只会在您启动脚本时运行,而不是由多进程生成的版本运行。 See Python docs for details on this construction.
有关此构造的详细信息,请参阅Python 文档。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.