[英]enigma using python-multiprocessing related with if __name__ == '__main__'
I am using multiprocessing
to speed up my program and there is an enigma I can not solve. 我正在使用
multiprocessing
来加快程序运行速度,并且有一个我无法解决的难题。 I am using multiprocessing
to write a lot of short files (based on a lot of input files) with the function writing_sub_file
, and I finally concatenate all these files after the end of all the processes, using the function my_concat
. 我正在使用
multiprocessing
来编写带有函数writing_sub_file
的许多短文件(基于许多输入文件),最后我在所有进程结束后使用函数my_concat
将所有这些文件连接my_concat
。 Here are two samples of code. 这是两个代码示例。 Note that this code is in my
main .py
file, but the function my_concat
is imported from another module. 请注意,此代码在我的
main .py
文件中,但是函数my_concat
是从另一个模块导入的。 The first one: 第一个:
if __name__ == '__main__':
pool = Pool(processes=cpu_count())
arg_tuple = (work_path, article_dict, cat_len, date_to, time_period, val_matrix)
jobs = [(group, arg_tuple) for group in store_groups]
pool.apply_async(writing_sub_file, jobs)
pool.close()
pool.join()
my_concat(work_path)
which gives many errors (as many as there are processes in the pool) since It tries to apply my_concat
before all my processes are done (I don't give the stack of the error since It is very clear that my_concat
function tries to apply before every files have been written by the pool processes). 因为它尝试在所有进程完成之前应用
my_concat
,所以会给出很多错误(与池中的进程数量一样多)(因为很明显my_concat
函数会在尝试应用之前先应用my_concat
,所以我没有给出错误的堆栈)每个文件都已由池进程写入)。
The second one: 第二个:
if __name__ == '__main__':
pool = Pool(processes=cpu_count())
arg_tuple = (work_path, article_dict, cat_len, date_to, time_period, val_matrix)
jobs = [(group, arg_tuple) for group in store_groups]
pool.apply_async(writing_sub_file, jobs)
pool.close()
pool.join()
my_concat(work_path)
which works perfectly. 完美地运作。
Can someone explain me the reason? 有人可以解释一下原因吗?
In the second, my_concat(work_path)
is inside the if
statement, and is therefore only executed if the script is running as the main script. 在第二个中,
my_concat(work_path)
在if
语句内,因此仅在脚本作为主脚本运行时才执行。
In the first, my_concat(work_path)
is outside the if
statement. 首先,
my_concat(work_path)
在if
语句之外。 When multiprocessing
imports the module in a new Python session, it is not imported as __main__
but under its own name. 当
multiprocessing
在新的Python会话中导入__main__
,它不会以__main__
导入,而是以自己的名称导入。 Therefore this statement is run pretty much immediately, in each of your pool's processes, when your module is imported into that process. 因此,当您将模块导入到该进程时,该语句几乎在池的每个进程中立即运行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.