[英]Python multiprocessing and sys.argv
Are sys.argv
values passed to the branches of multiprocessing? sys.argv
值是否传递到多处理的分支? What is the correct way of passing argv to all branches of the multiprocess?将 argv 传递给多进程的所有分支的正确方法是什么?
Let's suppose I have two files: test1.py:假设我有两个文件:test1.py:
import sys
if len(sys.argv) > 1:
env = sys.argv[1]
else:
env = 'test'
And main_code.py:和 main_code.py:
from test1 import *
import concurrent.futures
def f():
if env == 'test':
print('bu')
else:
print('not bu')
if __name__ == '__main__':
with concurrent.futures.ProcessPoolExecutor(max_workers=2) as executor:
for i in range(2):
executor.submit(f)
I invoke from cmd main.code.py: python main_code.py zzz
.我从 cmd main.code.py:
python main_code.py zzz
调用。 Is the sys.argv[1] variable (which is 'zzz') passed on each invocation of executor.submit(f)
as it was first obtained from import of text1.py? sys.argv[1] 变量(即“zzz”)是否在每次调用
executor.submit(f)
时传递,因为它是从 text1.py 的导入中首次获得的? My confusion comes from the fact that concurrent.futures basically creates separate instances of threads of code by re-importing all the files.我的困惑来自这样一个事实,即 concurrent.futures 基本上通过重新导入所有文件来创建单独的代码线程实例。
On Windows, the spawn context is the only way to create worker processes.在 Windows 上, 生成上下文是创建工作进程的唯一方法。
sys.argv
are copied to worker processes once. sys.argv
被复制到工作进程一次。
Not all files are re-imported.并非所有文件都重新导入。 Only the modules are required to unpickle the task function and arguments are imported.
只需要模块来解开任务function和 arguments 被导入。
In the worker, the original __main__
is actually called __mp_main__
.在worker中,原来的
__main__
实际上叫做__mp_main__
。 After copying sys.argv
, the worker import __mp_main__
, which import test
, so env
is set correctly.复制
sys.argv
后,worker 导入__mp_main__
,其中导入test
,因此env
设置正确。
Though multiprocessing
try to keep the environment similar, the worker process entry point is somewhere inside multiprocessing.spawn
.尽管
multiprocessing
试图保持环境相似,但工作进程入口点位于multiprocessing.spawn
中的某个位置。 Several items are mentioned there: sys.argv
, sys.path
, os.getcwd()
.那里提到了几个项目:
sys.argv
, sys.path
, os.getcwd()
。 See get_preparation_data()
and prepare()
for details.有关详细信息,请参阅
get_preparation_data()
和prepare()
。
It can be verified with Task Manager or ps
command that the worker process is started with different arguments.可以通过任务管理器或
ps
命令验证worker进程是以不同的arguments启动的。
I wrote a simple script called mp.py
to print the arguments by running python3 mp.py hello world
.我编写了一个名为
mp.py
的简单脚本,通过运行python3 mp.py hello world
来打印 arguments。
Output: Output:
29836 process ['C:/xxxx/stackoverflow/mp.py'] <module '__main__' from 'C:/xxxx/stackoverflow/mp.py'>
29836 my name is main
29836 true main <module '__main__' from 'C:/xxxx/stackoverflow/mp.py'>
18464 process ['C:\\xxxx\\stackoverflow\\mp.py'] <module '__main__' (built-in)>
18464 worker <module '__mp_main__' from 'C:\\xxxx\\stackoverflow\\mp.py'>
mp.py: mp.py:
from __future__ import annotations
import multiprocessing
import os
import sys
import time
from concurrent.futures import ProcessPoolExecutor
def list_modules(who_am_i):
the_main = sys.modules.get('__main__')
print(os.getpid(), who_am_i, the_main)
def main():
list_modules('true main')
mp_context = multiprocessing.get_context('spawn')
# mp_context = multiprocessing.get_context('fork')
# mp_context = multiprocessing.get_context('forkserver')
with ProcessPoolExecutor(1, mp_context=mp_context) as executor:
executor.submit(list_modules, 'worker').result()
time.sleep(100)
# This message is print when this module is loaded. (none in fork, once in forkserver, multiple times in spawn)
print(os.getpid(), "process", sys.argv, sys.modules.get('__main__'))
if __name__ == '__main__':
# Print once in the main process
print(os.getpid(), "my name is main")
main()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.