[英]Spawn processes using host file with mpi4py
我正在尝试使用MPI4py和OpenMPI在多个主机上生成一组辅助进程,但是spawn命令似乎忽略了我的主机文件。 我已经发布了完整的测试 ,但是下面是关键部分:
根据论坛讨论 ,我的经理脚本使用hostfile
选项调用spawn:
mpi_info = MPI.Info.Create()
mpi_info.Set("hostfile", "worker_hosts")
comm = MPI.COMM_SELF.Spawn(sys.executable,
args=['testworker.py'],
maxprocs=args.worker_count,
info=mpi_info).Merge()
在worker_hosts
文件中,列出了Scyld Beowulf群集中的节点:
myhead1 slots=2
mycompute1 slots=2
mycompute2 slots=2
mycompute3 slots=2
mycompute4 slots=3
管理者和工作人员都调用MPI.Get_processor_name()
,但是他们都报告“ myhead1”。 如果我将相同的主机文件与mpirun
一起使用,则可以使用:
> mpirun -hostfile worker_hosts -np 3 python -c "from mpi4py import MPI; print MPI.Get_processor_name()"
myhead1
myhead1
mycompute1
如果将主机文件的名称更改为不存在的名称(例如bogus_file
, bogus_file
收到错误消息:
--------------------------------------------------------------------------
Open RTE was unable to open the hostfile:
bogus_file
Check to make sure the path and filename are correct.
--------------------------------------------------------------------------
[Bulbasaur:86523] [[3458,0],0] ORTE_ERROR_LOG: Not found in file base/rmaps_base_support_fns.c at line 83
[Bulbasaur:86523] [[3458,0],0] ORTE_ERROR_LOG: Not found in file rmaps_rr.c at line 82
[Bulbasaur:86523] [[3458,0],0] ORTE_ERROR_LOG: Not found in file base/rmaps_base_map_job.c at line 88
[Bulbasaur:86523] [[3458,0],0] ORTE_ERROR_LOG: Not found in file base/plm_base_launch_support.c at line 105
[Bulbasaur:86523] [[3458,0],0] ORTE_ERROR_LOG: Not found in file plm_rsh_module.c at line 1173
因此,OpenMPI已经注意到hostfile
选项,但似乎没有使用它。 hostfile
选项在OpenMPI文档中列出。
Key Type Description
--- ---- -----------
host char * Host on which the process should be spawned.
See the orte_host man page for an
explanation of how this will be used.
hostfile char * Hostfile containing the hosts on which
the processes are to be spawned. See
the orte_hostfile man page for an
explanation of how this will be used.
如何为生成请求指定主机文件?
我找到了OpenMPI文档的更新版本,这给了我一个神奇的选择:
Key Type Description
--- ---- -----------
host char * Host on which the process should be
spawned. See the orte_host man
page for an explanation of how this
will be used.
hostfile char * Hostfile containing the hosts on which
the processes are to be spawned. See
the orte_hostfile man page for
an explanation of how this will be
used.
add-host char * Add the specified host to the list of
hosts known to this job and use it for
the associated process. This will be
used similarly to the -host option.
add-hostfile char * Hostfile containing hosts to be added
to the list of hosts known to this job
and use it for the associated
process. This will be used similarly
to the -hostfile option.
如果我更改为使用add-hostfile
,则可以完美运行:
mpi_info.Set("add-hostfile", "worker_hosts")
如果您被困在使用旧版本的OpenMPI,请尝试使用mpirun
和相同的主机文件运行管理器脚本。 当我仍在使用hostfile
选项时,这也起作用。
mpirun -hostfile worker_hosts -np1 python testmanager.py
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.