簡體   English   中英

通過 Python 在 MPI_Init 中啟動 Open MPI 時出錯

[英]Error when starting Open MPI in MPI_Init via Python

我正在嘗試通過 python 訪問帶有 OpenMPI 的共享庫,但由於某種原因,我收到以下錯誤消息:

[Geo00433:01196] mca: base: component_find: unable to open /usr/li/openmpi/lib/openmpi/mca_paffinity_hwloc: perhaps a missing symbol, or compiled for a different version of Open MPI? (ignored)
[Geo00433:01196] mca: base: component_find: unable to open /usr/lib/openmpi/lib/openmpi/mca_carto_auto_detect: perhaps a missing symbol, or compiled for a different version of Open MPI? (ignored)
[Geo00433:01196] mca: base: component_find: unable to open /usr/lib/openmpi/lib/openmpi/mca_carto_file: perhaps a missing symbol, or compiled for a different version of Open MPI? (ignored)
[Geo00433:01196] mca: base: component_find: unable to open /usr/lib/openmpi/lib/openmpi/mca_shmem_mmap: perhaps a missing symbol, or compiled for a different version of Open MPI? (ignored)
[Geo00433:01196] mca: base: component_find: unable to open /usr/lib/openmpi/lib/openmpi/mca_shmem_posix: perhaps a missing symbol, or compiled for a different version of Open MPI? (ignored)
[Geo00433:01196] mca: base: component_find: unable to open /usr/lib/openmpi/lib/openmpi/mca_shmem_sysv: perhaps a missing symbol, or compiled for a different version of Open MPI? (ignored)
-------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here is some additional information (which may only be relevant to an
Open MPI developer):

  opal_shmem_base_select failed
    --> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
[Geo00433:01196] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 79
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here is some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
[Geo00433:1196] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!

任何線索是什么原因? 我已經檢查了很多網頁,但不知何故還找不到解決我的問題的方法。

我安裝了 Ubuntu 15.10 和 mpich 以及 open-mpi。

非常感謝伙計們!

我在 Ubuntu 16.04 上遇到了同樣的問題(或非常相似,但錯誤消息略有不同),即使只安裝了 Open MPI。 據我所知,來自 Ubuntu 的 mpi4py 包的構建方式存在問題,但我不確定它到底是什么。

復制:由於問題並沒有完全清楚錯誤消息是如何產生的(我沒有編輯它的聲譽),這是我得到它的方式。 首先安裝Ubuntu的mpi4py包,然后進入python環境:

$ sudo apt-get install mpi
$ python

在 python 中,嘗試以下操作:

>>> from mpi4py import MPI

然后,您應該會收到與 OP 類似的錯誤消息。

解決方案:這是我如何讓它工作的。 首先卸載Ubuntu的包:

$ sudo apt-get remove mpi4py

然后安裝 Open MPI 頭文件(下一步涉及構建 mpi4py)和 pip:

$ sudo apt-get install libopenmpi-dev python-pip

最后安裝mpi4py:

$ sudo pip install mpi4py

如果您嘗試上面的 python 命令,它現在應該可以正常工作。

正如 Hristo Iliev 指出的那樣,錯誤消息確實與不同的 .so 文件有關。 在編譯我使用的程序時,編譯器在我的 Linux 機器上發現了“錯誤”的 OpenMPI,即通過明確指定使用 OpenMPI,問題解決了。

謝謝你們的幫助!

當我使用自己用 SWIG 包裝的 MPI 的 python 接口時,我也遇到了類似的錯誤。 如上所述,此錯誤可能與同一台機器上不同版本的 MPI 實現有關(例如,您計算機上的 OpenMPI 和 MPICH)。

我通過編譯和安裝新版本的 MPICH 解決了這個問題。 然后更改.bashrc的環境變量並使用新的 mpicxx 或 mpicc 編譯我自己的程序。 錯誤就會消失。

嘗試在 Ubuntu 16.04 LTS 上使用 mpi4py 時遇到了類似的錯誤。 就我而言,該錯誤與mpicc 包裝器不在我的搜索路徑中有關。

我為解決問題所做的工作如下

  • 卸載您當前的 mpi4py

$ sudo pip卸載mpi4py

  • 找到你的 mpicc 的路徑

$ which mpicc

$ sudo env MPICC=/path/to/mpicc pip install mpi4py

之后錯誤消息消失了,我可以用 python 運行 MPI

首先嘗試卸載您的 mpi4py

>>> pip uninstall mpi4py

那么

>>> conda install mpi4py

來源: https : //nyu-cds.github.io/python-mpi/setup/

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM