简体   繁体   中英

Import c-modules from embedded Python interpreter (pybind11) in a shared object raises an undefined symbol exception

Update (1) : The same problem can be seen with some compiled stdlib modules. This is not related to numpy (I'm removing the numpy tag and numpy from the title)

I'm writing a shared object (that is a plugin for a software) that contains an embedded python interpreter. The shared object launches an interpreter and the interpreter imports a python module to be executed. If the imported module includes numpy, I get an undefined symbol error. The actual undefined symbol error changes in function of the python version or numpy version, but it is always a struct of the PyExc_* family.

I've simplified the issue to this mimimum example (it comprises actually two files):

// main.cc
#include "pybind11/embed.h"
namespace py = pybind11;

extern "C" {
int main() {
  py::scoped_interpreter guard{};
  auto py_module = py::module::import("numpy");
  auto version   = py_module.attr("__version__");
  py::print(version);
  return 0;
}
}

// load.cc
#include <dlfcn.h>

int main() {
  void * lib = dlopen("./libissue.so", RTLD_NOW);
  int(*fnc)(void) = (int(*)(void))dlsym(lib, "main");
  fnc();
  dlclose(lib);
  return 0;
}

that I'm compiling with this CMakeFile:

cmake_minimum_required(VERSION 3.14)

include(FetchContent)
FetchContent_Declare(
  pybind11
  GIT_REPOSITORY https://github.com/pybind/pybind11
  GIT_TAG v2.8.1)
FetchContent_MakeAvailable(pybind11)

project(
  pybind_issue
  LANGUAGES C CXX
  VERSION 1.0.0)

add_library(issue SHARED main.cc)
set_target_properties(issue PROPERTIES 
  POSITION_INDEPENDENT_CODE ON 
  CXX_STANDARD 11)
target_link_libraries(issue PRIVATE pybind11::embed)
# also tested with
# target_link_libraries(main PRIVATE mylib pybind11::lto pybind11::embed pybind11::module)

add_executable(issue_main main.cc)
set_target_properties(issue_main PROPERTIES 
  POSITION_INDEPENDENT_CODE ON
  CXX_STANDARD 11)
target_link_libraries(issue_main PRIVATE pybind11::embed)

add_executable(loader load.cc)
target_link_libraries(loader PRIVATE ${CMAKE_DL_LIBS})

This CMakeFile compiles three targets:

  • an executable that loads the interpreter, imports numpy and prints its version
  • a shared object that exports a C function that does exactly the same thing
  • a simple loader for the shared object, that tries to run the exported function "main" from the shared object.

If I run the issue_main executable, I get the numpy version on screen correctly. If I run loader I get this error:

terminate called after throwing an instance of 'pybind11::error_already_set'
  what():  ImportError: 


    https://numpy.org/devdocs/user/troubleshooting-importerror.html

  * The Python version is: Python3.8 from "/usr/bin/python3"
  * The NumPy version is: "1.20.3"

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: /usr/local/lib/python3.8/dist-packages/numpy/core/_multiarray_umath.cpython-38-x86_64-linux-gnu.so: undefined symbol: PyExc_RecursionError


At:
  /usr/local/lib/python3.8/dist-packages/numpy/core/__init__.py(51): <module>
  <frozen importlib._bootstrap>(219): _call_with_frames_removed
  <frozen importlib._bootstrap_external>(848): exec_module
  <frozen importlib._bootstrap>(686): _load_unlocked
  <frozen importlib._bootstrap>(975): _find_and_load_unlocked
  <frozen importlib._bootstrap>(991): _find_and_load
  <frozen importlib._bootstrap>(219): _call_with_frames_removed
  <frozen importlib._bootstrap>(1050): _handle_fromlist
  <frozen importlib._bootstrap>(219): _call_with_frames_removed
  <frozen importlib._bootstrap>(961): _find_and_load_unlocked

irb(main):003:1* module TestMain
=> #<FFI::Function address=0x00007f9d0ba43bb6>
irb(main):008:0> 
irb(main):009:0> TestMain.main
terminate called after throwing an instance of 'pybind11::error_already_set'
  what():  ImportError: 

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.

We have compiled some common reasons and troubleshooting tips at:

    https://numpy.org/devdocs/user/troubleshooting-importerror.html

Please note and check the following:

  * The Python version is: Python3.8 from "/usr/bin/python3"
  * The NumPy version is: "1.20.3"

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: /usr/local/lib/python3.8/dist-packages/numpy/core/_multiarray_umath.cpython-38-x86_64-linux-gnu.so: undefined symbol: PyExc_RecursionError


At:
  /usr/local/lib/python3.8/dist-packages/numpy/core/__init__.py(51): <module>
  <frozen importlib._bootstrap>(219): _call_with_frames_removed
  <frozen importlib._bootstrap_external>(848): exec_module
  <frozen importlib._bootstrap>(686): _load_unlocked
  <frozen importlib._bootstrap>(975): _find_and_load_unlocked
  <frozen importlib._bootstrap>(991): _find_and_load
  <frozen importlib._bootstrap>(219): _call_with_frames_removed
  <frozen importlib._bootstrap>(1050): _handle_fromlist
  /usr/local/lib/python3.8/dist-packages/numpy/__init__.py(145): <module>
  <frozen importlib._bootstrap>(219): _call_with_frames_removed
  <frozen importlib._bootstrap_external>(848): exec_module
  <frozen importlib._bootstrap>(686): _load_unlocked
  <frozen importlib._bootstrap>(975): _find_and_load_unlocked
  <frozen importlib._bootstrap>(991): _find_and_load
  <frozen importlib._bootstrap>(219): _call_with_frames_removed
  <frozen importlib._bootstrap>(961): _find_and_load_unlocked
  <frozen importlib._bootstrap>(991): _find_and_load

The problem is specific on linux (not tested on OSX), while everything goes as expected on Windows (the code changes a bit, reported here for completeness):

// main.cc
#include "pybind11/embed.h"
namespace py = pybind11;

extern "C" {
__declspec(dllexport) int main() {
  py::scoped_interpreter guard{};
  auto py_module = py::module::import("numpy");
  auto version   = py_module.attr("__version__");
  py::print(version);
  return 0;
}
}
// load.cc
#include <windows.h>

int main() {
  HMODULE lib = LoadLibrary("./issue.dll");
  int(*fnc)(void) = (int(*)(void))GetProcAddress(lib, "main");
  fnc();
  FreeLibrary(lib);
  return 0;
}

Is there something that I'm missing?

Notes :

  • My first though is a bug in pybind cmake, which is why I issued this bug report
  • My problem seems similar to the one described in this bug report , but I'm not sure, and I'm not sure it is even a bug
  • The problem is similar to the one described here , but I don't think I'm loading the interpreter more than once in the minimal example. I think I have seen a SO question related to the same problem with the same solution (do not load the interpreter more than once), but I cannot find the reference now.
  • I've tested with several numpy version (from 1.19 to 1.22, installed from Ubuntu repository, installed from pip, and locally built), but the problem remained. Only the undefined symbol changed (but always a PyExc_ )
  • Tested with python3.6 and 3.8 in Ubuntu 18.04 and Ubuntu 20.04
  • Tested on pybind 2.6, 2.7, 2.8.1
  • I tired to link to python static library, but it was not compiled with -fPIC thus compilation fails...

Notes on Update (1) : this appears not to be tied to only numpy. If I import decimal (a stdlib numeric class with a c-module component) I get the a similar error:

#include "pybind11/embed.h"
namespace py = pybind11;

extern "C" {
int main() {
  py::scoped_interpreter guard{};
  auto py_module = py::module::import("decimal");
  auto version   = py_module.attr("__name__");
  py::print(version);
  return 0;
}
}

Gives me

terminate called after throwing an instance of 'pybind11::error_already_set'
  what():  ImportError: /usr/lib/python3.8/lib-dynload/_contextvars.cpython-38-x86_64-linux-gnu.so: undefined symbol: PyContextVar_Type

At:
  /usr/lib/python3.8/contextvars.py(1): <module>
  <frozen importlib._bootstrap>(219): _call_with_frames_removed
  <frozen importlib._bootstrap_external>(848): exec_module
  <frozen importlib._bootstrap>(686): _load_unlocked
  <frozen importlib._bootstrap>(975): _find_and_load_unlocked
  <frozen importlib._bootstrap>(991): _find_and_load
  /usr/lib/python3.8/_pydecimal.py(440): <module>
  <frozen importlib._bootstrap>(219): _call_with_frames_removed
  <frozen importlib._bootstrap_external>(848): exec_module
  <frozen importlib._bootstrap>(686): _load_unlocked
  <frozen importlib._bootstrap>(975): _find_and_load_unlocked
  <frozen importlib._bootstrap>(991): _find_and_load
  /usr/lib/python3.8/decimal.py(8): <module>
  <frozen importlib._bootstrap>(219): _call_with_frames_removed
  <frozen importlib._bootstrap_external>(848): exec_module
  <frozen importlib._bootstrap>(686): _load_unlocked
  <frozen importlib._bootstrap>(975): _find_and_load_unlocked
  <frozen importlib._bootstrap>(991): _find_and_load

[1]    3095287 abort (core dumped)  ./loader

I've found a solution. Knowing that it was not tied to numpy halped quite a lot to switch the focus on the real problem: symbol missing. Taking the suggestion from this answer and in particular this point:

Solve a problem. Load the library found in step 1 by dlopen first (use RTLD_GLOBAL there as well).

I've modified the minimum example as follows:

// main.cc
#include "pybind11/embed.h"
#include <dlfcn.h>
namespace py = pybind11;

extern "C" {
void * python;

int create() {
  python = dlopen("/usr/lib/x86_64-linux-gnu/libpython3.8.so", RTLD_NOW | RTLD_GLOBAL);
  return 0;
}

int destroy() {
  dlclose(python);
  return 0;
}

int main() {
  py::scoped_interpreter guard{};
  auto py_module = py::module::import("numpy");
  auto version   = py_module.attr("__version__");
  py::print(version);
  return 0;
}
}
// load.cc
#include <dlfcn.h>

int main() {
  void * lib = dlopen("./libissue.so", RTLD_NOW | RTLD_DEEPBIND);
  int(*fnc)(void) = (int(*)(void))dlsym(lib, "main");
  int(*create)(void) = (int(*)(void))dlsym(lib, "create");
  int(*destroy)(void) = (int(*)(void))dlsym(lib, "destroy");
  create();
  fnc();
  destroy();
  dlclose(lib);
  return 0;
}

(obviously in cmake I had to add ${CMAKE_DL_LIBS} as target link library for issue target).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM