简体   繁体   English

Python扩展(Boost.Python和Py ++)和dlopen混淆

[英]Python extension (Boost.Python & Py++) and dlopen confusion

I'm wrapping a C++ project with Py++/Boost.Python under Windows and Linux. 我正在Windows和Linux下用Py ++ / Boost.Python包装一个C ++项目。 Everything in Windows is working fine, but I'm a bit confused over the behavior in Linux. Windows中的所有功能都可以正常工作,但是我对Linux的行为感到有些困惑。 The C++ project is built into a single shared library called libsimif, but I'd like to split it up into 3 separate extension modules. C ++项目内置于一个名为libsimif的共享库中,但我想将其拆分为3个独立的扩展模块。 For simplicity, I'll only discuss two of them, since the behavior for the third is identical. 为了简单起见,我将只讨论其中的两个,因为第三个的行为是相同的。 The first, called storage contains definitions of data structures. 第一个称为存储,包含数据结构的定义。 It has no dependencies on anything defined in either of the other two extension modules. 它不依赖于其他两个扩展模块中的任何一个。 The second module, control, uses data structures that are defined in storage. 第二个模块,控件,使用存储中定义的数据结构。 On the C++ side of things, the headers and source files for storage and control are in entirely different directories. 在C ++方面,用于存储和控制的头文件和源文件位于完全不同的目录中。 I've tried a number of different configurations to build the extensions, but one thing that has remained consistent is that for storage, I am only generating Py++ wrappers for the headers included in the storage directory and only building source files in that directory along with the Py++ generated sources. 我已经尝试了多种不同的配置来构建扩展,但是保持一致的一件事是,对于存储,我只为存储目录中包含的标头生成Py ++包装,并且仅在该目录中构建源文件以及Py ++生成的源。 Ditto for the control extension. 控件扩展的同上。

The current configuration that I am using that works passes in libsimif as a library to the distutils.Extension constructor. 我正在使用的当前配置可以在libsimif中作为库传递给distutils.Extension构造函数。 Then before starting Python, I need to ensure that libsimif is found in LD_LIBRARY_PATH. 然后,在启动Python之前,我需要确保在LD_LIBRARY_PATH中找到libsimif。 Then I can launch Python and import either module (or from them) and everything works as-expected. 然后,我可以启动Python并导入其中一个模块(或从中导入),一切按预期进行。 Here is some sample output from this working configuration: 这是此工作配置的一些示例输出:

>>> import ast.simif.model_io.storage as storage
>>> import ast.simif.model_io.control as control
>>> dir(storage)
['DiscreteStore', 'PulseStore', 'RtStore', 'SerialStore', 'SharedMemoryBuilder', 'SharedMemoryDeleter', 'SpaceWireStore', '__doc__', '__file__', '__name__', '__package__']
>>> dir(control)
['DiscreteController', 'ModelIoController', 'PulseController', 'RtController', 'SerialController', 'SpaceWireController', '__doc__', '__file__', '__name__', '__package__']
>>> storage.__file__
'ast/simif/model_io/storage.so'
>>> control.__file__
'ast/simif/model_io/control.so'

As you can see, both modules have their own shared library and unique set of symbols. 如您所见,这两个模块都有自己的共享库和唯一的符号集。 Now here is why I am confused. 现在,这就是为什么我感到困惑。 In Linux, we've always set the dlopen flags to include RTLD_NOW and RTLD_GLOBAL. 在Linux中,我们始终将dlopen标志设置为包括RTLD_NOW和RTLD_GLOBAL。 If I do that, this is what happens: 如果我这样做,将会发生以下情况:

>>> import sys
>>> import DLFCN
>>> sys.setdlopenflags(DLFCN.RTLD_NOW | DLFCN.RTLD_GLOBAL)
>>> import ast.simif.model_io.storage as storage
>>> import ast.simif.model_io.control as control
__main__:1: RuntimeWarning: to-Python converter for DiscreteStore::FrameData already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for PulseStore::FrameData already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for RtStore::Link already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for RtStore::FrameData already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for RtStore::RtData already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for SerialStore::FrameData already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for SharedMemoryBuilder already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for SharedMemoryDeleter already registered; second conversion method ignored.
>>> dir(storage)
['DiscreteStore', 'PulseStore', 'RtStore', 'SerialStore', 'SharedMemoryBuilder', 'SharedMemoryDeleter', 'SpaceWireStore', '__doc__', '__file__', '__name__', '__package__']
>>> dir(control)
['DiscreteStore', 'PulseStore', 'RtStore', 'SerialStore', 'SharedMemoryBuilder', 'SharedMemoryDeleter', '__doc__', '__file__', '__name__', '__package__']
>>> storage.__file__
'ast/simif/model_io/storage.so'
>>> control.__file__
'ast/simif/model_io/control.so'

So, here storage imports ok, but control complains about a bunch of duplicate registrations. 因此,这里的存储导入正常,但是控件抱怨有一堆重复的注册。 Then when inspecting the modules, control is completely wrong. 然后,在检查模块时,控制是完全错误的。 It's like it tried to import storage twice even though file reports the correct shared libraries. 就像它试图两次导入存储一样,即使文件报告了正确的共享库。 Perhaps not surprising, if I change the import order and import control ahead of storage, this is what happens: 也许不足为奇,如果我在存储之前更改了导入顺序和导入控制,就会发生以下情况:

>>> import sys
>>> import DLFCN
>>> sys.setdlopenflags(DLFCN.RTLD_NOW | DLFCN.RTLD_GLOBAL)
>>> import ast.simif.model_io.control as control
>>> dir(control)
['DiscreteController', 'ModelIoController', 'PulseController', 'RtController', 'SerialController', 'SpaceWireController', '__doc__', '__file__', '__name__', '__package__']
>>> import ast.simif.model_io.storage as storage
__main__:1: RuntimeWarning: to-Python converter for DiscreteController already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for PulseController already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for RtController already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for SerialController already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for SpaceWireController already registered; second conversion method ignored.
>>> dir(storage)
['DiscreteController', 'ModelIoController', 'PulseController', 'RtController', 'SerialController', 'SpaceWireController', 'SpaceWireStore', '__doc__', '__file__', '__name__', '__package__']

Similar behavior, but now the storage import is FUBAR. 行为类似,但是现在存储导入为FUBAR。 Does anyone understand what is going on here? 有人知道这里发生了什么吗?

I'm using: 我正在使用:

  • x64 Python 2.6.6 on x64 RHEL6. x64 RHEL6上的x64 Python 2.6.6。 Gcc version 4.4.6 Gcc版本4.4.6
  • x64 Python 2.6.5 on x64 RHEL5. x64 RHEL5上的x64 Python 2.6.5。 Gcc version 4.1.2 Gcc版本4.1.2

Turns out this was actually due to a quirk with how Boost.Python registration code is generated when using balanced_split_module in Py++. 事实证明,这实际上是由于在Py ++中使用balanced_split_module时如何生成Boost.Python注册代码而引起的。 balanced_split_module basically splits up all of the registration code into a fixed number of source files, each with its own registration function. balanced_split_module基本上将所有注册代码分割为固定数量的源文件,每个文件都有其自己的注册功能。 The source files are named using the extension name plus the generated file number (eg _.cpp, but the gotcha is that the actual functions they contain do not contain the extension name and are just a simple register_1(), register_2(), etc. This is find and dandy when you are only importing a single module or not making making a module's symbols global. What happens in this case is that when you set RTLD_GLOBAL the first module import successfully, but then all subsequent modules will call the registration functions that were loaded in as part of the initial module. 源文件使用扩展名加上生成的文件号来命名(例如_.cpp,但要注意的是它们包含的实际函数不包含扩展名,而只是简单的register_1(),register_2()等)。当您仅导入单个模块或不使模块符号全局化时,这很麻烦,在这种情况下,发生的情况是,当您设置RTLD_GLOBAL时,第一个模块成功导入,但是随后的所有后续模块将调用注册函数作为初始模块的一部分加载的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM