[英]Python C extension link with a custom shared library
I am writing a Python C extension on a very old Red Hat system. 我正在一个非常古老的Red Hat系统上编写Python C扩展。 The system has zlib 1.2.3, which does not correctly support large files. 系统具有zlib 1.2.3,它不能正确支持大文件。 Unfortunately, I can't just upgrade the system zlib to a newer version, since some of the packages poke into internal zlib structures and that breaks on newer zlib versions. 不幸的是,我不能只是将系统zlib升级到更新的版本,因为一些软件包会进入内部zlib结构并且会破坏新的zlib版本。
I would like to build my extension so that all the zlib calls ( gzopen()
, gzseek()
etc.) are resolved to a custom zlib that I install in my user directory, without affecting the rest of the Python executable and other extensions. 我想构建我的扩展,以便所有zlib调用( gzopen()
, gzseek()
等)被解析为我安装在我的用户目录中的自定义zlib,而不会影响其余的Python可执行文件和其他扩展。
I have tried statically linking in libz.a
by adding libz.a
to the gcc command line during linking, but it did not work (still cannot create large files using gzopen()
for example). 我曾尝试在静态链接libz.a
加入libz.a
在连接过程中gcc的命令行,但它没有工作(仍然使用不能创建大文件gzopen()
为例)。 I also tried passing -z origin -Wl,-rpath=/path/to/zlib -lz
to gcc, but that also did not work. 我也尝试将-z origin -Wl,-rpath=/path/to/zlib -lz
, -z origin -Wl,-rpath=/path/to/zlib -lz
给gcc,但这也没有用。
Since newer versions of zlib are still named zlib 1.x
, the soname
is the same, so I think symbol versioning would not work. 由于较新版本的zlib仍然命名为zlib 1.x
,因此soname
是相同的,所以我认为符号版本控制不起作用。 Is there a way to do what I want to do? 有办法做我想做的事吗?
I am on a 32-bit Linux system. 我在32位Linux系统上。 Python version is 2.6, which is custom-built. Python版本是2.6,它是定制的。
Edit : 编辑 :
I created a minimal example. 我创建了一个最小的例子。 I am using Cython (version 0.19.1). 我正在使用Cython(版本0.19.1)。
File gztest.pyx
: 文件gztest.pyx
:
from libc.stdio cimport printf, fprintf, stderr
from libc.string cimport strerror
from libc.errno cimport errno
from libc.stdint cimport int64_t
cdef extern from "zlib.h":
ctypedef void *gzFile
ctypedef int64_t z_off_t
int gzclose(gzFile fp)
gzFile gzopen(char *path, char *mode)
int gzread(gzFile fp, void *buf, unsigned int n)
char *gzerror(gzFile fp, int *errnum)
cdef void print_error(void *gzfp):
cdef int errnum = 0
cdef const char *s = gzerror(gzfp, &errnum)
fprintf(stderr, "error (%d): %s (%d: %s)\n", errno, strerror(errno), errnum, s)
cdef class GzFile:
cdef gzFile fp
cdef char *path
def __init__(self, path, mode='rb'):
self.path = path
self.fp = gzopen(path, mode)
if self.fp == NULL:
raise IOError('%s: %s' % (path, strerror(errno)))
cdef int read(self, void *buf, unsigned int n):
cdef int r = gzread(self.fp, buf, n)
if r <= 0:
print_error(self.fp)
return r
cdef int close(self):
cdef int r = gzclose(self.fp)
return 0
def read_test():
cdef GzFile ifp = GzFile('foo.gz')
cdef char buf[8192]
cdef int i, j
cdef int n
errno = 0
for 0 <= i < 0x200:
for 0 <= j < 0x210:
n = ifp.read(buf, sizeof(buf))
if n <= 0:
break
if n <= 0:
break
printf('%lld\n', <long long>ifp.tell())
printf('%lld\n', <long long>ifp.tell())
ifp.close()
File setup.py
: 文件setup.py
:
import sys
import os
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
if __name__ == '__main__':
if 'CUSTOM_GZ' in os.environ:
d = {
'include_dirs': ['/home/alok/zlib_lfs/include'],
'extra_objects': ['/home/alok/zlib_lfs/lib/libz.a'],
'extra_compile_args': ['-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -g3 -ggdb']
}
else:
d = {'libraries': ['z']}
ext = Extension('gztest', sources=['gztest.pyx'], **d)
setup(name='gztest', cmdclass={'build_ext': build_ext}, ext_modules=[ext])
My custom zlib
is in /home/alok/zlib_lfs
(zlib version 1.2.8): 我的自定义zlib
位于/home/alok/zlib_lfs
(zlib版本1.2.8):
$ ls ~/zlib_lfs/lib/
libz.a libz.so libz.so.1 libz.so.1.2.8 pkgconfig
To compile the module using this libz.a
: 要使用此libz.a
编译模块:
$ CUSTOM_GZ=1 python setup.py build_ext --inplace
running build_ext
cythoning gztest.pyx to gztest.c
building 'gztest' extension
gcc -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/alok/zlib_lfs/include -I/opt/include/python2.6 -c gztest.c -o build/temp.linux-x86_64-2.6/gztest.o -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -g3 -ggdb
gcc -shared build/temp.linux-x86_64-2.6/gztest.o /home/alok/zlib_lfs/lib/libz.a -L/opt/lib -lpython2.6 -o /home/alok/gztest.so
gcc
is being passed all the flags I want (adding full path to libz.a
, large file flags, etc.). gcc
正在传递我想要的所有标志(添加libz.a
完整路径,大文件标志等)。
To build the extension without my custom zlib, I can compile without CUSTOM_GZ
defined: 要在没有我的自定义zlib的情况下构建扩展,我可以在没有定义CUSTOM_GZ
情况下进行编译:
$ python setup.py build_ext --inplace
running build_ext
cythoning gztest.pyx to gztest.c
building 'gztest' extension
gcc -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/opt/include/python2.6 -c gztest.c -o build/temp.linux-x86_64-2.6/gztest.o
gcc -shared build/temp.linux-x86_64-2.6/gztest.o -L/opt/lib -lz -lpython2.6 -o /home/alok/gztest.so
We can check the size of the gztest.so
files: 我们可以检查gztest.so
文件的大小:
$ stat --format='%s %n' original/gztest.so custom/gztest.so
62398 original/gztest.so
627744 custom/gztest.so
So, the statically linked file is much larger, as expected. 因此,静态链接文件比预期的要大得多。
I can now do: 我现在可以这样做:
>>> import gztest
>>> gztest.read_test()
and it will try to read foo.gz
in the current directory. 它会尝试在当前目录中读取foo.gz
When I do that using non-statically linked gztest.so
, it works as expected until it tries to read more than 2 GB. 当我使用非静态链接的gztest.so
执行此操作时,它会按预期工作,直到它尝试读取超过2 GB。
When I do that using statically linked gztest.so
, it dumps core: 当我使用静态链接的gztest.so
执行此操作时,它会转储核心:
$ python -c 'import gztest; gztest.read_test()'
error (2): No such file or directory (0: )
0
Segmentation fault (core dumped)
The error about No such file or directory
is misleading -- the file exists and is gzopen()
actually returns successfully. No such file or directory
的错误是误导性的 - 文件存在并且gzopen()
实际上成功返回。 gzread()
fails though. gzread()
失败了。
Here is the gdb
backtrace: 这是gdb
回溯:
(gdb) bt
#0 0xf730eae4 in free () from /lib/libc.so.6
#1 0xf70725e2 in ?? () from /lib/libz.so.1
#2 0xf6ce9c70 in __pyx_f_6gztest_6GzFile_close (__pyx_v_self=0xf6f75278) at gztest.c:1140
#3 0xf6cea289 in __pyx_pf_6gztest_2read_test (__pyx_self=<optimized out>) at gztest.c:1526
#4 __pyx_pw_6gztest_3read_test (__pyx_self=0x0, unused=0x0) at gztest.c:1379
#5 0xf769910d in call_function (oparg=<optimized out>, pp_stack=<optimized out>) at Python/ceval.c:3690
#6 PyEval_EvalFrameEx (f=0x8115c64, throwflag=0) at Python/ceval.c:2389
#7 0xf769a3b4 in PyEval_EvalCodeEx (co=0xf6faada0, globals=0xf6ff81c4, locals=0xf6ff81c4, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2968
#8 0xf769a433 in PyEval_EvalCode (co=0xf6faada0, globals=0xf6ff81c4, locals=0xf6ff81c4) at Python/ceval.c:522
#9 0xf76bbe1a in run_mod (arena=<optimized out>, flags=<optimized out>, locals=<optimized out>, globals=<optimized out>, filename=<optimized out>, mod=<optimized out>) at Python/pythonrun.c:1335
#10 PyRun_StringFlags (str=0x80a24c0 "import gztest; gztest.read_test()\n", start=257, globals=0xf6ff81c4, locals=0xf6ff81c4, flags=0xffbf2888) at Python/pythonrun.c:1298
#11 0xf76bd003 in PyRun_SimpleStringFlags (command=0x80a24c0 "import gztest; gztest.read_test()\n", flags=0xffbf2888) at Python/pythonrun.c:957
#12 0xf76ca1b9 in Py_Main (argc=1, argv=0xffbf2954) at Modules/main.c:548
#13 0x080485b2 in main ()
One of the problems seems to be that the second line in the backtrace refers to libz.so.1
! 其中一个问题似乎是回溯中的第二行是指libz.so.1
! If I do ldd gztest.so
, I get, among other lines: 如果我做ldd gztest.so
,我会得到,除其他之外:
libz.so.1 => /lib/libz.so.1 (0xf6f87000)
I am not sure why that is happening though. 我不知道为什么会发生这种情况。
Edit 2 : 编辑2 :
I ended up doing the following: 我最后做了以下事情:
z_
prefix. 使用z_
前缀导出的所有符号编译了我的自定义zlib。 zlib
's configure
script makes this very easy: just run ./configure --zprefix ...
. zlib
的configure
脚本使这很简单:只需运行./configure --zprefix ...
gzopen64()
instead of gzopen()
in my Cython code. 在我的Cython代码中调用gzopen64()
而不是gzopen()
。 This is because I wanted to make sure I am using the correct "underlying" symbol. 这是因为我想确保使用正确的“底层”符号。 z_off64_t
explicitly. 明确使用了z_off64_t
。 zlib.a
into the shared library generated by Cython. 将我的自定义zlib.a
静态链接到zlib.a
生成的共享库中。 I used '-Wl,--whole-archive /home/alok/zlib_lfs_z/lib/libz.a -Wl,--no-whole-archive'
while linking with gcc to achieve that. 我使用'-Wl,--whole-archive /home/alok/zlib_lfs_z/lib/libz.a -Wl,--no-whole-archive'
同时与gcc链接以实现这一目的。 There might be other ways or this might not be needed but it seemed the simplest way to make sure the correct library gets used. 可能还有其他方法可能不需要,但这似乎是确保使用正确库的最简单方法。 With the above changes, large files work while the rest of the Python extension modules/processes work as before. 通过上述更改,大型文件可以正常工作,而其余的Python扩展模块/进程也可以像以前一样工作。
I would recommend using ctypes
. 我建议使用ctypes
。 Write your C library as a normal shared library and than use ctypes
to access it. 将您的C库编写为普通的共享库,而不是使用ctypes
来访问它。 You would need to write a bit more Python code to transfer the data from Python data structures into C ones. 您需要编写更多Python代码才能将数据从Python数据结构传输到C数据结构中。 The big advantage is that you can isolate everything from the rest of the system. 最大的优点是您可以将所有内容与系统的其他部分隔离开来。 You can explicitly specify the *.so
file you would like to load. 您可以显式指定要加载的*.so
文件。 The Python C API is not needed. 不需要Python C API。 I have quite good experiences with ctypes
. 我对ctypes
有很好的经验。 This should be not too difficult for you since you seem proficient with C. 这对你来说应该不会太难,因为你似乎精通C语言。
Looks like this is similar to the problem in another question , except I get the opposite behavior. 看起来这与另一个问题中的问题类似,除了我得到相反的行为。
I downloaded a tarball of zlib-1.2.8
, ran ./configure
, then changed the following Makefile
variables... 我下载了zlib-1.2.8
的tarball,运行./configure
,然后更改了以下Makefile
变量...
CFLAGS=-O3 -fPIC -D_LARGEFILE64_SOURCE=1 -D_FILE_OFFSET_BITS=64
SFLAGS=-O3 -fPIC -D_LARGEFILE64_SOURCE=1 -D_FILE_OFFSET_BITS=64
...mostly to add the -fPIC
to libz.a
so I could link to it in a shared library. ...主要是将-fPIC
添加到libz.a
以便我可以在共享库中链接到它。
I then added some printf()
statements in the gzlib.c
functions gzopen()
, gzopen64()
, and gz_open()
so I could easily tell if these were being called. 然后我在gzlib.c
函数gzopen()
, gzopen64()
和gz_open()
添加了一些printf()
语句,这样我就可以很容易地判断它们是否被调用了。
After building libz.a
and libz.so
, I created a really simple foo.c
... 在构建libz.a
和libz.so
,我创建了一个非常简单的foo.c
...
#include "zlib-1.2.8/zlib.h"
void main()
{
gzFile foo = gzopen("foo.gz", "rb");
}
...and compiled both a foo
standalone binary, and a foo.so
shared library with... ...并编译了一个foo
独立二进制文件和一个foo.so
共享库...
gcc -fPIC -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -o foo.o -c foo.c
gcc -o foo foo.o zlib-1.2.8/libz.a
gcc -shared -o foo.so foo.o zlib-1.2.8/libz.a
Running foo
worked as expected, and printed... 运行foo
按预期工作,并打印...
gzopen64
gz_open
...but using the foo.so
in Python with... ...但是在Python中使用foo.so
...
import ctypes
foo = ctypes.CDLL('./foo.so')
foo.main()
...didn't print anything, so I guess it's using Python's libz.so
... ...没有打印任何东西,所以我猜它正在使用Python的libz.so
...
$ ldd `which python`
...
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f5af2c68000)
...
...even though foo.so
doesn't use it... ...即使foo.so
不使用它...
$ ldd foo.so
linux-vdso.so.1 => (0x00007fff93600000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc8bfa98000)
/lib64/ld-linux-x86-64.so.2 (0x00007fc8c0078000)
The only way I could get it to work was to open the custom libz.so
directly with... 我能让它工作的唯一方法是直接打开自定义libz.so
...
import ctypes
libz = ctypes.CDLL('zlib-1.2.8/libz.so.1.2.8')
libz.gzopen64('foo.gz', 'rb')
...which printed out... ......打印出......
gzopen64
gz_open
Note that the translation from gzopen
to gzopen64
is done by the pre-processor, so I had to call gzopen64()
directly. 请注意,从gzopen
到gzopen64
的转换是由预处理器完成的,所以我必须直接调用gzopen64()
。
So that's one way to fix it, but a better way would probably be to recompile your custom Python 2.6 to either link to the static zlib-1.2.8/libz.a
, or disable zlibmodule.c
completely, then you'll have more flexibility in your linking options. 所以这是修复它的一种方法,但更好的方法可能是将自定义Python 2.6重新编译为链接到静态zlib-1.2.8/libz.a
,或者完全禁用zlibmodule.c
,然后你会有更多灵活的链接选项。
Update 更新
Regarding _LARGEFILE_SOURCE
vs. _LARGEFILE64_SOURCE
: I only pointed that out because of this comment in zlib.h
... 关于_LARGEFILE_SOURCE
与_LARGEFILE64_SOURCE
:我只是因为zlib.h
的这个注释而指出了...
/* provide 64-bit offset functions if _LARGEFILE64_SOURCE defined, and/or
* change the regular functions to 64 bits if _FILE_OFFSET_BITS is 64 (if
* both are true, the application gets the *64 functions, and the regular
* functions are changed to 64 bits) -- in case these are set on systems
* without large file support, _LFS64_LARGEFILE must also be true
*/
...the implication being that the gzopen64()
function won't be exposed if you don't define _LARGEFILE64_SOURCE
. ...这意味着如果您没有定义_LARGEFILE64_SOURCE
则不会公开gzopen64()
函数。 I'm not sure if _LFS64_LARGEFILE
applies to your system or not. 我不确定_LFS64_LARGEFILE
适用于您的系统。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.