简体   繁体   English

将FILE *传递给Python / ctypes中的函数

[英]Pass FILE * into function from Python / ctypes

I have a library function (written in C) that generates text by writing the output to FILE * . 我有一个库函数(用C编写),它通过将输出写入FILE *来生成文本。 I want to wrap this in Python (2.7.x) with code that creates a temp file or pipe, passes it into the function, reads the result from the file, and returns it as a Python string. 我想在Python(2.7.x)中使用创建临时文件或管道的代码将其包装,将其传递给函数,从文件中读取结果,并将其作为Python字符串返回。

Here's a simplified example to illustrate what I'm after: 这是一个简单的例子来说明我的目标:

/* Library function */
void write_numbers(FILE * f, int arg1, int arg2)
{
   fprintf(f, "%d %d\n", arg1, arg2);
}

Python wrapper: Python包装器:

from ctypes import *
mylib = CDLL('mylib.so')


def write_numbers( a, b ):
   rd, wr = os.pipe()

   write_fp = MAGIC_HERE(wr)
   mylib.write_numbers(write_fp, a, b)
   os.close(wr)

   read_file = os.fdopen(rd)
   res = read_file.read()
   read_file.close()

   return res

#Should result in '1 2\n' being printed.
print write_numbers(1,2)

I'm wondering what my best bet is for MAGIC_HERE() . 我想知道我最好的选择是MAGIC_HERE()

I'm tempted to just use ctypes and create a libc.fdopen() wrapper that returns a Python c_void_t, then pass that into the library function. 我很想使用ctypes并创建一个返回Python c_void_t的libc.fdopen()包装器,然后将其传递给库函数。 I'm seems like that should be safe in theory--just wondering if there are issues with that approach or an existing Python-ism to solve this problem. 我觉得这在理论上应该是安全的 - 只是想知道这种方法是否存在问题,或者现有的Python主题是否能解决这个问题。

Also, this will go in a long-running process (lets just assume "forever"), so any leaked file descriptors are going to be problematic. 此外,这将进入一个长期运行的过程(让我们假设“永远”),所以任何泄露的文件描述符将是有问题的。

First, do note that FILE* is an stdio-specific entity. 首先,请注意FILE*是特定于stdio的实体。 It doesn't exist at system level. 它在系统级别不存在。 The things that exist at system level are descriptors (retrieved with file.fileno() ) in UNIX ( os.pipe() returns plain descriptors already) and handles (retrieved with msvcrt.get_osfhandle() ) in Windows. 系统级中存在的东西是UNIX中的描述符(使用file.fileno()检索)( os.pipe()已经返回普通描述符)和处理(使用msvcrt.get_osfhandle()检索)。 Thus it's a poor choice as an inter-library exchange format if there can be more than one C runtime in action. 因此,如果可以有多个C运行时,那么它作为库间交换格式是一个糟糕的选择。 You'll be in trouble if your library is compiled against another C runtime than your copy of Python: 1) binary layouts of the structure may differ (eg due to alignment or additional members for debugging purposes or even different type sizes); 如果您的库是针对另一个C运行库而不是您的Python副本编译的,那么您将遇到麻烦:1)结构的二进制布局可能不同(例如,由于对齐或其他成员用于调试目的或甚至不同的类型大小); 2) in Windows, file descriptors that the structure links to are C-specific entities as well, and their table is maintained by a C runtime internally 1 . 2)在Windows中,结构链接到的文件描述符也是特定于C的实体,它们的表由内部的C运行时维护1

Moreover, in Python 3, I/O was overhauled in order to untangle it from stdio . 此外,在Python 3中,对I / O进行了大修,以便从stdio解开它。 So, FILE* is alien to that Python flavor (and likely, most non-C flavors, too). 因此, FILE*与Python风格不同(很可能也是大多数非C风味)。

Now, what you need is to 现在,你需要的是

  • somehow guess which C runtime you need, and 以某种方式猜测你需要哪个C运行时,以及
  • call its fdopen() (or equivalent). 调用它的fdopen() (或等价物)。

(One of Python's mottoes is "make the right thing easy and the wrong thing hard", after all) (毕竟,Python的一个座右铭 “让正确的事情变得容易,错误的事情变得更加困难”)


The cleanest method is to use the precise instance that the library is linked to (do pray that it's linked with it dynamically or there'll be no exported symbol to call) 最干净的方法是使用库链接到的精确实例(请祈祷它与动态链接或不会导出符号)

For the 1st item, I couldn't find any Python modules that can analyze loaded dynamic modules' metadata to find out which DLLs/so's it have been linked with (just a name or even name+version isn't enough, you know, due to possible multiple instances of the library on the system). 对于第一项,我找不到任何可以分析加载的动态模块的元数据的Python模块,以找出它已链接到哪些DLL(只是一个名称甚至名称+版本是不够的,你知道,由于系统上可能有多个库实例)。 Though it's definitely possible since the information about its format is widely available. 虽然它的格式信息可以广泛使用,但它绝对是可能的。

For the 2nd item, it's a trivial ctypes.cdll('path').fdopen ( _fdopen for MSVCRT). 对于第二项,它是一个简单的ctypes.cdll('path').fdopen_fdopen for MSVCRT)。


Second, you can do a small helper module that would be compiled against the same (or guaranteed compatible) runtime as the library and would do the conversion from the aforementioned descriptor/handle for you. 其次,你可以做一个小的帮助器模块,它将根据与库相同(或保证兼容)的运行时进行编译,并为你做上述描述符/句柄的转换。 This is effectively a workaround to editing the library proper. 这实际上是正确编辑库的一种解决方法。


Finally, there's the simplest (and the dirtiest) method using Python's C runtime instance (so all the above warnings apply in full) through Python C API available via ctypes.pythonapi . 最后,通过ctypes.pythonapi提供的Python C API,使用Python的C运行时实例(所有上述警告全部适用)是最简单(也是最脏的)方法。 It takes advantage of 它利用了

  • the fact that Python 2's file-like objects are wrappers over stdio 's FILE* (Python 3's are not) 事实上,Python 2的文件类对象是stdioFILE*包装器(Python 3不是)
  • PyFile_AsFile API that returns the wrapped FILE* (note that it's missing from Python 3 ) PyFile_AsFile API返回包装的FILE* (注意它在Python 3中缺失
    • for a standalone fd , you need to construct a file-like object first (so that there would be a FILE* to return ;) ) 对于独立的fd ,你需要首先构造一个类文件对象(这样就会有一个FILE*返回;))
  • the fact that id() of an object is its memory address (CPython-specific) 2 事物的id()是它的内存地址(CPython特定的) 2

     >>> open("test.txt") <open file 'test.txt', mode 'r' at 0x017F8F40> >>> f=_ >>> f.fileno() 3 >>> ctypes.pythonapi <PyDLL 'python dll', handle 1e000000 at 12808b0> >>> api=_ >>> api.PyFile_AsFile <_FuncPtr object at 0x018557B0> >>> api.PyFile_AsFile.restype=ctypes.c_void_p #as per ctypes docs, # pythonapi assumes all fns # to return int by default >>> api.PyFile_AsFile.argtypes=(ctypes.c_void_p,) # as of 2.7.10, long integers are #silently truncated to ints, see http://bugs.python.org/issue24747 >>> api.PyFile_AsFile(id(f)) 2019259400 

Do keep in mind that with fd s and C pointers, you need to ensure proper object lifetimes by hand! 请记住,使用fd和C指针,您需要手动确保适当的对象生命周期!

  • file-like objects returned by os.fdopen() do close the descriptor on .close() os.fdopen()返回的类文件对象会关闭.close()上的描述符
    • so duplicate descriptors with os.dup() if you need them after a file object is closed/garbage collected 如果在关闭/垃圾收集文件对象后需要它们,则使用os.dup()重复描述符
  • while working with the C structure, adjust the corresponding object's reference count with PyFile_IncUseCount() / PyFile_DecUseCount() . 在使用C结构时,使用PyFile_IncUseCount() / PyFile_DecUseCount()调整相应对象的引用计数。
  • ensure no other I/O on the descriptors/file objects since it would screw up the data (eg ever since calling iter(f) / for l in f , internal caching is done that's independent from stdio 's caching) 确保描述符/文件对象上没有其他I / O,因为它会搞砸数据(例如,自从for l in f调用iter(f) / for l in f ,内部缓存完全独立于stdio的缓存)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM