简体   繁体   English

使用 swig,如何将二进制数据从 python 传递到 C/C++?

[英]Using swig, how do I pass binary data from python to C/C++?

If I have a C or C++ function like this in my my_module.h:如果我的 my_module.h 中有这样的 C 或 C++ function:

void my_function(const char* data, int len);

And my_module.c:和 my_module.c:

#include "my_module.h"

#include <unistd.h>

void my_function(const char* data, int len)
{
    // do fancy things with data
    write(1, data, len);
}

I will create a my_module.i for swig like this:我将为 swig 创建一个 my_module.i,如下所示:

%inline %{
#include "my_module.h"
%}
%include "my_module.h"

I build like this:我这样构建:

swig -python -module my_module my_module.i

gcc -shared -fPIC my_module_wrap.c my_module.c -I/usr/include/python3.8 -lpython3.8 -o _my_module.so

Now from python I want to do this:现在从 python 我想这样做:

import my_module

in_file = open("my_binary_file", "rb")
bytes = in_file.read()
in_file.close()

my_module.my_function(bytes, len(bytes))

But I get:但我得到:

TypeError: in method 'find_mistakes', argument 1 of type 'char const *'

I inspected type of the variable bytes我检查了变量bytes的类型

>>> type(bytes)
<class 'bytes'>

I don't know what that means.我不知道那是什么意思。 How do I just get the raw data to pass to C?如何将原始数据传递给 C?

I don't want to convert the bytes to string, because it is not text.我不想将字节转换为字符串,因为它不是文本。 And when I tried to have convert it anyway the C side received text that encodes the binary instead of the raw binary.当我尝试转换它时,C 端接收到的文本编码二进制而不是原始二进制。 Something that looks like this:看起来像这样的东西:

b'\x\x00\n'

Edit:编辑:

Reading the manual I found this:阅读手册我发现了这一点:

The char * datatype is handled as a NULL-terminated ASCII string. char * 数据类型作为以 NULL 结尾的 ASCII 字符串处理。 SWIG maps this into a 8-bit character string in the target scripting language. SWIG 将其映射为目标脚本语言中的 8 位字符串。 SWIG converts character strings in the target language to NULL terminated strings before passing them into C/C++. SWIG 将目标语言中的字符串转换为 NULL 终止的字符串,然后再将它们传递给 C/C++。 The default handling of these strings does not allow them to have embedded NULL bytes.这些字符串的默认处理不允许它们嵌入 NULL 字节。 Therefore, the char * datatype is not generally suitable for passing binary data.因此,char * 数据类型一般不适合传递二进制数据。 However, it is possible to change this behavior by defining a SWIG typemap.但是,可以通过定义 SWIG 类型映射来更改此行为。 See the chapter on Typemaps for details about this.有关这方面的详细信息,请参阅 Typemaps 章节。

So swig says char* is good for text, but not for binary data.所以 swig 说char*适用于文本,但不适用于二进制数据。 What is the alternative to char* that works then?那么char*的替代方法是什么?

It suggests changing swig's default behavior using typemaps.它建议使用类型映射来更改 swig 的默认行为。 Is this the only way?这是唯一的方法吗? If so, how is it done?如果是这样,它是如何完成的? And will it be a different hack for each language?每种语言都会有不同的技巧吗?

After many hours of frustration, I finally did it.经过几个小时的挫折,我终于做到了。 Hopefully this answer saves other people some time and frustration.希望这个答案可以节省其他人一些时间和挫败感。

Note that this answer works for python 3. I don't know how things would work with python 2.请注意,此答案适用于 python 3。我不知道 python 2 会如何工作。

From the docs :文档

In some cases, users may wish to instead handle all byte strings as bytes objects in Python 3. This can be accomplished by adding SWIG_PYTHON_STRICT_BYTE_CHAR to the generated code:在某些情况下,用户可能希望将所有字节字符串作为 Python 3 中的字节对象处理。这可以通过将 SWIG_PYTHON_STRICT_BYTE_CHAR 添加到生成的代码中来完成:

This means you just need this in your interface file:这意味着您只需要在接口文件中使用它:

%begin %{
#define SWIG_PYTHON_STRICT_BYTE_CHAR
%}

This will modify the behavior so that only Python 3 bytes objects will be accepted and converted to a C/C++ string, and any string returned from C/C++ will be converted to a bytes object in Python.这将修改行为,以便仅接受 Python 3 字节对象并将其转换为 C/C++ 字符串,并且从 C/C++ 返回的任何字符串都将转换为 ZA7F5F35426B967411FC9231 中的字节 object

Bonus:奖金:

If you don't want to pass both the binary and the length from python you can do this in the interface file:如果您不想同时传递二进制文件和 python 的长度,您可以在接口文件中执行此操作:

%include "typemaps.i"

// change "(const char* data, int len)" to match your functions declaration
%apply (char *STRING, size_t LENGTH) { (const char* data, int len) }

%include "my_module.h"

Now from python you can do this:现在从 python 你可以这样做:

my_module.my_function(bytes) instead of my_module.my_function(bytes, len(bytes)) my_module.my_function(bytes)而不是my_module.my_function(bytes, len(bytes))

So my final interface file looks like this:所以我最终的接口文件是这样的:

%module my_module

%begin %{
#define SWIG_PYTHON_STRICT_BYTE_CHAR
%}

%inline %{
#include "my_module.h"
%}

%include "typemaps.i"

%apply (char *STRING, size_t LENGTH) { (const char* data, int len) }

%include "my_module.h"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM