[英]Passing a string from C to Python to multiprocessing without making an extra copy
I have a C application that embeds the Python 2.7 interpreter. 我有一个嵌入Python 2.7解释器的C应用程序。 At some point in my program, a potentially large string (
char*
) is generated and needs to be processed by some Python code. 在我的程序中的某个时候,会生成一个可能很大的字符串(
char*
),并且需要使用某些Python代码进行处理。 I use PyObject_CallFunction
to call the Python function and pass the string as an argument. 我使用
PyObject_CallFunction
调用Python函数并将字符串作为参数传递。 This Python function then uses the multiprocessing
library to analyze the data in a separate process. 然后,此Python函数使用
multiprocessing
库在一个单独的进程中分析数据。
Passing the string to the Python function will create a copy of the data in a Python str
object. 将字符串传递给Python函数将在Python
str
对象中创建数据的副本。 I tried to avoid this extra copy by passing a buffer
object to the Python function. 我试图通过将
buffer
对象传递给Python函数来避免这种额外的复制。 Unfortunately, this generates an error in the multiprocessing
process during unpickling: 不幸的是,这会在解腌过程中在
multiprocessing
处理过程中产生错误:
TypeError: buffer() takes at least 1 argument (0 given)
It seems as though buffer
objects can be pickled, but not unpickled. 似乎可以对
buffer
对象进行腌制,但不能对其进行腌制。
Any suggestions on passing the char*
from C to the multiprocessing
function without making an extra copy? 关于将
char*
从C传递到multiprocessing
函数而不作额外复制的任何建议?
Approach that worked for me: 对我有用的方法:
Before you create your big C string, allocate memory for it using Python: 在创建大C字符串之前,请使用Python为它分配内存:
PyObject *pystr = PyString_FromStringAndSize(NULL, size);
char *str = PyString_AS_STRING(pystr);
/* now fill <str> with <size> bytes */
This way, when the time comes to pass it to Python, you don't have to create a copy: 这样,当需要将其传递给Python时,您不必创建副本:
PyObject *result = PyObject_CallFunctionObjArgs(callable, pystr, NULL);
/* or PyObject_CallFunction(callable, "O", pystr) if you prefer */
Note that you shouldn't modify the string once this is done. 请注意,完成此操作后,您不应修改字符串。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.