简体   繁体   English

如何将具有 0 值字节的 char * 转换为 python 字符串?

[英]How do you convert a char * with 0-value bytes into a python string?

Using the ctypes module I can easily import a POINTER(c_char) or a c_char_p type into python, but neither of these provides a way to end up with a python string that contains zero value bytes.使用 ctypes 模块,我可以轻松地将 POINTER(c_char) 或 c_char_p 类型导入到 python 中,但这些都无法提供以包含零值字节的 python 字符串结束的方法。

c_char_p is zero terminated, meaning that a char * array from C is terminated at the first zero value. c_char_p 以零终止,这意味着来自 C 的 char * 数组在第一个零值处终止。

POINTER(c_char) is the recommended way of importing binary data that can have 0 values, but there doesn't seem to be a way to directly convert this into a python string. POINTER(c_char) 是导入可以具有 0 值的二进制数据的推荐方法,但似乎没有办法将其直接转换为 python 字符串。

I can do this:我可以做这个:

pixels = clibblah.get_pixels()
a = ""
for i in range(0, clibblah.get_pixel_length()):
    a += pixels[i]

...but this 1) doesn't seem very pythony, and 2) takes forever (converting a 640x480 block of pixel data takes about 2 seconds on my mac). ...但是这 1) 似乎不是很 pythony,并且 2) 需要永远(在我的 mac 上转换 640x480 像素数据块大约需要 2 秒)。

I've seen a bunch of questions regarding this on stack overflow, but darned if I can see one that isn't either people going "why do you need to do that?"我在堆栈溢出上看到了很多关于这个的问题,但是如果我看到一个不是任何人都在说“你为什么需要这样做?” or "c_char_p will do what you want" (it doesn't, as I've described above).或者“c_char_p 会做你想做的事”(它不会,正如我上面所描述的)。

The only credible advice I've seen is to use the c api PyString_FromStringAndSize, as recommended here: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/FAQ.html我看到的唯一可靠的建议是使用 c api PyString_FromStringAndSize,如这里推荐的那样: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/FAQ.html

Can't really see how that helps though, because afaik that's a cython feature, not a python one.虽然无法真正看到这有什么帮助,因为 afaik 这是一个 cython 功能,而不是 python 功能。

For the interested, the reason I need to do this is I'm working with panda3d and a kinect, and the kinect c api provides an array of unsigned char * values and the panda3d api lovingly provides a setPixels() call that only takes a python string as an argument.对于感兴趣的人,我需要这样做的原因是我正在使用 panda3d 和 kinect,而 kinect c api 提供了一个 unsigned char * 值数组,而 panda3d api 亲切地提供了一个 setPixels() 调用,它只需要一个python 字符串作为参数。

As you said, use a POINTER(c_char) to get a pointer to the array of binary data.如您所说,使用POINTER(c_char)获取指向二进制数据数组的指针。 To put that together into a string, you can just take a slice of it, since array indexing works as expected with ctypes pointers:要将它们组合成一个字符串,您可以只取它的一部分,因为数组索引按预期使用 ctypes 指针工作:

clibblah = ctypes.cdll.LoadLibrary('clibblah.dylib')
get_pixels = clibblah.get_pixels
get_pixels.restype = ctypes.POINTER(ctypes.c_char)

pixels = get_pixels()
num_pixels = clibblah.get_pixel_length()

# Slice the ctypes array into a Python string
a = pixels[:num_pixels]

There are a few different methods.有几种不同的方法。 I like ctypes.string_at because it isn't finicky: it works regardless of whether you supply a c_char_p type, or a pointer-to- c_char , or a void pointer type, or even just an int address.我喜欢ctypes.string_at ,因为它不挑剔:无论您提供的是c_char_p类型、指向c_char的指针、void 指针类型,还是只是一个int地址,它都能正常工作。

s = b'hello\x00world' # create a string containing null bytes
sz = len(s)
from ctypes import *

p = c_char_p(s) # obtain a pointer of various types
p2 = cast(p,POINTER(c_char))
address = cast(p,c_void_p).value

print p.value # by default it is interpreted as null-terminated

print p2[:sz] # various methods of explicitly specifying the full length
print string_at(p,size=sz)
print (c_char * sz).from_address(address).raw

I don't know what the best answer to the main question is, but here are a few comments about how PyString_FromStringAndSize could be used to accomplish what you want.我不知道主要问题的最佳答案是什么,但这里有一些关于如何使用PyString_FromStringAndSize来完成你想要的事情的评论。

PyString_FromStringAndSize is part of the Python C API: http://docs.python.org/c-api/string.html PyString_FromStringAndSize是 Python C API: http://docs.python.org/c-api/string.html的一部分

That means you can use this to这意味着您可以使用它来

  • Write a Python module in C/C++ in which you define a new Python data type your C-derived strings-with-null-characters在 C/C++ 中编写一个 Python 模块,您在其中定义一个新的 Python 数据类型您的 C 派生字符串和空字符
  • You could define that data type so it provides a Python constructor that accepts arguments that somehow contain a pointer to the C-string in question.您可以定义该数据类型,以便它提供一个 Python 构造函数,该构造函数接受 arguments 以某种方式包含指向相关 C 字符串的指针。 If nothing helps, the argument the constructor accepts could be a c_void_p from cytpes.如果没有任何帮助,构造函数接受的参数可能是来自c_void_p的 c_void_p。
  • The constructor you define (in C/C++) would have to store a pointer to the C-string in a member variable.您定义的构造函数(在 C/C++ 中)必须将指向 C 字符串的指针存储在成员变量中。 It might also make a copy and/or increase reference counts etc. Since the constructor is written in C/C++, anything that is possible there is possible in that constructor.它还可能进行复制和/或增加引用计数等。由于构造函数是用 C/C++ 编写的,因此该构造函数中可能存在任何可能。

You would have to build this into a.dll/.pyd library and then can import it into any Python code.您必须将其构建到 a.dll/.pyd 库中,然后才能import其导入任何 Python 代码中。

Admittedly, a rather complicated procedure.不可否认,这是一个相当复杂的过程。 Hopefully somebody else suggests a simpler way, perhaps based on ctypes directly.希望其他人建议一种更简单的方法,也许直接基于 ctypes。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM