简体   繁体   中英

Passing audio data from Python to C with ctypes

I have a C++ library which performs analysis on audio data, and a C API to it. One of the C API functions takes const int16_t* pointers to the data and returns the results of the analysis.

I'm trying to build a Python interface to this API, and most of it is working, but I'm having trouble getting ctypes pointers to use as arguments for this function. Since the pointers on the C side are to const , it feels to me like it ought to be possible to make this work fine with any contiguous data. However, the following does not work:

import ctypes
import wave

_native_lib = ctypes.cdll.LoadLibrary('libsound.so')
_native_function = _native_lib.process_sound_data
_native_function.argtypes = [ctypes.POINTER(ctypes.c_int16),
                             ctypes.c_size_t]
_native_function.restype = ctypes.c_int

wav_path = 'hello.wav'

with wave.open(wav_path, mode='rb') as wav_file:
    wav_bytes = wav_file.readframes(wav_file.getnframes())

data_start = ctypes.POINTER(ctypes.c_int16).from_buffer(wav_bytes) # ERROR: data is immutable
_native_function(data_start, len(wav_bytes)//2)

Manually copying wav_bytes to a bytearray allows the pointer to be constructed but causes the native code to segfault, indicating that the address it receives is wrong (it passes unit tests with data read in from C++). Fixing this by getting the address right would technically solve the problem but I feel like there's a better way.

Surely it's possible to just get the address of some data and promise that it's the right format and won't be altered? I'd prefer not to have to deep copy all my Pythonically-stored audio data to a ctypes format, since presumably the bytes are in there somewhere if I can just get a pointer to them!

Ideally, I'd like to be able to do something like this

data_start = cast_to(address_of(data[0]), c_int16_pointer)
_native_function(data_start, len(data))

which would then work with anything that has a [0] and a len . Is there a way to do something like this in ctypes? If not, is there a technical reason why it's impossible, and is there something else I should be using instead?

This should work for you. Use array for a writable buffer and create a ctypes array that references the buffer.

data = array.array('h',wav_bytes)
addr,size = data.buffer_info()
arr = (c_short * size).from_address(addr)
_native_function(arr,size)

Alternatively, to skip the copy of wav_bytes into data array, you could lie about the pointer type in argtypes. ctypes knows how convert a byte string to a c_char_p . A pointer is just an address, so the _native_function will receive the address but use it as an int* internally:

_native_function.argtypes = c_char_p,c_size_t
_native_function(wav_bytes,len(wav_bytes) // 2)

Another way to work around the "underlying buffer is not writable" error is to leverage c_char_p , which allows an immutable byte string to used, and then explicitly cast it to the pointer type you want:

_native_function.argtypes = POINTER(c_short),c_size_t
p = cast(c_char_p(wav_bytes),POINTER(c_short))
_native_function(p,len(wav_bytes) // 2)

In these latter cases you must ensure you don't actually write to the buffer as it will corrupt the immutable Python object holding the data.

I had a look around at the CPython bug tracker to see if this had come up before, and it seems it was raised as an issue in 2011 . I agree with the poster that it's a serious mis-design, but it seems the developers at that time did not.

Eryk Sun's comment on that thread revealed that it's actually possible to just use ctypes.cast directly. Here is part of the comment:

cast calls ctypes._cast(obj, obj, typ) . _cast is a ctypes function pointer defined as follows:

 _cast = PYFUNCTYPE(py_object, c_void_p, py_object, py_object)(_cast_addr)

Since cast makes an FFI call that converts the first arg to c_void_p , you can directly cast bytes to a pointer type:

 >>> from ctypes import * >>> data = b'123\\x00abc' >>> ptr = cast(data, c_void_p)

It's a bit unclear to me if this is actually required by the standard or if it's just a CPython implementation detail, but the following works for me in CPython:

import ctypes
data = b'imagine this string is 16-bit sound data'
data_ptr = ctypes.cast(data, ctypes.POINTER(ctypes.c_int16))

The documentation on cast says the following:

ctypes.cast(obj, type)

This function is similar to the cast operator in C. It returns a new instance of type which points to the same memory block as obj. type must be a pointer type, and obj must be an object that can be interpreted as a pointer.

so it seems that that CPython is of the opinion that bytes 'can be interpreted as a pointer'. This seems fishy to me, but these modern pointer-hiding languages have a way of messing with my intuition.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM