简体   繁体   中英

Porting a VBA Type / C struct to a Python ctypes.Structure: array of strings with fixed length

I am trying to port a piece of VBA code to Python. This effort includes calling a function in a Windows DLL. The function requires a pointer to a C struct (in VBA, they are called "Type") as a parameter. The mentioned struct contains strings of fixed length as well as arrays of strings of fixed length. I am struggling with finding a way to express this in Python using ctypes.

The original VBA code contains statements like this:

Public Type elements
    elementA As String * 48
    elementB(3) As String * 12
End Type

This may be represented in the following way in C, I think:

struct elements
{
    char elementA[48];
    char elementB[4][12];
}

What I have tried so far in Python:

import ctypes

class elements(ctypes.Structure):
    _fields_ = [
        ("elementA", ctypes.create_string_buffer(48)), 
        ("elementB", ctypes.create_string_buffer(12) * 4)
        ]

I can successfully declare elementA, though declaring elementB fails with

"TypeError: unsupported operand type(s) for *: 'c_char_Array_12' and 'int'"

How can this be done the right way?


UPDATE #1

I can successfully declare the following:

import ctypes

class elements(ctypes.Structure):
    _fields_ = [
        ("elementA", ctypes.c_char * 48), 
        ("elementB", ctypes.c_char * 12 * 4)
        ]

elementA exposed a "value" property, while I can not find a way to work with elementB. How can I read its contents or change them?


UPDATE #2

I think I understand the behaviour.

>>> e = elements()
>>> e.elementA
''
>>> e.elementA = 'test'
>>> e.elementA
'test'
>>> e.elementB
<__main__.c_char_Array_12_Array_4 object at 0x9878ecc>
>>> e.elementB[0][:] == '\x00' * 12
True
>>> e.elementB[0][:]
'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> e.elementB[0][:] = 'test'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Can only assign sequence of same size
>>> e.elementB[0][:] = 'test' + '\x00' * 8
>>> e.elementB[0][:]
'test\x00\x00\x00\x00\x00\x00\x00\x00'
>>> testB = 'abcde'
>>> e.elementB[0][:] = testB + '\x00' * ( ctypes.sizeof(e.elementB[0]) - len(testB) )
>>> e.elementB[0][:]
'abcde\x00\x00\x00\x00\x00\x00\x00'
>>> e.elementB[0][:].rstrip('\x00')
'abcde'
>>> e.elementB[0].value
'abcde'
>>> e.elementB[0].value = 'abcdef'
>>> e.elementB[0][:]
'abcdef\x00\x00\x00\x00\x00\x00'

(This question refers to Python 2.6 and 2.7.)

create_string_buffer is a convenience function to create a c_char array instance. However, a field definition requires a C type, not an instance. For example:

import ctypes

class elements(ctypes.Structure):
    _fields_ = [("elementA", ctypes.c_char * 48), 
                ("elementB", ctypes.c_char * 12 * 4)]

Say you have a C function defined as follows:

lib.func.argtypes = [ctypes.POINTER(elements)]

To call this function, pass an instance of elements using byref :

e = elements()
lib.func(ctypes.byref(e))

Accessing a 1-D c_char array field, such as elementA , is special cased to return a Python string. But accessing a 2-D array, such as elementB , returns a ctypes Array instance. In the case of elementB there are 4 rows, each containing 12 columns.

>>> len(e.elementB)
4
>>> map(len, e.elementB)
[12, 12, 12, 12]

sizeof returns the size of an array in bytes. For example the buffer for elementB consists of 48 c_char elements, which are 1 byte each:

>>> ctypes.sizeof(e.elementB)
48

The c_char arrays of elementB , as character arrays, are special cased to have value and raw attributes. Getting the value attribute creates a Python string that treats the array as a null-terminated C string. The raw attribute returns the entire length. You can also assign Python strings using these attributes, and both accept a string with a null in it.

>>> e.elementB[3].value = 'abc\x00def'
>>> e.elementB[3].value
'abc'
>>> e.elementB[3].raw
'abc\x00def\x00\x00\x00\x00\x00'

Or slice the array to get a substring:

>>> e.elementB[3][:]
'abc\x00def\x00\x00\x00\x00\x00'
>>> e.elementB[3][4:7]
'def'

c_wchar arrays only have the value attribute, which returns a unicode string. You can set value with either a unicode string or (in Python 2) an 8-bit string. An 8-bit string is decoded using the current ctypes encoding, which defaults to 'mbcs' on Windows and 'ascii' otherwise. set_conversion_mode (Python 2) sets the default encoding:

>>> s = (ctypes.c_wchar * 12)()
>>> v = u'\u0100'.encode('utf-8')
>>> v
'\xc4\x80'
>>> s.value = v
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 0: 
ordinal not in range(128)
>>> old_mode = ctypes.set_conversion_mode('utf-8', 'strict')
>>> old_mode
('ascii', 'strict')

Assigning '\\xc4\\x80' works now that the conversion encoding is set to UTF-8:

>>> s.value = v
>>> s.value
u'\u0100'
>>> s[:]
u'\u0100\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

Arrays are iterable:

for row in e.elementB:
    row[:] = 'abcdefghijkl'

>>> print '\n'.join(row[::-1] for row in e.elementB)
lkjihgfedcba
lkjihgfedcba
lkjihgfedcba
lkjihgfedcba

ctypes data types support Python's buffer protocol for inter-operation with other types:

>>> bytearray(e.elementB)
bytearray(b'abcdefghijklabcdefghijklabcdefghijklabcdefghijkl')

>>> import numpy as np
>>> np.frombuffer(e.elementB, dtype='uint8')
array([ 97,  98,  99, 100, 101, 102, 103, 104, 105, 106, 107, 108,  97,
        98,  99, 100, 101, 102, 103, 104, 105, 106, 107, 108,  97,  98,
        99, 100, 101, 102, 103, 104, 105, 106, 107, 108,  97,  98,  99,
       100, 101, 102, 103, 104, 105, 106, 107, 108], dtype=uint8)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM