简体   繁体   English

使用ctypes在Python中解码C const char *

[英]Decode C const char* in Python with ctypes

I am using ctypes (imported as c ) in Python 3 to execute a C++ shared library. 我在Python 3中使用ctypes (作为c导入)执行C ++共享库。 The library is loaded into python using: 该库使用以下命令加载到python中:

smpLib = c.cdll.LoadLibrary(os.getcwd()+os.sep+'libsmpDyn.so')

One of the functions has the extern 'C' declaration const char* runSmpModel(...) . 函数之一具有extern 'C'声明const char* runSmpModel(...) The python function prototype is coded and run as: python函数原型被编码并运行为:

proto_SMP = c.CFUNCTYPE(c.c_char_p,...)
runSmpModel = proto_SMP(('runSmpModel',smpLib))
res = runSmpModel(...)

This all works beautifully, but I'm unable to decode the res variable and obtain the string passed out by the C runSmpModel function. 所有这些都能很好地工作,但是我无法解码res变量并获得C runSmpModel函数传递的字符串。 The value of res is displayed (I'm using ipython3 ) as b'\\xd0' . res的值显示为b'\\xd0' (我正在使用ipython3 )。 The best solution I've found online - res.decode('utf-8') gives me the error: 我在网上找到的最佳解决方案res.decode('utf-8')给了我错误:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 0: unexpected end of data

The const char* return value from the runSmpModel function comes from runSmpModel函数的const char*返回值来自

std::string scenID = SMPLib::SMPModel::runModel(...);
return scenID.c_str();

inside runModel, it is ultimately defined as shown here, where scenName is an input string: 在runModel内部,最终定义如下所示,其中scenName是输入字符串:

auto utcBuffId = newChars(500);
sprintf(utcBuffId, "%s_%u", scenName.c_str(), microSeconds); // catenate scenario name & time
uint64_t scenIdhash = (std::hash < std::string>() (utcBuffId)); // hash it

auto hshCode = newChars(100);
sprintf(hshCode, "%032llX", scenIdhash);
scenId = hshCode;

The value of this specific res should be 0000000000000000BBB00C6CA8B8872E . 此特定res的值应为0000000000000000BBB00C6CA8B8872E How can I decode this string? 如何解码此字符串?

After a lot of further testing, I've identified the problem as the length of the string passed from the C function. 经过大量的进一步测试后,我已经确定问题出在C函数传递的字符串的长度上。 No problems if the string is up to 15 characters in length, but if it's 16 or longer - no dice. 如果字符串的长度不超过15个字符,但长度不超过16个,则没有问题-没有骰子。 For a minimum-working example, the C-code is: 对于最低工作示例,C代码为:

extern "C" {
  const char* testMeSO()
  {
    string scenarioID = "abcdefghijklmnop";
    return scenarioID.c_str();
  }
}

and python code is (same definition of smpLib as shown above): 和python代码是(与smpLib相同的定义,如上所示):

proto_TST = c.CFUNCTYPE(c.c_char_p)
testMeSO = proto_TST(('testMeSO',smpLib))
res = testMeSO()
print("Scenario ID: %s"%res.decode('utf-8'))

This gives the decode error, unless any character is removed from the scenarioID variable in the C function. 除非从C函数中的scenarioID变量中删除了任何字符,否则都会产生解码错误。 So it seems the question is "how can Python read a C char* longer than 15 characters, using ctypes . 因此看来问题是“ Python如何使用ctypes读取超过15个字符的C char*

After several days of debugging and testing, I've finally gotten this working, using the second solution posted by @Petesh on this SO post . 经过几天的调试和测试,我终于使用@Petesh 在此SO post上发布的第二个解决方案开始了这项工作 I don't understand why ctypes is apparently limiting the char * value passed from C to 15 characters (+termination = 256 bits?). 我不明白为什么ctypes显然将从C传递的char *值限制为15个字符(+终止= 256位?)。

Essentially, the solution is to pass into the C function an extra char * buff buffer that has already been created using ctypes.create_string_buffer(32*16) , as well as an unsigned int buffsize of value 32*16. 本质上,解决方案是将已经使用ctypes.create_string_buffer(32*16)创建的额外char * buff缓冲区以及值32 * 16的unsigned int buffsize传递给C函数。 Then, in the C function execute scenarioID.copy(buff,buffsize) . 然后,在C函数中执行scenarioID.copy(buff,buffsize) The python prototype function is modified in an obvious way. python原型函数以明显的方式进行了修改。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM