[英]C char array from python string
我在 python 中有一個字符串列表,我試圖將其傳遞給 C 擴展名以進行字符分析。 到目前為止,我已經將列表分解為它們各自的字符串 PyObjects。 接下來,我希望將這些字符串拆分為單獨的字符,這樣每個字符串 PyObject 現在都是一個對應的 C 類型字符數組。 我似乎無法弄清楚如何做到這一點。
這是我到目前為止所擁有的:目前在構建 .pyd 文件之后,它將返回一個 1 的列表作為 Python 的填充符(所以其他一切都有效),我只是不知道如何將字符串 PyObject 拆分為 C 類型字符數組。
--- cExt.c ---
#include <Python.h>
#include <stdio.h>
static int *CitemCheck(PyObject *commandString, int commandStringLength) {
// HAALP
//char* commandChars = (char*) malloc(commandStringLength*sizeof(char*));
// char c[] = PyString_AsString("c", commandString);
// printf("%c" , c);
// printf("%s", PyString_AsString(commandString));
// for (int i=0; i<sizeof(commandChars)/sizeof(*commandChars); i++) {
// printf("%s", PyString_AsString(commandString));
// printf("%c", commandChars[i]);
// }
return 1; // TODO: RETURN PROPER RESULTANT
}
static PyObject *ClistCheck(PyObject *commandList, int commandListLength) {
PyObject *results = PyList_New(commandListLength);
for (int index = 0; index < commandListLength; index++) {
PyObject *commandString;
commandString = PyList_GetItem(commandList, index);
int commandStringLength = PyObject_Length(commandString);
// CitemCheck should take string PyObject and its length as int
int x = CitemCheck(commandString, commandStringLength);
PyObject* pyItem = Py_BuildValue("i", x);
PyList_SetItem(results, index, pyItem);
}
return results;
}
static PyObject *parseListCheck(PyObject *self, PyObject *args) {
PyObject *commandList;
int commandListLength;
if (!PyArg_ParseTuple(args, "O", &commandList)){
return NULL;
}
commandListLength = PyObject_Length(commandList);
return Py_BuildValue("O", ClistCheck(commandList, commandListLength));
}
static char listCheckDocs[] =
""; // TODO: ADD DOCSTRING
static PyMethodDef listCheck[] = {
{"listCheck", (PyCFunction) parseListCheck, METH_VARARGS, listCheckDocs},
{NULL,NULL,0,NULL}
};
static struct PyModuleDef DCE = {
PyModuleDef_HEAD_INIT,
"listCheck",
NULL,
-1,
listCheck
};
PyMODINIT_FUNC PyInit_cExt(void){
return PyModule_Create(&DCE);
}
作為參考,我的臨時擴展構建文件:
--- _c_setup.py ---
(located in same folder as cExt.c)
"""
to build C files, pass:
python _c_setup.py build_ext --inplace clean --all
in command prompt which is cd'd to the file's dierctory
"""
import glob
from setuptools import setup, Extension, find_packages
from os import path
here = path.abspath(path.dirname(__file__))
files = [path.split(x)[1] for x in glob.glob(path.join(here, '**.c'))]
extensions = [Extension(
path.splitext(x)[0], [x]
) for x in files]
setup(
ext_modules = extensions,
)
您可以使用 PyUnicode_AsEncodedString,它
編碼 Unicode object 並將結果返回為 Python 字節 ZA8CFDE6331BD59EB26666F8911ZB44。 encoding和errors與Unicode encode()方法中的同名參數含義相同。 使用 Python 編解碼器注冊表查找要使用的編解碼器。 如果編解碼器引發異常,則返回 NULL。
見https://docs.python.org/3/c-api/unicode.html#c.PyUnicode_AsEncodedString
然后使用 PyBytes_AsString 你得到一個指向帶有終止 NUL 字節的內部緩沖區的指針。 這個緩沖區既不能被釋放也不能被修改。 如果您需要一份副本,您可以使用例如 strdup。
見https://docs.python.org/3/c-api/bytes.html#c.PyBytes_AsString
稍微修改您的代碼,它可能看起來像這樣:
PyObject *encodedString = PyUnicode_AsEncodedString(commandString, "UTF-8", "strict");
if (encodedString) { //returns NULL if an exception was raised
char *commandChars = PyBytes_AsString(encodedString); //pointer refers to the internal buffer of encodedString
if(commandChars) {
printf("the string '%s' consists of the following chars:\n", commandChars);
for (int i = 0; commandChars[i] != '\0'; i++) {
printf("%c ", commandChars[i]);
}
printf("\n");
}
Py_DECREF(encodedString);
}
如果有人會測試:
import cExt
fruits = ["apple", "pears", "cherry", "pear", "blueberry", "strawberry"]
res = cExt.listCheck(fruits)
print(res)
output 將是:
the string 'apple' consists of the following chars:
a p p l e
the string 'pears' consists of the following chars:
p e a r s
the string 'cherry' consists of the following chars:
c h e r r y
the string 'pear' consists of the following chars:
p e a r
the string 'blueberry' consists of the following chars:
b l u e b e r r y
the string 'strawberry' consists of the following chars:
s t r a w b e r r y
[1, 1, 1, 1, 1, 1]
與問題沒有直接關系的旁注:您的 CitemCheck function 返回一個指向 int 的指針,但如果查看它的調用方式,您似乎想要返回一個 int 值。 function 簽名應該看起來更像這樣:
static int CitemCheck(PyObject *commandString, int commandStringLength)
(注意 int 后刪除的*
)。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.