While learning Python C extension modules utilizing CPython's C API, I have encountered a curious segfault bug (disclaimer: I have only a passing fluency of C). A typical example module method, written in C (which can then be imported into python) might look like this:
static PyObject *method_echo(PyObject *self, PyObject *args) {
long a;
if(!PyArg_ParseTuple(args, "l", &a)) {
return NULL;
}
printf("Value of the passed variable is: %li\n", a);
return PyLong_FromLong(a);
}
This works for me without issue. The problem comes if I choose to declare a
as a pointer and pass it to PyArg_ParseTuple
, for example, changing the relevant lines to:
long *a;
if(!PyArg_ParseTuple(args, "l", a)) {
return NULL;
}
(and of course modifying the remaining lines to work with a pointer), this results in a segfault. HOWEVER, if I remove the return NULL
line:
long *a;
PyArg_ParseTuple(args, "l", a);
This runs without issue. Even though the return NULL
statement never gets executed (I have checked that explicitly with a printf
in the conditional block), somehow it causes a segfault if I pass a pointer to PyArg_ParseTuple
. Any ideas what's going on?
Here are some details of my system, followed by some example code that should be able to reproduce the problem:
macOS 11.6 python3.9 C compiler: clang (clang-1300.0.29.30)
C extension module (which will import in python as test1_pptr
):
test1_parsepointer.c
#define PY_SSIZE_T_CLEAN
#include <python3.9/Python.h>
static PyObject *method_parse_ptr1(PyObject *self, PyObject *args) {
long *a;
if(!PyArg_ParseTuple(args, "l",a)) {
printf("PROBLEM ENCOUNTERED\n");
};
printf(" ptr-v1: Value of var is: %li\n", *a);
return PyLong_FromLong(*a);
}
static PyObject *method_parse_ptr2(PyObject *self, PyObject *args) {
long *a;
if(!PyArg_ParseTuple(args, "l",a)) {
return NULL;
};
printf(" ptr-v2: Value of var is: %li\n", *a);
return PyLong_FromLong(*a);
}
static PyObject *method_parse_val(PyObject *self, PyObject *args) {
long a;
if(!PyArg_ParseTuple(args, "l",&a)) {
return NULL;
};
printf(" val: Value of var is: %li\n", a);
return PyLong_FromLong(a);
}
static PyMethodDef parseptr_methods[] = {
{"parse_ptr_v1", method_parse_ptr1, METH_VARARGS, "Parse as pointer, no NULL"},
{"parse_ptr_v2", method_parse_ptr2, METH_VARARGS, "Parse as pointer, with NULL"},
{"parse_val", method_parse_val, METH_VARARGS, "Parse as val, with NULL"},
{NULL, NULL, 0, NULL}
};
static struct PyModuleDef parsing_ptrs = {
PyModuleDef_HEAD_INIT,
"test1_pptr",
"Testing PyArg_ParseTuple vars as pointers",
-1,
parseptr_methods
};
PyMODINIT_FUNC PyInit_test1_pptr(void) {
return PyModule_Create(&parsing_ptrs);
}
I compile this with the following command:
clang -shared -undefined dynamic_lookup -o test1_parsepointer.so test1_parsepointer.c
Create a .py
file that bootstraps this module upon import:
test1_pptr.py:
def __bootstrap__():
global __bootstrap__, __loader__, __file__
import sys, pkg_resources, importlib.util
__file__ = pkg_resources.resource_filename(__name__, 'test1_parsepointer.so')
__loader__ = None; del __bootstrap__, __loader__
spec = importlib.util.spec_from_file_location(__name__,__file__)
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
__bootstrap__()
And finally, the methods can be tested with the following python script:
import test1_pptr as tppr
"""
Three functions in tppr should be:
parse_ptr_v1(int)
parse_ptr_v2(int)
parse_val
"""
def main():
a = int(3)
print("about to test parse-by-value...")
tppr.parse_val(a) # runs fine
print("about to test parse-by-pointer v1...")
tppr.parse_ptr_v1(a) # runs fine
print("about to test parse-by-pointer v2...")
tppr.parse_ptr_v2(a) # segfaults
if __name__ == "__main__":
main()
long *a;
This doesn't point to anything valid because you haven't initialized it (either by allocating memory for a long
or taking the address of an existing long
).
if(!PyArg_ParseTuple(args, "l", a))
This is attempting to write into whatever a
points to. But a
doesn't point to a valid long
. Therefore it crashes.
The fact that it seems to work in some cases is completely uninteresting. Writing into an invalid pointer is undefined behaviour. Practically it's just arbitrary what a
gets initialized to point at. There's no value in attempting to understand it.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.