PyArg_ParseTuple 在传递指针而不是地址时导致段错误

Question

While learning Python C extension modules utilizing CPython's C API, I have encountered a curious segfault bug (disclaimer: I have only a passing fluency of C).在学习使用 CPython 的 C API 的 Python C 扩展模块时，我遇到了一个奇怪的段错误（免责声明：我对 C 的流利度还算过得去）。 A typical example module method, written in C (which can then be imported into python) might look like this:一个典型的示例模块方法，用 C 编写（然后可以导入到 python 中）可能如下所示：

static PyObject *method_echo(PyObject *self, PyObject *args) {
    long a;
    if(!PyArg_ParseTuple(args, "l", &a)) {
        return NULL;
    }
    printf("Value of the passed variable is: %li\n", a);
    return PyLong_FromLong(a);
}

This works for me without issue.这对我有用没有问题。 The problem comes if I choose to declare a as a pointer and pass it to PyArg_ParseTuple , for example, changing the relevant lines to:如果我选择将a声明为指针并将其传递给PyArg_ParseTuple ，例如，将相关行更改为：

    long *a;
    if(!PyArg_ParseTuple(args, "l", a)) {
        return NULL;
    }

(and of course modifying the remaining lines to work with a pointer), this results in a segfault. （当然还有修改剩余的行以使用指针），这会导致段错误。 HOWEVER, if I remove the return NULL line:但是，如果我删除return NULL行：

    long *a;
    PyArg_ParseTuple(args, "l", a);

This runs without issue.这运行没有问题。 Even though the return NULL statement never gets executed (I have checked that explicitly with a printf in the conditional block), somehow it causes a segfault if I pass a pointer to PyArg_ParseTuple .即使return NULL语句永远不会被执行（我已经用条件块中的printf明确地检查过），但如果我将指针传递给PyArg_ParseTuple ，它会以某种方式导致段错误。 Any ideas what's going on?任何想法发生了什么？

Here are some details of my system, followed by some example code that should be able to reproduce the problem:下面是我的系统的一些细节，后面是一些应该能够重现问题的示例代码：

macOS 11.6 python3.9 C compiler: clang (clang-1300.0.29.30) macOS 11.6 python3.9 C 编译器：clang (clang-1300.0.29.30)

C extension module (which will import in python as test1_pptr ): C 扩展模块（将在 python 中作为test1_pptr导入）：

test1_parsepointer.c test1_parsepointer.c

#define PY_SSIZE_T_CLEAN
#include <python3.9/Python.h>

static PyObject *method_parse_ptr1(PyObject *self, PyObject *args) {
    long *a;
    if(!PyArg_ParseTuple(args, "l",a)) {
        printf("PROBLEM ENCOUNTERED\n");
    };
    printf("  ptr-v1: Value of var is: %li\n", *a);
    return PyLong_FromLong(*a);
}

static PyObject *method_parse_ptr2(PyObject *self, PyObject *args) {
    long *a;
    if(!PyArg_ParseTuple(args, "l",a)) {
        return NULL;
    };
    printf("  ptr-v2: Value of var is: %li\n", *a);
    return PyLong_FromLong(*a);
}

static PyObject *method_parse_val(PyObject *self, PyObject *args) {
    long a;
    if(!PyArg_ParseTuple(args, "l",&a)) {
        return NULL;
    };
    printf("     val: Value of var is: %li\n", a);
    return PyLong_FromLong(a);
    
}

static PyMethodDef parseptr_methods[] = {
    {"parse_ptr_v1", method_parse_ptr1, METH_VARARGS, "Parse as pointer, no NULL"},
    {"parse_ptr_v2", method_parse_ptr2, METH_VARARGS, "Parse as pointer, with NULL"},
    {"parse_val", method_parse_val, METH_VARARGS, "Parse as val, with NULL"},
    {NULL, NULL, 0, NULL}
};

static struct PyModuleDef parsing_ptrs = {
    PyModuleDef_HEAD_INIT,
    "test1_pptr",
    "Testing PyArg_ParseTuple vars as pointers",
    -1,
    parseptr_methods
};

PyMODINIT_FUNC PyInit_test1_pptr(void) {
    return PyModule_Create(&parsing_ptrs);
}

I compile this with the following command:我使用以下命令编译它：

clang -shared -undefined dynamic_lookup -o test1_parsepointer.so test1_parsepointer.c

Create a .py file that bootstraps this module upon import:创建一个在导入时引导此模块的.py文件：

test1_pptr.py: test1_pptr.py:

def __bootstrap__():
    global __bootstrap__, __loader__, __file__
    import sys, pkg_resources, importlib.util
    __file__ = pkg_resources.resource_filename(__name__, 'test1_parsepointer.so')
    __loader__ = None; del __bootstrap__, __loader__
    spec = importlib.util.spec_from_file_location(__name__,__file__)
    mod = importlib.util.module_from_spec(spec)
    spec.loader.exec_module(mod)
__bootstrap__()

And finally, the methods can be tested with the following python script:最后，可以使用以下 python 脚本测试这些方法：

import test1_pptr as tppr

"""
Three functions in tppr should be:
    parse_ptr_v1(int)
    parse_ptr_v2(int)
    parse_val
"""

def main():
    a = int(3)
    print("about to test parse-by-value...")
    tppr.parse_val(a) # runs fine

    print("about to test parse-by-pointer v1...")
    tppr.parse_ptr_v1(a) # runs fine
    
    print("about to test parse-by-pointer v2...")
    tppr.parse_ptr_v2(a) # segfaults

if __name__ == "__main__":
    main()

Answer 1

long *a;

This doesn't point to anything valid because you haven't initialized it (either by allocating memory for a long or taking the address of an existing long ).这并不指向任何有效的东西，因为您尚未对其进行初始化（通过为long分配内存或获取现有long的地址）。

if(!PyArg_ParseTuple(args, "l", a))

This is attempting to write into whatever a points to.这是试图写入a指向的任何内容。 But a doesn't point to a valid long .但是a不指向有效的long 。 Therefore it crashes.因此它崩溃了。

The fact that it seems to work in some cases is completely uninteresting.它似乎在某些情况下有效的事实是完全无趣的。 Writing into an invalid pointer is undefined behaviour.写入无效指针是未定义的行为。 Practically it's just arbitrary what a gets initialized to point at.实际上，它只是任意a被初始化指向。 There's no value in attempting to understand it.试图理解它没有任何价值。

PyArg_ParseTuple 在传递指针而不是地址时导致段错误

问题描述

1 个解决方案

解决方案1
0 2022-12-20 17:54:00

PyArg_ParseTuple 在传递指针而不是地址时导致段错误

问题描述

1 个解决方案

解决方案1 0 2022-12-20 17:54:00

解决方案1
0 2022-12-20 17:54:00