简体   繁体   English

(Numpy C API)迭代单个数组:NpyIter 与 for 循环(使用 PyArray_DATA)

[英](Numpy C API) Iterating over a single array: NpyIter vs for loop (with PyArray_DATA)

I am writing some C extension code for a python module.我正在为 python 模块编写一些 C 扩展代码。 The function I want to write is (in python)我想写的函数是(在python中)

output = 1./(1. + input)

where input is a numpy array of any shape.其中input是任何形状的 numpy 数组。

Originally I was using NpyIter_MultiNew :最初我使用的是NpyIter_MultiNew

static PyObject *
helper_calc1(PyObject *self, PyObject *args){

    PyObject * input;
    PyObject * output = NULL;

    if (!PyArg_ParseTuple(args, "O", &input)){
        return NULL;
    }

    // -- input -----------------------------------------------
    PyArrayObject * in_arr;

    in_arr = (PyArrayObject *) PyArray_FROM_OTF(input, NPY_DOUBLE, NPY_ARRAY_IN_ARRAY);
    if (in_arr == NULL){
        goto fail;
    }

    // -- set up iterator -------------------------------------
    PyArrayObject * op[2];
    npy_uint32 op_flags[2];
    npy_uint32 flags;

    op[0] = in_arr;
    op_flags[0] = NPY_ITER_READONLY;

    op[1] = NULL;
    op_flags[1] = NPY_ITER_WRITEONLY | NPY_ITER_ALLOCATE;

    flags = NPY_ITER_EXTERNAL_LOOP | NPY_ITER_BUFFERED | NPY_ITER_GROWINNER;

    NpyIter * iter = NpyIter_MultiNew(2, op,
                            flags, NPY_KEEPORDER, NPY_NO_CASTING,
                            op_flags, NULL);

    if (iter == NULL){
        goto fail;
    };

    NpyIter_IterNextFunc * iternext = NpyIter_GetIterNext(iter, NULL);
    if (iternext == NULL){
        NpyIter_Deallocate(iter);
        goto fail;
    };

    // -- iterate ---------------------------------------------
    npy_intp count;
    char ** dataptr = NpyIter_GetDataPtrArray(iter);
    npy_intp * strideptr = NpyIter_GetInnerStrideArray(iter);
    npy_intp * innersizeptr = NpyIter_GetInnerLoopSizePtr(iter);

    do {
        count = *innersizeptr;

        while (count--){

            *(double *) dataptr[1] = 1. / (1. + *(double *)dataptr[0]);

            dataptr[0] += strideptr[0];
            dataptr[1] += strideptr[1];
        }

    } while (iternext(iter));

    output = NpyIter_GetOperandArray(iter)[1];

    if (NpyIter_Deallocate(iter) != NPY_SUCCEED){
        goto fail;
    }

    Py_DECREF(in_arr);

    return output;

    fail:
        Py_XDECREF(in_arr);
        Py_XDECREF(output);
        return NULL;
}

However, since this is just a single array (ie I don't need to be concerned about broadcasting multiple arrays), Is there any reason I can't iterate over the data myself using, PyArray_DATA , a for loop and the array size?但是,由于这只是一个数组(即我不需要担心广播多个数组),有什么理由我PyArray_DATA使用PyArray_DATAfor循环和数组大小迭代数据?

static PyObject *
helper_calc2(PyObject *self, PyObject *args){

    PyObject * input;
    PyObject * output = NULL;

    if (!PyArg_ParseTuple(args, "O", & in)){
        return NULL;
    }

    // -- input -----------------------------------------------
    PyArrayObject * in_arr;

    in_arr = (PyArrayObject *) PyArray_FROM_OTF(input, NPY_DOUBLE, NPY_ARRAY_IN_ARRAY);
    if (in_arr == NULL){
        Py_XDECREF(in_arr);
        return NULL;
    }

    int ndim = PyArray_NDIM(in_arr);
    npy_intp * shape = PyArray_DIMS(in_arr);
    int size = (int) PyArray_SIZE(in_arr);

    double * in_data = (double *) PyArray_DATA(in_arr);

    output =  PyArray_SimpleNew(ndim, shape, NPY_DOUBLE);
    double * out_data = (double *) PyArray_DATA((PyArrayObject *) output);

    for (int i = 0; i < size; i++){
        out_data[i] = 1. / (1. + in_data[i]);
    }

    Py_DECREF(in_arr);
    return output;

fail:
    Py_XDECREF(in_arr);
    Py_XDECREF(output);
    return NULL;
}

This second version runs more quickly and the code is shorter.第二个版本运行速度更快,代码更短。

Are there any dangers I need to look out for when using, PyArray_DATA with a for loop instead of NpyIter_MultiNew ?在使用PyArray_DATAfor循环而不是NpyIter_MultiNew时,我是否需要注意任何危险?

From the PyArray_DATA documentation:PyArray_DATA文档:

If you have not guaranteed a contiguous and/or aligned array then be sure you understand how to access the data in the array to avoid memory and/or alignment problems.如果您不能保证连续和/或对齐的数组,那么请确保您了解如何访问数组中的数据以避免内存和/或对齐问题。

But I believe that this is taken care of by PyArray_FROM_OTF with the NPY_ARRAY_IN_ARRAY flag.但我相信这是由带有NPY_ARRAY_IN_ARRAY标志的PyArray_FROM_OTF处理的。

You should be ok in this case.在这种情况下你应该没问题。 Like you mentioned NPY_ARRAY_IN_ARRAY with PyArray_FROM_OTF takes care of that issue and since you are casting the data pointer to the right type, you should be fine.就像你提到的NPY_ARRAY_IN_ARRAYPyArray_FROM_OTF了这个问题,因为你将数据指针转换为正确的类型,你应该没问题。 However, in general, as you seem to be aware of, you can't use this approach if you are directly accepting a NumPy array from Python code.但是,一般来说,正如您似乎知道的那样,如果您直接从 Python 代码接受 NumPy 数组,则不能使用这种方法。

Happy C extension writing.快乐的 C 扩展写作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM