简体   繁体   English

C数组到PyArray

[英]C array to PyArray

I'm writing a Python C-Extension without using Cython. 我在不使用Cython的情况下编写Python C-Extension。

I want to allocate a double array in C, use it in an internal function (that happens to be in Fortran) and return it. 我想在C中分配一个double数组,在内部函数中使用它(恰好在Fortran中)并返回它。 I point out that the C-Fortran interface works perfectly in C. 我指出C-Fortran接口在C中完美运行。

static PyObject *
Py_drecur(PyObject *self, PyObject *args)
{
  // INPUT
  int n;
  int ipoly;
  double al;
  double be;

  if (!PyArg_ParseTuple(args, "iidd", &n, &ipoly, &al, &be))
    return NULL;

  // OUTPUT
  int nd = 1;
  npy_intp dims[] = {n};
  double a[n];
  double b[n];
  int ierr;

  drecur_(n, ipoly, al, be, a, b, ierr);

  // Create PyArray
  PyObject* alpha = PyArray_SimpleNewFromData(nd, dims, NPY_DOUBLE, a);
  PyObject* beta = PyArray_SimpleNewFromData(nd, dims, NPY_DOUBLE, b);

  Py_INCREF(alpha);
  Py_INCREF(beta);

  return Py_BuildValue("OO", alpha, beta);
}

I debugged this code and I get a Segmentation fault when I try to create alpha out of a. 我调试了这段代码,当我尝试从a创建alpha时,我遇到了Segmentation错误。 Up to there everything works fine. 到那里一切正常。 The function drecur_ works and I get the same problem if it is removed. 功能drecur_工作,如果删除它我会得到同样的问题。

Now, what's the standard way of defining a PyArray around C data? 现在,围绕C数据定义PyArray的标准方法是什么? I found documentation but no good example. 我找到了文档,但没有很好的例子。 Also, what about memory leakage? 还有,内存泄漏怎么样? Is it correct to INCREF before return so that the instance of alpha and beta are preserved? 返回之前INCREF是否正确,以便保留alpha和beta的实例? What about the deallocation when they are not needed anymore? 那些不再需要的解除分配怎么样?

EDIT I finally got it right with the approach found in NumPy cookbook . 编辑我终于用NumPy食谱中找到的方法做对了。

static PyObject *
Py_drecur(PyObject *self, PyObject *args)
{
  // INPUT
  int n;
  int ipoly;
  double al;
  double be;
  double *a, *b;
  PyArrayObject *alpha, *beta;

  if (!PyArg_ParseTuple(args, "iidd", &n, &ipoly, &al, &be))
    return NULL;

  // OUTPUT
  int nd = 1;
  int dims[2];
  dims[0] = n;
  alpha = (PyArrayObject*) PyArray_FromDims(nd, dims, NPY_DOUBLE);
  beta = (PyArrayObject*) PyArray_FromDims(nd, dims, NPY_DOUBLE);
  a = pyvector_to_Carrayptrs(alpha);
  b = pyvector_to_Carrayptrs(beta);
  int ierr;

  drecur_(n, ipoly, al, be, a, b, ierr);

  return Py_BuildValue("OO", alpha, beta);
}

double *pyvector_to_Carrayptrs(PyArrayObject *arrayin)  {
  int n=arrayin->dimensions[0];
  return (double *) arrayin->data;  /* pointer to arrayin data as double */
}

Feel free to comment on this and thanks for the answers. 随意评论这个并感谢您的答案。

So the first things that looks suspicious, is that your array a and b are in the local scope of the function. 所以看起来可疑的第一件事就是你的数组ab都在函数的局部范围内。 That means after the return you will get an illegal memory access. 这意味着在返回后您将获得非法内存访问。

So you should declare the arrays with 所以你应该声明数组

double *a = malloc(n*sizeof(double));

Then you need to make sure that the memory is later freed by the object you have created. 然后,您需要确保稍后由您创建的对象释放内存。 See this quote of the documentation: 请参阅文档的引用:

PyObject PyArray_SimpleNewFromData(int nd, npy_intp dims, int typenum, void* data) PyObject PyArray_SimpleNewFromData(int nd,npy_intp dims,int typenum,void * data)

Sometimes, you want to wrap memory allocated elsewhere into an ndarray object for downstream use. 有时,您希望将分配在其他位置的内存包装到ndarray对象中以供下游使用。 This routine makes it straightforward to do that. 这个例程可以直接做到这一点。 The first three arguments are the same as in PyArray_SimpleNew, the final argument is a pointer to a block of contiguous memory that the ndarray should use as it's data-buffer which will be interpreted in C-style contiguous fashion. 前三个参数与PyArray_SimpleNew中的相同,最后一个参数是一个指向ndarray应该使用的连续内存块的指针,因为它的数据缓冲区将以C风格的连续方式进行解释。 A new reference to an ndarray is returned, but the ndarray will not own its data. 返回对ndarray的新引用,但ndarray将不拥有其数据。 When this ndarray is deallocated, the pointer will not be freed. 取消分配此ndarray时,不会释放指针。

You should ensure that the provided memory is not freed while the returned array is in existence. 您应该确保在返回的数组存在时不释放提供的内存。 The easiest way to handle this is if data comes from another reference-counted Python object. 处理此问题的最简单方法是,数据来自另一个引用计数的Python对象。 The reference count on this object should be increased after the pointer is passed in, and the base member of the returned ndarray should point to the Python object that owns the data. 传入指针后,应增加此对象的引用计数,并且返回的ndarray的基本成员应指向拥有该数据的Python对象。 Then, when the ndarray is deallocated, the base-member will be DECREF'd appropriately. 然后,当取消分配ndarray时,基本成员将适当地降低。 If you want the memory to be freed as soon as the ndarray is deallocated then simply set the OWNDATA flag on the returned ndarray. 如果要在释放ndarray后立即释放内存,则只需在返回的ndarray上设置OWNDATA标志即可。

For your second question the Py_INCREF(alpha); 对于你的第二个问题, Py_INCREF(alpha); is generally only necessary if you intend to keep the reference in a global variable or a class member. 通常只有在打算将引用保留在全局变量或类成员中时才是必需的。 But since you are only wrapping a function you don't have to do it. 但是因为你只是包装一个函数,所以你不必这样做。 Sadly it could be that the function PyArray_SimpleNewFromData does not set the reference counter to 1, if that would be the case you would have to increase it to 1. I hope that was understandable ;). 遗憾的是,函数PyArray_SimpleNewFromData可能没有将引用计数器设置为1,如果是这种情况你必须将它增加到1.我希望这是可以理解的;)。

One problem might be that your arrays (a,b) have to last at least as long as the numpy-array that contains it. 一个问题可能是您的数组(a,b)必须至少与包含它的numpy数组一样长。 You've created your arrays in local scope so they will be destroyed when you leave the method. 您已在本地范围内创建了数组,因此在您离开方法时它们将被销毁。

Try to have python allocating the array (eg using PyArray_SimpleNew ), copy your content into it and pass it a pointer. 尝试让python分配数组(例如使用PyArray_SimpleNew ),将内容复制到其中并传递指针。 You might also want to use boost::python for taking care of these details, if building against boost is an option. 您可能还想使用boost :: python来处理这些细节,如果构建针对boost是一个选项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM