简体   繁体   English

生成器理解和列表理解的迭代方式不同

[英]Generator Comprehension and List Comprehension iterate differently

I wrote a function that passes numpy array's into C code using CFFI.我编写了一个函数,该函数使用 CFFI 将 numpy 数组传递到 C 代码中。 It utilizes the buffer protocol and memoryview to pass the data efficiently without copying it.它利用缓冲区协议和内存视图来高效地传递数据而无需复制它。 However, this means that you need to pass C-contiguous arrays and ensure that you using the right types.但是,这意味着您需要传递 C 连续数组并确保使用正确的类型。 Numpy provides a function numpy.ascontiguous, which does this. Numpy 提供了一个函数numpy.ascontiguous,它可以做到这一点。 So I iterate over the arguments, and apply this function.所以我迭代参数,并应用这个函数。 The implementation below works, and may be of general interest.下面的实现有效,并且可能引起普遍关注。 However, it is slow given the number of times it is called.但是,考虑到它被调用的次数,它很慢。 (Any general comments on how to speed it up would be helpful.) (关于如何加快速度的任何一般性评论都会有所帮助。)

However, the actual question is when you replace the first list comprehension with a generator comprehension, or if you refactor the code so that np.ascontigous is called in the second one, the pointers passed into the C code no longer point to the start of the numpy array.然而,实际的问题是,当您用生成器np.ascontigous替换第一个列表np.ascontigous ,或者如果您重构代码以便在第二个中调用np.ascontigous ,则传递到 C 代码中的指针不再指向numpy 数组。 I think that it is not getting called.我认为它不会被调用。 I'm iterating over the comprehension and only using the return values, why would using a list comprehension or generator comprehension change anything?我正在迭代理解并且只使用返回值,为什么使用列表理解或生成器理解会改变任何东西?

def cffi_wrap(cffi_func, ndarray_params, pod_params, return_shapes=None):
    """
    Wraps a cffi function to allow it to be called on numpy arrays.

    It uss the numpy buffer protocol and and the cffi buffer protocol to pass the 
    numpy array into the c function without copying any of the parameters. 
    You will need to pass dimensions into the C function, which you can do using 
    the pod_params.

    Parameters
    ----------
    cffi_func : c function
        This is a c function declared using cffi. It must take double pointers and 
        plain old data types. The arguments must be in the form of numpy arrays, 
        plain old data types, and then the returned numpy arrays.
    ndarray_params : iterable of ndarrays
         The numpy arrays to pass into the function.
    pod_params : tuple of plain old data
        This plain old data objects to pass in.  This may include for example 
        dimensions.
    return_shapes : iterable of tuples of positive ints
          The shapes of the returned objects.

    Returns
    -------
    return_vals : ndarrays of doubles.
        The objects to be calculated by the cffi_func.

    """

    arr_param_buffers = [np.ascontiguousarray(param, np.float64) 
         if np.issubdtype(param.dtype, np.float)
         else np.ascontiguousarray(param, np.intc) for param in ndarray_params]
    arr_param_ptrs = [ffi.cast("double *", ffi.from_buffer(memoryview(param))) 
        if np.issubdtype(param.dtype, np.float)
        else ffi.cast("int *", ffi.from_buffer(memoryview(param))) 
        for param in arr_param_buffers]

    if return_shapes is not None:

        return_vals_ptrs = tuple(ffi.new("double[" + str(np.prod(shape)) + "]") 
            for shape in return_shapes)
        returned_val = cffi_func(*arr_param_ptrs, *pod_params, *return_vals_ptrs)
        return_vals = tuple(np.frombuffer(ffi.buffer(
              return_val))[:np.prod(shape)].reshape(shape)
              for shape, return_val in zip(return_shapes, return_vals_ptrs))
    else:
        returned_val = cffi_func(*arr_param_ptrs, *pod_params)
        return_vals = None

    if returned_val is not None and return_vals is not None:
        return_vals = return_vals + (returned_val,)
    elif return_vals is None:
       return_vals = (returned_val,)

    if len(return_vals) == 1:
        return return_vals[0]
    else:
       return return_vals

I'm just guessing, but the error could come from keepalives: with arr_param_buffers a list comprehension, as in your posted code, then as long as this local variable exists (ie for the whole duration of cffi_wrap()), all the created numpy arrays are alive.我只是在猜测,但错误可能来自 keepalives:使用arr_param_buffers是一个列表理解,就像在您发布的代码中一样,那么只要这个局部变量存在(即在 cffi_wrap() 的整个持续时间内),所有创建的 numpy数组还活着。 This allows you to do ffi.from_buffer(memoryview(...)) on the next line and be sure that they are all pointers to valid data.这允许您在下一行执行ffi.from_buffer(memoryview(...))并确保它们都是指向有效数据的指针。

If you replace arr_param_buffers with a generator expression, it will generate the new numpy arrays one by one, call ffi.from_buffer(memoryview(param)) on them, and then throw them away.如果将arr_param_buffers替换为生成器表达式,它会一一生成新的 numpy 数组,对它们调用ffi.from_buffer(memoryview(param)) ,然后将它们扔掉。 The ffi.from_buffer(x) returns an object that should keep x alive, but maybe x == memoryview(nd) does not itself keep alive the numpy array nd , for all I know. ffi.from_buffer(x)返回一个应该使x保持活动状态的对象,但x == memoryview(nd)我所知,也许x == memoryview(nd)本身并不能使 numpy 数组nd保持活动状态。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM