简体   繁体   English

用于节省内存的 Numpy nditer?

[英]Numpy nditer for memory saving?

I'm lost when iterating over a ndarray with nditer.使用 nditer 迭代 ndarray 时,我迷路了。

Background背景

I am trying to compute the eigenvalues of 3x3 symmetric matrices for each point in a 3D array.我正在尝试为 3D 数组中的每个点计算 3x3 对称矩阵的特征值。 My data is a 4D array of shape [6,x,y,z] with the 6 values being the values of matrix at point x,y,z, over a ~500x500x500 cube of float32.我的数据是一个形状为 [6,x,y,z] 的 4D 数组,其中 6 个值是点 x,y,z 处矩阵的值,超过一个 ~500x500x500 的 float32 立方体。 I first used numpy's eigvalsh, but it's optimized for large matrices, while I can use analytical simplification for 3x3 symmetric matrices.我首先使用 numpy 的 eigvalsh,但它针对大型矩阵进行了优化,而我可以对 3x3 对称矩阵使用分析简化。

I then implemented wikipedia's simplification , both as a function that takes a single matrix and computes eigenvalues (then iterating naively with nested for loops), and then vectorized using numpy.然后我实现了wikipedia 的 simplification ,既作为一个函数,它接受一个矩阵并计算特征值(然后用嵌套的 for 循环天真地迭代),然后使用 numpy.vector 进行矢量化。

The problem is that now inside my vectorization, each operation creates an internal array of my data's size, culminating in too much RAM used and PC freeze.问题是现在在我的矢量化中,每个操作都会创建一个包含我的数据大小的内部数组,最终导致使用过多的 RAM 和 PC 冻结。

I tried using numexpr etc, it's still around 10G usage.我尝试使用 numexpr 等,它仍然在 10G 左右使用。

What I'm trying to do我想做什么

I want to iterate (using numpy's nditer) through my array so that for each matrix, I compute my eigenvalues.我想通过我的数组迭代(使用 numpy 的 nditer),以便为每个矩阵计算我的特征值。 This would remove the need to allocate huge intermediary arrays because we only calculate ~ 10 float numbers at a time.这将消除分配巨大中间数组的需要,因为我们一次只计算大约 10 个浮点数。 Basically trying to substitute nested for loops into one iterator.基本上是尝试将嵌套的for循环替换为一个迭代器。

I'm looking for something like this :我正在寻找这样的东西:

for a,b,c,d,e,f in np.nditer([symMatrix,eigenOut]): # for each matrix in x,y,z

    # computing my output for this matrix
    eigenOut[...] = myLovelyEigenvalue(a,b,c,d,e,f)

The best I have so far is this :到目前为止我最好的是:

for i in np.nditer([derived],[],[['readonly']],op_axes=[[1,2,3]]):

But this means that i takes all values of the 4D array instead of being a tuple of 6 length.但这意味着i取 4D 数组的所有值,而不是长度为 6 的元组。 I can't seem to get the hang of the nditer documentation.我似乎无法掌握 nditer 文档的窍门。

What am I doing wrong ?我究竟做错了什么 ? Do you have any tips and tricks as to iterating over "all but one" axis ?您是否有任何关于迭代“除一个”轴之外的任何提示和技巧?

The point is to have an nditer that would outperform regular nested loops on iteration (once this works i'll change function calls, buffer iteration ... but so far I just want it to work ^^)关键是要有一个 nditer 在迭代时优于常规嵌套循环(一旦它起作用,我将更改函数调用,缓冲区迭代......但到目前为止我只希望它工作^^)

You don't really need np.nditer for this.你真的不需要np.nditer A simpler way of iterating over all but the first axis is just to reshape into a [6, 500 ** 3] array, transpose it to [500 ** 3, 6] , then iterate over the rows:除了第一个轴之外的所有迭代的更简单方法是将其重塑为[6, 500 ** 3]数组,将其转置为[500 ** 3, 6] ,然后迭代行:

for (a, b, c, d, e, f) in (symMatrix.reshape(6, -1).T):
    # do something involving a, b, c, d, e, f...

If you really want to use np.nditer then you would do something like this:如果你真的想使用np.nditer那么你会做这样的事情:

for (a, b, c, d, e, f) in np.nditer(x, flags=['external_loop'], order='F'):
    # do something involving a, b, c, d, e, f...

A potentially important thing to consider is that if symMatrix is C-order (row-major) rather than Fortran-order (column-major) then iterating over the first dimension may be significantly faster than iterating over the last 3 dimensions, since then you will be accessing adjacent blocks of memory address.需要考虑的一个潜在的重要事项是,如果symMatrix是 C 顺序(行优先)而不是 Fortran 顺序(列优先),那么在第一个维度上进行迭代可能比在最后 3 个维度上进行迭代要快得多,从那时起您将访问相邻的内存地址块。 You might therefore want to consider switching to Fortran-order.因此,您可能需要考虑切换到 Fortran 顺序。

I wouldn't expect a massive performance gain from either of these, since at the end of the day you're still doing all of your looping in Python and operating only on scalars rather than taking advantage of vectorization.我不希望从这两者中获得巨大的性能提升,因为在一天结束时,您仍然在 Python 中进行所有循环,并且仅对标量进行操作,而不是利用矢量化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM