简体   繁体   English

广播高级索引 numpy

[英]Broadcast advanced indexing numpy

I have an array of values for example例如,我有一组值

x = array([[[-0.78867513, -0.21132487,  0.        ,  0.78867513,  0.21132487,    0.        ,  0.        ,  0.        ,  0.        ],
            [ 0.        , -0.78867513, -0.21132487,  0.        ,  0.78867513,    0.21132487,  0.        ,  0.        ,  0.        ],
            [ 0.        ,  0.        ,  0.        , -0.78867513, -0.21132487,    0.        ,  0.78867513,  0.21132487,  0.        ],
            [ 0.        ,  0.        ,  0.        ,  0.        , -0.78867513,   -0.21132487,  0.        ,  0.78867513,  0.21132487]],
           [[-0.78867513, -0.21132487,  0.        ,  0.78867513,  0.21132487,    0.        ,  0.        ,  0.        ,  0.        ],
            [ 0.        , -0.78867513, -0.21132487,  0.        ,  0.78867513,    0.21132487,  0.        ,  0.        ,  0.        ],
            [ 0.        ,  0.        ,  0.        , -0.78867513, -0.21132487,    0.        ,  0.78867513,  0.21132487,  0.        ],
            [ 0.        ,  0.        ,  0.        ,  0.        , -0.78867513,   -0.21132487,  0.        ,  0.78867513,  0.21132487]]])

I want in use advanced indexing to pull out the nonzero values.我想使用高级索引来提取非零值。 I know the indices of the nonzero values so我知道非零值的索引所以

idx = array([[4, 3, 1, 0],
             [5, 4, 2, 1],
             [7, 6, 4, 3],
             [8, 7, 5, 4]])

The desired result would be something like想要的结果是这样的

x[idx] = array([[[-0.78867513, -0.21132487,  0.78867513,  0.21132487],
                 [-0.78867513, -0.21132487,  0.78867513,  0.21132487],
                 [-0.78867513, -0.21132487,  0.78867513,  0.21132487],
                 [-0.78867513, -0.21132487,  0.78867513,  0.21132487]],
                [[-0.78867513, -0.21132487,  0.78867513,  0.21132487],
                 [-0.78867513, -0.21132487,  0.78867513,  0.21132487],
                 [-0.78867513, -0.21132487,  0.78867513,  0.21132487],
                 [-0.78867513, -0.21132487,  0.78867513,  0.21132487]]])

The actual x array is much larger along the first dimension, but the nonzero structure is always indicated by idx so I need it to broadcast along the first dimension.实际的x数组沿第一维要大得多,但非零结构始终由idx指示,因此我需要它沿第一维进行广播。 Is this possible?这可能吗?

EDIT: To be clear x along the first dimension contains a nested list of 4 x 9 array.编辑:要清除x沿第一个维度包含4 x 9数组的嵌套列表。 idx then has the nonzero entries row-for-row.然后idx具有非零条目 row-for-row。 Notice in the first row of the both 4 x 9 nested arrays in x that the 4 3 1 0 entries are nonzero.通知两者的第一行中的4 x 9在嵌套数组x ,所述4 3 1 0项是非零。

Try this one:试试这个:

x[:,np.arange(idx.shape[0])[:,None],idx]

Using this technique every element in np.arange(idx.shape[0])[:,None] (which has shape (idx.shape[0], 1) and therefore is a column vector) will be broadcast with every row in idx.使用这种技术, np.arange(idx.shape[0])[:,None]每个元素(它具有形状 (idx.shape[0], 1),因此是一个列向量)将与中的每一行一起广播身份证号。 This will then be used for all entries along x's first axis.这将用于沿 x 的第一个轴的所有条目。

I tried this one liner for your problem and it seems to do the job without needing idx .我为您的问题尝试了这种衬垫,它似乎不需要idx就可以完成这项工作。 You may need to change the parameter in .reshape() according to the size of your problem.您可能需要根据问题的大小更改.reshape()的参数。

np.array(filter(lambda x: x!=0, x.ravel())).reshape(-1, 4, 4)

It flattens the array, removes the zeroes and then changes it back to the required shape.它将数组展平,删除零,然后将其更改回所需的形状。

Here's another version which is probably more efficient as it does not use the filter function and uses boolean indexing for numpy arrays instead这里的另一个版本,它可能更有效,因为它不使用filter功能,并使用布尔索引为numpy的数组来代替

y = x.ravel()
z = y[y!=0].reshape(-1, 4, 4)

EDIT:编辑:

While playing around with numpy I discovered yet another way to do it.在玩 numpy 时,我发现了另一种方法。

x[x!=0].reshape(-1, 4, 4)

And here's the performance of all three method:这是所有三种方法的性能:

  • Method 1: 10000 loops, best of 3: 21.2 µs per loop方法 1: 10000 loops, best of 3: 21.2 µs per loop
  • Method 2: 100000 loops, best of 3: 2.42 µs per loop方法 2: 100000 loops, best of 3: 2.42 µs per loop
  • Method 3: 100000 loops, best of 3: 1.97 µs per loop方法 3: 100000 loops, best of 3: 1.97 µs per loop

OK, this is a bit odd, but here goes...好吧,这有点奇怪,但是这里...

idxes = np.ones((x.shape[0], x.shape[1], 1), dtype=bool) * idx
print x[np.array(x, dtype=bool)].reshape(idxes.shape)

And of course you must remember to write np.array rather than array .当然,您必须记住编写np.array而不是array

Cheers!干杯!

And you can unburden yourself from computing idx with the following:您可以使用以下方法减轻计算 idx 的负担:

y = x[np.array(x, dtype=bool)]
print y.reshape(x.shape[0], x.shape[1], y.size/x.shape[0]/x.shape[1])

With this or the lines above it's the casting of the floats as bools that provides a mask that eliminates the zeros.有了这个或上面的线条,它是浮点数作为布尔值的铸造,提供了一个消除零的掩码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM