[英]Extract elements of a 2d array with indices from another 2d array
I have a 2d numpy array: data.shape==(n,8), and another ind.shape=(n,4). 我有一个2d numpy数组:data.shape ==(n,8),另一个ind.shape =(n,4)。 The ind array is the same length as data, and contains indices like [4,3,0,6].
ind数组与数据的长度相同,并包含像[4,3,0,6]这样的索引。 How can I create another array with shape==(n,4) containing the elements from data specified by the indices from ind?
如何创建另一个shape ==(n,4)的数组,其中包含来自ind的索引指定的数据中的元素? My actual arrays are pretty long (shape[0]), so the loop is slow.
我的实际数组很长(shape [0]),所以循环很慢。 There must be a better way than loops?
必须有比循环更好的方法吗?
import numpy as np
# Example data
data = np.array([[ 0.44180102, -0.05941365, 2.1482739 , -0.56875081, -1.45400572,
-1.44391254, -0.33710766, -0.44214518],
[ 0.79506417, -2.46156966, -0.09929341, -1.07347179, 1.03986533,
-0.45745476, 0.58853107, -1.08565425],
[ 1.40348682, -1.43396403, 0.8267174 , -1.54812358, -1.05854445,
0.15789466, -0.0666025 , 0.29058816]])
ind = np.array([[3, 4, 1, 5],
[4, 7, 0, 1],
[5, 1, 3, 6]])
# This is the part I want to vectorize:
out = np.zeros(ind.shape)
for i in range(ind.shape[0]):
out[i,:] = data[i,ind[i,:]]
# This should be good
assert np.all(out == np.array([[-0.56875081, -1.45400572, -0.05941365, -1.44391254],
[ 1.03986533, -1.08565425, 0.79506417, -2.46156966],
[ 0.15789466, -1.43396403, -1.54812358, -0.0666025 ]]))
This can be easily done if we index into the raveled data
array: 如果我们索引到raveled
data
数组,这可以很容易地完成:
out = data.ravel()[ind.ravel() + np.repeat(range(0, 8*ind.shape[0], 8), ind.shape[1])].reshape(ind.shape)
It might be easier to understand if it is broken down into three steps: 它可能更容易理解,如果它分为三个步骤:
indices = ind.ravel() + np.repeat(range(0, 8*ind.shape[0], 8), ind.shape[1])
out = data.ravel()[indices]
out = out.reshape(ind.shape)
ind
has the information on the elements from data
that we want. ind
拥有我们想要的data
元素的信息。 Unfortunately, it is expressed in 2-D indices. 不幸的是,它以二维指数表示。 The first line above converts these into
indices
of the 1-D raveled data
. 上面的第一行将这些转换为1-D raveled
data
indices
。 The second line above selects those elements out of the raveled array data
. 上面的第二行从raveled数组
data
选择那些元素。 The third line restores the 2-D shape to out
. 第三行恢复2-d的形状
out
。 The 2-D indices represented by ind
is converted to ind indices
has the indices 由
ind
表示的2-D指数被转换为ind indices
具有指数
What you want is something like this: 你想要的是这样的:
import numpy as np
data = np.array([[ 0.4, -0.1, 2.1, -0.6, -1.5, -1.4, -0.3, -0.4],
[ 0.8, -2.5, -0.1, -1.1, 1. , -0.5, 0.6, -1.1],
[ 1.4, -1.4, 0.8, -1.5, -1.1, 0.2, -0.1, 0.3]])
expected = np.array([[-0.6, -1.5, -0.1, -1.4],
[ 1. , -1.1, 0.8, -2.5],
[ 0.2, -1.4, -1.5, -0.1]])
indI = np.array([[0, 0, 0, 0],
[1, 1, 1, 1],
[2, 2, 2, 2]])
indJ = np.array([[3, 4, 1, 5],
[4, 7, 0, 1],
[5, 1, 3, 6]])
out = data[indI, indJ]
assert np.all(out == expected)
Notice that indI
and indJ
are the same shape and that 请注意,
indI
和indJ
的形状相同
out[i, j] == data[indI[i, j], indJ[i, j]]
for all i
and j
. 对于所有
i
和j
。
You might have noticed that that indI
is very repetitive . 您可能已经注意到,
indI
非常重复。 Because of numpy's broadcasting magic you can simply indI
to: 由于numpy的广播魔术你可以简单地
indI
:
indI = np.array([[0],
[1],
[2]])
You can build this kind of indI
array a few different ways, here is my favorite: 你可以用几种不同的方式构建这种
indI
数组,这是我的最爱:
a, b = indJ.shape
indI, _ = np.ogrid[:a, :0]
out = data[indI, indJ]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.