[英]Indexing Multi-dimensional arrays
I know that multidimensional numpy arrays may be indexed with other arrays, but I did not figure out how the following works: 我知道多维numpy数组可以与其他数组建立索引,但是我不知道以下内容如何工作:
I would like to have the the items from raster
, a 3d numpy array, based on indx
, a 3d index array: 我想从3d numpy数组raster
到基于3d索引数组indx
:
raster=np.random.rand(5,10,50)
indx=np.random.randint(0, high=50, size=(5,10,3))
What I want is another array with dimensions of indx
that holds the values of raster
based on the index of indx
. 我要的是另一个阵列的尺寸indx
保存的值raster
基础上的指数indx
。
What we need in order to properly resolve your indices during broadcasting are two arrays a
and b
so that raster[a[i,j,k],b[i,j,k],indx[i,j,k]]
will be raster[i,j,indx[i,j,k]]
for i
, j
, k
in corresponding ranges for indx
's axes. 为了在广播过程中正确解析索引,我们需要两个数组a
和b
以便raster[a[i,j,k],b[i,j,k],indx[i,j,k]]
将在indx
轴的相应范围内,对于i
, j
, k
为raster[i,j,indx[i,j,k]]
。 The easiest solution would be: 最简单的解决方案是:
x,y,z = indx.shape
a,b,_ = np.ogrid[:x,:y,:z]
raster[a,b,indx]
Where np.ogrid[...]
creates three arrays with shapes (x,1,1)
, (1,y,1)
and (1,1,z)
. 其中np.ogrid[...]
创建三个形状为(x,1,1)
, (1,y,1)
和(1,1,z)
数组。 We don't need the last one so we throw it away. 我们不需要最后一个,因此我们将其丢弃。 Now when the other two are broadcast with indx
they behave exactly the way we need. 现在,当其他两个以indx
广播时,它们的行为完全符合我们所需的方式。
If I understood the question correctly, for each row of indx
, you are trying to index into the corresponding row in raster
, but the column numbers vary depending on the actual values in indx
. 如果我对问题的理解正确,那么对于indx
每一行,您都尝试索引到raster
的相应行,但是列数根据indx
的实际值而有所不同。 So, with that assumption, you can use a vectorized approach that uses linear indexing, like so - 因此,基于此假设,您可以使用使用线性索引的矢量化方法,如下所示:
M,N,R = raster.shape
linear_indx = R*np.arange(M*N)[:,None] + indx.reshape(M*N,-1)
out = raster.ravel()[linear_indx].reshape(indx.shape)
I'm assuming that you want to get 3 random values from each of the 3rd dimension arrays. 我假设您想从每个第3维数组中获取3个随机值。
You can do this via list-comprehension thanks to advanced indexing 借助高级索引功能,您可以通过列表理解来实现
Here's an example using less number of values and integers so the output is easier to read: 这是一个使用较少数量的值和整数的示例,因此输出更易于阅读:
import numpy as np
raster=np.random.randint(0, high=1000, size=(2,3,10))
indices=np.random.randint(0, high=10, size=(2,3,3))
results = np.array([ np.array([ column[col_indices] for (column, col_indices) in zip(row, row_indices) ]) for (row, row_indices) in zip(raster, indices) ])
print("Raster:")
print(raster)
print("Indices:")
print(indices)
print("Results:")
print(results)
Output: 输出:
Raster:
[[[864 353 11 69 973 475 962 181 246 385]
[ 54 735 871 218 143 651 159 259 785 383]
[532 476 113 888 554 587 786 172 798 232]]
[[891 263 24 310 652 955 305 470 665 893]
[260 649 466 712 229 474 1 382 269 502]
[323 513 16 236 594 347 129 94 256 478]]]
Indices:
[[[0 1 2]
[7 5 1]
[7 8 9]]
[[4 0 2]
[6 1 4]
[3 9 2]]]
Results:
[[[864 353 11]
[259 651 735]
[172 798 232]]
[[652 891 24]
[ 1 649 229]
[236 478 16]]]
It iterates simultaneously over the corresponding 3rd dimension arrays in raster and indices and uses advanced indexing to slice the desired indices from raster . 它同时在栅格和索引中的相应第3维数组上进行迭代,并使用高级索引从栅格中切出所需的索引。
Here's a more verbose version that does the exact same thing: 这是一个更冗长的版本,其功能完全相同:
results = []
for i in range(len(raster)):
row = raster[i]
row_indices = indices[i]
row_results = []
for j in range(len(row)):
column = row[j]
column_indices = row_indices[j]
column_results = column[column_indices]
row_results.append(column_results)
results.append(np.array(row_results))
results = np.array(results)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.