简体   繁体   English

索引多维数组

[英]Indexing Multi-dimensional arrays

I know that multidimensional numpy arrays may be indexed with other arrays, but I did not figure out how the following works: 我知道多维numpy数组可以与其他数组建立索引,但是我不知道以下内容如何工作:

I would like to have the the items from raster , a 3d numpy array, based on indx , a 3d index array: 我想从3d numpy数组raster到基于3d索引数组indx

raster=np.random.rand(5,10,50)
indx=np.random.randint(0, high=50, size=(5,10,3))

What I want is another array with dimensions of indx that holds the values of raster based on the index of indx . 我要的是另一个阵列的尺寸indx保存的值raster基础上的指数indx

What we need in order to properly resolve your indices during broadcasting are two arrays a and b so that raster[a[i,j,k],b[i,j,k],indx[i,j,k]] will be raster[i,j,indx[i,j,k]] for i , j , k in corresponding ranges for indx 's axes. 为了在广播过程中正确解析索引,我们需要两个数组ab以便raster[a[i,j,k],b[i,j,k],indx[i,j,k]]将在indx轴的相应范围内,对于ijkraster[i,j,indx[i,j,k]] The easiest solution would be: 最简单的解决方案是:

x,y,z = indx.shape
a,b,_ = np.ogrid[:x,:y,:z]
raster[a,b,indx]

Where np.ogrid[...] creates three arrays with shapes (x,1,1) , (1,y,1) and (1,1,z) . 其中np.ogrid[...]创建三个形状为(x,1,1)(1,y,1)(1,1,z)数组。 We don't need the last one so we throw it away. 我们不需要最后一个,因此我们将其丢弃。 Now when the other two are broadcast with indx they behave exactly the way we need. 现在,当其他两个以indx广播时,它们的行为完全符合我们所需的方式。

If I understood the question correctly, for each row of indx , you are trying to index into the corresponding row in raster , but the column numbers vary depending on the actual values in indx . 如果我对问题的理解正确,那么对于indx每一行,您都尝试索引到raster的相应行,但是列数根据indx的实际值而有所不同。 So, with that assumption, you can use a vectorized approach that uses linear indexing, like so - 因此,基于此假设,您可以使用使用线性索引的矢量化方法,如下所示:

M,N,R = raster.shape
linear_indx = R*np.arange(M*N)[:,None] + indx.reshape(M*N,-1)
out = raster.ravel()[linear_indx].reshape(indx.shape)

I'm assuming that you want to get 3 random values from each of the 3rd dimension arrays. 我假设您想从每个第3维数组中获取3个随机值。

You can do this via list-comprehension thanks to advanced indexing 借助高级索引功能,您可以通过列表理解来实现

Here's an example using less number of values and integers so the output is easier to read: 这是一个使用较少数量的值和整数的示例,因此输出更易于阅读:

import numpy as np

raster=np.random.randint(0, high=1000, size=(2,3,10))
indices=np.random.randint(0, high=10, size=(2,3,3))
results = np.array([ np.array([ column[col_indices] for (column, col_indices) in zip(row, row_indices) ]) for (row, row_indices) in zip(raster, indices) ])

print("Raster:")
print(raster)
print("Indices:")
print(indices)
print("Results:")
print(results)

Output: 输出:

Raster:
[[[864 353  11  69 973 475 962 181 246 385]
  [ 54 735 871 218 143 651 159 259 785 383]
  [532 476 113 888 554 587 786 172 798 232]]

 [[891 263  24 310 652 955 305 470 665 893]
  [260 649 466 712 229 474   1 382 269 502]
  [323 513  16 236 594 347 129  94 256 478]]]
Indices:
[[[0 1 2]
  [7 5 1]
  [7 8 9]]

 [[4 0 2]
  [6 1 4]
  [3 9 2]]]
Results:
[[[864 353  11]
  [259 651 735]
  [172 798 232]]

 [[652 891  24]
  [  1 649 229]
  [236 478  16]]]

It iterates simultaneously over the corresponding 3rd dimension arrays in raster and indices and uses advanced indexing to slice the desired indices from raster . 它同时在栅格索引中的相应第3维数组上进行迭代,并使用高级索引从栅格中切出所需的索引。

Here's a more verbose version that does the exact same thing: 这是一个更冗长的版本,其功能完全相同:

results = []
for i in range(len(raster)):
    row = raster[i]
    row_indices = indices[i]
    row_results = []
    for j in range(len(row)):
        column = row[j]
        column_indices = row_indices[j]
        column_results = column[column_indices]
        row_results.append(column_results)
    results.append(np.array(row_results))
results = np.array(results)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM