简体   繁体   English

如何从一维数组中获取二维索引数组?

[英]How to get 2d array of indices from 1d array?

I'm looking for an efficient way to return indices for a 2d array based on values in a 1d array.我正在寻找一种有效的方法来根据一维数组中的值返回二维数组的索引。 I currently have a nested for loop set up that is painfully slow.我目前有一个嵌套的 for 循环设置,它非常缓慢。

Here is some example data and what I want to get:这是一些示例数据以及我想获得的内容:

data2d = np.array( [  [1,2] , [1,3] ,[3,4], [1,2] , [7,9] ])

data1d = np.array([1,2,3,4,5,6,7,8,9])

I would like to return the indices where data2d is equal to data1d.我想返回 data2d 等于 data1d 的索引。 My desired output would be this 2d array:我想要的输出是这个二维数组:

locs = np.array([[0, 1], [0, 2], [2, 3], [0, 1], [6, 8]])

The only thing I've come up with is the nested for loop:我唯一想到的是嵌套的 for 循环:

locs = np.full((np.shape(data2d)), np.nan)

for i in range(0, 5):
    for j in range(0, 2):
        loc_val = np.where(data1d == data2d[i, j])
        loc_val = loc_val[0]
        locs[i, j] = loc_val

This would be fine for a small set of data but I have 87,600 2d grids that are each 428x614 grid points.这对于一小组数据来说没问题,但我有 87,600 个 2d 网格,每个网格点为 428x614。

Use np.searchsorted :使用np.searchsorted

np.searchsorted(data1d, data2d.ravel()).reshape(data2d.shape)

array([[0, 1],
       [0, 2],
       [2, 3],
       [0, 1],
       [6, 8]])

searchsorted performs binary search with the ravelled data2d . searchsorted执行二进制与弄明白搜索data2d The result is then reshaped.然后重新塑造结果。


Another option is to build an index and query it in constant time.另一种选择是建立一个索引并在恒定时间内查询它。 You can do this with pandas' Index API.您可以使用 Pandas 的Index API 来做到这一点。

import pandas as pd

idx = pd.Index([1,2,3,4,5,6,7,8,9])
idx
#  Int64Index([1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64')

idx.get_indexer(data2d.ravel()).reshape(data2d.shape)

array([[0, 1],
       [0, 2],
       [2, 3],
       [0, 1],
       [6, 8]])

This should be fast also这也应该很快

import numpy as np
data2d = np.array( [  [1,2] , [1,3] ,[3,4], [1,2] , [7,9] ])
data1d = np.array([1,2,3,4,5,6,7,8,9])
idxdict = dict(zip(data1d,range(len(data1d))))
locs = data2d
for i in range(len(locs)):
    for j in range(len(locs[i])):
        locs[i][j] = idxdict[locs[i][j]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM