简体   繁体   English

如何用在另一个数组中找到的值的索引替换 Python NumPy 数组中的值?

[英]How can I replace values in a Python NumPy array with the index of those values found in another array?

I have an n*m array "a", and another 1D array "b", such as the following:我有一个 n*m 数组“a”和另一个一维数组“b”,如下所示:

a = array([[ 51, 30, 20, 10],
           [ 10, 32, 65, 77],
           [ 15, 20, 77, 30]])

b = array([10, 15, 20, 30, 32, 51, 65, 77])

I would like to replace all elements in "a" with the corresponding index of "b" where that element lies.我想用该元素所在的“b”的相应索引替换“a”中的所有元素。 In the case above, I would like the output to be:在上述情况下,我希望 output 为:

a = array([[ 5, 3, 2, 0],
           [ 0, 4, 6, 7],
           [ 1, 2, 7, 3]])

Please note, in real application my arrays are large, over 30k elements and several thousands of them.请注意,在实际应用中,我的 arrays 很大,超过 30k 个元素和数千个元素。 I have tried for loops but these take a long time to compute.我尝试过 for 循环,但这些循环需要很长时间来计算。 I have also tried similar iterative methods, and using list.index() to grab the indices but this also takes too much time.我也尝试过类似的迭代方法,并使用 list.index() 来获取索引,但这也需要太多时间。

Can anyone help me in identifying first the indices of "b" for the elements of "a" which appear in "b", and then constructing the updated "a" array?谁能帮我首先确定出现在“b”中的“a”元素的“b”索引,然后构造更新的“a”数组?

Thank you.谢谢你。

If the minimal/maximal elements of a, b form a small range (or at least small enough to fit into RAM), this can be done very quickly using a lookup table:如果a, b的最小/最大元素形成一个小范围(或至少小到足以放入 RAM),则可以使用查找表非常快速地完成此操作:

a = np.array([[51, 30, 20, 10],
              [10, 32, 65, 77],
              [15, 20, 77, 30]])
b = np.array([10, 15, 20, 30, 32, 51, 65, 77])

lo = min(a.min(), b.min())
hi = max(a.max(), b.max())
lut = np.zeros(hi - lo + 1, dtype=np.int64)
lut[b - lo] = np.arange(len(b))

Then:然后:

>>> a_indices = lut[a - lo]
>>> a_indices
array([[5, 3, 2, 0],
       [0, 4, 6, 7],
       [1, 2, 7, 3]])

This is posted as an answer only because it is too long for a comment.这只是作为答案发布,因为评论太长了。 It supports orlp 's solution posted above.它支持上面发布的orlp的解决方案。 Numpy's vectorize avoids an explicit loop, but it is clearly not the best approach. Numpy 的向量化避免了显式循环,但它显然不是最好的方法。 Note that Numpy's searchsorted can only be applied as shown when b is sorted.请注意,Numpy 的 searchsorted 只能在 b 排序时应用,如图所示。

import timeit
import numpy as np

a = np.random.randint(1,100,(1000,1000))
b = np.arange(0,1000,1)

def o1():
    lo = min(a.min(), b.min())
    hi = max(a.max(), b.max())
    lut = np.zeros(hi - lo + 1, dtype=np.int64)
    lut[b - lo] = np.arange(len(b))
    a2 = lut[a - lo]
    return a2 

def o2():
    a2 = a.copy()
    fu = np.vectorize(lambda i: np.place(a2, a2==b[i], i))
    fu(np.arange(0,len(b),1))

print(timeit.timeit("np.searchsorted(b, a)", globals=globals(), number=2))
print(timeit.timeit("o1()", globals=globals(), number=2))
print(timeit.timeit("o2()", globals=globals(), number=2))

prints印刷

0.061956800000189105
0.012765400000716909
2.220097600000372

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何用另一个数组中唯一值的索引替换numpy数组中的重复值? - How can I replace recurring values in a numpy array by the index of the unique value from another array? 如何在Python中使用numpy将一个数组的值替换为另一个数组 - How to replace values of an array with another array using numpy in Python Python Numpy:用另一个数组中的相应值替换一个数组中的值 - Python Numpy: replace values in one array with corresponding values in another array 用列表的值替换numpy索引数组的值 - Replace values of a numpy index array with values of a list 如何用另一个数组中的值替换 numpy 矩阵列中的值? - How to replace values in numpy matrix columns with values from another array? Python Numpy Array为每个索引返回多个值,但不允许索引这些值 - Python Numpy Array returns multiple values for each index, yet does not allow indexing of those values 用另一个 numpy 数组的值替换一个 numpy 数组的值 - Replace values of a numpy array by values from another numpy array Python:如果小于,如何用另一个数组的值替换数组中的值? - Python: How to replace values in an array by values of another array if smaller than? 如何使用 numpy 数组元素作为索引来为另一个 numpy 数组分配值 - How can I use numpy array elements as indices to assign values for another numpy array 如何根据另一列替换numpy数组中的值? - How to replace values in a numpy array based on another column?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM