如何根據索引更有效地將一個數組中的值分配給另一個數組？

Question

我試圖根據源數組中有多少個數組來替換一個數組的值。 我根據總和從替換數組中的給定索引中分配一個值。 因此，如果連續有 2 個，則為物種分配值l1[1] ，如果有一個單元，則為輸出分配值l1[0] 。

在一個具體的例子中會更好地看到：

import numpy as np

l1 = np.array([4, 5])
x112 = np.array([[0, 0], [0, 1], [1, 1], [0, 0], [1, 0], [1, 1]])

array([[0, 0],
       [1, 0],
       [1, 1],
       [0, 0],
       [1, 0],
       [1, 1]])

所需輸出：

[[0]
 [4]
 [5]
 [0]
 [4]
 [5]]

我通過計算每行中的單位並使用np.where進行np.where分配來np.where ：

x1x2 = np.array([0, 1, 2, 0 1, 2]) #count value 1
x1x2 = np.where(x1x2 != 1, x1x2, l1[0]) 
x1x2 = np.where(x1x2 != 2, x1x2, l1[1])             
print(x1x2)

輸出

[0 4 5 0 4 5]

這可以更有效地完成嗎？

Answer 1

好吧，我實際上嘗試了對您的代碼進行去向量化。 首先是您擁有的矢量化 NumPy：

def op(x112, l1):
    # bit of cheating, adding instead of counting 1s
    x1x2 = x112[:,0] + x112[:,1]

    x1x2=np.where(x1x2 != 1, x1x2, l1[0])
    x1x2=np.where(x1x2 != 2, x1x2, l1[1])
    return x1x2

最有效的替代方法是只循環一次x112 ，所以讓我們做一個 Numba 循環。

import numba as nb

@nb.njit
def loop(x112, l1):
    d0, d1 = x112.shape
    x1x2 = np.zeros(d0, dtype = x112.dtype)
    for i in range(d0):
        # actually count the 1s
        num1s = 0
        for j in range(d1):
            if x112[i,j] == 1:
                num1s += 1
        
        if num1s == 1:
            x1x2[i] = l1[0]
        elif num1s == 2:
            x1x2[i] = l1[1]
    return x1x2

Numba 循環在我的筆記本電腦上有大約 9-10 倍的速度提升。

%timeit op(x112, l1)
8.05 µs ± 34.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit loop(x112, l1)
873 ns ± 5.09 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

正如@Mad_Physicist 所要求的那樣，使用更大的數組進行計時。 我也包括他的高級索引方法。

x112 = np.random.randint(0, 2, size = (100000, 2))
l1_v2 = np.array([0,4,5])

%timeit op(x112, l1)
1.35 ms ± 27.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit loop(x112, l1)
956 µs ± 2.78 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit l1_v2[x112.sum(1)]
1.2 ms ± 1.05 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

編輯：好吧，也許對這些時間持保留op(x112, l1)因為當我重新啟動 IPython 內核並重新運行這些東西時， op(x112, l1)提高到390 µs ± 22.1 µs per loop而其他方法保持相同的性能（ 971 微秒，1.23 毫秒）。

Answer 2

您可以使用直接索引：

l1 = np.array([0, 4, 5])
x112 = np.array([[0, 0], [0, 1], [1, 1], [0, 0], [1, 0], [1, 1]])

result = l1[x112.sum(1)]

如果您可以在創建時l1在l1前面加上零，則此方法有效。 如果不：

result = np.r_[0, l1][x112.sum(1)]

如何根據索引更有效地將一個數組中的值分配給另一個數組？

問題描述

2 個解決方案

解決方案1
1 2021-07-24 00:36:40

解決方案2
0 2021-07-24 00:42:00

如何根據索引更有效地將一個數組中的值分配給另一個數組？

問題描述

2 個解決方案

解決方案1 1 2021-07-24 00:36:40

解決方案2 0 2021-07-24 00:42:00

解決方案1
1 2021-07-24 00:36:40

解決方案2
0 2021-07-24 00:42:00