獲取符合我的標准的 np.array 索引的最快方法

Question

我試圖找到獲得滿足我的標准的矩陣元素的索引的最快方法。 我有一個 (7,7) np.array（名為“board”），其中包含從 0 到 400 的 int16。例如，我想找到等於 300 的元素的索引。

我嘗試了很多技術，到目前為止最快的方法是 np.where(board ==300)

我正在嘗試優化的功能：

def is_end(self, board):
    ind = np.where((board > 300) & (board - 300 < 100))
    try:
        victoriousPlayer = board[ind[0][0], ind[1][0]] % 100 // 10
        return victoriousPlayer
    except:
        return -1

因為我數萬次使用這個函數，所以我需要它盡可能快地運行。

Answer 1

如果你想最小化函數的運行時間，你能做的最好的事情就是避免在每次調用時分配新的數組。 這意味着為函數之外的臨時值維護額外的數組，但它確實給你帶來了顯着的加速。

import numpy as np

# Original function
def is_end_1(board):
    ind = np.where((board > 300) & (board - 300 < 100))
    try:
        victoriousPlayer = board[ind[0][0], ind[1][0]] % 100 // 10
        return victoriousPlayer
    except:
        return -1

# Without array allocation
def is_end_2(board, tmpBool1, tmpBool2):
    np.less(300, board, out=tmpBool1)
    np.less(board, 400, out=tmpBool2)
    np.logical_and(tmpBool1, tmpBool2, out=tmpBool1)
    idx = np.unravel_index(np.argmax(tmpBool1), board.shape)
    return board[idx] % 100 // 10 if tmpBool1[idx] else -1

# Test
np.random.seed(0)
# Create some data
board = np.random.randint(500, size=(1000, 1000))
# Result from original function
res1 = is_end_1(board)
# Temporary arrays
tmpBool1 = np.empty_like(board, dtype=np.bool)
tmpBool2 = tmpBool1.copy()
# Result from function without allocations
res2 = is_end_2(board, tmpBool1, tmpBool2)
print(res1 == res2)
# True

# Measure time
%timeit is_end_1(board)
# 9.61 ms ± 323 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit is_end_2(board, tmpBool1, tmpBool2)
# 1.38 ms ± 53.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Answer 2

在這種情況下，您似乎不需要索引，只需要一個掩碼。

ind = np.where((board > 300) & (board - 300 < 100))
victoriousPlayer = board[ind[0][0], ind[1][0]] % 100 // 10

相當於

victoriousPlayer = board[(board  > 300) & (board - 300 < 100)][0] % 100 // 10

時間：

In [1]: import numpy as np                                                                                                    

In [2]: board = np.random.randint(0,401, (7,7))                                                                               

In [3]: %timeit ind = np.where((board > 300) & (board - 300 < 100));victoriousPlayer = board[ind[0][0], ind[1][0]] % 100 // 10
6.77 µs ± 260 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [4]: %timeit victoriousPlayer = board[(board  > 300) & (board - 300 < 100)][0] % 100 // 10                                 
5.02 µs ± 26.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

獲取符合我的標准的 np.array 索引的最快方法

問題描述

2 個解決方案

解決方案1
2 2020-02-21 14:55:41

解決方案2
1 已采納 2020-02-21 13:50:44

獲取符合我的標准的 np.array 索引的最快方法

問題描述

2 個解決方案

解決方案1 2 2020-02-21 14:55:41

解決方案2 1 已采納 2020-02-21 13:50:44

解決方案1
2 2020-02-21 14:55:41

解決方案2
1 已采納 2020-02-21 13:50:44