[英]How better speed performance in loops would be achieved in cython?
我已經在python中啟動了一個項目,該項目主要由循環組成。 幾天前,我讀到了有關cython的信息,它可以幫助您通過靜態鍵入來獲得更快的代碼。 我開發了這兩個函數來檢查性能(一個在python中,另一個在cython中):
import numpy as np
from time import clock
size = 11
board = np.random.randint(2, size=(size, size))
def py_playout(board, N):
black_rave = []
white_rave = []
for i in range(N):
for x in range(board.shape[0]):
for y in range(board.shape[1]):
if board[(x,y)] == 0:
black_rave.append((x,y))
else:
white_rave.append((x,y))
return black_rave, white_rave
cdef cy_playout(board, int N):
cdef list white_rave = [], black_rave = []
cdef int M = board.shape[0], L = board.shape[1]
cdef int i=0, x=0, y=0
for i in range(N):
for x in range(M):
for y in range(L):
if board[(x,y)] == 0:
black_rave.append((x,y))
else:
white_rave.append((x,y))
return black_rave, white_rave
我畢竟使用下面的代碼來測試性能:
t1 = clock()
a = playout(board, 1000)
t2 = clock()
b = playout1(board, 1000)
t3 = clock()
py = t2 - t1
cy = t3 - t2
print('cy is %a times better than py'% str(py / cy))
但是我沒有發現任何明顯的改進。 我還沒有使用Typed-Memoryviews。 有人可以提出有用的解決方案來提高速度,還是可以幫助我使用typed-memoryview重寫代碼?
沒錯,沒有在cython函數中的board
參數中添加類型,加速不是很多:
%timeit py_playout(board, 1000)
# 321 ms ± 19.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit cy_playout(board, 1000)
# 186 ms ± 541 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
但這仍然快了兩倍。 通過添加類型,例如
cdef cy_playout(int[:, :] board, int N):
# ...
# or if you want explicit types:
# cimport numpy as np
# cdef cy_playout(np.int64_t[:, :] board, int N): # or np.int32_t
它快得多(快了將近十倍):
%timeit cy_playout(board, 1000)
# 38.7 ms ± 1.84 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
我還使用了timeit
(可以使用IPython的魔術%timeit
)來獲得更准確的計時。
請注意,您也可以使用numba來實現極大的加速,而無需任何其他靜態類型:
import numba as nb
nb_playout = nb.njit(py_playout) # Just decorated your python function
%timeit nb_playout(board, 1000)
# 37.5 ms ± 154 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
我實現了一個運行速度更快的功能。 我只是將black_rave
和white_rave
聲明為memoryviews並將它們放入返回值中:
cdef tuple cy_playout1(int[:, :] board, int N):
cell_size = int((size ** 2) / 2) + 10
cdef int[:, :] black_rave = np.empty([cell_size, 2], dtype=np.int32)
cdef int[:, :] white_rave = np.empty([cell_size, 2], dtype=np.int32)
cdef int i, j, x, y, h
i, j = 0, 0
cdef int M,L
M = board.shape[0]
L = board.shape[1]
for h in range(N):
for x in range(M):
for y in range(L):
if board[x,y] == 0:
black_rave[i][0], black_rave[i][1] = x, y
i += 1
elif board[x,y] == 1:
white_rave[j][0], white_rave[j][1] = x, y
j += 1
i = 0
j = 0
return black_rave[:i], white_rave[:j]
這是速度測試結果:
%timeit py_playout(board, 1000)
%timeit cy_playout(board, 1000)
%timeit cy_playout1(board, 1000)
# 1 loop, best of 3: 200 ms per loop
# 100 loops, best of 3: 9.26 ms per loop
# 100 loops, best of 3: 4.88 ms per loop
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.