[英]numpy - efficient value counts in 2D and 3D arrays
我正在編寫一個小組游戲的調度程序。 我有一個適用於 32-4-8(32 名球員,每組 4 名球員,8 輪)的時間表,沒有重復的伙伴或對手。 但由於場地限制,每輪只能有28名玩家/7組參賽。 所以我必須修改賽程,讓每個球員都有 7 場比賽,1 次輪空,並且盡可能少地重復搭檔或對手。
import numpy as np
sched = np.array([
[[ 3, 28, 17, 14],
[23, 30, 22, 1],
[ 2, 5, 27, 25],
[20, 8, 10, 16],
[ 0, 24, 26, 11],
[ 4, 21, 31, 7],
[19, 6, 29, 15],
[13, 18, 12, 9]],
[[20, 15, 24, 31],
[ 3, 21, 16, 13],
[ 6, 30, 4, 5],
[28, 8, 0, 7],
[25, 29, 17, 23],
[14, 9, 2, 22],
[27, 12, 1, 11],
[26, 10, 19, 18]],
[[10, 4, 23, 12],
[ 9, 28, 25, 31],
[ 5, 13, 22, 8],
[15, 7, 30, 2],
[16, 19, 11, 14],
[18, 17, 24, 6],
[21, 0, 27, 20],
[ 3, 26, 29, 1]],
[[18, 20, 28, 1],
[ 8, 9, 3, 4],
[12, 17, 31, 5],
[13, 30, 27, 14],
[19, 25, 24, 7],
[ 2, 6, 21, 26],
[10, 11, 29, 22],
[15, 23, 0, 16]],
[[22, 21, 25, 15],
[26, 12, 20, 14],
[28, 5, 24, 10],
[11, 6, 31, 13],
[23, 27, 7, 3],
[ 0, 19, 9, 1],
[18, 30, 8, 29],
[16, 17, 2, 4]],
[[29, 28, 12, 21],
[ 9, 16, 27, 6],
[19, 17, 20, 30],
[ 2, 8, 24, 23],
[ 5, 11, 18, 7],
[26, 13, 25, 4],
[ 1, 10, 15, 14],
[ 0, 22, 31, 3]],
[[31, 19, 27, 8],
[20, 5, 29, 2],
[24, 16, 22, 12],
[25, 3, 10, 6],
[17, 1, 7, 13],
[ 4, 0, 14, 18],
[23, 28, 26, 15],
[11, 21, 9, 30]],
[[31, 18, 1, 16],
[23, 14, 21, 5],
[ 8, 3, 11, 15],
[26, 17, 9, 10],
[30, 12, 25, 0],
[22, 20, 7, 6],
[27, 4, 29, 24],
[13, 19, 28, 2]]
])
為了確定最佳再見選項,我從每一輪比賽中隨機選擇了一場比賽作為再見。 然后,我為每個輪空選擇分配一個分數,以最大限度地增加只有 1 個輪空的玩家數量,以最大限度地減少對時間表的必要更改。
def bincount2d(arr, bins=None):
if bins is None:
bins = np.max(arr) + 1
count = np.zeros(shape=[len(arr), bins], dtype=np.int64)
indexing = np.arange(len(arr))
for col in arr.T:
count[indexing, col] += 1
return count
# randomly sample one game per round as byes
# repeat n times (here 10000)
times = 10000
idx1 = np.tile(np.arange(sched.shape[0]), times)
idx2 = np.random.randint(sched.shape[1], size=sched.shape[0] * times)
population_byes = sched[idx1, idx2].reshape(times, sched.shape[1], sched.shape[2])
# get player counts for byes
# can reshape because interested in # of byes for entire schedule
# so no need to segment players by rounds for these counts
count_shape = (population_byes.shape[0], population_byes.shape[1] * population_byes.shape[2])
counts = bincount2d(population_byes.reshape(count_shape))
# fitness is the number of players with one bye
# the higher the value, the less we need to do to mess with the schedule
fitness = np.apply_along_axis(lambda x: (x == 1).sum(), 1, counts)
byes = population_byes[np.argmax(fitness)]
我的問題如下:
(1) 有沒有一種有效的方法來解釋沒有計數的值(我知道索引應該是從 0 到 31)? bincount2d 沒有該范圍內缺失值的值。
(2) 是否有比 np.apply_along_axis 線更有效的矢量化方法來使元素計數等於 1?
(3) 最終,我想做的是讓應用程序更改時間表,通過交換玩家分配來讓每個人都再見。 如何交換 3D 數組中的元素?
(1) 有沒有一種有效的方法來解釋沒有計數的值(我知道索引應該是從 0 到 31)? bincount2d 沒有該范圍內缺失值的值。
bincount2d
效率低下,因為它執行的內存訪問效率低下。 事實上,轉置是一項昂貴的操作,尤其是當它像 Numpy 那樣懶惰地完成時。 此外,循環也效率不高,因為它適用於具有隨機內存訪問的相當大的陣列,這對CPU 緩存不利。 話雖如此,Numpy 並不適合這樣的計算。 可以使用Numba來高效地實現操作:
import numba as nb
# You may need to tune the types on your machines
# Alternatively, you can use cache=True instead and let Numba find the types (which is slower the fist time)
@nb.njit('int64[:,::1](int64[:,::1], optional(int64))')
def bincount2d_fast(arr, bins=None):
if bins is None:
nbins = np.max(arr) + 1
else:
nbins = np.int64(bins)
count = np.zeros((arr.shape[0], nbins), dtype=np.int64)
for i in range(arr.shape[0]):
for j in range(arr.shape[1]):
count[i, arr[i, j]] += 1
return count
上面的代碼比我機器上的原始bincount2d
函數快 10 倍。
(2) 是否有比 np.apply_along_axis 線更有效的矢量化方法來使元素計數等於 1?
是的。 您可以對整個 2D 數組進行操作並在給定軸上執行縮減。 下面是一個例子:
fitness = (counts == 1).sum(axis=1)
byes = population_byes[np.argmax(fitness)]
```
This is roughly 30 times faster on my machine.
> (3) Ultimately, what I would like to do is have the application change the schedule to give everyone a bye by swapping player assignments. How do you swap elements in a 3D array?
A straightforward solution is to use Numba again with plain loops. Another solution could be to save the value to swap in a temporary array and use an indirect access regarding your exact needs (like what @WholeBrain proposed). Something like:
```python
# all_x1, all_y1, etc. are 1D Numpy arrays containing coordinates of the items to swap
arr[all_x2, all_y2], arr[all_x1, all_y1] = arr[all_x1, all_y1], arr[all_x2, all_y2]
```
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.