在特定索引处写入 numpy 数组的最快方法？

Question

I would like to get the fastest solution to write data in a 2D numpy array using an array of indexes.我想获得最快的解决方案，使用索引数组将数据写入 2D numpy 数组。

I have a large 2D boolean numpy array buffer我有一个大的 2D boolean numpy 数组缓冲区

import numpy as np

n_rows = 100000
n_cols = 250
shape_buf = (n_rows, n_cols)

row_indexes = np.arange(n_rows,dtype=np.uint32)
w_idx = np.random.randint(n_cols, size=n_rows, dtype = np.uint32)

buffer = np.full(shape=shape_buf,
                 fill_value=0,
                 dtype=np.bool_,order="C")

I want to write data in the buffer using a list of indexes w_idx我想使用索引列表w_idx在缓冲区中写入数据

data = np.random.randint(0,2, size=n_rows, dtype = np.bool_)
w_idx = np.random.randint(n_cols, size=n_rows, dtype = np.uint32)

One solution is to use standard indexing:一种解决方案是使用标准索引：

%timeit buffer[row_indexes, w_idx] = data
2.07 ms ± 20.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

A faster solution is to flatten the indexes and to use np.put:更快的解决方案是展平索引并使用 np.put：

%timeit buffer.put(flat_row_indexes + w_idx, data, "wrap")
1.76 ms ± 18.9 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

However this last solution is still to slow for my application.但是，最后一个解决方案对我的应用程序来说仍然很慢。 Is it possible to be faster?有可能更快吗？ Maybe by using another library, say Numba?也许通过使用另一个库，比如 Numba？

Answer 1

The timing result you got makes sense given the fact that the first assignment fills in all 800 rows for each column, while the second one actually places the individual elements you want into the array.鉴于第一个赋值填充了每列的所有 800 行，而第二个赋值实际上将您想要的各个元素放入数组中，因此您获得的计时结果是有意义的。 The reason the first version appears to be ~100x faster instead of ~800x faster is that the overhead of a call to put for such a small dataset is going to overwhelm the timing result.第一个版本似乎快约 100 倍而不是约 800 倍的原因是，为如此小的数据集调用put的开销将压倒计时结果。

First lesson : always test numpy operations on a small array that you can see.第一课：始终在您可以看到的小数组上测试 numpy 操作。 Usually no larger than 5x5, so you can compare to a hand-calculated version if necessary.通常不大于 5x5，因此如有必要，您可以与手算版本进行比较。

Second lesson : benchmarks on small arrays are unreliable.第二课：小型 arrays 的基准测试不可靠。 An O(n) algorithm only achieves linear scaling asymptotically. O(n) 算法只能渐进地实现线性缩放。 Timing for small arrays is dominated by function calls and other (usually constant-time) bookkeeping, especially in python.小 arrays 的计时主要由 function 调用和其他（通常是恒定时间）记账，尤其是 python。

The best thing I can think of is to avoid the overhead of calling put :我能想到的最好的事情是避免调用put的开销：

buffer[np.arange(n_rows), w_idx] = data

Another option, since you already have the offsets pre-computed, and all your arrays are contiguous, is to assign linear indices:另一种选择，因为您已经预先计算了偏移量，并且所有 arrays 都是连续的，所以分配线性索引：

buffer.ravel()[w_idx + flat_row_indices] = data

在特定索引处写入 numpy 数组的最快方法？

问题描述

1 个解决方案

解决方案1
2 2022-11-13 15:30:49

在特定索引处写入 numpy 数组的最快方法？

问题描述

1 个解决方案

解决方案1 2 2022-11-13 15:30:49

解决方案1
2 2022-11-13 15:30:49