將元組 (dtype=object) 的 np.ndarray 轉換為 dtype=int 的數組

Question

我需要將元組的 np arrays （短）轉換為整數的 np arrays 。

最明顯的方法不起作用：

# array_of_tuples is given, this is just an example:
array_of_tuples = np.zeros(2, dtype=object)
array_of_tuples[0] = 1,2
array_of_tuples[1] = 2,3

np.array(array_of_tuples, dtype=int)

ValueError: setting an array element with a sequence.

Answer 1

看起來將元組放入固定大小和 dtype 的預分配緩沖區是 go 的方式。 它似乎避免了與計算大小、粗糙度和 dtype 相關的大量開銷。

以下是一些較慢的替代方案和基准：

您可以作弊並創建具有所需字段數量的 dtype，因為 numpy 支持將元組轉換為自定義 dtype：

 dt = np.dtype([('', int) for _ in range(len(array_of_tuples[0]))]) res = np.empty((len(array_of_tuples), len(array_of_tuples[0])), int) res.view(dt).ravel()[:] = array_of_tuples

您可以堆疊數組：
```
 np.stack(array_of_tuples, axis=0)
```
不幸的是，這甚至比其他提出的方法還要慢。

預分配沒有多大幫助：

 res = np.empty((len(array_of_tuples), len(array_of_tuples[0])), int) np.stack(array_of_tuples, out=res, axis=0)

嘗試使用np.concatenate作弊，它允許您指定 output dtype 也無濟於事：

 np.concatenate(array_of_tuples, dtype=int).reshape(len(array_of_tuples), len(array_of_tuples[0]))

也沒有預先分配數組：

 res = np.empty((len(array_of_tuples), len(array_of_tuples[0])), int) np.concatenate(array_of_tuples, out=res.ravel())

您也可以嘗試在 python 空間中進行連接，這也很慢：

 np.array(sum(array_of_tuples, start=()), dtype=int).reshape(len(array_of_tuples), len(array_of_tuples[0]))

或者

 np.reshape(np.sum(array_of_tuples), (len(array_of_tuples), len(array_of_tuples[0])))

array_of_tuples = np.empty(100, dtype=object)
for i in range(len(array_of_tuples)):
    array_of_tuples[i] = tuple(range(i, i + 100))

%%timeit
res = np.empty((len(array_of_tuples), len(array_of_tuples[0])), int)
for i, res[i] in enumerate(array_of_tuples):
    pass
305 µs ± 8.55 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

dt = np.dtype([('', 'int',) for _ in range(100)])
%%timeit
res = np.empty((100, 100), int)
res.view(dt).ravel()[:] = array_of_tuples
334 µs ± 5.59 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit np.array(array_of_tuples.tolist())
478 µs ± 12.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%%timeit
res = np.empty((100, 100), int)
np.concatenate(array_of_tuples, out=res.ravel())
500 µs ± 2.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit np.concatenate(array_of_tuples, dtype=int).reshape(100, 100)
504 µs ± 7.72 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%%timeit
res = np.empty((100, 100), int)
np.stack(array_of_tuples, out=res, axis=0)
557 µs ± 25.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit np.stack(array_of_tuples, axis=0)
577 µs ± 6.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit np.array(sum(array_of_tuples, start=()), dtype=int).reshape(100, 100)
1.06 ms ± 11.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit np.reshape(np.sum(array_of_tuples), (100, 100))
1.26 ms ± 24.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

將元組 (dtype=object) 的 np.ndarray 轉換為 dtype=int 的數組

問題描述

1 個解決方案

解決方案1
3 已采納 2022-09-12 16:54:48

將元組 (dtype=object) 的 np.ndarray 轉換為 dtype=int 的數組

問題描述

1 個解決方案

解決方案1 3 已采納 2022-09-12 16:54:48

解決方案1
3 已采納 2022-09-12 16:54:48