NumPy：數組和標量列表中的2D數組

Question

我需要從一維數組和標量列表中創建一個2D numpy數組，以便復制標量以匹配1D數組的長度。

期望行為的示例

>>> x = np.ones(5)
>>> something([x, 0, x])
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.]])

我知道列表中的矢量元素總是具有相同的長度（形狀），因此我可以通過執行以下操作“手動”執行此操作：

def something(lst):
    for e in lst:
        if isinstance(e, np.ndarray):
            l = len(e)
            break
    tmp = []
    for e in lst:
        if isinstance(e, np.ndarray):
            tmp.append(e)
            l = len(e)
        else:
            tmp.append(np.empty(l))
            tmp[-1][:] = e
    return np.array(tmp)

我要問的是，是否有一些現成的解決方案隱藏在numpy中，或者如果沒有，是否有比上面更好的（例如更一般，更可靠，更快）的解決方案。

Answer 1

In [179]: np.column_stack(np.broadcast(x, 0, x))
Out[179]: 
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.]])

要么

In [187]: np.row_stack(np.broadcast_arrays(x, 0, x))
Out[187]: 
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.]])

使用np.broadcast比np.broadcast_arrays更快：

In [195]: %timeit np.column_stack(np.broadcast(*[x, 0, x]*10))
10000 loops, best of 3: 46.4 µs per loop

In [196]: %timeit np.row_stack(np.broadcast_arrays(*[x, 0, x]*10))
1000 loops, best of 3: 380 µs per loop

但比你的something功能慢：

In [201]: %timeit something([x, 0, x]*10)
10000 loops, best of 3: 37.3 µs per loop

請注意， np.broadcast最多可以傳遞32個數組：

In [199]: np.column_stack(np.broadcast(*[x, 0, x]*100))
ValueError: Need at least two and fewer than (32) array objects.

而np.broadcast_arrays是無限的：

In [198]: np.row_stack(np.broadcast_arrays(*[x, 0, x]*100))
Out[198]: 
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.],
       ..., 
       [ 1.,  1.,  1.,  1.,  1.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.]])

使用np.broadcast或np.broadcast_arrays比something更通用。 它將適用於不同（但可播放）形狀的數組，例如：

In [209]: np.column_stack(np.broadcast(*[np.atleast_2d(x), 0, x]))
Out[209]: 
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.]])

而something([np.atleast_2d(x), 0, x])返回：

In [211]: something([np.atleast_2d(x), 0, x])
Out[211]: 
array([array([[ 1.,  1.,  1.,  1.,  1.]]), array([ 0.]),
       array([ 1.,  1.,  1.,  1.,  1.])], dtype=object)

Answer 2

一個較短的方式，但我懷疑是否更快：

l = len(max(lst, key=lambda e: len(e) if isinstance(e, np.ndarray) else 0))
new_lst = np.array([(x if isinstance(x, np.ndarray) else np.ones(l) * x) for x in lst])

編輯：使用np.fromiter更快地完成：

l = len(max(lst, key=lambda e: len(e) if isinstance(e, np.ndarray) else 0))
new_lst = np.fromiter(((x if isinstance(x, np.ndarray) else np.ones(l) * x) for x in lst))

並使用while循環來更快地完成它，但代碼有點長：

i = 0
while not isinstance(lst[i], np.ndarray):
  i += 1
l = len(lst[i])
new_lst = np.fromiter(((x if isinstance(x, np.ndarray) else np.ones(l) * x) for x in lst))

Answer 3

對於25行，列表理解版本的something在broadcase和broadcast_arrays之間的速度：

In [48]: ll=[x,0,x,x,0]*5

In [49]: np.vstack([y if isinstance(y,np.ndarray) else np.zeros(5) for y in ll]).shape
Out[49]: (25, 5)

In [50]: timeit np.vstack([y if isinstance(y,np.ndarray) else np.zeros(5) for y in ll]).shape
1000 loops, best of 3: 219 us per loop

In [51]: timeit np.vstack(np.broadcast_arrays(*ll))
1000 loops, best of 3: 790 us per loop

In [52]: timeit np.column_stack(np.broadcast(*ll)).shape
10000 loops, best of 3: 126 us per loop

使用np.array而不是vstack它會變得更好：

In [54]: timeit np.array([y if isinstance(y,np.ndarray) else np.zeros(5) for y in ll]).shape
10000 loops, best of 3: 54.2 us per loop

對於2d x ， if理解的vstack可能是唯一正確的：

In [66]: x=np.arange(10).reshape(2,5)

In [67]: ll=[x,0,x,x,0]

In [68]: np.vstack([y if isinstance(y,np.ndarray) else np.zeros(5) for y in ll]) 
Out[68]: 
array([[ 0.,  1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.,  9.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.,  9.],
       [ 0.,  1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.,  9.],
       [ 0.,  0.,  0.,  0.,  0.]])

NumPy：數組和標量列表中的2D數組

問題描述

3 個解決方案

解決方案1
3 已采納 2016-02-06 18:37:04

解決方案2
1 2016-02-06 18:10:48

解決方案3
0 2016-02-07 02:40:03

NumPy：數組和標量列表中的2D數組

問題描述

3 個解決方案

解決方案1 3 已采納 2016-02-06 18:37:04

解決方案2 1 2016-02-06 18:10:48

解決方案3 0 2016-02-07 02:40:03

解決方案1
3 已采納 2016-02-06 18:37:04

解決方案2
1 2016-02-06 18:10:48

解決方案3
0 2016-02-07 02:40:03