简体   繁体   中英

Numpy ndarray unexpected shape broadcast error

I have one numpy ndarray with shape (3,). I have another ndarray with shape (3,100,100). The following works:

a = np.array([1,1,1]) # Shape is (3,)
b = np.zeros((3,100,100)) # Shape is (3,100,100)
c = np.array([b[0], b[1], 0]) # Shape (3,)
c - a # works fine and as expected 

But the following breaks:

c_wrong = np.array([b[0], b[1], b[2]]) # now c_wrong is (3,100,100) too

c_wrong - a # ValueError: operands could not be broadcast together with shapes (3,100,100) (3,)

Is there a way to reshape a (3,100,100) into a (3,)?

An ugly walk around that I figured out is just to add a dummy extra component:

>>> c_wrong = np.array([b[0],b[1],b[2],0])
>>> a = np.array([1,1,1,1])
>>> d = c_wrong - a
>>> d[0:3]

This is quite ugly though, but I hope it helps to understand the problem and the desired behavior.

Look at more than the shape!

In [82]: a = np.array([1,1,1]) # Shape is (3,) 
    ...: b = np.zeros((3,10,10)) # Shape is (3,10,10) 
    ...: c = np.array([b[0], b[1], 0]) # Shape (3,)                             
In [83]:                                                                        
In [83]: c                                                                      
Out[83]: 
array([array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]),
       array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]),
       0], dtype=object)
In [84]: c.shape                                                                
Out[84]: (3,)

Yes, c just has 3 elements, but each is array or a scalar (the last 0).

In [85]: c-a                                                                    
Out[85]: 
array([array([[-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.]]),
       array([[-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.]]),
       -1], dtype=object)

So you managed to subtract 1 from each of those elements!

c_wrong is a very different array - it's 3d with a numeric dtype. Replacing that 0 with d[3] makes all the difference.

In [88]: c_wrong.shape                                                          
Out[88]: (3, 10, 10)
In [89]: c_wrong.dtype                                                          
Out[89]: dtype('float64')

To subtract a (3,) from a (3,N,N) you have to adjust the dimensions of a to (3,1,1). Then it can do proper broadcasting.

In [91]: c_wrong -  a[:,None,None]                                              
Out[91]: 
array([[[-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
        [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
        ....
        [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.]]])

I think it's just an accident that your ca works. By defining c with a 0 element, you created an object dtype array. Math with object dtype arrays is nit-or-miss. This subtraction happens to be one of those hits. But don't count on it; there are lots ways in which math with such an array does not work - and it is always slower.


c_wrong is essentially the same thing as b .


The core of numpy is multidimensional numeric arrays. np.array as a default tries to construct as high a dimensional numeric as it can. In your c_wrong case it can make a 3d; in c is can't because of the scalar 0. So it falls back on making a 1d object array.

The surest way to make an object array of the desired shape is to initial a 'blank' one, and fill it. But even then filling can be tricky. Here I managed to do it with:

In [92]: c3 = np.empty(3, object)                                               
In [93]: c3                                                                     
Out[93]: array([None, None, None], dtype=object)
In [94]: c3[:] = list(b)                                                        
In [95]: c3                                                                     
Out[95]: 
array([array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       ....
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])], dtype=object)
In [96]: c3-a                                                                   
Out[96]: 
array([array([[-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.],
....
       [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.]])], dtype=object)

A fill that doesn't work:

In [97]: c3[:] = b                                                              
------------------------------------------------------------------------ 
...
ValueError: could not broadcast input array from shape (3,10,10) into shape (3)

a[:,None,None] doesn't look so ugly when you become familiar with broadcasting.

Compare the timings:

In [98]: timeit c_wrong-a[:,None,None]                                          
5.22 µs ± 6.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [99]: timeit c3-a                                                            
9.53 µs ± 20.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [100]: timeit c-a                                                            
7.66 µs ± 10.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Or with dot

In [103]: timeit np.dot(a, b.reshape(3,-1)).shape                              
2.44 µs ± 9.63 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [104]: timeit np.dot(a,c).shape                                              
10.9 µs ± 16.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [105]: timeit np.dot(a,c3).shape                                             
11.6 µs ± 30.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

dot has very specific rules - the last axis of a must match the 2nd to the last of b . That's why I used reshape . And it passes the task to a fast 'blas' routine.

With the (3,) object array it does 1d dot product - but iteratively.

@ , matmul works with the reshaped b , but not with c or c3 . Same for einsum : np.einsum('i,ijk->jk',a,b).shape works, but not anything using c .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM