简体   繁体   中英

Why does numpy Dot product of 2d array with 1d array produce 1d array?

I try to run the code like below:

>>> import numpy as np
>>> A = np.array([[1,2], [3,4], [5,6]])
>>> A.shape
(3, 2)
>>> B = np.array([7,8])
>>> B.shape
(2,)
>>> np.dot(A,B)
array([23, 53, 83])

I thought the shape of np.dot(A,B) should be (1,3) not (3,).

The result of matrix return should be:

array([[23],[53],[83]])

23
53
83

not

array([23,53,83])

23 53 83

why the result occurred?

As its name suggests, the primary purpose of the numpy.dot() function is to deliver a scalar result by performing a traditional linear algebra dot product on two arrays of identical shape (m,) .

Given this primary purpose, the documentation of numpy.dot() also talks about this scenario as the first (the first bullet point below):

numpy.dot(a, b, out=None)

 1. If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).
 2. If both a and b are 2-D arrays, it is matrix multiplication, but using matmul or a @ b is preferred.
 3. If either a or b is 0-D (scalar), it is equivalent to multiply and using numpy.multiply(a, b) or a * b is preferred.
 4. If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.

Your case is covered by the 4 th bullet point above (as pointed out by @hpaulj) in his comments. But then, it still does not fully answer your question as to why the result has shape (3,) , and not (3,1) as you expected.

You are justified in expecting a result-shape of (3,1) , only if shape of B is (2,1) . In such a case, since A has shape (3,2) , and B has shape (2,1) , you'd be justified in expecting a result-shape of (3,1) .

But here, B has a shape of (2,) , and not (2,1) . So, we are now in a territory that's outside the jurisdiction of the usual rules of matrix multiplication . So, it's really up to the designers of the numpy.dot() function as to how the result turns out to be. They could've chosen to treat this as an error ("dimension mis-match"). Instead, they've chosen to deal with this scenario, as described in this answer .

I'm quoting that answer, with some modifications to relate your code:

According to numpy a 1D array has only 1 dimension and all checks are done against that dimension. Because of this we find that np.dot(A,B) checks second dimension of A against the one dimension of B

So, the check would succeed, and numpy wouldn't treat this as an error.

Now, the only remaining question is why is the result-shape (3,) and not (3,1) or (1,3) .

The answer to this is: in A , which has shape (3,2) , we have consumed the last part (2,) to perform sum-product. The un-consumed part of A's shape is (3,) , and hence the shape of the result of np.dot(A,B) , would be (3,) . To understand this further, if we take a different example in which A has a shape of (3,4,2) , instead of (3,2) , the un-consumed part of A 's shape would be (3,4,) , and the result of np.dot(A,B) would be (3,4,) instead of (3,) which your example produced.

Here's the code for you to verify:

import numpy as np

A = np.arange(24).reshape(3,4,2)
print ("A is:\n", A, ", and its shape is:", A.shape)
B = np.array([7,8])
print ("B is:\n", B, ", and its shape is:", B.shape)
C = np.dot(A,B)
print ("C is:\n", C, ", and its shape is:", C.shape)

The output of this is:

A is:
 [[[ 0  1]
  [ 2  3]
  [ 4  5]
  [ 6  7]]

 [[ 8  9]
  [10 11]
  [12 13]
  [14 15]]

 [[16 17]
  [18 19]
  [20 21]
  [22 23]]] , and its shape is: (3, 4, 2)
B is:
 [7 8] , and its shape is: (2,)
C is:
 [[  8  38  68  98]
 [128 158 188 218]
 [248 278 308 338]] , and its shape is: (3, 4)

Another helpful perspective to understand the behavior in this example is below:

The array A of shape (3,4,2) can be conceptually visualized as an outer array of inner arrays, where the outer array has shape (3,4) , and each inner array has shape (2,) . On each of these inner arrays, the traditional dot product will therefore be performed using the array B (which has shape (2,) , and the resulting scalars are all left in their own respective places, to form a (3,4) shape (the outer matrix shape). So, the overall result of numpy.dot(A,B) , consisting of all these in-place scalar results, would have the shape (3,4) .

In wiki

在此处输入图片说明

So (3, 2) dot with (2,1) will be (3,1)


How to fix

np.dot(A,B[:,None])
Out[49]: 
array([[23],
       [53],
       [83]])

I just learned this dot product from Neural Network... Anyway, it is the dot product between " 1d " array and " nd " array. enter image description here

As we can see, it calculates the sum of the multiplication for elements separately in the red box as "1 7 + 2 8" Then enter image description here

Then enter image description here

A.shape is (3, 2), B.shape is (2,) this situation could directly use the rule #4 for the dot operation np.dot(A,B):

If a is an ND array and b is a 1-D array, it is a sum product over the last axis of a and b .

Because the alignment will happen between B's 2 (only axis of B) and A's 2 (last axis of A) and 2 indeed equals 2, numpy will judge that this is absolutely legitimate for dot operation. Therefore these two "2" are "consumed", leaving A's (3,) "in the wild". This (3,) will therefore be the shape of the result.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM