简体   繁体   English

Numpy 3D arrays 的点积,形状为 (X, Y, Z) 和 (X, Y, 1)

[英]Numpy dot product of 3D arrays with shapes (X, Y, Z) and (X, Y, 1)

I have 2 numpy 3D arrays: A of shape (X, Y, Z) and B of shape (X, Y, 1) .我有 2 个 numpy 3D arrays: A形状(X, Y, Z)B形状(X, Y, 1) I need to perform dot product of each column of A with the single column of B, obtaining another array C of shape (X, Z) .我需要对 A 的每一列与 B 的单列进行点积,得到另一个形状为(X, Z)的数组C
I managed do achieve the results by allocating C , iterating over the 1st shape ( X ) and saving the results during the loop.我设法通过分配C ,迭代第一个形状( X )并在循环期间保存结果来实现结果。

Here is a sample code:这是一个示例代码:

import numpy as np

np.random.seed(10)

x, y, z = 7, 2048, 10

a = np.random.randint(0, 10, (x, y, z))
b = np.random.randint(0, 10, (x, y, 1))
c = np.zeros((x, z))

for i in range(x):
    c[i] = np.dot(b[i].T, a[i])

print(c)

With the output:使用 output:

[[40223. 42505. 41040. 40772. 41213. 40311. 41813. 41632. 40578. 40859.]
 [40984. 42119. 41512. 40948. 40725. 40222. 41182. 42255. 41916. 41948.]
 [41824. 41908. 42118. 39690. 40537. 41394. 42446. 41598. 40710. 42171.]
 [41664. 41949. 40847. 39915. 41888. 41565. 40992. 41354. 41227. 41948.]
 [42766. 41490. 41291. 42317. 40691. 41544. 41440. 41111. 42395. 40857.]
 [41714. 40661. 41421. 42129. 42115. 42189. 41941. 41541. 41957. 42574.]
 [41236. 40527. 41599. 40372. 40897. 41287. 41953. 40968. 41700. 42033.]]

For low values of (X, Y, Z) the process is quite fast, but usually I work with large sample sets, which makes the solution poorly optimized.对于 (X, Y, Z) 的低值,该过程非常快,但通常我使用大型样本集,这使得解决方案优化不佳。

So, despite the fact the code above is working for me, I believe there is another ( better ) way to get the same results, maybe using matmul or tensordot , but I couldn't figure out how to properly use those functions.所以,尽管上面的代码对我有用,但我相信还有另一种(更好的)方法可以获得相同的结果,可能使用matmultensordot ,但我不知道如何正确使用这些函数。

Easiest way is to do it in two steps, like a manual dot product:最简单的方法是分两步完成,例如手动点积:

c = (a * b).sum(1)

Or if you wanna get fancy and speed it up a bit:或者,如果您想花哨并加快速度:

c = np.einsum('ijk,ijl->ik', a, b)

Output: Output:

array([[40223, 42505, 41040, 40772, 41213, 40311, 41813, 41632, 40578,
        40859],
       [40984, 42119, 41512, 40948, 40725, 40222, 41182, 42255, 41916,
        41948],
       [41824, 41908, 42118, 39690, 40537, 41394, 42446, 41598, 40710,
        42171],
       [41664, 41949, 40847, 39915, 41888, 41565, 40992, 41354, 41227,
        41948],
       [42766, 41490, 41291, 42317, 40691, 41544, 41440, 41111, 42395,
        40857],
       [41714, 40661, 41421, 42129, 42115, 42189, 41941, 41541, 41957,
        42574],
       [41236, 40527, 41599, 40372, 40897, 41287, 41953, 40968, 41700,
        42033]])

@fsl gave this einsum : @fsl 给出了这个einsum

np.einsum('ijk,ijl->ik',a,b)

with a bit of transpose, you can place the j , sum-of-products dimension in the standard dot order (last of A, 2nd to the last of B):通过一些转置,您可以将j ,乘积总和维度放在标准dot顺序中(A 的最后一个,B 的第二个到最后一个):

np.einsum('ikj,ijl->ik',a.transpose(0,2,1),b)

Which can be used with matmul :可以与matmul一起使用:

np.matmul(a.transpose(0,2,1),b).squeeze()

the squeeze is needed to remove the trailing size 1 dimension (the l that was 'omitted' in the einsum .需要squeeze以删除尾随大小 1 维度(在einsum中“省略”的l

tensordot cannot be used since it will do an "outer" product on the leading dimensions. tensordot不能使用,因为它会在前导维度上做一个“外部”产品。

In [8]: np.matmul(a.transpose(0,2,1),b).shape
Out[8]: (7, 10, 1)
In [9]: np.dot(a.transpose(0,2,1),b).shape
Out[9]: (7, 10, 7, 1)

In this case the einsum is nearly as good as matmul , possibly because the j dimension is relatively large.在这种情况下, einsum几乎与matmul一样好,可能是因为j维度相对较大。

In [10]: timeit np.matmul(a.transpose(0,2,1),b).squeeze()
291 µs ± 632 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [11]: timeit np.einsum('ijk,ijl->ik',a,b)
337 µs ± 77.4 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [12]: timeit np.einsum('ijk,ijl->ik',a,b,optimize=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM