[英]Numpy element-wise dot product
is there an elegant, numpy way to apply the dot product elementwise? 是否有一种优雅,numpy的方式来应用点积元素? Or how can the below code be translated into a nicer version?
或者如何将以下代码翻译成更好的版本?
m0 # shape (5, 3, 2, 2)
m1 # shape (5, 2, 2)
r = np.empty((5, 3, 2, 2))
for i in range(5):
for j in range(3):
r[i, j] = np.dot(m0[i, j], m1[i])
Thanks in advance! 提前致谢!
Approach #1 方法#1
np.einsum('ijkl,ilm->ijkm',m0,m1)
Steps involved : 涉及的步骤:
Keep the first axes from the inputs aligned. 保持输入的第一个轴对齐。
Lose the last axis from m0
against second one from m1
in sum-reduction. 在和减少中丢失
m0
的最后一个轴与m1
第二个轴。
Let remaining axes from m0
and m1
spread-out /expand with elementwise multiplications in an outer-product fashion. 让来自其它轴
m0
和m1
展开的 /在外部产品的方式与按元素乘法扩大。
Approach #2 方法#2
If you are looking for performance and with the axis of sum-reduction having a smaller length, you are better off with one-loop and using matrix-multiplication
with np.tensordot
, like so - 如果你正在寻找性能并且总和减少的轴具有较小的长度,你最好使用
np.tensordot
并使用matrix-multiplication
与np.tensordot
,如此 -
s0,s1,s2,s3 = m0.shape
s4 = m1.shape[-1]
r = np.empty((s0,s1,s2,s4))
for i in range(s0):
r[i] = np.tensordot(m0[i],m1[i],axes=([2],[0]))
Approach #3 方法#3
Now, np.dot
could be efficiently used on 2D inputs for some further performance boost. 现在,
np.dot
可以有效地用于2D输入,以进一步提高性能。 So, with it, the modified version, though a bit longer one, but hopefully the most performant one would be - 所以,有了它,修改后的版本,虽然有点长,但希望性能最高的版本是 -
s0,s1,s2,s3 = m0.shape
s4 = m1.shape[-1]
m0.shape = s0,s1*s2,s3 # Get m0 as 3D for temporary usage
r = np.empty((s0,s1*s2,s4))
for i in range(s0):
r[i] = m0[i].dot(m1[i])
r.shape = s0,s1,s2,s4
m0.shape = s0,s1,s2,s3 # Put m0 back to 4D
Function definitions - 功能定义 -
def original_app(m0, m1):
s0,s1,s2,s3 = m0.shape
s4 = m1.shape[-1]
r = np.empty((s0,s1,s2,s4))
for i in range(s0):
for j in range(s1):
r[i, j] = np.dot(m0[i, j], m1[i])
return r
def einsum_app(m0, m1):
return np.einsum('ijkl,ilm->ijkm',m0,m1)
def tensordot_app(m0, m1):
s0,s1,s2,s3 = m0.shape
s4 = m1.shape[-1]
r = np.empty((s0,s1,s2,s4))
for i in range(s0):
r[i] = np.tensordot(m0[i],m1[i],axes=([2],[0]))
return r
def dot_app(m0, m1):
s0,s1,s2,s3 = m0.shape
s4 = m1.shape[-1]
m0.shape = s0,s1*s2,s3 # Get m0 as 3D for temporary usage
r = np.empty((s0,s1*s2,s4))
for i in range(s0):
r[i] = m0[i].dot(m1[i])
r.shape = s0,s1,s2,s4
m0.shape = s0,s1,s2,s3 # Put m0 back to 4D
return r
Timings and verification - 时间和验证 -
In [291]: # Inputs
...: m0 = np.random.rand(50,30,20,20)
...: m1 = np.random.rand(50,20,20)
...:
In [292]: out1 = original_app(m0, m1)
...: out2 = einsum_app(m0, m1)
...: out3 = tensordot_app(m0, m1)
...: out4 = dot_app(m0, m1)
...:
...: print np.allclose(out1, out2)
...: print np.allclose(out1, out3)
...: print np.allclose(out1, out4)
...:
True
True
True
In [293]: %timeit original_app(m0, m1)
...: %timeit einsum_app(m0, m1)
...: %timeit tensordot_app(m0, m1)
...: %timeit dot_app(m0, m1)
...:
100 loops, best of 3: 10.3 ms per loop
10 loops, best of 3: 31.3 ms per loop
100 loops, best of 3: 5.12 ms per loop
100 loops, best of 3: 4.06 ms per loop
我认为numpy.inner()是你真正想要的吗?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.