[英]Black voodoo of NumPy Einsum
I got some working code using einsum function. 我使用einsum函数获得了一些工作代码。 But as einsum is currently still like
black voodoo
for me. 但是因为einsum目前仍然像我的
black voodoo
。 I was wondering, what this code actually is doing and if it can be somehow optimized using np.dot
我想知道,这段代码实际上在做什么,以及是否可以使用
np.dot
以某种方式进行优化
My data looks likes this 我的数据看起来像这样
n, p, q = 40000, 8, 4
a = np.random.rand(n, p, q)
b = np.random.rand(n, p)
And my existing functions einsum functions looks like this 我现有的函数einsum函数看起来像这样
f1 = np.einsum("ijx,ijy->ixy", a, a)
f2 = np.einsum("ijx,ij->ix", a, b)
But what does it really do? 但它真正做到了什么? I get till here: each dimension (axis) is represented by a label,
i
is equal to the first axis n
, j
for the 2nd axis p
and x
and y
are different labels for the same axis q
. 直到这里:每个尺寸(轴)由标签表示,
i
等于第一轴n
, j
为第二轴p
, x
和y
是同一轴q
不同标签。 So the order of the output array of f1
is ixy
and thus the output shape is 40000,4,4 (n,q,q)
所以
f1
的输出数组的顺序是ixy
,因此输出形状是40000,4,4 (n,q,q)
But that's as far as I get. 但就我而言。 And
和
Lets play around with a couple of small arrays 让我们玩几个小阵列
In [110]: a=np.arange(2*3*4).reshape(2,3,4)
In [111]: b=np.arange(2*3).reshape(2,3)
In [112]: np.einsum('ijx,ij->ix',a,b)
Out[112]:
array([[ 20, 23, 26, 29],
[200, 212, 224, 236]])
In [113]: np.diagonal(np.dot(b,a)).T
Out[113]:
array([[ 20, 23, 26, 29],
[200, 212, 224, 236]])
np.dot
operates on the last dim of the 1st array, and 2nd to the last of the 2nd. np.dot
在第一个数组的最后一个dim上运行,第二个到第二个数组的最后一个运行。 So I have to switch the arguments so the 3
dimension lines up. 所以我必须切换参数,以便
3
维排列。 dot(b,a)
produces a (2,2,4) array. dot(b,a)
产生(2,2,4)阵列。 diagonal
selects 2 of those 'rows', and transpose to clean up. diagonal
选择其中2个'行',并转置清理。 Another einsum
expresses that cleanup nicely: 另一个
einsum
很好地表达了清理:
In [122]: np.einsum('iik->ik',np.dot(b,a))
Since np.dot
is producing a larger array than the original einsum
, it is unlikely to be faster, even if the underlying C code is tighter. 由于
np.dot
产生的数组比原始的einsum
,所以即使底层的C代码更紧密,它也不可能更快。
(Curiously I'm having trouble replicating np.dot(b,a)
with einsum
; it won't generate that (2,2,...) array). (奇怪的是我无法用
einsum
复制np.dot(b,a)
;它不会生成那个(2,2,...)数组)。
For the a,a
case we have to do something similar - roll the axes of one array so the last dimension lines up with the 2nd to last of the other, do the dot
, and then cleanup with diagonal
and transpose
: 对于
a,a
我们必须做类似的事情 - 滚动一个数组的轴,使最后一个维度与另一个数据的第二个到最后一个排列,做dot
,然后用diagonal
和transpose
清理:
In [157]: np.einsum('ijx,ijy->ixy',a,a).shape
Out[157]: (2, 4, 4)
In [158]: np.einsum('ijjx->jix',np.dot(np.rollaxis(a,2),a))
In [176]: np.diagonal(np.dot(np.rollaxis(a,2),a),0,2).T
tensordot
is another way of taking a dot
over selected axes. tensordot
是在选定轴上采用dot
另一种方式。
np.tensordot(a,a,(1,1))
np.diagonal(np.rollaxis(np.tensordot(a,a,(1,1)),1),0,2).T # with cleanup
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.