简体   繁体   English

numpy - einsum 表示法:矩阵堆栈与向量堆栈的点积

[英]numpy - einsum notation: dot product of a stack of matrices with stack of vectors

I want to multiply an n-dim stack of m* m matrices by an n-dim stack of vectors (length m), so that the resulting m*n array contains the result of the dot product of the matrix and vector in the nth entry:我想将 m* m 个矩阵的 n-dim 堆栈乘以向量的 n-dim 堆栈(长度为 m),以便得到的 m*n 数组包含矩阵和第 n 个向量的点积的结果入口:

vec1=np.array([0,0.5,1,0.5]); vec2=np.array([2,0.5,1,0.5])
vec=np.transpose(n.stack((vec1,vec2)))
mat = np.moveaxis(n.array([[[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3]],[[-1,2.,0,1.],[0,0,-1,2.],[0,1,-1,2.],[1,0.1,1,1]]]),0,2)
outvec=np.zeros((4,2))
for i in range(2):
    outvec[:,i]=np.dot(mat[:,:,i],vec[:,i])

Inspired by this post Element wise dot product of matrices and vectors , I have tried all different perturbations of index combinations in einsum, and have found that受这篇文章Element wise dot product of matrix and vectors 的启发,我在 einsum 中尝试了所有不同的索引组合扰动,并发现

np.einsum('ijk,jk->ik',mat,vec)

gives the correct result.给出正确的结果。

Unfortunately I really do not understand this - I assumed the fact that I repeat the entry k in the 'ijk,jk' part means that I multiply AND sum over k.不幸的是,我真的不明白这一点 - 我假设我在 'ijk,jk' 部分重复条目 k 的事实意味着我对 k 进行乘法和求和。 I've tried to read the documentation https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.einsum.html , but I still don't understand.我试图阅读文档https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.einsum.html ,但我仍然不明白。

(My previous attempts included, (我之前的尝试包括,

 np.einsum('ijk,il->ik', mat, vec)

I'm not even sure what this means.我什至不确定这意味着什么。 What happens to the index l when I drop it?)当我删除它时索引 l 会发生什么?)

Thanks in advance!提前致谢!

Read up on Einstein summation notation .阅读爱因斯坦求和符号

Basically, the rules are:基本上,规则是:

Without a ->没有->

  • Any letter repeated in the inputs represents an axis to be multipled and summed over输入中重复的任何字母表示要乘以和求和的轴
  • Any letter not repeated in the inputs is included in the output输入中未重复的任何字母都包含在输出中

With a ->带有->

  • Any letter repeated in the inputs represents an axis to be multipled over输入中重复的任何字母表示要乘以的轴
  • Any letter not in the output represents an axis to be summed over任何不在输出中的字母都代表要求和的轴

So, for example, with matrices A and B wih same shape:因此,例如,对于具有相同形状的矩阵AB

np.einsum('ij, ij',       A, B)  # is A ddot B,                returns 0d scalar
np.einsum('ij, jk',       A, B)  # is A dot  B,                returns 2d tensor
np.einsum('ij, kl',       A, B)  # is outer(A, B),             returns 4d tensor
np.einsum('ji, jk, kl',   A, B)  # is A.T @ B @ A,             returns 2d tensor
np.einsum('ij, ij -> ij', A, B)  # is A * B,                   returns 2d tensor
np.einsum('ij, ij -> i' , A, A)  # is norm(A, axis = 1),       returns 1d tensor
np.einsum('ii'             , A)  # is tr(A),                   returns 0d scalar
In [321]: vec1=np.array([0,0.5,1,0.5]); vec2=np.array([2,0.5,1,0.5])
     ...: vec=np.transpose(np.stack((vec1,vec2)))
In [322]: vec1.shape
Out[322]: (4,)
In [323]: vec.shape
Out[323]: (4, 2)

A nice thing about the stack function is we can specify an axis, skipping the transpose: stack函数的一个好处是我们可以指定一个轴,跳过转置:

In [324]: np.stack((vec1,vec2), axis=1).shape
Out[324]: (4, 2)

Why the mix of np.为什么混合np. and n.n. ? ? NameError: name 'n' is not defined . NameError: name 'n' is not defined That kind of thing almost sends me away.这种事情差点把我赶走。

In [326]: mat = np.moveaxis(np.array([[[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3]],[[-1,2.,0
     ...: ,1.],[0,0,-1,2.],[0,1,-1,2.],[1,0.1,1,1]]]),0,2)
In [327]: mat.shape
Out[327]: (4, 4, 2)

In [328]: outvec=np.zeros((4,2))
     ...: for i in range(2):
     ...:     outvec[:,i]=np.dot(mat[:,:,i],vec[:,i])
     ...:     
In [329]: outvec
Out[329]: 
array([[ 4.  , -0.5 ],
       [ 4.  ,  0.  ],
       [ 4.  ,  0.5 ],
       [ 4.  ,  3.55]])

In [330]: # (4,4,2) (4,2)   'kji,ji->ki'

From your loop, the location of the i axis (size 2) is clear - last in all 3 arrays.从您的循环中, i轴(大小 2)的位置很清楚 - 在所有 3 个数组中排在最后。 That leaves one axis for vec , lets call that j .这为vec留下了一个轴,让我们称之为j It pairs with the last (next to i of mat ).它与最后一个(在mat i旁边)配对。 k carries over from mat to outvec . kmat outvecoutvec

In [331]: np.einsum('kji,ji->ki', mat, vec)
Out[331]: 
array([[ 4.  , -0.5 ],
       [ 4.  ,  0.  ],
       [ 4.  ,  0.5 ],
       [ 4.  ,  3.55]])

Often the einsum string writes itself.通常einsum字符串会写入自己。 For example if mat was described as (m,n,k) and vec as (n,k), with the result being (m,k)例如,如果mat被描述为 (m,n,k) 并且vec被描述为 (n,k),结果是 (m,k)

In this case only the j dimension is summed - it appears on the left, but on the right.在这种情况下,只对j维求和 - 它出现在左侧,但在右侧。 The last dimension, i in my notation, is not summed because if appears on both sides, just as it does in your iteration.最后一个层面, i在我的符号,不总结,因为如果双方出现,只是因为它在你的迭代。 I think of that as 'going-along-for-the-ride'.我认为那是“随心所欲”。 It isn't actively part of the dot product.它不是dot积的积极组成部分。

You are, in effect, stacking on the last dimension, size 2 one.您实际上是在最后一个维度上堆叠,尺寸为 2。 Usually we stack on the first, but you transpose both to put that last.通常我们堆叠在第一个上,但是您将两个都调换为最后一个。


Your 'failed' attempt runs, and can be reproduced as:您的“失败”尝试运行,并且可以复制为:

In [332]: np.einsum('ijk,il->ik', mat, vec)
Out[332]: 
array([[12. ,  4. ],
       [ 6. ,  1. ],
       [12. ,  4. ],
       [ 6. ,  3.1]])
In [333]: mat.sum(axis=1)*vec.sum(axis=1)[:,None]
Out[333]: 
array([[12. ,  4. ],
       [ 6. ,  1. ],
       [12. ,  4. ],
       [ 6. ,  3.1]])

The j and l dimensions don't appear on the right, so they are summed. jl维度没有出现在右侧,因此它们被求和。 They can be summed before multiplying because they appear in only one term each.它们可以在相乘之前求和,因为它们每个只出现在一项中。 I added the None to enable broadcasting (multiplying a ik with i ).我添加了None以启用广播(将iki相乘)。

np.einsum('ik,i->ik', mat.sum(axis=1), vec.sum(axis=1))

If you'd stacked on the first, and added a dimension for vec (2,4,1), it would matmul with a (2,4,4) mat.如果您堆叠在第一个上,并为vec (2,4,1) 添加一个维度,则它会使用 (2,4,4) 垫进行matmul mat @ vec[...,None] . mat @ vec[...,None]

In [337]: m1 = mat.transpose(2,0,1)
In [338]: m1@v1[...,None]
Out[338]: 
array([[[ 4.  ],
        [ 4.  ],
        [ 4.  ],
        [ 4.  ]],

       [[-0.5 ],
        [ 0.  ],
        [ 0.5 ],
        [ 3.55]]])
In [339]: _.shape
Out[339]: (2, 4, 1)

einsum is easy (when you had played with permutation of indices for a while, that is...). einsum很容易(当你玩过一段时间的索引排列时,那就是......)。

Let's work with something simple, a triple stack of 2×2 matrices and a triple stack of 2×, arrays让我们处理一些简单的事情,一个2×2矩阵的三重堆栈和一个数组的三重堆栈

import numpy as np

a = np.arange(3*2*2).reshape((3,2,2))
b = np.arange(3*2).reshape((3,2))

We need to know what we are going to compute using einsum我们需要知道我们将使用einsum计算einsum

In [101]: for i in range(3): 
     ...:     print(a[i]@b[i])                                                                            
[1 3]
[23 33]
[77 95]

What we have done?我们所做的? we have an index i that is fixed when we perform a dot product between one of the stacked matrices and one of the stacked vectors (both indexed by i ) and the individual output line implies a summation over the last index of the stacked matrix and the lone index of the stacked vector.当我们在堆叠矩阵之一和堆叠向量之一(均由i索引)之间执行点积时,我们有一个固定的索引i ,并且单个输出行意味着对堆叠矩阵的最后一个索引的求和和堆叠向量的唯一索引。

This is easily encoded in an einsum directive这很容易在einsum指令中编码

  • we want the same i index to specify the matrix, the vector and also the output,我们想要相同的i索引来指定矩阵、向量和输出,
  • we want to reduce along the last matrix index and the remaining vector index, say k我们想沿着最后一个矩阵索引和剩余的向量索引减少,比如k
  • we want to have as many columns in the output as the rows in each stacked matrix, say j我们希望输出中的列数与每个堆叠矩阵中的行数一样多,比如j

Hence因此

In [102]: np.einsum('ijk,ik->ij', a, b)                                                                   
Out[102]: 
array([[ 1,  3],
       [23, 33],
       [77, 95]])

I hope that my discussion of how I got the directive right is clear, correct and useful.我希望我对如何获得正确指令的讨论是清晰、正确和有用的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM