简体   繁体   English

将元素乘法和矩阵乘法与 NumPy 中的多维数组相结合

[英]Combining element-wise and matrix multiplication with multi-dimensional arrays in NumPy

I have two multidimensional NumPy arrays, A and B , with A.shape = (K, d, N) and B.shape = (K, N, d) .我有两个多维 NumPy 数组, ABA.shape = (K, d, N)B.shape = (K, N, d) I would like to perform an element-wise operation over axis 0 ( K ), with that operation being matrix multiplication over axes 1 and 2 ( d, N and N, d ).我想在轴 0 ( K ) 上执行元素操作,该操作是在轴 1 和 2 ( d, NN, d ) 上的矩阵乘法。 So the result should be a multidimensional array C with C.shape = (K, d, d) , so that C[k] = np.dot(A[k], B[k]) .所以结果应该是一个多维数组C其中C.shape = (K, d, d) ,所以C[k] = np.dot(A[k], B[k]) A naive implementation would look like this:一个简单的实现看起来像这样:

C = np.vstack([np.dot(A[k], B[k])[np.newaxis, :, :] for k in xrange(K)])

but this implementation is slow .但是这个实现很 A slightly faster approach looks like this:稍微快一点的方法如下所示:

C = np.dot(A, B)[:, :, 0, :]

which uses the default behaviour of np.dot on multidimensional arrays, giving me an array with shape (K, d, K, d) .它在多维数组上使用np.dot的默认行为,给我一个形状为(K, d, K, d)的数组。 However, this approach computes the required answer K times (each of the entries along axis 2 are the same).但是,这种方法计算所需的答案K次(沿轴 2 的每个条目都相同)。 Asymptotically it will be slower than the first approach, but the overhead is much less.渐近地它会比第一种方法慢,但开销要少得多。 I am also aware of the following approach:我也知道以下方法:

from numpy.core.umath_tests import matrix_multiply
C = matrix_multiply(A, B)

but I am not guaranteed that this function will be available.但我不保证此功能将可用。 My question is thus, does NumPy provide a standard way of doing this efficiently?因此,我的问题是,NumPy 是否提供了有效执行此操作的标准方法? An answer which applies to multidimensional arrays in general would be perfect, but an answer specific to only this case would be great too.一般适用于多维数组的答案将是完美的,但仅针对这种情况的答案也会很棒。

Edit: As pointed out by @Juh_, the second approach is incorrect.编辑:正如@Juh_ 所指出的,第二种方法是不正确的。 The correct version is:正确的版本是:

C = np.dot(A, B).diagonal(axis1=0, axis2=2).transpose(2, 0, 1)

but the overhead added makes it slower than the first approach, even for small matrices.但是增加的开销使它比第一种方法,即使对于小矩阵也是如此。 The last approach is winning by a long shot on all my timing tests, for small and large matrices.最后一种方法是在我所有的时序测试中,无论是小矩阵还是大矩阵,都遥遥领先。 I'm now strongly considering using this if no better solution crops up, even if that would mean copying the numpy.core.umath_tests library (written in C) into my project.如果没有更好的解决方案出现,我现在强烈考虑使用它,即使这意味着将numpy.core.umath_tests库(用 C 编写)复制到我的项目中。

A possible solution to your problem is:您的问题的可能解决方案是:

C = np.sum(A[:,:,:,np.newaxis]*B[:,np.newaxis,:,:],axis=2)

However:然而:

  1. it is quicker than the vstack approach only if K is much bigger than d and N只有当 K 远大于 d 和 N 时,它才比 vstack 方法更快
  2. their might be some memory issue: in the above solution an KxdxNxd array is allocated (ie all possible product paires, before summing).它们可能是一些内存问题:在上面的解决方案中,分配了一个 KxdxNxd 数组(即所有可能的乘积对,在求和之前)。 Actually I could not test with big K,d and N as I was going out of memory.实际上,由于内存不足,我无法使用大 K、d 和 N 进行测试。

btw, note that:顺便说一句,请注意:

C = np.dot(A, B)[:, :, 0, :]

does not give the correct result.没有给出正确的结果。 It got me tricked because I first checked my method by comparing the results to those given by this np.dot command.它让我被骗了,因为我首先通过将结果与 np.dot 命令给出的结果进行比较来检查我的方法。

I have this same issue in my project.我的项目中有同样的问题。 The best I've been able to come up with is, I think it's a little faster (maybe 10%) than using vstack :我能想到的最好的方法是,我认为它比使用vstack快一点(可能快 10%):

K, d, N = A.shape
C = np.empty((K, d, d))
for k in xrange(K):
    C[k] = np.dot(A[k], B[k])

I'd love to see a better solution, I can't quite see how one would use tensordot to do this.我很想看到更好的解决方案,我不太明白人们会如何使用tensordot来做到这一点。

A very flexible, compact, and fast solution:一个非常灵活、紧凑且快速的解决方案:

C = np.einsum('Kab,Kbc->Kac', A, B, optimize=True)

Confirmation:确认:

import numpy as np
K = 10
d = 5
N = 3
A = np.random.rand(K,d,N)
B = np.random.rand(K,N,d)
C_old = np.dot(A, B).diagonal(axis1=0, axis2=2).transpose(2, 0, 1)
C_new = np.einsum('Kab,Kbc->Kac', A, B)
print(np.max(C_old-C_new))  # should be 0 or a very small number

For large multi-dimensional arrays, the optional parameter optimize=True can save you a lot of time.对于大型多维数组,可选参数optimize=True可以为您节省大量时间。 You can learn about einsum here:您可以在此处了解einsum

https://ajcr.net/Basic-guide-to-einsum/ https://ajcr.net/Basic-guide-to-einsum/

https://rockt.github.io/2018/04/30/einsum https://rockt.github.io/2018/04/30/einsum

https://numpy.org/doc/stable/reference/generated/numpy.einsum.html https://numpy.org/doc/stable/reference/generated/numpy.einsum.html

Quote:引用:

The Einstein summation convention can be used to compute many multi-dimensional, linear algebraic array operations.爱因斯坦求和约定可用于计算许多多维线性代数数组运算。 einsum provides a succinct way of representing these. einsum提供了一种简洁的方式来表示这些。 A non-exhaustive list of these operations is:这些操作的非详尽列表是:

  • Trace of an array, numpy.trace .数组的跟踪, numpy.trace

  • Return a diagonal, numpy.diag .返回对角线numpy.diag

  • Array axis summations, numpy.sum .数组轴求和, numpy.sum

  • Transpositions and permutations, numpy.transpose .换位和排列, numpy.transpose

  • Matrix multiplication and dot product, numpy.matmul numpy.dot .矩阵乘法和点积, numpy.matmul numpy.dot

  • Vector inner and outer products, numpy.inner numpy.outer .矢量内积和外积, numpy.inner numpy.outer

  • Broadcasting, element-wise and scalar multiplication, numpy.multiply .广播,元素和标量乘法, numpy.multiply

  • Tensor contractions, numpy.tensordot .张量收缩, numpy.tensordot

  • Chained array operations, in efficient calculation order, numpy.einsum_path .链式数组操作,以高效的计算顺序, numpy.einsum_path

You can do你可以做

np.matmul(A, B)

Look at https://numpy.org/doc/stable/reference/generated/numpy.matmul.html .查看https://numpy.org/doc/stable/reference/generated/numpy.matmul.html

Should be faster than einsum for big enough K .对于足够大的K应该比 einsum 快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM