简体   繁体   中英

Fast iteration over matrix-vector products in multidimensional numpy array

essentially I'm having to do a lot of Matrix-Vector products, where the 3x3 matrices are stored as the last two entries of a (nxnx 3 x 3) numpy array and the vectors are the last column in a corresponging (nxnx 3) array.

Currently I'm looping quite inelegantly over the first two indices and calculating each matrix-vector product separately:

length=1000
x=np.random.rand(length,length,3)
A=np.random.rand(length,length,3,3)
result=np.zeros((length,length,3))

for i in range(0,length):
   for j in range(0,length):
      result[i,j,:]=A[i,j,:,:].dot(x[i,j,:])

This however is quite slow. Is there a more efficient way to avoid this i,j for loop? I'm still trying to learn the numpy methods. ;)

Also, in my real-life case, the matrices are very sparse (>0.999 sparsity), so bonus points for anyone who can give me some hints on how to incorporate that.

Thank you very much!

You could use np.einsum to perform the matrix/vector dot products over both i and j :

>>> np.einsum('ijkl,ijl->ijk', A, x)

>>> np.allclose(np.einsum('ijkl,ijl->ijk', A, x), result)
True

您可以通过在x临时引入一个额外维度来使用点运算符 ( @ ) 上的广播:

result = (A @ x[:,:,:,None]).squeeze(axis=-1)

there are some answers to this question, I have a different method that a friend of mine taught me. I want to share it.

When you have a matrix-vector (or matrix-matrix) product suposing A NxM and x M you can make the product in numpy by reshaping x, doing the clasical product and then summing over the last dimension:

n=10
m=3
A = np.random.randn(n,m)
x = np.random.randn(m)

result1 = A.dot(x)
result2 = (A*x.reshape(1,-1)).sum(-1)
print(np.allclose(result1,result2))
#throws True

Using that you could resolve your problem for more bigger matrices:

import numpy as np

def method_1(A,x,n,m): #OP
    result=np.zeros((n,n,m))
    for i in range(0,n):
        for j in range(0,n):
            result[i,j,:]=A[i,j,:,:].dot(x[i,j,:])
    return result

def method_2(A,x,n,m): #Me 
    return (A* x.reshape(n,n,1,m)).sum(-1)

def method_3(A,x,n,m): #orlp
    return (A @ x[:,:,:,None]).squeeze(axis=-1)

def method_4(A,x,n,m): #ivan
    return np.einsum('ijkl,ijl->ijk', A, x)

n= 7
m = 3
A = np.random.randn(n,n,m,m)
x = np.random.randn(n,n,m)

r1 = method_1(A,x,n,m)
r2 = method_2(A,x,n,m)
r3 = method_3(A,x,n,m)
r4 = method_4(A,x,n,m)

r12 = np.allclose(r1,r2)
r13 = np.allclose(r1,r3)
r14 = np.allclose(r1,r4)

print(" same result?")
print(f"op-me: {r12}, op-orlp: {r13}, op-ivan: {r14}")

#same result?
#op-me: True, op-orlp: True, op-ivan: True

For last I did a comparision between all these method speed

from time import time
import matplotlib.pyplot as plt
nList = [10,100,1000,10000]

times_1 =[]
times_2 =[]
times_3 =[]
times_4 =[]
for n in nList:
    print(n)
    A = np.random.randn(n,n,m,m)
    x = np.random.randn(n,n,m)
    t1 = time()
    r1 = method_1(A,x,n,m)
    t2 = time()
    r2 = method_2(A,x,n,m)
    t3 = time()
    r3 = method_3(A,x,n,m)
    t4 = time()
    r4 = method_4(A,x,n,m)
    t5 = time()

    times_1.append(t2-t1)
    times_2.append(t3-t2)
    times_3.append(t4-t3)
    times_4.append(t5-t4)

#plot times
plt.figure()
plt.loglog(nList,times_1,'o-',label="op")
plt.loglog(nList,times_2,'o-',label="me")
plt.loglog(nList,times_3,'o-',label="orlp")
plt.loglog(nList,times_4,'o-',label="ivan")
plt.legend()

Here the results shows that the method proposed by Ivan is the faster one时间比较的结果

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM