essentially I'm having to do a lot of Matrix-Vector products, where the 3x3 matrices are stored as the last two entries of a (nxnx 3 x 3) numpy array and the vectors are the last column in a corresponging (nxnx 3) array.
Currently I'm looping quite inelegantly over the first two indices and calculating each matrix-vector product separately:
length=1000
x=np.random.rand(length,length,3)
A=np.random.rand(length,length,3,3)
result=np.zeros((length,length,3))
for i in range(0,length):
for j in range(0,length):
result[i,j,:]=A[i,j,:,:].dot(x[i,j,:])
This however is quite slow. Is there a more efficient way to avoid this i,j for loop? I'm still trying to learn the numpy methods. ;)
Also, in my real-life case, the matrices are very sparse (>0.999 sparsity), so bonus points for anyone who can give me some hints on how to incorporate that.
Thank you very much!
You could use np.einsum
to perform the matrix/vector dot products over both i
and j
:
>>> np.einsum('ijkl,ijl->ijk', A, x)
>>> np.allclose(np.einsum('ijkl,ijl->ijk', A, x), result)
True
您可以通过在x
临时引入一个额外维度来使用点运算符 ( @
) 上的广播:
result = (A @ x[:,:,:,None]).squeeze(axis=-1)
there are some answers to this question, I have a different method that a friend of mine taught me. I want to share it.
When you have a matrix-vector (or matrix-matrix) product suposing A
NxM
and x
M
you can make the product in numpy by reshaping x, doing the clasical product and then summing over the last dimension:
n=10
m=3
A = np.random.randn(n,m)
x = np.random.randn(m)
result1 = A.dot(x)
result2 = (A*x.reshape(1,-1)).sum(-1)
print(np.allclose(result1,result2))
#throws True
Using that you could resolve your problem for more bigger matrices:
import numpy as np
def method_1(A,x,n,m): #OP
result=np.zeros((n,n,m))
for i in range(0,n):
for j in range(0,n):
result[i,j,:]=A[i,j,:,:].dot(x[i,j,:])
return result
def method_2(A,x,n,m): #Me
return (A* x.reshape(n,n,1,m)).sum(-1)
def method_3(A,x,n,m): #orlp
return (A @ x[:,:,:,None]).squeeze(axis=-1)
def method_4(A,x,n,m): #ivan
return np.einsum('ijkl,ijl->ijk', A, x)
n= 7
m = 3
A = np.random.randn(n,n,m,m)
x = np.random.randn(n,n,m)
r1 = method_1(A,x,n,m)
r2 = method_2(A,x,n,m)
r3 = method_3(A,x,n,m)
r4 = method_4(A,x,n,m)
r12 = np.allclose(r1,r2)
r13 = np.allclose(r1,r3)
r14 = np.allclose(r1,r4)
print(" same result?")
print(f"op-me: {r12}, op-orlp: {r13}, op-ivan: {r14}")
#same result?
#op-me: True, op-orlp: True, op-ivan: True
For last I did a comparision between all these method speed
from time import time
import matplotlib.pyplot as plt
nList = [10,100,1000,10000]
times_1 =[]
times_2 =[]
times_3 =[]
times_4 =[]
for n in nList:
print(n)
A = np.random.randn(n,n,m,m)
x = np.random.randn(n,n,m)
t1 = time()
r1 = method_1(A,x,n,m)
t2 = time()
r2 = method_2(A,x,n,m)
t3 = time()
r3 = method_3(A,x,n,m)
t4 = time()
r4 = method_4(A,x,n,m)
t5 = time()
times_1.append(t2-t1)
times_2.append(t3-t2)
times_3.append(t4-t3)
times_4.append(t5-t4)
#plot times
plt.figure()
plt.loglog(nList,times_1,'o-',label="op")
plt.loglog(nList,times_2,'o-',label="me")
plt.loglog(nList,times_3,'o-',label="orlp")
plt.loglog(nList,times_4,'o-',label="ivan")
plt.legend()
Here the results shows that the method proposed by Ivan is the faster one
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.