简体   繁体   中英

Splitting matrix multiplication using einsum

I have a large data matrix and I want calculate the similarity matrix of that large matrix but due to memory limitation I want to split the calculation.

Lets assume I have following: For the example I have taken a smaller matrix

data1 = data/np.linalg.norm(data,axis=1)[:,None]

(Pdb) data1
array([[ 0.        ,  0.        ,  0.        , ...,  0.        ,
         0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        , ...,  0.        ,
         0.        ,  0.        ],
       [ 0.04777415,  0.00091094,  0.01326067, ...,  0.        ,
         0.        ,  0.        ],
       ...,
       [ 0.        ,  0.01503281,  0.00655707, ...,  0.        ,
         0.        ,  0.        ],
       [ 0.00418038,  0.00308079,  0.01893477, ...,  0.        ,
         0.        ,  0.        ],
       [ 0.06883803,  0.        ,  0.0209448 , ...,  0.        ,
         0.        ,  0.        ]])

They I try to do following:

similarity_matrix[n1:n2,m1:m2] = np.einsum('ik,jk->ij', data1[n1:n2,:], data1[m1:m2,:])

n1,n2,m1,m2 been calculated as follows: (df is a data frame)

data = df.values
m, k = data.shape
n1=0; n2=m/2; m1=n2+1; m2=m;

But the error is:

(Pdb) similarity_matrix[n1:n2,m1:m2] = np.einsum('ik,jk->ij', data1[n1:n2,:], data1[m1:m2,:])
*** NameError: name 'similarity_matrix' is not defined

Didn't you do something like

similarity_matrix = np.empty((N,M),dtype=float)

at the start of your calculations?

You can't index an array, on right or left side of an equation, before you create it.

If that full (N,M) matrix is too big for memory, then just assign your einsum value to another variable, and work with that.

partial_matrix = np.einsum...

How you relate that partial_matrix to the virtual similarity_matrix is a different issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM