简体   繁体   中英

python kernel dead when performing SVD on a sparse symmetrical matrix

I would like to reproduce the SVD method mentioned in a standford lecture on my own dataset. The slide of the lecture is as following

斯坦福大学演讲

My dataset is of the same type, which is a word co-occurrence matrix M with a size of

<13840x13840 sparse matrix of type '<type 'numpy.int64'>' 
with 597828 stored elements in Compressed Sparse Column format>

generated and processed from CountVectorizer(), note that this is a symmetric matrix.

However, when I tried to extract features from SVD, however, none of the following code works,

1st try:

scipy.linalg.svd(M)

I have tried the matrix from sparse csr todense() and toarray(), my computer taken quite a few minutes, and it displays kernel stops. I also played around with different parameter settings

2nd try:

scipy.sparse.linalg.svds(M)

I have also tried to change the matrix type from int64 to float64, however, the kernel dead after 30 seconds or so.

Anyone could suggest me a way to conduct SVD on this matrix in any way?

Thank you so much

Seems that the matrix is to stressful for the memory. You have several options:

  1. Perform an adaptive SVD,
  2. Use modred ,
  3. Use the SVD from dask .

The latter two should work out of the box. All these options will load only what the memory can.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM