python kernel dead when performing SVD on a sparse symmetrical matrix

Question

I would like to reproduce the SVD method mentioned in a standford lecture on my own dataset. The slide of the lecture is as following

My dataset is of the same type, which is a word co-occurrence matrix M with a size of

<13840x13840 sparse matrix of type '<type 'numpy.int64'>' 
with 597828 stored elements in Compressed Sparse Column format>

generated and processed from CountVectorizer(), note that this is a symmetric matrix.

However, when I tried to extract features from SVD, however, none of the following code works,

1st try:

scipy.linalg.svd(M)

I have tried the matrix from sparse csr todense() and toarray(), my computer taken quite a few minutes, and it displays kernel stops. I also played around with different parameter settings

2nd try:

scipy.sparse.linalg.svds(M)

I have also tried to change the matrix type from int64 to float64, however, the kernel dead after 30 seconds or so.

Anyone could suggest me a way to conduct SVD on this matrix in any way?

Thank you so much

Answer 1

Seems that the matrix is to stressful for the memory. You have several options:

Perform an adaptive SVD,
Use modred ,
Use the SVD from dask .

The latter two should work out of the box. All these options will load only what the memory can.

python kernel dead when performing SVD on a sparse symmetrical matrix

Question

1st try:

2nd try:

1 answers

solution1
1 ACCPTED 2017-10-02 05:37:59

python kernel dead when performing SVD on a sparse symmetrical matrix

Question

1st try:

2nd try:

1 answers

solution1 1 ACCPTED 2017-10-02 05:37:59

solution1
1 ACCPTED 2017-10-02 05:37:59