在稀疏对称矩阵上执行SVD时python内核死

Question

I would like to reproduce the SVD method mentioned in a standford lecture on my own dataset. 我想在自己的数据集上重现Standford讲座中提到的SVD方法。 The slide of the lecture is as following 演讲的幻灯片如下

My dataset is of the same type, which is a word co-occurrence matrix M with a size of 我的数据集属于同一类型，即字共现矩阵M，大小为

<13840x13840 sparse matrix of type '<type 'numpy.int64'>' 
with 597828 stored elements in Compressed Sparse Column format>

generated and processed from CountVectorizer(), note that this is a symmetric matrix. 从CountVectorizer（）生成并处理，请注意，这是一个对称矩阵。

However, when I tried to extract features from SVD, however, none of the following code works, 但是，当我尝试从SVD提取功能时，以下代码均无效，

1st try: 第一次尝试：

scipy.linalg.svd(M)

I have tried the matrix from sparse csr todense() and toarray(), my computer taken quite a few minutes, and it displays kernel stops. 我已经尝试了稀疏csr todense（）和toarray（）的矩阵，我的计算机花了相当多的时间，并且它显示内核停止。 I also played around with different parameter settings 我也玩了不同的参数设置

2nd try: 第二次尝试：

scipy.sparse.linalg.svds(M)

I have also tried to change the matrix type from int64 to float64, however, the kernel dead after 30 seconds or so. 我还尝试将矩阵类型从int64更改为float64，但是，内核在30秒左右后就死了。

Anyone could suggest me a way to conduct SVD on this matrix in any way? 有人可以建议我以任何方式在此矩阵上执行SVD吗？

Thank you so much 非常感谢

Answer 1

Seems that the matrix is to stressful for the memory. 似乎矩阵对内存压力很大。 You have several options: 您有几种选择：

Perform an adaptive SVD, 执行自适应SVD
Use modred , 使用modred ，
Use the SVD from dask . 使用dask的SVD。

The latter two should work out of the box. 后两个应该开箱即用。 All these options will load only what the memory can. 所有这些选项将仅加载内存可以加载的内容。

在稀疏对称矩阵上执行SVD时python内核死

问题描述

1st try: 第一次尝试：

2nd try: 第二次尝试：

1 个解决方案

解决方案1
1 已采纳 2017-10-02 05:37:59

在稀疏对称矩阵上执行SVD时python内核死

问题描述

1st try: 第一次尝试：

2nd try: 第二次尝试：

1 个解决方案

解决方案1 1 已采纳 2017-10-02 05:37:59

解决方案1
1 已采纳 2017-10-02 05:37:59