简体   繁体   English

稀疏矩阵的sklearn tsne

[英]sklearn tsne with sparse matrix

I'm trying to display tsne on a very sparse matrix with precomputed distances values but I'm having trouble with it. 我正在尝试在具有预先计算的距离值的非常稀疏的矩阵上显示tsne,但遇到了麻烦。

It boils down to this: 归结为:

row = np.array([0, 2, 2, 0, 1, 2])
col = np.array([0, 0, 1, 2, 2, 2])
distances = np.array([.1, .2, .3, .4, .5, .6])
X = csc_matrix((distances, (row, col)), shape=(3, 3))
Y = TSNE(metric='precomputed').fit_transform(X)

However, I get this error: 但是,我收到此错误:

TypeError: A sparse matrix was passed, but dense data is required for method="barnes_hut". TypeError:传递了一个稀疏矩阵,但是method =“ barnes_hut”需要密集的数据。 Use X.toarray() to convert to a dense numpy array if the array is small enough for it to fit in memory. 如果数组足够小以适合内存,请使用X.toarray()转换为密集的numpy数组。 Otherwise consider dimensionality reduction techniques (eg TruncatedSVD) 否则考虑降维技术(例如,TruncatedSVD)

I don't want to perform TruncatedSVD since I already computed distances. 我已经执行了距离计算,所以我不想执行TruncatedSVD。

If I change the method='exact' , I get another error (which is somewhat questionable): 如果我更改method='exact' ,则会收到另一个错误(这有点可疑):

NotImplementedError: >= and <= don't work with 0. NotImplementedError:> =和<=不适用于0。

NOTE: my distance matrix is about 100k x 100k with approximately 1M non zero values. 注意:我的距离矩阵大约为100k x 100k,大约为1M非零值。

Any ideas? 有任何想法吗?

I think this should solve your problem: 我认为这应该可以解决您的问题:

X = csr_matrix((distances, (row, col)), shape=(3, 3)).todense()

If you really ment csr_matrix instead of csc_matrix 如果您确实提到csr_matrix而不是csc_matrix

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM