[英]problem with hierarchical clustering in Python
I am doing a hierarchical clustering a 2 dimensional matrix by correlation distance metric (ie 1 - Pearson correlation). 我正在通过相关距离度量(即1 - Pearson相关)对二维矩阵进行分层聚类。 My code is the following (the data is in a variable called "data"): 我的代码如下(数据在一个名为“data”的变量中):
from hcluster import *
Y = pdist(data, 'correlation')
cluster_type = 'average'
Z = linkage(Y, cluster_type)
dendrogram(Z)
The error I get is: 我得到的错误是:
ValueError: Linkage 'Z' contains negative distances.
What causes this error? 是什么导致这个错误? The matrix "data" that I use is simply: 我使用的矩阵“数据”很简单:
[[ 156.651968 2345.168618]
[ 158.089968 2032.840106]
[ 207.996413 2786.779081]
[ 151.885804 2286.70533 ]
[ 154.33665 1967.74431 ]
[ 150.060182 1931.991169]
[ 133.800787 1978.539644]
[ 112.743217 1478.903191]
[ 125.388905 1422.3247 ]]
I don't see how pdist could ever produce negative numbers when taking 1 - pearson correlation. 我没有看到pdist在采用1 - pearson相关时如何产生负数。 Any ideas on this? 有什么想法吗?
thank you. 谢谢。
There are some lovely floating point problems going on. 有一些可爱的浮点问题正在发生。 If you look at the results of pdist, you'll find there are very small negative numbers (-2.22044605e-16) in them. 如果你看一下pdist的结果,你会发现它们中的负数非常小(-2.22044605e-16)。 Essentially, they should be zero. 基本上,它们应该为零。 You can use numpy's clip function to deal with it if you would like. 如果您愿意,可以使用numpy的剪辑功能来处理它。
If you were getting error 如果你收到错误
KeyError: -428
and your code was on the lines of 而你的代码就是这样的
import matplotlib.pyplot as plt
import matplotlib as mpl
%matplotlib inline
from scipy.cluster.hierarchy import ward, dendrogram
linkage_matrix = ward(dist) #define the linkage_matrix using ward clustering pre-computed distances
fig, ax = plt.subplots(figsize=(35, 20),dpi=400) # set size
ax = dendrogram(linkage_matrix, orientation="right",labels=queries);
` It is due to the mismatch in indexes of queries. `这是由于查询索引不匹配。
Might want to update to 可能想要更新到
ax = dendrogram(linkage_matrix, orientation="right",labels=list(queries));
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.