[英]DBSCAN in Python: Unexpected result
I'm trying to understand the DBSCAN implementation by scikit-learn, but I'm having trouble. 我正在尝试通过scikit-learn了解DBSCAN的实现,但是遇到了麻烦。 Here is my data sample:
这是我的数据样本:
X = [[0,0],[0,1],[1,1],[1,2],[2,2],[5,0],[5,1],[5,2],[8,0],[10,0]]
Then I calculate D as in the example provided 然后我按照提供的示例计算D
D = distance.squareform(distance.pdist(X))
D
returns a matrix with the distance between each point and all others. D
返回一个矩阵,其中包含每个点与所有其他点之间的距离。 The diagonal is thus always 0. 因此,对角线始终为0。
Then I run DBSCAN as: 然后我以以下方式运行DBSCAN:
db = DBSCAN(eps=1.1, min_samples=2).fit(D)
eps = 1.1
means, if I understood the documentation well, that points with a distance of smaller or equal 1.1 will be considered in a cluster (core). eps = 1.1
意味着,如果我对文档了解得很好,则将在群集(核心)中考虑距离小于或等于1.1的点。
D[1]
returns the following: D[1]
返回以下内容:
>>> D[1]
array([ 1. , 0. , 1. , 1.41421356,
2.23606798, 5.09901951, 5. , 5.09901951,
8.06225775, 10.04987562])
which means the second point has a distance of 1 to the first and the third. 这意味着第二点到第一点和第三点的距离为1。 So I expect them to build a cluster, but ...
所以我希望他们能建立一个集群,但是...
>>> db.core_sample_indices_
[]
which means no cores found, right? 这意味着找不到核心,对吗? Here are the other 2 outputs.
这是其他2个输出。
>>> db.components_
array([], shape=(0, 10), dtype=float64)
>>> db.labels_
array([-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.])
Why is there any cluster? 为什么会有集群?
I figure the implementation might just assume your distance matrix is the data itself . 我认为实现可能只是假设您的距离矩阵是数据本身 。
See: usually you wouldn't compute the full distance matrix for DBSCAN, but use a data index for faster neighbor search. 请参阅:通常,您不会为DBSCAN计算完整的距离矩阵,而是使用数据索引来加快邻居搜索的速度。
Judging from a 1 minute Google, consider adding metric="precomputed"
, since: 从1分钟的Google来看,考虑添加
metric="precomputed"
,因为:
fit(X)
配合(X)
X: Array of distances between samples, or a feature array.
X:样本之间的距离数组或要素数组。 The array is treated as a feature array unless the metric is given as 'precomputed'.
除非将度量标准指定为“预先计算”,否则该数组将被视为要素数组。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.