[英]sklearn.cluster.DBSCAN gives unexpected result
I'm using DBSCAN method for clustering images, but it gives unexpected result. 我正在使用DBSCAN方法对图像进行聚类,但是会产生意外的结果。 Let's assume I have 10 images.
假设我有10张图片。
Firstly, I read an images in a loop using cv2.imread
. 首先,我使用
cv2.imread
循环读取图像。 Then I compute structural similarity index between each images. 然后,我计算每个图像之间的结构相似性指数。 After that, I have a matrix like this:
在那之后,我有一个像这样的矩阵:
[
[ 1. -0.00893619 0. 0. 0. 0.50148778 0.47921832 0. 0. 0. ]
[-0.00893619 1. 0. 0. 0. 0.00996088 -0.01873205 0. 0. 0. ]
[ 0. 0. 1. 0.57884212 0. 0. 0. 0. 0. 0. ]
[ 0. 0. 0.57884212 1. 0. 0. 0. 0. 0. 0. ]
[ 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[ 0.50148778 0.00996088 0. 0. 0. 1. 0.63224396 0. 0. 0. ]
[ 0.47921832 -0.01873205 0. 0. 0. 0.63224396 1. 0. 0. 0. ]
[ 0. 0. 0. 0. 0. 0. 0. 1. 0.77507487 0.69697053]
[ 0. 0. 0. 0. 0. 0. 0. 0.77507487 1. 0.74861881]
[ 0. 0. 0. 0. 0. 0. 0. 0.69697053 0.74861881 1. ]]
Looks good. 看起来不错。 Then I have simple invokation of DBSCAN:
然后,我将简单地调用DBSCAN:
db = DBSCAN(eps=0.4, min_samples=3, metric='precomputed').fit(distances)
labels = db.labels_
n_clusters_ = len(set(labels)) - (1 if -1 in labels else 0)
And the result is 结果是
[0 0 0 0 0 0 0 0 0 0]
What do I do wrong? 我做错了什么? Why it puts all images into one cluster?
为什么将所有图像都放在一个群集中?
DBSCAN usually assumes a dissimilarity (distance) not a similarity. DBSCAN通常假定不相似 (距离)而不是相似性。 It can be implemented with a similarity threshold, too (see Generalized DBSCAN)
也可以使用相似性阈值来实现(请参见通用DBSCAN)
问题是我错误地计算了距离矩阵-主对角线上的条目全为零。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.