[英]Clustering Resized Images with Sci-Kit
I have several product images of different sizes and I am running a clustering algorithm in sci-kit learn to group similar images together. 我有几个不同大小的产品图片,并且正在sci-kit中运行聚类算法,以学习将相似的图片组合在一起。 The images vary in size, but are generally around 500x500, I am shrinking them down to 250x250 图片大小各异,但通常为500x500左右,我将其缩小为250x250
def read_img(path, mode='L', size_one=(250, 250)):
return misc.imresize(misc.imread(path, mode=mode), size_one)
Once I get the vector, I remove white borders from it and flatten the vector. 一旦获得矢量,就从其中删除白色边框并展平矢量。
The issue is that some images that are visually very similar have different quality (due to size before resizing). 问题在于,某些视觉上非常相似的图像具有不同的质量(由于调整大小之前的尺寸)。 They are not getting picked up as separate clusters. 它们不会作为单独的群集被拾取。
For example these two images, although very similar, have slightly different quality and don't get clustered together. 例如,尽管这两个图像非常相似,但它们的质量略有不同,并且不会聚在一起。
What can I do from a pre-clustering standpoint to improve this? 从群集前的角度来看,我可以做些什么来改善这一点? I am just getting started with this and any feedback would be very helpful. 我才刚刚开始,任何反馈都将非常有帮助。
Thanks in advance 提前致谢
EDIT: Here is how I am trimming the borders, a better approach would be very welcomed as well. 编辑:这就是我修剪边界的方式,更好的方法也将受到欢迎。
def trim_img_border(img):
shape = img.shape
temp_rows = []
for row in img:
if check_row(row):
temp_rows.append(row)
temp_rows_T = np.transpose(np.array(temp_rows))
out = []
for row in temp_rows_T:
if check_row(row):
out.append(row)
return round_img(misc.imresize(np.transpose(np.array(out)), shape))
def check_row(row):
srow = sorted(list(set(row)))
if srow == [255] or srow == [254, 255] or srow == [253, 254, 255]:
return False
return True
The primary reason your clustering is finding different clusters is that you are doing clustering on raw pixels, rather than on features from the pixels. 聚类发现不同聚类的主要原因是,您是在原始像素上进行聚类,而不是在像素特征上进行聚类。 Pixels have a lot of variability even when visually, things look similar. 像素即使在视觉上看起来也很相似,但仍具有很大的可变性。 So I think you have two approaches: 所以我认为您有两种方法:
Hope something like this helps - would love to hear how it goes. 希望这样的事情有所帮助-很想听听它的进展。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.