简体   繁体   English

使用 python 进行深度学习(余弦相似度)

[英]Deep learning with python (Cosine similarity)

I am learning how to use VGG16 model to recognize similar objects.我正在学习如何使用 VGG16 model 来识别相似的物体。 I created a folder "images" can put some .jpg inside the folder.我创建了一个文件夹“images”,可以在文件夹内放一些.jpg

But i am confuse with the cosine_similarity part of the program.但我对程序的 cosine_similarity 部分感到困惑。 The cosine_similarity function is to convert all jpg in "images" folder to Eigenvector and compare to each others. cosine_similarity function 是将“images”文件夹中的所有jpg转换为特征向量并相互比较。 They are more similar when the value tends to be 1.当值趋于 1 时,它们更相似。

But I don't understand in below code,但我不明白下面的代码,

sim = ratings.dot(ratings.T)

Why the jpg is comparing to itself (in transpose) but not others?为什么 jpg 与自身比较(转置)而不是其他人?

Could anyone explain to me regarding the cosine_similarity below?有人可以向我解释下面的 cosine_similarity 吗?

from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input, decode_predictions
import numpy as np
import os
import sys

#Calculate similar matrics
def cosine_similarity(ratings):
    sim = ratings.dot(ratings.T)
    if not isinstance(sim,np.ndarray):
        sim = sim.toarray()
    norms = np.array([np.sqrt(np.diagonal(sim))])
    return (sim/norms/norms.T)


def main():
    #from "folder-->image" find all of JPEG files
    y_test=[]
    x_test=[]
    for img_path in os.listdir("C:\\Users\\Desktop\\Python\\ML\\CNN model\\VGG16\\images"):
        if img_path.endswith(".jpg"):
            img = image.load_img("C:\\Users\\Desktop\\Python\\ML\\CNN model\\VGG16\\images\\" + img_path, target_size=(224,224))
            y_test.append(img_path[0:4])
            x = image.img_to_array(img)
            x= np.expand_dims(x,axis=0)
            if len(x_test) > 0:
                x_test = np.concatenate((x_test,x))
            else:
                x_test = x

    #Convert to VGG input format 
    x_test = preprocess_input(x_test)

    #include_top=False == not getting VGG16 last 3 layers
    model = VGG16(weights = "imagenet", include_top=False)

    #Get features
    features = model.predict(x_test)

    #Calculate similar metrics
    features_compress = features.reshape(len(y_test), 7*7*512)
    sim = cosine_similarity(features_compress)

    #
    inputNo = int(sys.argv[1])

    top = np.argsort(-sim[inputNo], axis=0)[1:3]

    #get the first 2 most similar index
    recommend = [y_test[i] for i in top]
    print(recommend)

if __name__ == "__main__":
    main()

Why is jpg is comparing to itself (in transpose) but not others?为什么 jpg 与自身(转置)而不是其他人进行比较?

sim = cosine_similarity(features_compress)
So here, I reckon features_compress is the set of features for all your images that are contained within x_test, and not a single image.所以在这里,我认为features_compress是 x_test 中包含的所有图像的一组特征,而不是单个图像。

Because in the for loop earlier, that's what you seem to be doing with np.concatenate() .因为在前面的 for 循环中,这就是您使用np.concatenate()所做的事情。

And if that indeed is the case, then think of the result returned from cosine_similarity() as a matrix that tells you the similarity of each image with every other image.如果确实如此,那么将cosine_similarity()返回的结果视为一个矩阵,告诉您每个图像与其他图像的相似性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM