简体   繁体   English

使用户兴趣与内容匹配(基于标签)

[英]Matching user interests with content (based on tags)

I have a lot of content items stored in the database and I know which tags a user is interested to. 我在数据库中存储了很多内容项,而且我知道用户对哪些标签感兴趣。 Alice, for example, shows interest in tags like "healthcare", "sports" and "social". 例如,爱丽丝(Alice)对“医疗保健”,“体育”和“社交”等标签表现出了兴趣。 Each content items contains one or more tags. 每个内容项包含一个或多个标签。 How would I match these in order to recommend new content to Alice? 我如何匹配这些内容以向Alice推荐新内容?

Consider these basic database tables: 考虑以下基本数据库表:

CREATE TABLE `content_tag` (
   `id` INT(11) NOT NULL AUTO_INCREMENT,
   `item_id` INT(11) NOT NULL,
   `tag_id` INT(11) NOT NULL,
   PRIMARY KEY (`id`)
);

CREATE TABLE `tag` (
   `id` INT(11) NOT NULL AUTO_INCREMENT,
   `name` VARCHAR(50) NOT NULL,
   PRIMARY KEY (`id`)
);

And I have Alice's interests alongside with a (relevance) score, which act like weights: 我还有爱丽丝的兴趣以及(相关性)得分,其作用类似于权重:

array:3 [
    'healthcare' => 2.20
    'sports' => 1.30
    'socal' => 0.5
]

How would you approach this? 您将如何处理?

Is there a way to use an algorithm for this, like the cosine simularity, or is this only ment for sentences? 有没有办法使用算法来解决这个问题,例如余弦模拟,还是仅针对句子?

You can find similarities between Alice and items and then sort them. 您可以找到Alice和项目之间的相似之处,然后对其进行排序。 So, most similar n items will be recommended. 因此,将推荐大多数类似的n项。

One of the similarity metrics is cosine similarity (as you suggest) and works like follows; 相似度指标之一是余弦相似度(如您所建议),其工作方式如下:

For each item you can create a vector by using tags. 您可以使用标签为每个项目创建向量。 As far as I understand your items don't have scores so values of item vectors will be 0 or 1. Each value will represent one tag for the particular item. 据我了解,您的商品没有评分,因此商品矢量的值将为0或1。每个值代表特定商品的一个标签。

Item representation; 项目表示;

[0,0,1,1,0,0] -> Lets say First value represents 'healthcare', second one is for sports and last tag is for tag5. [0,0,1,1,0,0] ->假设第一个值代表“医疗保健”,第二个值代表运动,最后一个标记代表tag5。 This item does not have tag5 so its value is 0 此项没有标签5,因此其值为0

And users also have vectors which is similar to items. 用户还具有类似于项目的向量。 For instance Alice's vector is; 例如,爱丽丝的向量是; [2.20,1.30,0.5,0,0,0] [2.20,1.30,0.5,0,0,0]

After creating vectors you can compute similarity (eg by using cosine similarity.) 创建矢量后,您可以计算相似度(例如,通过使用余弦相似度)。

Note that size of each user and item vectors equals to number of all tags in the system. 请注意, 每个用户和项目向量的大小等于系统中所有标签的数目。 In this example there are 6 distinct tags in the system. 在此示例中,系统中有6个不同的标签。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM