[英]What model or approach to use for this kind of “nested” recommendation?
I have a very specific recommendation problem. 我有一个非常具体的推荐问题。
Suppose I have 3 types of values/entities - item, property, value. 假设我有3种类型的值/实体-项目,属性,值。 There are N items, A properties and B values.
有N个项目,A属性和B值。 Each item has some number of property-value pairs.
每个项目都有一些属性值对。 Example:
例:
Item#1 项目#1
2374-23783 2374-23783
8455-5783 8455-5783
744-2438 744-2438
Item#2 项目#2
5435-23783 5435-23783
8455-54654 8455-54654
544-9778 544-9778
... ...
Now, given an "anonymous" item, say, Item#x with 3-4 sample property value pairs like above, I want to get recommendations for a specific property. 现在,给定一个“匿名”项目,例如具有上述3-4个示例属性值对的Item#x,我想获得针对特定属性的建议。 Example:
例:
Item#x 项目#x
5435-23783 5435-23783
544-9778 544-9778
744-2438 744-2438
8455-?? 8455- ?? (get recommendation)
(获取推荐)
Now, intuition - the recommended value for property 8455 in Item#x may be 54654. You'll see that the properties 5435 and 744 have same values in Item#2 as in Item#x. 现在,凭直觉-在Item#x中,属性8455的建议值为54654。您将看到,在Item#2中,属性5435和744具有与Item#x中相同的值。 Therefore, it's more probable that the value for 8455 will be similar to what value 8455 has in Item#2.
因此,很有可能8455的值将与8455在Item#2中的值相似。
Question: 题:
What kind of model do you think would be best for this problem? 您认为哪种模型最适合此问题? What approach should I use?
我应该使用哪种方法? Collaborative filtering - but how?
协同过滤-但是如何? Simply dumping all property-value pairs into the dataset and fetching recommendations wouldn't satisfy my needs, obviously.
显然,仅将所有属性值对转储到数据集中并获取建议将无法满足我的需求。
Can you add any implementation specific details too? 您也可以添加任何具体实施细节吗? Mahout?
Mahout? Myrrix?
Myrrix? Machine learning/recommendation libraries?
机器学习/推荐库?
Any machine learning approach will do the job. 任何机器学习方法都可以胜任。 You can, for instance, use Bayesian networks as it is natural for these conditional item-property-value occurrences.
例如,您可以使用贝叶斯网络,因为这些条件项-属性-值的出现很自然。
It is not realistic to add implementation specific details without knowing what is your concerns. 在不知道您关注的是的情况下添加特定于实现的细节是不现实的。 What do you care most?
你最在乎什么 Performance, accuracy, or scalability?
性能,准确性或可扩展性?
It doesn't appear you need any machine learning, just retrieval. 看来您不需要任何机器学习,只需检索即可。 The most straightforward way is to create a feature vector where each dimension is a property.
最直接的方法是创建一个特征向量,其中每个维度都是一个属性。
Vector position and property: 向量位置和属性:
Position #0, property 2374
Position #1, property 8455
Position #2, property 744
Position #3, property 5435
Position #4, property 544
For each item fill in vector values. 对于每个项目,请填写矢量值。
Item #1 is represented as [23783, 5783, 2438, ?, ?]
Item #2 is represented as [ ?, 54654, ?, 23783, 9778]
Item #x is represented as [ ?, ?, 2438, 23783, 9778]
Item #x has most common values with Item #2 whose position #1 is 54654. Basically you find the best intersection with an item that has the position value you're interested in. It gets more interesting if you want values for several properties that can only be suggested by several items, but you haven't talked about the nature of the data. 项#x与项#2具有最常见的值,项#2的位置#1为54654。基本上,您会找到与具有您感兴趣的位置值的项的最佳交点。如果您想要多个属性的值,它将变得更加有趣只能由几个项目来建议,但是您没有谈论数据的性质。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.