简体   繁体   English

在Apache Mahout中计算首选项值

[英]Computing preference values in Apache Mahout

I am trying to learn Apache mahout, very new to this topic. 我正在尝试学习Apache mahout,这是该主题的新手。 I want to implement user-based recommender. 我想实现基于用户的推荐器。 For this, after exploring on the internet I have found some samples like below, 为此,在互联网上进行探索之后,我发现了以下示例

public static void main(String[] args) {
        try {
            int userId = 2;

            DataModel model = new FileDataModel(new File("data/mydataset.csv"), ";");
            UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
            UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model);
            UserBasedRecommender recommender = new GenericUserBasedRecommender(model, neighborhood, similarity);

            List<RecommendedItem> recommendations = recommender.recommend(userId, 3);
            for (RecommendedItem recommendation : recommendations) {
                logger.log(Level.INFO, "Item Id recommended : " + recommendation.getItemID() + " Ratings : "
                        + recommendation.getValue() + " For UserId : " + userId);
            }
        } catch (Exception e) {
            logger.log(Level.SEVERE, "Exception in main() ::", e);
        }

I am using following dataset which contains userid, itemid, preference value respectively, 我正在使用以下数据集,其中分别包含用户ID,项目ID,首选项值,

1,10,1.0
1,11,2.0
1,12,5.0
1,13,5.0
1,14,5.0
1,15,4.0
1,16,5.0
1,17,1.0
1,18,5.0
2,10,1.0
2,11,2.0
2,15,5.0
2,16,4.5
2,17,1.0
2,18,5.0
3,11,2.5
3,12,4.5
3,13,4.0
3,14,3.0
3,15,3.5
3,16,4.5
3,17,4.0
3,18,5.0
4,10,5.0
4,11,5.0
4,12,5.0
4,13,0.0
4,14,2.0
4,15,3.0
4,16,1.0
4,17,4.0
4,18,1.0

In this case, it works fine, but my main question is I have the different set of data which don't have preference values, which contains some data based on that I am thinking to compute preference values. 在这种情况下,它可以正常工作,但我的主要问题是我拥有不包含首选项值的不同数据集,其中包含一些基于我正在考虑计算首选项值的数据。 Following is my new dataset, 以下是我的新数据集,

userid  itemid  likes   shares  comments
1        4       1      20      3
2        6       18     20      12
3        12      10     2       20
4        7       0      20      13
5        9       0      2       1
6        5       5      3       2
7        3       9      7       0
8        1       15     0       0

My question is how can I compute preference value for a particular record based on some other columns such as likes, shares, comments etc. Is there anyway to compute this in mahout? 我的问题是如何根据其他一些列(如顶,分享,评论等)为特定记录计算首选项值。是否有任何方法可以在mahout中进行计算?

Yes- I think your snippet is from an older version of Mahout, but what you want to use is the Correlated Co Occurrence recommender. 是的-我认为您的代码段来自Mahout的较旧版本,但是您要使用的是“相关相关事件”推荐器。 The CCO Recommender is multi-modal (allows user to have various inputs). CCO推荐器是多模式的(允许用户输入各种内容)。

There are CLI Drivers, but I'm guessing you want to code, there is a Scala tutorial here 有CLI驱动程序,但我猜你想代码,还有斯卡拉教程这里

In the tutorial I think it recommends 'friends' based on genres tagged and artists 'liked', as well as your current friends. 我认为在本教程中,建议根据标记的流派和“喜欢的”艺术家以及您当前的朋友推荐“朋友”。

As @rawkintrevo says, Mahout has moved on from the older "taste" recommenders and they will be deprecated from Mahout soon. 正如@rawkintrevo所说,Mahout已从较旧的“品味”推荐者转移到其他人,他们很快就会从Mahout中弃用。

You can build you own system from the CCO algorithm in Mahout here . 您可以在此处通过Mahout中的CCO算法构建自己的系统。 It allows you to use data from different user behavior like "likes, shares, comments". 它允许您使用来自不同用户行为的数据,例如“喜欢,分享,评论”。 So we call it multi-modal. 因此,我们称其为多模式。

Or in another project we have created a full featured recommendation server based on Mahout, called the Universal Recommender . 或者在另一个项目中,我们基于Mahout创建了功能齐全的推荐服务器,称为Universal Recommender It is build on Apache PredicitonIO where the UR is a plugin called a Template. 它基于Apache PredicitonIO构建,其中UR是称为模板的插件。 Together they deliver a nearly turnkey server that takes input and responds to queries. 他们一起提供了一个几乎交钥匙的服务器,该服务器接受输入并响应查询。 To get started easily try the AWS AMI that has the whole system working. 要轻松上手,请尝试使整个系统正常运行的AWS AMI Some other methods to install are shown here . 这里显示其他一些安装方法。

This is all Apache licensed OSS, but Mahout no longer can really provide a production ready environment, Mahout does algorithms but you need a system around it. 这就是所有Apache许可的OSS,但是Mahout不再能够真正提供可用于生产的环境,Mahout可以执行算法,但是您需要一个围绕它的系统。 Build your own or try the PredictionIO based one. 构建自己的数据库或尝试使用基于PredictionIO的数据库。 Since everything is OSS you can tweak things if needed. 由于一切都是OSS,因此您可以根据需要进行调整。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM