简体   繁体   English

Apache Mahout没有给出任何建议

[英]Apache Mahout not giving any recommendation

I am trying to use mahout for the recommendation but getting none . 我正在尝试使用mahout作为推荐但没有得到

My dataset : 我的数据集:

0,102,5.0
1,101,5.0
1,102,5.0

Code : 代码:

      DataModel datamodel = new FileDataModel(new File("dataset.csv"));
        // Creating UserSimilarity object.
        UserSimilarity usersimilarity = new PearsonCorrelationSimilarity(datamodel);

        // Creating UserNeighbourHHood object.
        UserNeighborhood userneighborhood = new ThresholdUserNeighborhood(0.1, usersimilarity, datamodel);

        // Create UserRecomender
        UserBasedRecommender recommender = new GenericUserBasedRecommender(datamodel, userneighborhood, usersimilarity);

        List<RecommendedItem> recommendations = recommender.recommend(0, 1);

        for (RecommendedItem recommendation : recommendations) {
            System.out.println(recommendation);
        }

I am using Mahout version : 0.13.0 我使用的是Mahout版本:0.13.0

Ideally, it should recommend item_id = 101' to 'user_id = 0' as user = 0 and user = 1 have item 102 common show it should recommend item_id = 101 to user_id = 0` 理想情况下,它应该推荐item_id = 101' to 'user_id = 0' as user = 0 and user = 1 have item 102 common show it should recommend item_id = 101 to user_id = 0`

Logs : 日志:

18:08:11.669 [main] INFO org.apache.mahout.cf.taste.impl.model.file.FileDataModel - Creating FileDataModel for file dataset.csv
18:08:11.700 [main] INFO org.apache.mahout.cf.taste.impl.model.file.FileDataModel - Reading file info...
18:08:11.702 [main] INFO org.apache.mahout.cf.taste.impl.model.file.FileDataModel - Read lines: 3
18:08:11.722 [main] INFO org.apache.mahout.cf.taste.impl.model.GenericDataModel - Processed 2 users
18:08:11.738 [main] DEBUG org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender - Recommending items for user ID '0'

The Hadoop Mapreduce code in Mahout is being deprecated. Mahout中的Hadoop Mapreduce代码已被弃用。 The new recommender code starts with @rawkintrevo 's examples. 新的推荐代码以@rawkintrevo的例子开头。 If you are a Scala programmer follow them. 如果您是Scala程序员,请关注他们。

Most Engineers would like a system that works with no modification, The Mahout algorithm is encapsulated in The Universal Recommender built on top of Apache PredictionIO. 大多数工程师都希望系统无需修改,Mahout算法封装在Apache PredictionIO之上构建的Universal Recommender中。 It has a server to accept events, like the ones in your example, it has internal event storage, and a query server for results. 它有一个服务器来接受事件,例如示例中的事件,它具有内部事件存储,以及用于结果的查询服务器。 There are numerous improvements over the old Mapreduce code, including using real-time user behavior to make recommendations. 旧的Mapreduce代码有许多改进,包括使用实时用户行为来提出建议。 Neither the new Mahout nor the old included servers for input and query, the Universal Recommender has REST endpoints for both. 无论是新的Mahout还是旧的包含输入和查询的服务器,Universal Recommender都有两个REST端点。

Given that the code you are using will be deprecated I strongly suggest that you dive into Mahout code (@rawkintrevo's example) or look at The Universal Recommender, which is an entire end-to-end system. 鉴于您正在使用的代码将被弃用,我强烈建议您深入了解Mahout代码(@ rawkintrevo的示例)或查看The Universal Recommender,这是一个完整的端到端系统。

  • Install PredictionIO with a "single machine" setup here or to really shortcut setup use our prepackaged AWS AMI here It includes PIO and The Universal Recommender pre-installed. 以“单机”安装程序安装PredictionIO 这里还是要真正快捷安装使用我们的预包装AWS AMI 这里它包括PIO和通用导购预装。
  • Add the UR Template here 这里添加UR模板
  • A Java SDK for sending events to the recommender here 用于在此处向推荐程序发送事件的Java SDK

Once you have this setup you deal with config, REST or Java SDK and the PIO CLI. 完成此设置后,您将处理配置,REST或Java SDK以及PIO CLI。 No Scala coding required. 无需Scala编码。

I have three examples that are based on version 0.13.0 (and Scala, which is required for Samsara, the R-Like Scala DSL Mahout utilizes v0.10+) 我有三个基于版本0.13.0的示例(Scala,Samsara需要,R-Like Scala DSL Mahout使用v0.10 +)

Walk 步行

The first example is a very slow walk through: https://gist.github.com/rawkintrevo/3869030ff1a731d43c5e77979a5bf4a8 and is meant as a companion to Pat Ferrels blog post/slide deck found here. 第一个例子是一个非常缓慢的步骤: https ://gist.github.com/rawkintrevo/3869030ff1a731d43c5e77979a5bf4a8,并且是作为Pat Ferrels博客文章/幻灯片在这里找到的伴侣。 http://actionml.com/blog/cco http://actionml.com/blog/cco

Crawl 爬行

The second example is a little more "real" in that it utilizes the SimilarityAnalysis.cooccurrencesIDSs(... which is the propper interface for the CCO algorithm. 第二个例子更加“真实”,因为它利用了SimilarityAnalysis.cooccurrencesIDSs(...这是CCO算法的propper接口。

https://gist.github.com/rawkintrevo/c1bb00896263bdc067ddcd8299f4794c https://gist.github.com/rawkintrevo/c1bb00896263bdc067ddcd8299f4794c

Run

Here we use 'real' data. 这里我们使用'真实'数据。 The MovieLens data set doesn't have enough going on to showcase CCO's multi-modal power (the ability to recommend on multiple user behaviors). MovieLens数据集没有足够的功能来展示CCO的多模式功能(能够推荐多种用户行为)。 Here we load 'real' data and generate recommendations. 在这里,我们加载“真实”数据并生成建议。 https://gist.github.com/rawkintrevo/f87cc89f4d337d7ffea80a6af3bee83e https://gist.github.com/rawkintrevo/f87cc89f4d337d7ffea80a6af3bee83e

Conclusion I know you specifically asked for Java, however Apache Mahout isn't geared for Java at the moment. 结论我知道你特意要求Java,但Apache Mahout目前还不适合Java。 In theory you could import Scala into your java, or maybe wrap the functions in another more Java friendly function... I've heard rumors late at night (or possibly in a dream) that some grad students some where were working on a Java API, but its not in the trunk at the moment, nor is there a PR, nor is their a bullet in the road map. 从理论上讲,你可以将Scala导入到你的java中,或者将这些函数包装在另一个更友好的Java函数中......我听说过一些深夜(或者可能是在梦中)的传闻,一些研究生在某些地方工作Java API,但它现在不在主干中,也没有PR,也不是路线图中的子弹。

Hope the above provides some insight. 希望以上提供一些见解。

Appendix 附录

The most trivial example for Stackoverflow (you can run this interactively in the Mahout spark shell by typing $MAHOUT_HOME/bin/mahout spark-shell (assuming SPARK_HOME , JAVA_HOME and MAHOUT_HOME are set): Stackoverflow最简单的例子(你可以通过键入$MAHOUT_HOME/bin/mahout spark-shell在Mahout spark shell中以交互方式运行它(假设设置了SPARK_HOMEJAVA_HOMEMAHOUT_HOME ):

val inputRDD = sc.parallelize(Array(  ("u1", "purchase",    "iphone"),
  ("u1","purchase","ipad"),
  ("u2","purchase","nexus"),
  ("u2","purchase","galaxy"),
  ("u3","purchase","surface"),
  ("u4","purchase","iphone"),
  ("u4","purchase","galaxy"),
  ("u1","category-browse","phones"),
  ("u1","category-browse","electronics"),
  ("u1","category-browse","service"),
  ("u2","category-browse","accessories"),
  ("u2","category-browse","tablets"),
  ("u3","category-browse","accessories"),
  ("u3","category-browse","service"),
  ("u4","category-browse","phones"),
  ("u4","category-browse","tablets")) )



import org.apache.mahout.math.indexeddataset.{IndexedDataset, BiDictionary}
import org.apache.mahout.sparkbindings.indexeddataset.IndexedDatasetSpark

val purchasesIDS = IndexedDatasetSpark.apply(inputRDD.filter(_._2 == "purchase").map(o => (o._1, o._3)))(sc)
val browseIDS = IndexedDatasetSpark.apply(inputRDD.filter(_._2 == "category-browse").map(o => (o._1, o._3)))(sc)

import org.apache.mahout.math.cf.SimilarityAnalysis

val llrDrmList = SimilarityAnalysis.cooccurrencesIDSs(Array(purchasesIDS, browseIDS),
  randomSeed = 1234,
  maxInterestingItemsPerThing = 3,
  maxNumInteractions = 4)

val llrAtA = llrDrmList(0).matrix.collect

IndexedDatasetSpark.apply( requires an RDD[(String, String)] where the first string is the 'row' (eg users), second string is the 'behavior' so for the 'buy matrix', the columns would be 'products', but this could also be a 'gender' matrix, with two columns (male/female) IndexedDatasetSpark.apply(需要一个RDD[(String, String)] ,其中第一个字符串是'row'(例如用户),第二个字符串是'behavior',因此对于'buy matrix',列将是'products' ,但这也可能是一个'性别'矩阵,有两列(男/女)

Then you pass an array of IndexedDataSets to SimilarityAnalysis.cooccurrencesIDSs( 然后将一个IndexedDataSets数组传递给SimilarityAnalysis.cooccurrencesIDSs(

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM