简体   繁体   中英

Connect To MongoDB using Apache Mahout

I'm trying to generate recommendations using Apache Mahout while using MongoDB to create the datamodel as per the MongoDBDataModel. My code is as follows :

import java.net.UnknownHostException;
import java.util.List;
import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel;
import org.apache.mahout.cf.taste.impl.neighborhood.ThresholdUserNeighborhood;
import org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender;
import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;
 import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;
 import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood;
 import org.apache.mahout.cf.taste.recommender.RecommendedItem;
 import org.apache.mahout.cf.taste.recommender.UserBasedRecommender;
 import org.apache.mahout.cf.taste.similarity.ItemSimilarity;
 import org.apache.mahout.cf.taste.similarity.UserSimilarity;
 import com.mongodb.MongoException;


public class usingMongo {
public static void main(String[] args) throws UnknownHostException, Mong oException
        ,TasteException {
    final long startTime = System.nanoTime();

    MongoDBDataModel model = new MongoDBDataModel("AdamsLaptop", 27017,
            "test", "ratings100k", false, false, null);
    System.out.println("connected to mongo ");

    UserSimilarity UserSim = new PearsonCorrelationSimilarity(model);

    UserNeighborhood neighborhood = new ThresholdUserNeighborhood(0.5, UserSim, model);

    UserBasedRecommender UserRecommender = new GenericUserBasedRecommender(model, neighborhood, UserSim);
    List<RecommendedItem>UserRecommendations = UserRecommender.recommend(1, 3);
    for (RecommendedItem recommendation : UserRecommendations) {
          System.out.println("You may like movie " + recommendation.getItemID() + " as a user similar to you also rated it " + recommendation.getValue() + " USER");
    }

    ItemSimilarity ItemSim = new PearsonCorrelationSimilarity(model);//LogLikelihoodSimilarity(model);

    GenericItemBasedRecommender ItemRecommender = new GenericItemBasedRecommender(model, ItemSim);
    List<RecommendedItem>ItemRecommendations = ItemRecommender.recommend(1, 3);
    for (RecommendedItem recommendation : ItemRecommendations) {
          System.out.println("You may like movie " + recommendation.getItemID() + " as a user similar to you also rated it " + recommendation.getValue() + " ITEM");
        }


    final long duration = System.nanoTime() - startTime;
    System.out.println(duration);
}
}

I cant see where I've gone wrong but with numerous changes and lots of trial and error the error message remains the same :

 Exception in thread "main" java.lang.NullPointerException
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.getID(MongoDBDataModel.java:743)
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.buildModel(MongoDBDataModel.java:570)
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.<init>(MongoDBDataModel.java:245)
at recommender.usingMongo.main(usingMongo.java:24)

Any suggestions? Here's an example of my data within MongoDB :

{ "_id" : ObjectId("56ddf61f5960960c333f3dcb"),"userId" : 1, "movieId" : 292, "rating" : 4, "timestamp" : 847116936 }

I succesfully integrated MongoDB data to mahout.

The structure of the data in mongoDB depends on the kind of Similarity algorithm you use.for eg,

UserSimilarity

MongoDBDataModel datamodel = new MongoDBDataModel("127.0.0.1", 27017, "testing", "ratings", true, true, null); where the user_id, item_id are integer values, preference are float values and created_at as timestamp

SVDRecommender

the user_id, item_id are MongoDB Objects and preference are float values and created_at as timestamp

The obvious troubleshooting you can do is whether the MongoDB server is running or not. As per the exception it's running. I think the problem lies in your structure of data..

Use user_id instead of userId, item_id instead of itemId, preference instead of rating. I don't know if this will make any difference. I used one of the tutorial online, but can't find it at the moment.

It's working but too slow when I have more than 10000 users with 1000 items.

I think that the problem is that mahout assumes some default values when it comes to some fields that need to reside in your mongoDB the item ID, User ID and preferences that are user_id, item_id and preference so The solution might lie on using another MongoDBDataModel constructor that will give you the possibility to pass as parameters the names of those fields in your mongoDB instance or redesign your Collections Schema.

I hope that makes sense.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM