简体   繁体   中英

MongoDB's reduce-phase is not working as expected

I worked with a java-tutorial for mapReduce-Programming in MongoDB and ended up with the following Code:

package mapReduceExample;

import com.mongodb.BasicDBObject;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.DBObject;
import com.mongodb.MapReduceCommand;
import com.mongodb.MapReduceOutput;
import com.mongodb.Mongo;

public class MapReduceExampleMain {

    /**
     * @param args
     */
    public static void main(String[] args) {

        Mongo mongo;

        try {
            mongo = new Mongo("localhost", 27017);
            DB db = mongo.getDB("library");

            DBCollection books = db.getCollection("books");

            BasicDBObject book = new BasicDBObject();
            book.put("name", "Understanding JAVA");
            book.put("pages", 100);
            books.insert(book);

            book = new BasicDBObject();
            book.put("name", "Understanding JSON");
            book.put("pages", 200);
            books.insert(book);

            book = new BasicDBObject();
            book.put("name", "Understanding XML");
            book.put("pages", 300);
            books.insert(book);

            book = new BasicDBObject();
            book.put("name", "Understanding Web Services");
            book.put("pages", 400);
            books.insert(book);

            book = new BasicDBObject();
            book.put("name", "Understanding Axis2");
            book.put("pages", 150);
            books.insert(book);

            String map = "function()"
                    + "{ "
                        + "var category; "
                        + "if ( this.pages > 100 ) category = 'Big Books'; "
                        + "else category = 'Small Books'; "
                        + "emit(category, {name: this.name});"
                    + "}";

            String reduce = "function(key, values)"
                    + "{"
                        + "return {books: values.length};"
                    + "} ";

            MapReduceCommand cmd = new MapReduceCommand(books, map, reduce,
                    null, MapReduceCommand.OutputType.INLINE, null);

            MapReduceOutput out = books.mapReduce(cmd);

            for (DBObject o : out.results()) {
                System.out.println(o.toString());
            }

            //aufräumen
            db.dropDatabase();

        } catch (Exception e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }



    }
}

This is a pretty simple reduce-Phase, but it does not what I want :(

The output is:

{ "_id" : "Big Books" , "value" : { "books" : 4.0}}
{ "_id" : "Small Books" , "value" : { "name" : "Understanding JAVA"}}

I would expect this:

{ "_id" : "Big Books" , "value" : { "books" : 4.0}}
{ "_id" : "Small Books" , "value" : { "books" : 1.0}}

Why does the reduce-Phase not give back the values.length in the case of a small book?

Greetings, Andre

Becuase if there is only one results the reduce is never run. Change it to be a finalise function or something.

A Basic Understanding of how mapReduce Works


Let us introduce the concepts of mapReduce

  • mapper - This is the stage that emit's the data to be fed into the reduce stage. It requires a key and a value be to sent. You can emit several times if you want in a mapper, but the requirements stay the same.

  • reducer - A reducer is called when there is more than one value of a given key to process the list of values that have been emitted for that key.


That said, since the mapper only emitted one key value your reducer was not called.

You can clean this up in finalise , but the behavior of the emit from the mapper going straight through is by standard design.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM