简体   繁体   中英

Riak: Feed MR with search result + apply limit

I know it's possible to feed Riak map/reduce job with results of Search. I have a bucket of items on which I want to search. Then, I need to process the top, let's say 100, with map/reduce. The naive solution is searching for the keyword, applying limit and starting a new map/reduce job with a set of 100 keys.

However, I would like to do the whole job in Riak - kick off map/reduce directly with search. I currently use the map/reduce init described here :

"inputs": {
    "bucket":"mybucket",
    "query":"foo OR bar"
}

Is there a way to provide a limit so that the search does not return all keys, but just the top matches for the search? Something like this:

"inputs": {
    "bucket":"mybucket",
    "query":"foo OR bar",
    "limit": 10
}

The trick here is determining which 100 keys are the 'top'. Since the map phase will run separately on 1/N vnodes, and will only see 1 object at a time, the map function cannot determine which will be the top keys overall. You would need the reduce phase sort and return the top 100. You could pass in the limit to the MR as an arg to the reduce phase so you don't need to recreate the function every time. This Question may have some relevant info for you

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM