简体   繁体   中英

Riak - Concurrent Erlang Map/Reduce jobs

I'm running Erlang Map/Reduce jobs on Riak .

When in the past I used Javascript M/R jobs, I had to tune the JS VM settings properly. At the time I found this conversation to be extremely useful: to http://riak-users.197444.n3.nabble.com/Follow-up-Riak-Map-Reduce-error-preflist-exhausted-td4024330.html

Now, because I'm not an Erlang developer, I wonder what are the main implications when running concurrent M/R jobs on Riak and if there's any VM settings to set (like I used to do with JS M/R).

Thanks

Currently we found this riak mapred gotchas:

  • worker_limit_reached . This is happens when you have a lot of data arriving to mapred job and job's queue full
  • read with r=1. All your data inside mapreduce is read with r=1
  • no read repair. Mapreduce reads does not trigger read reapair
  • you may get already deleted data. Inside mapred check special header of object, which indicates that object is already deleted

ps this is about riak 1.2.1. Basho folks quickly resolve many issues, so it may be changed in near future.

Basically what happens here is that all phases of map/reduce query is performed by ErlangVM, not by Erlang+JS. Since the jobs are isolated in ErlangVM in separate processes, operations are not affected. Host-wise you have the same computational power, so it is also OK. Regarding ErlangVM parameters, many of them were tweaked to improve Riak operatinos and your query is good to go.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM