简体   繁体   中英

worker_limit_reached on parallel map reduce jobs

I have 50 hosts trying to run the map reduce job below on Riak. I am getting the error below where some of the hosts complain about the worker_limit being reached.

Looking for some insights on whether I can tune the system to avoid this error? Couldn't find too much documentation around the worker_limit .

{"phase":0,"error":"[worker_limit_reached]","input":"{<<\\"provisionentry\\">>,<<\\"R89Okhz49SDje0y0qvcnkK7xLH0\\">>}","type":"result","stack":"[]"} with query MapReduce(path='/mapred', reply_headers={'content-length': '144', 'access-control-allow-headers': 'Content-Type', 'server': 'MochiWeb/1.1 WebMachine/1.10.8 (that head fake, tho)', 'connection': 'close', 'date': 'Thu, 27 Aug 2015 00:32:22 GMT', 'access-control-allow-origin': '*', 'access-control-allow-methods': 'POST, GET, OPTIONS', 'content-type': 'application/json'}, verb='POST', headers={'Content-Type': 'application/json'}, data=MapReduceJob(inputs=MapReduceInputs(bucket='provisionentry', key=u'34245e92-ccb5-42e2-a1d9-74ab1c6af8bf', index='testid_bin'), query=[MapReduceQuery(map=MapReduceQuerySpec(language='erlang', module='datatools', function='map_object_key_value'))]))

Map reduce in Riak does not scale well, and so does not work well as part of a user-facing service.

It is suitable for periodic administrative tasks, or pre-calculations when the number of jobs can be limited.

Since the map phase of the job is a coverage query, you will need to involve at least 1/n_val (rounded up) vnodes in each map, using 1 worker at each. Since you cannot guarantee that the selected coverage sets do not overlap, you should not expect to be able to simultaneously run more map reduce jobs than your worker limit setting.

The default worker limit is 50 ( https://github.com/basho/riak_pipe/blob/develop/src/riak_pipe_vnode.erl#L86 ), but you can adjust that by setting {worker_limit, 50} in the riak_pipe section of app.config or advanced.config.

Keep in mind that each worker is a process, so you may need to increase the process limit for the erlang VM as well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM