How hadoop jobtracker handles long running tasks

Question

I am new to Hadoop world,recently i got stuck on an interview question.

Q- If the job-tracker finds that any particular task-Tracker is taking long time to finish the assigned task to it,will the job-Tracker suspend the execution with that task tracker and assign new execution of the same job on another task-Tracker or what it will do?

There is no network error and child JVMs are executing properly. Will the Job-Tracker allow Task-Tracker to execute that job forever?

Thnx.

Answer 1

If speculative execution is enabled, the same task will be assigned to another tasktracker without killing the existing task. The output of the task that completes first will be taken and the other will be killed. By default this is enabled. There are two properties that defines this nature

In the new API, the property is

mapreduce.map.speculative
mapreduce.reduce.speculative

In Old API

mapred.map.tasks.speculative.execution
mapred.reduce.tasks.speculative.execution

If the speculative execution is false and if the task is running fine with proper progress, the jobtracker will allow the task to continue.

If the task is not progressing, it will wait for a time defined by the property mapreduce.task.timeout and it will kill the task. It will retry the same task in other nodes. The number of retry attempts are defined by the properties mapreduce.map.maxattempts and mapreduce.reduce.maxattempts .

How hadoop jobtracker handles long running tasks

Question

1 answers

solution1
0 2015-08-28 11:45:52

How hadoop jobtracker handles long running tasks

Question

1 answers

solution1 0 2015-08-28 11:45:52

solution1
0 2015-08-28 11:45:52