简体   繁体   中英

Mapreduce Job running in local mode instead of cluster

Configuration are done for running mapreduce job in cluster mode on top of yarn but its running on local mode. Not able to figuring out whats the issue.

below is yarn-site.xml (at master node)

    <property>
            <name>yarn.resourcemanager.resource-tracker.address</name>
            <value>namenode:8031</value>
    </property>
     <property>
            <name>yarn.nodemanager.aux-services</name>    //node manager servi
            <value>mapreduce_shuffle</value>    //This will specify that how mapper reducer work
    </property>

    <property>
            <name>yarn.resourcemanager.scheduler.address</name>
            <value>namenode:8030</value>
    </property>

    <property>
            <name>yarn.resourcemanager.address</name>
            <value>namenode:8032</value>
    </property>

    <property>
            <name>yarn.resourcemanager.hostname</name>
            <value>namenode</value>
    </property>

    <property>
            <name>yarn.nodemanager.resource.memory-mb</name>
            <value>2042</value>
    </property>

    <property>
            <name>yarn.nodemanager.vmem-check-enabled</name>
            <value>false</value>
    </property>

yarn-site.xml (at slave node)

    <property>
            <name>yarn.nodemanager.aux-services</name>    //node manager service
            <value>mapreduce_shuffle</value>    //This will specify that how mapper reducer work
    </property>

    <property>
            <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
            <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>

    <property>
            <name>yarn.resourcemanager.resource-tracker.address</name>
            <value>namenode:8031</value>    //Tell the ip_address of resource tracker
    </property>

mapred-site.xml (at master node and slave node)

    <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
    </property>
    <property>
            <name>yarn.app.mapreduce.am.resource.mb</name>
            <value>2048</value>
    </property>
    <property>
            <name>mapreduce.map.memory.mb</name>
            <value>2048</value>
    </property>

    <property>
            <name>mapreduce.reduce.memory.mb</name>
            <value>2048</value>
    </property>

on submission the job output is like below.

18/12/06 16:20:43 INFO input.FileInputFormat: Total input paths to process : 1
18/12/06 16:20:43 INFO mapreduce.JobSubmitter: number of splits:2
18/12/06 16:20:43 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1556004420_0001
18/12/06 16:20:43 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
18/12/06 16:20:43 INFO mapreduce.Job: Running job: job_local1556004420_0001
18/12/06 16:20:43 INFO mapred.LocalJobRunner: OutputCommitter set in config null
18/12/06 16:20:43 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
18/12/06 16:20:43 INFO mapred.LocalJobRunner: Waiting for map tasks
18/12/06 16:20:43 INFO mapred.LocalJobRunner: Starting task: attempt_local1556004420_0001_m_000000_0
18/12/06 16:20:43 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
18/12/06 16:20:43 INFO mapred.MapTask: Processing split: hdfs://namenode:9001/all-the-news/articles1.csv:0+134217728
18/12/06 16:20:43 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
18/12/06 16:20:43 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
18/12/06 16:20:43 INFO mapred.MapTask: soft limit at 83886080
18/12/06 16:20:43 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
18/12/06 16:20:43 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
18/12/06 16:20:43 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
18/12/06 16:20:44 INFO mapreduce.Job: Job job_local1556004420_0001 running in uber mode : false
18/12/06 16:20:44 INFO mapreduce.Job:  map 0% reduce 0%
18/12/06 16:20:49 INFO mapred.LocalJobRunner: map > map
18/12/06 16:20:50 INFO mapreduce.Job:  map 1% reduce 0%
18/12/06 16:20:52 INFO mapred.LocalJobRunner: map > map
18/12/06 16:20:55 INFO mapred.LocalJobRunner: map > map
18/12/06 16:20:56 INFO mapreduce.Job:  map 2% reduce 0%
18/12/06 16:20:58 INFO mapred.LocalJobRunner: map > map
18/12/06 16:21:01 INFO mapred.LocalJobRunner: map > map
18/12/06 16:21:02 INFO mapreduce.Job:  map 3% reduce 0%
18/12/06 16:21:04 INFO mapred.LocalJobRunner: map > map

Why it's running in local mode. I am running this job on 200MB file with 3 nodes 2 datanode and 1 namenode.

etc/hosts file is as shown below

127.0.0.1       localhost
127.0.1.1       anil-Lenovo-Product

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

192.168.8.98 namenode
192.168.8.99 datanode
192.168.8.100 datanode2

YarnUI图片

  1. first check if these configurations are effective: http://{your-resource-manager-host}:8088/conf by default or your configured UI address: http://namenode:8088/conf

  2. then make sure these properties are configured:

    in mapred-site.xml

<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>

  <property>
    <name>yarn.app.mapreduce.am.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  </property>

  <property>
    <name>mapreduce.map.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  </property>

  <property>
    <name>mapreduce.reduce.env</name>
    <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
  </property>

in yarn-site.xml

  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>

restart YARN service and check if it works.

jobs are submitted by ClientProtocol interface, and one of its two implementations are created when service started:

  • LocalClientProtocolProvider prefix with job_local
  • YarnClientProtocolProvider prefix with job_

according to MRConfig.FRAMEWORK_NAME(value is "mapreduce.framework.name") configuration, and its valid options are classic , yarn , local .

Good luck!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM