简体   繁体   中英

Run Hadoop MR Job via Intellij IDEA

I have a Map-only Job configured to run in distributive mode. When I run it throw CLI, Job runs successfully. Launch string looks like:

hadoop jar FileHandy.jar com.company.MainRun arg1 arg2

But if I run it via IDE (Intellij IDEA), it fails with error (could not find Mapper class):

14/07/30 01:07:34 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/07/30 01:07:34 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
14/07/30 01:07:35 INFO input.FileInputFormat: Total input paths to process : 1
14/07/30 01:07:36 INFO mapred.JobClient: Running job: job_201407300013_0001
14/07/30 01:07:37 INFO mapred.JobClient:  map 0% reduce 0%
14/07/30 01:07:55 INFO mapred.JobClient: Task Id : attempt_201407300013_0001_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.expedia.eww.FileMapper not found
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1617)
    at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:191)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:631)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.ClassNotFoundException: Class com.expedia.eww.FileMapper not found
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1523)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1615)
    ... 8 more

I've setup IDE and use maven pom.xml with dependencies only (I using jar file generated by Build process by IDEA instead of maven jar, but if using maven jar file - results are same). My Run Configuration for IDE is following:

Main class: org.apache.hadoop.util.RunJar
Programs args: /path/to/jar/FileHandy.jar com.company.FileRun arg1 arg2
Work dir set

Code snippet:

Job job = new Job(conf, "File2Hdfs");
job.setJarByClass(FileRun.class);
job.setMapperClass(FileMapper.class);
job.setInputFormatClass(NLineInputFormat.class);
job.setNumReduceTasks(0);
//FileOutputFormat.setOutputPath(job, new Path("hdfs://localhost/user/cloudera/out111"));
FileOutputFormat.setOutputPath(job, new Path(arg0[1]));
FileInputFormat.addInputPath(job, new Path(fileForMapper));


return job.waitForCompletion(true) ? 0 : 1;

FileRun.class (with main) and FileMapper.class (mapper) are in com.company package.

IDEA launch following when Run project:

/usr/java/jdk1.6.0_32/bin/java -Didea.launcher.port=7547 -Didea.launcher.bin.path=/home/cloudera/Downloads/idea-IC-135.909/bin -Dfile.encoding=UTF-8 -classpath /usr/java/jdk1.6.0_32/jre/lib/rt.jar:/usr/java/jdk1.6.0_32/jre/lib/deploy.jar:/usr/java/jdk1.6.0_32/jre/lib/resources.jar:/usr/java/jdk1.6.0_32/jre/lib/jsse.jar:/usr/java/jdk1.6.0_32/jre/lib/management-agent.jar:/usr/java/jdk1.6.0_32/jre/lib/jce.jar:/usr/java/jdk1.6.0_32/jre/lib/plugin.jar:/usr/java/jdk1.6.0_32/jre/lib/charsets.jar:/usr/java/jdk1.6.0_32/jre/lib/javaws.jar:/usr/java/jdk1.6.0_32/jre/lib/ext/sunpkcs11.jar:/usr/java/jdk1.6.0_32/jre/lib/ext/dnsns.jar:/usr/java/jdk1.6.0_32/jre/lib/ext/localedata.jar:/usr/java/jdk1.6.0_32/jre/lib/ext/sunjce_provider.jar:/home/cloudera/IdeaProjects/MavenFileHandy/target/classes:/home/cloudera/.m2/repository/org/apache/hadoop/hadoop-client/2.0.0-mr1-cdh4.4.0/hadoop-client-2.0.0-mr1-cdh4.4.0.jar:/home/cloudera/.m2/repository/org/apache/hadoop/hadoop-common/2.0.0-cdh4.4.0/hadoop-common-2.0.0-cdh4.4.0.jar:/home/cloudera/.m2/repository/org/apache/hadoop/hadoop-annotations/2.0.0-cdh4.4.0/hadoop-annotations-2.0.0-cdh4.4.0.jar:/usr/java/jdk1.6.0_32/lib/tools.jar:/home/cloudera/.m2/repository/com/google/guava/guava/11.0.2/guava-11.0.2.jar:/home/cloudera/.m2/repository/com/google/code/findbugs/jsr305/1.3.9/jsr305-1.3.9.jar:/home/cloudera/.m2/repository/org/apache/commons/commons-math/2.1/commons-math-2.1.jar:/home/cloudera/.m2/repository/xmlenc/xmlenc/0.52/xmlenc-0.52.jar:/home/cloudera/.m2/repository/commons-codec/commons-codec/1.4/commons-codec-1.4.jar:/home/cloudera/.m2/repository/commons-io/commons-io/2.1/commons-io-2.1.jar:/home/cloudera/.m2/repository/commons-net/commons-net/3.1/commons-net-3.1.jar:/home/cloudera/.m2/repository/commons-el/commons-el/1.0/commons-el-1.0.jar:/home/cloudera/.m2/repository/commons-logging/commons-logging/1.1.1/commons-logging-1.1.1.jar:/home/cloudera/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/home/cloudera/.m2/repository/junit/junit/4.8.2/junit-4.8.2.jar:/home/cloudera/.m2/repository/commons-lang/commons-lang/2.5/commons-lang-2.5.jar:/home/cloudera/.m2/repository/commons-configuration/commons-configuration/1.6/commons-configuration-1.6.jar:/home/cloudera/.m2/repository/commons-collections/commons-collections/3.2.1/commons-collections-3.2.1.jar:/home/cloudera/.m2/repository/commons-digester/commons-digester/1.8/commons-digester-1.8.jar:/home/cloudera/.m2/repository/commons-beanutils/commons-beanutils/1.7.0/commons-beanutils-1.7.0.jar:/home/cloudera/.m2/repository/commons-beanutils/commons-beanutils-core/1.8.0/commons-beanutils-core-1.8.0.jar:/home/cloudera/.m2/repository/org/slf4j/slf4j-api/1.6.1/slf4j-api-1.6.1.jar:/home/cloudera/.m2/repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar:/home/cloudera/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.8.8/jackson-core-asl-1.8.8.jar:/home/cloudera/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.8.8/jackson-mapper-asl-1.8.8.jar:/home/cloudera/.m2/repository/org/mockito/mockito-all/1.8.5/mockito-all-1.8.5.jar:/home/cloudera/.m2/repository/org/apache/avro/avro/1.7.4/avro-1.7.4.jar:/home/cloudera/.m2/repository/com/thoughtworks/paranamer/paranamer/2.3/paranamer-2.3.jar:/home/cloudera/.m2/repository/org/xerial/snappy/snappy-java/1.0.4.1/snappy-java-1.0.4.1.jar:/home/cloudera/.m2/repository/org/apache/commons/commons-compress/1.4.1/commons-compress-1.4.1.jar:/home/cloudera/.m2/repository/org/tukaani/xz/1.0/xz-1.0.jar:/home/cloudera/.m2/repository/com/google/protobuf/protobuf-java/2.4.0a/protobuf-java-2.4.0a.jar:/home/cloudera/.m2/repository/org/apache/hadoop/hadoop-auth/2.0.0-cdh4.4.0/hadoop-auth-2.0.0-cdh4.4.0.jar:/home/cloudera/.m2/repository/com/jcraft/jsch/0.1.42/jsch-0.1.42.jar:/home/cloudera/.m2/repository/org/apache/zookeeper/zookeeper/3.4.5-cdh4.4.0/zookeeper-3.4.5-cdh4.4.0.jar:/home/cloudera/.m2/repository/jline/jline/0.9.94/jline-0.9.94.jar:/home/cloudera/.m2/repository/org/apache/hadoop/hadoop-hdfs/2.0.0-cdh4.4.0/hadoop-hdfs-2.0.0-cdh4.4.0.jar:/home/cloudera/.m2/repository/com/sun/jersey/jersey-core/1.8/jersey-core-1.8.jar:/home/cloudera/.m2/repository/com/sun/jersey/jersey-server/1.8/jersey-server-1.8.jar:/home/cloudera/.m2/repository/asm/asm/3.1/asm-3.1.jar:/home/cloudera/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar:/home/cloudera/.m2/repository/org/apache/hadoop/hadoop-core/2.0.0-mr1-cdh4.4.0/hadoop-core-2.0.0-mr1-cdh4.4.0.jar:/home/cloudera/.m2/repository/hsqldb/hsqldb/1.8.0.10/hsqldb-1.8.0.10.jar:/home/cloudera/Downloads/idea-IC-135.909/lib/idea_rt.jar com.intellij.rt.execution.application.AppMain org.apache.hadoop.util.RunJar /home/cloudera/IdeaProjects/MavenFileHandy/target/FileHandy.jar com.company.FileRun arg1 arg2

Why scripts throws exception and can't find Mapper Class when runs via IDE, and successfully complete same script via hadoop jar ... command?

Thanks

I've find the reason. TaskTrackers can't run job task (map) because jar file is not in Distributed Cash. To solve the problem it's necessary to add jar file to project classpath. Steps are:

File -> Project Structure -> Libraries, type '+' at the bottom pane and add jar file

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM